Log on / register
BioMed Central home | Journals A-Z | Feedback | Support | My details
Open AccessResearch

Probability landscapes for integrative genomics

Annick Lesne1 email and Arndt Benecke1,2 email

1Institut des Hautes Études Scientifiques, Bures sur Yvette, France

2Institut de Recherche Interdisciplinaire – CNRS USR3078 – Université Lille I, France

author email corresponding author email

Theoretical Biology and Medical Modelling 2008, 5:9doi:10.1186/1742-4682-5-9

Published: 20 May 2008

Abstract

Background

The comprehension of the gene regulatory code in eukaryotes is one of the major challenges of systems biology, and is a requirement for the development of novel therapeutic strategies for multifactorial diseases. Its bi-fold degeneration precludes brute force and statistical approaches based on the genomic sequence alone. Rather, recursive integration of systematic, whole-genome experimental data with advanced statistical regulatory sequence predictions needs to be developed. Such experimental approaches as well as the prediction tools are only starting to become available and increasing numbers of genome sequences and empirical sequence annotations are under continual discovery-driven change. Furthermore, given the complexity of the question, a decade(s) long multi-laboratory effort needs to be envisioned. These constraints need to be considered in the creation of a framework that can pave a road to successful comprehension of the gene regulatory code.

Results

We introduce here a concept for such a framework, based entirely on systematic annotation in terms of probability profiles of genomic sequence using any type of relevant experimental and theoretical information and subsequent cross-correlation analysis in hypothesis-driven model building and testing.

Conclusion

Probability landscapes, which include as reference set the probabilistic representation of the genomic sequence, can be used efficiently to discover and analyze correlations amongst initially heterogeneous and un-relatable descriptions and genome-wide measurements. Furthermore, this structure is usable as a support for automatically generating and testing hypotheses for alternative gene regulatory grammars and the evaluation of those through statistical analysis of the high-dimensional correlations between genomic sequence, sequence annotations, and experimental data. Finally, this structure provides a concrete and tangible basis for attempting to formulate a mathematical description of gene regulation in eukaryotes on a genome-wide scale.


© 1999-2008 BioMed Central Ltd unless otherwise stated < info@biomedcentral.com >   Terms and conditions