|Information-integration approaches to biological discovery in high-dimensional data|
|Professor of Computational Biology and Bioinformatics, Dana-Farber Cancer Institute and the Harvard School of Public Health|
|Life Sciences Annex Auditorium A101
May 28, 2009 - 03:00 pm
Available via Access Grid, please check your local site. Two trends are driving innovation and discovery in biological sciences: technologies that allow holistic surveys of genes proteins and metabolites and a realization that biological processes are holistic surveys of genes, proteins, and metabolites and a realization that biological processes are driven by complex networks of interacting biological molecules. However, there is a gap between the gene lists emerging from genome sequencing projects and the network diagrams that are essential if we are to understand the link between genotype and phenotype. Omic technologies such as DNA microarrays were once heralded as providing a window into those networks, but so far their success has been limited, in large part because the high-dimensional data they produce cannot be fully constrained by the limited number of measurements and in part because the data themselves represent only a small part of the complete story. To circumvent these limitations, we have developed methods that combine omic data with other sources of information in an effort to leverage, more completely, the compendium of information that we have been able to amass. Here we will present a number of approaches we have developed, including an integrated database that collects clinical, research, and public domain data and synthesizes it to drive discovery and an application of seeded Bayesian Network analysis applied to gene expression data that deduces predictive models of network response.