Integration and Visualization of Diverse Biological Data
Project Number5R01GM071966-02
Contact PI/Project LeaderTROYANSKAYA, OLGA G
Awardee OrganizationPRINCETON UNIVERSITY
Description
Abstract Text
DESCRIPTION (provided by applicant): Currently a gap exists between the explosion of high-throughput data generation in molecular biology and the relatively slower growth of reliable functional information extracted from the data. This gap is largely due to the lack of specificity necessary for accurate gene function prediction in the currently available large-scale experimental technologies for rapid protein function assessment. Bioinformatics methods that integrate diverse data sources in their analysis achieve higher accuracy and thus alleviate this lack of specificity, but there's a paucity of generalizable, efficient, and accurate methods for data integration. In addition, no efficient methods exist to effectively display diverse genomic data, even though visualization has been very valuable for analysis of data from large scale technologies such as gene expression microarrays. The long-term goal of this proposal is to develop an accurate and generalizable bioinformatics framework for integrated analysis and visualization of heterogeneous biological data.
We propose to address the data integration problem with a Bayesian network approach and effective visualization methods. We have shown the efficacy of this method in a proof-of-principle system that increased the accuracy of gene function prediction for Saccharomyces cerevisiae compared to individual data sources. Building on our previous work, we present a two-part plan to improve and expand our system and to develop novel visualization methods for genomic data based on the scalable display technology. First, we will investigate the computational and theoretical issues behind accurate integration, analysis and effective visualization of heterogeneous high-throughput data. Then, leveraging our existing system and algorithmic improvements developed in the first part of the project, we will design and implement a full-scale data integration and function prediction system for Saccharomyces cerevisiae that will be incorporated with the Saccharomyces Genome Database (SGD), a model organism database for yeast.
The proposed system would provide highly accurate automatic function prediction that can accelerate genomic functional annotation through targeted experimental testing. Furthermore, our system will perform general integration and will offer researchers a unified view of the diverse high-throughput data through effective integration and visualization tools, thereby facilitating hypothesis generation and data analysis. Our scalable visualization methods will enable teams of researchers to examine biological data interactively and thus support the highly collaborative nature of genomic research. In addition to contributing to S. cerevisiae genomics, the technology for efficient and accurate heterogeneous data integration and visualization developed as a result of this proposal will form a basis for systems that address the same set of issues for other organisms, including the human.
Public Health Relevance Statement
Data not available.
NIH Spending Category
No NIH Spending Category available.
Project Terms
InternetSaccharomyces cerevisiaeautomated data processingbioengineering /biomedical engineeringbioinformaticsbiotechnologycomputer assisted sequence analysiscomputer networkcomputer program /softwarecomputer simulationcomputer system design /evaluationdata collection methodology /evaluationfunctional /structural genomicsfungal geneticshigh throughput technologyinformation retrievalmathematical modelmolecular biology information systemonline computerstatistics /biometry
No Sub Projects information available for 5R01GM071966-02
Publications
Publications are associated with projects, but cannot be identified with any particular year of the project or fiscal year of funding. This is due to the continuous and cumulative nature of knowledge generation across the life of a project and the sometimes long and variable publishing timeline. Similarly, for multi-component projects, publications are associated with the parent core project and not with individual sub-projects.
No Publications available for 5R01GM071966-02
Patents
No Patents information available for 5R01GM071966-02
Outcomes
The Project Outcomes shown here are displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed are those of the PI and do not necessarily reflect the views of the National Institutes of Health. NIH has not endorsed the content below.
No Outcomes available for 5R01GM071966-02
Clinical Studies
No Clinical Studies information available for 5R01GM071966-02
News and More
Related News Releases
No news release information available for 5R01GM071966-02
History
No Historical information available for 5R01GM071966-02
Similar Projects
No Similar Projects information available for 5R01GM071966-02