Awardee OrganizationUNIV OF MASSACHUSETTS MED SCH WORCESTER
Description
Abstract Text
PROJECT SUMMARY
Genome-wide association studies (GWAS) have associated tens of thousands of common variants with human
diseases and traits. The rapid expansion of Whole-Genome Sequencing (WGS) studies and biobanks offer
great potential to understand the physiologic and pathophysiologic associations of both common and rare
variants. The IGVF Consortium aims to systematically study the functional and phenotypic effects of genomic
variation; it is not, however, feasible to experimentally characterize the vast number of candidate variants of
interest. Computational models which can accurately predict the context-specific effects of variants are
essential in designing targeted research. We propose an approach anchored on a framework of
high-confidence regulatory elements (REs), from which we will develop methods to learn RE-gene links,
perform rare variant association tests, and finemap causal common and rare variants. We aim to make all our
results, methods, and tools available to the community through a public portal and the NHGRI and NHLBI Data
Commons. Our proposal has four aims: (1) Develop a core framework of REs from open chromatin regions on
which to anchor our models, improving on past approaches by producing higher-resolution predictions of
functional base-pairs, producing novel RE subclassifications using functional characterization datasets from
IGVF and other sources, and harnessing single-cell datasets to delineate lineage- and stimulus-specific
elements. (2) Use this framework to predict the roles of variants in molecular phenotypes, specifically gene
expression and cellular response to stimuli. We will build statistical and machine-learning methods to predict
context-specific links between REs and their target genes, using three-dimensional conformation data
produced by the IGVF Consortium and external sources. We will apply this method across many cell types and
perform feature selection to build a catalog of high-confidence RE-gene links and regulatory networks. (3)
Develop statistical methods to perform cell type-specific rare variant association tests (cellSTAAR) in WGS
studies, and a latent variable model to prioritize candidate functional variants for traits and diseases, using
results from Aims 1 and 2. We will apply these methods to analyze various metabolic, immune-mediated, and
psychiatric disorders in the multi-ethnic WGS data of the NHLBI Trans-Omic Precision Medicine Program
(TOPMed) and the NHGRI Genome Sequencing Program (GSP) to identify candidate causal
disease-associated variants. (4) Make all the results publicly available by substantially expanding the FAVOR
Portal to include whole genome variant functional annotations of all three billion genomic positions as well as
cell type-specific annotations. We will implement both FAVOR and cellSTAAR in the Data Commons AnVIL
(NHGRI) and BioData Catalyst (NHLBI) so researchers may use them for analysis of new datasets in a
scalable cloud computing environment. We will work closely with other centers and the Data Analysis
Coordinating Center (DACC) of the IGVF on joint analyses and building the IGVF Variant Catalog.
Public Health Relevance Statement
PROJECT NARRATIVE
Scientists have been pinpointing tens of thousands of positions in the human genome which may influence
disease. In order to apply this knowledge to improve treatment options and preventative medicine, however,
we need to have a better understanding of how these regions function and where in the body they are active.
This project is a predictive modeling component of the IGVF Consortium, which aims to systematically study
the functional impact of genetic variants and their influence on human diseases and traits by developing and
applying cutting-edge computational and statistical methods to predict the functional impacts of
disease-associated genetic variants.
No Sub Projects information available for 5U01HG012064-04
Publications
Publications are associated with projects, but cannot be identified with any particular year of the project or fiscal year of funding. This is due to the continuous and cumulative nature of knowledge generation across the life of a project and the sometimes long and variable publishing timeline. Similarly, for multi-component projects, publications are associated with the parent core project and not with individual sub-projects.
No Publications available for 5U01HG012064-04
Patents
No Patents information available for 5U01HG012064-04
Outcomes
The Project Outcomes shown here are displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed are those of the PI and do not necessarily reflect the views of the National Institutes of Health. NIH has not endorsed the content below.
No Outcomes available for 5U01HG012064-04
Clinical Studies
No Clinical Studies information available for 5U01HG012064-04
News and More
Related News Releases
No news release information available for 5U01HG012064-04
History
No Historical information available for 5U01HG012064-04
Similar Projects
No Similar Projects information available for 5U01HG012064-04