Tooling for accurately studying the epigenome along the human pangenome reference
Project Number1U01HG013744-01
Contact PI/Project LeaderSTERGACHIS, ANDREW BEN
Awardee OrganizationUNIVERSITY OF WASHINGTON
Description
Abstract Text
Project Summary/Abstract
This proposal will provide the foundational tooling for understanding the function of the pan-genome reference
through the accurate annotation of regulatory elements within the pan-genome. As the genetic component of the
pan-genome reference comes into focus, the next challenge is understanding the functional relevance of genetic
variants within this reference. However, resolving this challenge requires tooling that enables users to: (1) get
accurate epigenetic data into a pan-genome reference; and (2) use epigenetic data once it is in a pan-genome
reference. This proposal leverages our team’s unique expertise in long-read epigenetics, short-read epigenetics,
pan-genome assembly, and genomic software development to develop transformative tooling for threading
accurate epigenetic information into a pan-genome graph, as well as extracting epigenetic information from a
pan-genome in a manner that is compatible with existing epigenetic and genetic analysis tools. Our tooling is
grounded in first assembling accurate epigenetic annotations at the level of haploid linear contigs, which are then
threaded into a pan-genome reference. This approach significantly improves the accuracy by which both long-
and short-read epigenetic features are mapped into a pan-genome, enables our tooling to readily adapt to new
pan-genomes, and enables user-generated epigenetic data to be incorporated into a pan-genome reference
without having to remake the pan-genome reference itself. Importantly, we are designing this tooling to work for
diverse types of epigenetic data acquired across sequencing platforms. In addition, this tooling will be available
through AnVIL, Conda, and other platforms, enabling users to readily adopt it into their own research pipelines.
Specifically, in Aim 1 we will develop tooling that uses a semi-supervised machine learning approach to
accurately classify long-read epigenetic data collected using diverse experimental methods and sequencing
platforms. In Aim 2, we will develop tooling that accurately aggregates long-read epigenetic data onto haploid
linear contigs, and then threads either long-read or short-read epigenetic data into a pan-genome reference. In
Aim 3, we will create fundamental operation tools for processing epigenetic data within a pan-genome to identify
epigenetic and genetic features at specific points of interest within a pan-genome in a sample-, path-, and read-
aware manner. Finally, we will apply our tooling to existing long-read and short-read epigenetic datasets to
identify genetic variants within the pan-genome reference associated with haplotype-, paralog-, and sample-
specific epigenetic features.
Public Health Relevance Statement
Project Narrative
This proposal will develop foundational computational tooling for integrating accurate long-read or short-read
epigenetic data into a pan-genome reference, and processing epigenetic data that is contained within a pan-
genome reference. Overall, this tooling will improve our understanding of the function of genetic variants within
the pan-genome reference that are contributing to human disease through their impact on gene regulatory
elements.
No Sub Projects information available for 1U01HG013744-01
Publications
Publications are associated with projects, but cannot be identified with any particular year of the project or fiscal year of funding. This is due to the continuous and cumulative nature of knowledge generation across the life of a project and the sometimes long and variable publishing timeline. Similarly, for multi-component projects, publications are associated with the parent core project and not with individual sub-projects.
No Publications available for 1U01HG013744-01
Patents
No Patents information available for 1U01HG013744-01
Outcomes
The Project Outcomes shown here are displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed are those of the PI and do not necessarily reflect the views of the National Institutes of Health. NIH has not endorsed the content below.
No Outcomes available for 1U01HG013744-01
Clinical Studies
No Clinical Studies information available for 1U01HG013744-01
News and More
Related News Releases
No news release information available for 1U01HG013744-01
History
No Historical information available for 1U01HG013744-01
Similar Projects
No Similar Projects information available for 1U01HG013744-01