Project Summary
Long-read sequencing is rapidly transforming our knowledge of the human genome as well as the approach to
uncovering human genetic variation and alterations. In contrast to the rapid pace of algorithmic innovations for
long-read sequencing of human genomes, both the informatic development and the generation of long-read
cancer genome data have seen lagging. With the accuracy and cost of long-read sequencing both approaching
short reads, we anticipate long-read cancer genome sequencing to soon become the new frontier of cancer
genomics and the primary engine of cancer genomic discoveries. The overarching goal of this application is to
catalyze long-read cancer genome sequencing efforts through the development of informatic methods for the
discovery and characterization of somatic genetic alterations in cancer genomes. We propose three lines of
research activities to achieve this goal. First, we will improve existing methods for long-read analysis, including
both long-read alignment and assembly, and develop downstream bioinformatic tools for somatic variant
discovery from aligned long reads (Aim 1) and from de novo long-read assembly (Aim 2). Second, in parallel to
the informatic development, we will generate a resource of long-read cancer genome data that are used for the
benchmarking and evaluation of long-read informatic methods (Aim 3). We will specifically compare the
performance of variant detection from alignment-based and assembly-based approaches to generate best
practices for long-read cancer genome applications. Finally, we aim to build and expand an active community of
researchers who interact with, generate, analyze, or develop informatic methods for long-read cancer genome
data (Aim 4). The community building effort will initially focus on providing tutorials and user examples based on
the newly developed informatic methods and newly generated long-read data, and eventually aim to establish a
catalog of reference cancer genome assemblies for use by the cancer research community.
Public Health Relevance Statement
Narrative
This proposal aims to advance applications of long-read sequencing to the identification of genetic changes in
cancer cells by developing cutting edge informatic methods for long-read sequencing analysis.
No Sub Projects information available for 1U24CA294203-01
Publications
Publications are associated with projects, but cannot be identified with any particular year of the project or fiscal year of funding. This is due to the continuous and cumulative nature of knowledge generation across the life of a project and the sometimes long and variable publishing timeline. Similarly, for multi-component projects, publications are associated with the parent core project and not with individual sub-projects.
No Publications available for 1U24CA294203-01
Patents
No Patents information available for 1U24CA294203-01
Outcomes
The Project Outcomes shown here are displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed are those of the PI and do not necessarily reflect the views of the National Institutes of Health. NIH has not endorsed the content below.
No Outcomes available for 1U24CA294203-01
Clinical Studies
No Clinical Studies information available for 1U24CA294203-01
News and More
Related News Releases
No news release information available for 1U24CA294203-01
History
No Historical information available for 1U24CA294203-01
Similar Projects
No Similar Projects information available for 1U24CA294203-01