Expanding the AnVIL (Analysis, Visualization, and Informatics Lab-space)
Project Number5U24HG010263-07
Former Number5U24HG010263-05
Contact PI/Project LeaderSCHATZ, MICHAEL Other PIs
Awardee OrganizationJOHNS HOPKINS UNIVERSITY
Description
Abstract Text
The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) powers the
next generation of computational genomic research. The AnVIL makes available several of the the most widely
used analysis environments for genomics and biomedical research including Bioconductor, RStudio, Galaxy,
Jupyter, Cromwell, and IGV in a secure, scalable, and accessible cloud-based environment. It currently houses
>600,000 genomic samples from the largest NHGRI projects including the Centers for Common Disease
Genomics (CCDG), the Centers for Mendelian Disease Genomics (CMG), the Telomere-to-Telomere (T2T)
consortium, and the Genotype Tissue Expression (GTEx) project. Our user centered solution for data access,
analysis, and visualization enables investigators across all levels of expertise to fully utilize genomic datasets
using environments they are already familiar with, leveraging well engineered and optimized scientific
computing infrastructure for greater efficiency and lower costs. In this second phase of the AnVIL, we will
expand the AnVIL experience with several additional high-value services and capabilities with the goal of
expanding the number of researchers using the platform and the depth of their research. In Aim 1, we will
enhance the core platform in several innovative ways. First to support researchers transitioning into the cloud
environment, we will work to simplify and optimize the research environment with new dashboards for
monitoring costs and managing teams, along with optimizations to the APIs to run in a multi-cloud
environment. Next we will optimize Galaxy in AnVIL to improve the user experience, enable cost-efficient
computing, develop a workflow recommender system, and enable interoperability by integrating computing
services across multiple clouds. Within Bioconductor, we will introduce new capabilities for reliable software
engineering practices, enhance accessibility through monographs, curriculum authoring, and shiny apps; and
optimize Bioconductor infrastructure and development for the cloud. Additionally, we will design and
implement standards to ensure AnVIL is interoperable with other cloud-based research systems. In Aim 2, we
will introduce four new scientific services to support critical analysis tasks. This includes services for enhanced
machine learning capabilities, data harmonization and metadata autocompletion, new liftover services to
translate genomic knowledge between reference genomes, and comprehensive variant discovery and analysis
using long read sequencing. In Aim 3, we will expand our efforts for training and outreach. This will begin
with focused high-impact events including community workshops and the AnVIL Champions Program, with
the goal of seeding and developing community-driven support. We will also create scalable accessible videos
and massive open online courses (MOOCs) leveraging new educational infrastructure we are developing. In
Aim 4, we will continue our joint leadership with our AnVIL partners at the Broad, as well as welcome our
new partners in the forthcoming AnVIL Clinical Resource (ACR) program.
Public Health Relevance Statement
PROJECT NARRATIVE
The goal of this project is to expand the NHGRI Genomic Data Science Analysis, Visualization, and Informatics
Lab-space (AnVIL) platform for genomics and biomedical research. The research enabled by this platform will
accelerate our understanding of the genetic components of human health and disease as well as accelerate
progress towards implementing genomic knowledge in medical practice supported by the AnVIL Clinical
Resource (ACR). This will be accomplished through improvements to the core AnVIL platform for scalability,
usability, and interoperability, the addition of several new scientific services to make researchers more
efficient, and a multi-faceted approach to training and outreach.
NIH Spending Category
No NIH Spending Category available.
Project Terms
AccelerationBioconductorBiomedical ResearchClinicalClinical ResearchCloud ComputingCollaborationsCommunitiesComplexDataData ScienceDevelopmentDiseaseEducationEducation and OutreachEducational CurriculumEducational process of instructingEducational workshopEngineeringEnsureEnvironmentEthnic OriginEventGalaxyGeneticGenomeGenomicsGenotype-Tissue Expression ProjectGoalsHealthHumanIndividualInformaticsInfrastructureJointsKnowledgeLeadershipLiftingMachine LearningMedicalMendelian disorderMetadataMethodsMinority-Serving InstitutionModelingMonitorMonographNational Human Genome Research InstituteNeeds AssessmentParticipantPersonsPhasePropertyPublishingResearchResearch PersonnelResourcesRunningSamplingSecureServicesSlideSoftware EngineeringSoftware ToolsStudentsSystemTissuesTrainingTranslatingVariantVisualizationWorkcloud basedcommunity partnershipcomputing resourcescostcost efficientdashboarddata accessdata harmonizationdata ingestiondesignempowermentexperiencegenomic datagenomic platformimprovedinnovationinteroperabilitymachine learning modelmassive open online coursesnew technologynext generationonline deliveryoutreachprogramsreference genomerepositoryscientific computingsuccesssystems researchtask analysistelomeretoolusability
No Sub Projects information available for 5U24HG010263-07
Publications
Publications are associated with projects, but cannot be identified with any particular year of the project or fiscal year of funding. This is due to the continuous and cumulative nature of knowledge generation across the life of a project and the sometimes long and variable publishing timeline. Similarly, for multi-component projects, publications are associated with the parent core project and not with individual sub-projects.
No Publications available for 5U24HG010263-07
Patents
No Patents information available for 5U24HG010263-07
Outcomes
The Project Outcomes shown here are displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed are those of the PI and do not necessarily reflect the views of the National Institutes of Health. NIH has not endorsed the content below.
No Outcomes available for 5U24HG010263-07
Clinical Studies
No Clinical Studies information available for 5U24HG010263-07
News and More
Related News Releases
No news release information available for 5U24HG010263-07
History
No Historical information available for 5U24HG010263-07
Similar Projects
No Similar Projects information available for 5U24HG010263-07