The Commons Alliance: A Partnership to Catalyze the Creation of an NIH Data Commons
Project Number3OT3OD025460-01S2
Contact PI/Project LeaderPATEN, BENEDICT Other PIs
Awardee OrganizationUNIVERSITY OF CALIFORNIA SANTA CRUZ
Description
Abstract Text
The life sciences are in the midst of a data revolution. Cheap and accurate genome sequencing is a reality,
high-resolution imaging is becoming routine, and clinical data is increasingly stored in machine-readable
formats. These breakthroughs have brought us to the threshold of a new era in biomedicine, one where the
data sciences hold the potential to propel our understanding and treatment of human disease.
Achieving this potential, however, will require creating software platforms that can support storing, sharing, and
analyzing data at unlimited scale. In this application, we propose to address this unmet need by bringing
together three groups — the University of Chicago, the Broad Institute, and the University of California at Santa
Cruz — each with a strong track record of developing production-grade software platforms to support flagship
scientific efforts, including the All of Us Cohort Program, the Genome Data Commons (GDC) and its affiliated
NCI Cloud Pilots program, and the Human Cell Atlas Data Coordination Platform (HCA DCP). Our goal is to
align and integrate our individual efforts at building data platforms, in order to build a cohesive environment
that can serve the needs of the NIH Data Commons and beyond.
Because these platforms were each developed to fulfill differing use cases, there is currently far more
complementarity than overlap between them. For example, Dr. Grossman has extensive expertise in running a
hybrid cloud at scale to support the needs of the GDC; this provides cost benefits around data transport and
egress that would be invaluable to the NIH Data Commons. Similarly, Dr. Philippakis has developed a
cloud-based model of collaborative workspaces (FireCloud) and software for management of secondary data
use restrictions (DUOS), and Dr. Paten has long been a leader in developing and implementing standardized
APIs as part of the GA4GH. It is this complementarity that motivates us to integrate our efforts.
In the sections below, we present our plans for creating the Commons Alliance Platform. In addition to having
a unified technical vision for what is needed, we are also aligned around a core set of guiding principles:
(1) Open-source - All the software we develop, from user interfaces down to cloud metal, is
open-source. This includes not only the software that would be funded via this awarding mechanism,
but all software developed and deployed by our team.
(2) Modular and interoperable - A design principle of all complex software undertakings is “separation
of concerns,” i.e. the notion that there should be a clean division between architectural components,
each encapsulated by well-defined interfaces. We are committed to building modular and interoperable
software and, in doing so, encouraging the creation of an ecosystem around them.
(3) Standards-driven - Our team is committed to creating and utilizing standardized APIs and data
formats. We have been leaders in GA4GH since its founding, chairing various working groups and
driver projects.
(4) Healthy Competition - Our consortium’s philosophy is to collaborate on APIs to support
interoperability, but compete on implementation to encourage creativity and diversity.
(5) Diversity of data types - We have expertise in multiple data types beyond molecular profiling. In
particular, a key goal of All of Us is to collect extensive clinical data in the form of participant-provided
data and medical records. Similarly, through the Brain Health Commons, Dr. Grossman will be
managing clinical and imaging data. These capabilities will be invaluable as the Commons expands to
include additional data types.
(6) Driven by scientific use cases - Our consortium includes many leading scientists, including PIs on
awards for model organism databases, GTEx, and TOPMed. We will leverage their insights via driving
use cases to ensure that our software enables flagship scientific investigations.
Public Health Relevance Statement
Data not available.
NIH Spending Category
Networking and Information Technology R&D
Project Terms
AddressArchitectureAtlasesAutomobile DrivingAwardBiological SciencesCaliforniaCellsChicagoClinical DataClinical ManagementComplexComputer softwareCosts and BenefitsCreativenessDataData AnalysesData ScienceEcosystemEncapsulatedEnsureEnvironmentFundingGenomeGenotype-Tissue Expression ProjectGoalsHumanHybridsImageIndividualInstitutesInvestigationMedical RecordsMetalsModelingMolecular ProfilingParticipantPhilosophyProductionReadabilityRunningScientistStandardizationTrans-Omics for Precision MedicineUnited States National Institutes of HealthUniversitiesVisionbrain healthcloud basedcohesioncohortdata formatdesigngenome sequencinghigh resolution imaginghuman diseaseinsightinteroperabilitymodel organisms databasesopen sourceprogramssoftware developmentworking group
No Sub Projects information available for 3OT3OD025460-01S2
Publications
Publications are associated with projects, but cannot be identified with any particular year of the project or fiscal year of funding. This is due to the continuous and cumulative nature of knowledge generation across the life of a project and the sometimes long and variable publishing timeline. Similarly, for multi-component projects, publications are associated with the parent core project and not with individual sub-projects.
No Publications available for 3OT3OD025460-01S2
Patents
No Patents information available for 3OT3OD025460-01S2
Outcomes
The Project Outcomes shown here are displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed are those of the PI and do not necessarily reflect the views of the National Institutes of Health. NIH has not endorsed the content below.
No Outcomes available for 3OT3OD025460-01S2
Clinical Studies
No Clinical Studies information available for 3OT3OD025460-01S2
News and More
Related News Releases
No news release information available for 3OT3OD025460-01S2
History
No Historical information available for 3OT3OD025460-01S2
Similar Projects
No Similar Projects information available for 3OT3OD025460-01S2