Awardee OrganizationICAHN SCHOOL OF MEDICINE AT MOUNT SINAI
Description
Abstract Text
Abstract
The NIH Common Fund (CF) programs have produced transformative datasets, databases,
methods, bioinformatics tools and workflows that are significantly advancing biomedical research
in the United States and worldwide. Currently, CF programs are mostly isolated. However,
integrating data from across CF programs has the potential for synergistic discoveries. In addition,
since CF programs have a time limit of 10 years, sustainability of the widely used CF digital
resources after the programs expire is critical. To address these challenges, the NIH established
the Common Fund Data Ecosystem (CFDE) program which has been recently approved to
continue to its second new phase. For the second phase of the CFDE, this project will establish
the Data Resource Center (DRC) and the Knowledge Center (KC). Our efforts will culminate in
producing The CFDE Workbench which will be composed of three main products: the CFDE
information portal, the CFDE data resource portal, and the CFDE knowledge portal. These three
web portals will be full-stack web-based applications with a backend database and will be
integrated into one public site.
The CFDE information portal will be the entry point to the other two portals. It will contain
information about the CFDE in a dedicated About page, information about each participating and
non-participating CF program, information about each data coordination center (DCC), a link to a
catalog of CF datasets, and a link to a catalog of CF tools and workflows, news, events, funding
opportunities, standards and protocols, educational programs and opportunities, social media
feeds, and publications.
The CFDE data resource portal will contain metadata, data, workflows, and tools which are the
products of the CF programs, and their data coordination centers (DDCs). We will adopt the C2M2
data model for storing information about metadata describing DCC datasets. We will also archive
relatively small omics datasets that do not have a home in widely established repositories and do
not require PHI protection. In addition, we will expand the cataloging to CF tools, APIs, and
workflows. Importantly, we will develop a search engine that will index and present results from
all these assembled digital assets. In addition, continuing the work established in the CFDE pilot
phase, users of the data portal will be able to fetch identified datasets through links provided by
the DCCs via the DRS protocol. This will include links to raw and processed data.
The CFDE knowledge portal will provide access to CF programs processed data in various
formats including: 1) knowledge graph assertions; 2) gene, drug, metabolite, and other set
libraries; 3) data matrices ready for machine learning and other AI applications; 4) signatures; and
5) bipartite graphs. In addition, the extract, transform, and load (ETL) scripts to process the data
into these formats will be provided. Since such processed data is relatively small, we will archive
and serve this processed data, mint it with unique IDs, and serve it via APIs. In addition, we will
develop workflows that will demonstrate how the processed data can be harmonized. At the same
time, we will document APIs from all CF DCCs and provide example Jupyter Notebooks that
demonstrate how these datasets can be accessed, processed, and combined for integrative
omics analysis. For the knowledge portal we will also develop a library of tools that utilize these
processed datasets. These tools will have some uniform requirements enabling a plug-and-play
architecture.
To achieve these goals, we will work collaboratively with the other CFDE newly established
centers, the participating CFDE DCCs, the CFDE NIH team, and relevant external entities and
potential consumers of these three software products. These interactions will be achieved via
face-to-face meetings, virtual working groups meeting, one-on-one meetings, Slack, GitHub,
project management software, and e-mail exchange. Via these interactions, we will establish
standards, workstreams, feedback and mini projects towards accomplishing the goal of
developing a lively and productive Common Fund Data Ecosystem.
Public Health Relevance Statement
Data not available.
NIH Spending Category
No NIH Spending Category available.
Project Terms
AddressAdoptedArchitectureArchivesBenchmarkingBiological AssayBiomedical ResearchCatalogingCatalogsComplexComputer softwareDataData Coordinating CenterData SetDatabasesDedicationsEcosystemEducationEducation and OutreachElectronic MailElementsEventFeedbackFeedsFundingFunding OpportunitiesGenesGenus MenthaGoalsGraphGroup MeetingsHomeIngestionKnowledgeKnowledge PortalLearningLibrariesLinkLinkedInMachine LearningMetadataMethodsPharmaceutical PreparationsPhasePlayProcessProductivityProtocols documentationPublicationsPublishingResearch PersonnelResourcesSiteSocial NetworkSystemTimeTwitterUnited StatesUnited States National Institutes of HealthVisualizationWorkbioinformatics toolcell typechatbotcostdata analysis pipelinedata ecosystemdata ingestiondata integrationdata modelingdata portaldata resourcedata toolsdesigndigitalexperienceindexingknowledge graphmeetingsnewspreservationprogramsrepositorysearch enginesocial mediatooltranscriptome sequencingvirtualweb appweb portalworking group
No Sub Projects information available for 3OT2OD036435-01S1
Publications
Publications are associated with projects, but cannot be identified with any particular year of the project or fiscal year of funding. This is due to the continuous and cumulative nature of knowledge generation across the life of a project and the sometimes long and variable publishing timeline. Similarly, for multi-component projects, publications are associated with the parent core project and not with individual sub-projects.
No Publications available for 3OT2OD036435-01S1
Patents
No Patents information available for 3OT2OD036435-01S1
Outcomes
The Project Outcomes shown here are displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed are those of the PI and do not necessarily reflect the views of the National Institutes of Health. NIH has not endorsed the content below.
No Outcomes available for 3OT2OD036435-01S1
Clinical Studies
No Clinical Studies information available for 3OT2OD036435-01S1
News and More
Related News Releases
No news release information available for 3OT2OD036435-01S1
History
No Historical information available for 3OT2OD036435-01S1
Similar Projects
No Similar Projects information available for 3OT2OD036435-01S1