Using the literature to build causal models of retrospective observational data
Project Number5K99LM013367-02
Former Number1K99LM013367-01
Contact PI/Project LeaderMALEC, SCOTT ALEXANDER
Awardee OrganizationUNIVERSITY OF PITTSBURGH AT PITTSBURGH
Description
Abstract Text
Health data contain a wealth of information for research. Health data, such as found in electronic health
records (EHRs), allow for the identification links between health events, such as drug exposures and side-
effects. Some of these links indicate stable dependencies deemed as causes. Causal insight allows reverse-
engineering disease. If confounding is not addressed, it will be difficult to distinguish causative from correlative
links. Our approach is to identify confounders explicitly. Graphical causal modeling (GCMs) can discover
causal links from data and prior knowledge. GCMs summarize causal links between variables. Automated
selection of variables would allow GCMs to scale and yield more insight from data. Literature-based discovery
(LBD) methods were developed to identify links between concepts in the literature. Advanced methods permit
the search for concepts linked to each other through specific verbs, e.g., “causes”, “treats”. Our hypothesis is
that we can exploit structured knowledge extracted from the literature to inform GCMs. In prior work, we found
that LBD + GCM was better at identifying side-effects in EHR data than traditional methods. Compared to
methods which use solely data, we hypothesize that our method will increase the ability to detect causal
relationships from EHR data. The first aim is to determine the extent to which LBD-informed GCM improves the
identification of causal links for drug safety. We will build LBD-informed GCMs using publicly available
reference datasets for drug safety. These reference datasets contain drug/side-effect pairs for performance
benchmarking. (A) Test the ability of GCM algorithms to identify known causal links solely using data. We will
systematically evaluate GCM algorithms based on their ability to re-discover causal links in a reference
standard. Results will guide our studies on how GCM can be tuned. (B) Determine the effect of adding different
subsets of LBD-derived information to GCMs at identifying drug side-effects. We will build causal models using
increasing numbers confounders. The second aim is to test the ability of LBD built with disease-specific
literature to improve the relevance of LBD derived confounders for Alzheimer's Disease (AD). We chose AD for
its high prevalence and relative lack of effective pharmacologic treatment. (A) Compare LBD strategies in a
disease-specific setting. We will test LBD variants using disease-specific literature or with LBD lacking subject-
matter restrictions. (B) Define the ability of robust LBD-informed GCM to validate drug repurposing candidates
for treating AD symptoms. We will test the ability of advanced methods to iteratively resolve hidden latent
confounding, when detected, to improve effect estimates. The fulfillment of these aims will yield new methods
to combine insights from the literature with causal modeling to uncover causal relationships of drug exposures
on adverse events and on beneficial outcomes.
Public Health Relevance Statement
Observational health data, such as electronic health records, contain a wealth of information
that can help to determine the safety, effectiveness, and benefits of drugs. This K99 proposal
aims to integrate knowledge embedded in the literature into graphical causal models to better
characterize the effects of drug exposures. The fulfillment of these aims will advance the state-
of-the-art of pharmacovigilance and drug repositioning using observational health data by
discovering/validating new methods to address confounding.
NIH Spending Category
Patient Safety
Project Terms
AddressAdverse drug eventAdverse eventAgeAgingAlcoholismAlgorithmsAlzheimer's DiseaseBenchmarkingCharacteristicsChronicControlled Clinical TrialsCross-Sectional StudiesDataData SetDependenceDiagnosisDiseaseDrug ExposureDrug Side EffectsDrug usageEffectivenessElectronic Health RecordEtiologyEventHealthHigh PrevalenceKnowledgeLightLinkLiteratureManualsMedicalMedical centerMentorsMethodsNaltrexoneNeurodegenerative DisordersOutcomePancreatitisPartner in relationshipPatientsPeer ReviewPerformancePharmaceutical PreparationsPharmacological TreatmentPharmacologyPhasePrevalenceReference StandardsResearchResearch PersonnelReverse engineeringReview LiteratureSafetySelection for TreatmentsStructureSymptomsSystemTestingTranslational ResearchVariantWorkbasecausal modelcomorbiditydata qualitydrug repurposingguided inquiryhealth applicationhealth dataimprovedinsightmedication safetypharmacovigilancepreventside effect
No Sub Projects information available for 5K99LM013367-02
Publications
Publications are associated with projects, but cannot be identified with any particular year of the project or fiscal year of funding. This is due to the continuous and cumulative nature of knowledge generation across the life of a project and the sometimes long and variable publishing timeline. Similarly, for multi-component projects, publications are associated with the parent core project and not with individual sub-projects.
No Publications available for 5K99LM013367-02
Patents
No Patents information available for 5K99LM013367-02
Outcomes
The Project Outcomes shown here are displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed are those of the PI and do not necessarily reflect the views of the National Institutes of Health. NIH has not endorsed the content below.
No Outcomes available for 5K99LM013367-02
Clinical Studies
No Clinical Studies information available for 5K99LM013367-02
News and More
Related News Releases
No news release information available for 5K99LM013367-02
History
No Historical information available for 5K99LM013367-02
Similar Projects
No Similar Projects information available for 5K99LM013367-02