Top Banner
Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics June 11, 2012
51

Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics

Feb 24, 2016

Download

Documents

Lester

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping. Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics. June 11, 2012. Project 3: Collaborators & Acknowledgments. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput PhenotypingJyotishman Pathak, PhDAssistant Professor of Biomedical Informatics

June 11, 2012

Page 2: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Project 3: Collaborators & Acknowledgments• CDISC (Clinical Data Interchange Standards Consortium)

• Rebecca Kush, Landen Bain• Centerphase Solutions

• Gary Lubin, Jeff Tarlowe• Group Health Seattle

• David Carrell• Harvard University/MIT

• Guergana Savova, Peter Szolovits• Intermountain Healthcare/University of Utah

• Susan Welch, Herman Post, Darin Wilcox, Peter Haug• Mayo Clinic

• Cory Endle, Rick Kiefer, Sahana Murthy, Gopu Shrestha, Dingcheng Li, Gyorgy Simon, Matt Durski, Craig Stancl, Kevin Peterson, Cui Tao, Lacey Hart, Erin Martin, Kent Bailey, Scott Tabor

©2012 MFMER | slide-2

Page 3: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Page 4: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

Phenotyping is still a bottleneck…

©2012 MFMER | slide-4[Image from Wikipedia]

Page 5: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

EHR systems: United States 2002—2011

©2012 MFMER | slide-5

[Millwood et al. 2012]

Page 6: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Electronic health records (EHRs) driven phenotyping

• EHRs are becoming more and more prevalent within the U.S. healthcare system• Meaningful Use is one of the major drivers

• Overarching goal• To develop high-throughput automated

techniques and algorithms that operate on normalized EHR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings

©2012 MFMER | slide-6

Page 7: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-7

http://gwas.org

Page 8: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

EHR-driven Phenotyping Algorithms - I• Typical components

• Billing and diagnoses codes• Procedure codes• Labs• Medications• Phenotype-specific co-variates (e.g., Demographics,

Vitals, Smoking Status, CASI scores)• Pathology• Imaging?

• Organized into inclusion and exclusion criteria

©2012 MFMER | slide-8

Page 9: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

DataTransformTransform

EHR-driven Phenotyping Algorithms - II

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings [eMERGE Network]

©2012 MFMER | slide-9

Page 10: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Example: Hypothyroidism Algorithm

No secondary causes (e.g., pregnancy, ablation)

No ICD-9s forHypothyroidism

NoAbnormalTSH/FT4

No Antiboides for TTG/TPO

ICD-9s forHypothyroidism

Antibodies forTTG or TPO(anti-thyroglobulin,anti-thyroperidase)

AbnormalTSH/FT4

No thyroid-altering medications (e.g., Phenytoin, Lithium)

Thyroid replace. meds

Case 1 Case 2

No thyroid replace. meds

Control

2+ non-acute visits in 3 yrs

No hx of myasthenia gravis

©2012 MFMER | slide-10

[Denny et al., 2012]

Page 11: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Hypothyroidism Algorithm: Validation

Positive Predictive Values (PPV) Based on Chart Review – All Sites

SiteEHR-based

Cases/Controls

Sampled forChart Review

Cases/ControlsOld CasePPV (%)

New Case PPV (%)

Group Health 430/1,188 50/50 92 98

Marshfield 509/1193 50/50 88 91

Mayo Clinic 250/2,145 100/100 76 97

Northwestern 103/516 50/50 88 98

Vanderbilt 184/1,344 50/50 90 98All sites 1,421/6,362 — 87 96

©2012 MFMER | slide-11

[Denny et al., 2012]

Page 12: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

Data Categories used to define the EHR-driven Phenotyping Algorithms

Clinical gold standard

EHR-derived phenotype

Phenotype Definitions

Validation (PPV/NPV)

Alzheimer’s Dementia

Demographics, clinical examination of mental status, histopathologic examination

Diagnoses, medications

Demographics, laboratory tests, radiology reports

73%

Cataracts Clinical exam finding (Ophthalmologic examination)

Diagnoses, procedure codes

Demographics, medications

98%/98%

Peripheral Arterial Disease

Radiology test results (ankle-brachial index or arteriography)

Diagnoses, procedure codes, medications, radiology test results

Demographics 94%/99%

Type 2 Diabetes Laboratory Tests Diagnoses, laboratory tests, medications

Demographics, height, weight, family history

98%/100%

Cardiac Conduction

ECG measurements ECG report results Demographics, diagnoses, procedure codes, medications, laboratory tests

97%

[eMERGE Network]©2012 MFMER | slide-12

Page 13: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

0.5 5

Genotype-Phenotype Association Results

0.5 50.5 5.01.0

Odds Ratio

rs2200733 Chr. 4q25rs10033464 Chr. 4q25rs11805303 IL23Rrs17234657 Chr. 5rs1000113 Chr. 5rs17221417 NOD2rs2542151 PTPN22rs3135388 DRB1*1501rs2104286 IL2RArs6897932 IL7RArs6457617 Chr. 6rs6679677 RSBN1rs2476601 PTPN22rs4506565 TCF7L2rs12255372 TCF7L2rs12243326 TCF7L2rs10811661 CDKN2Brs8050136 FTOrs5219 KCNJ11rs5215 KCNJ11rs4402960 IGF2BP2

Atrial fibrillation

Crohn's disease

Multiple sclerosis

Rheumatoid arthritis

Type 2 diabetes

disease gene / regionmarker

2.0[Ritchie et al. 2010]

observedpublished

©2012 MFMER | slide-13

Page 14: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Key lessons learned from eMERGE• Algorithm design and transportability

• Non-trivial; requires significant expert involvement• Highly iterative process• Time-consuming manual chart reviews• Representation of “phenotype logic” for transportability

is critical

• Standardized data access and representation• Importance of unified vocabularies, data elements, and

value sets• Questionable reliability of ICD & CPT codes (e.g., billing

the wrong code since it is easier to find)• Natural Language Processing (NLP) is critical

©2012 MFMER | slide-14

Page 15: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

[eMERGE Network]

©2012 MFMER | slide-15

Page 16: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

©2012 MFMER | slide-16

• Standardized representation of clinical data

• Create new and re-use existing clinical element models (CEMs)

• Standardized and structured representation of phenotype definition criteria

• Use the NQF Quality Data Model (QDM)

• Conversion of structured phenotype criteria into executable queries

• Use JBoss® Drools (DRLs)

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

Page 17: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

The SHARPn “phenotyping funnel”

©2012 MFMER | slide-17

Phenotype specific patient cohorts

DRLs

QDMs

CEMs

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

Intermountain EHR

Mayo Clinic EHR

Page 18: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Clinical Element ModelsHigher-Order Structured Representations

©2012 MFMER | slide-18

[Stan Huff, IHC]

Page 19: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Pre- and Post-Coordination

©2012 MFMER | slide-19

[Stan Huff, IHC]

Page 20: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping [Stan Huff, IHC]

CEMs available for patient demographics, medications, lab measurements, procedures etc.

Page 21: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

©2012 MFMER | slide-21

SHARPn data normalization flow - I

CEM MySQL database with normalized patient information

[Welch et al. 2012]

Page 22: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

SHARPn data normalization flow - II

©2012 MFMER | slide-22

CEM MySQL database with normalized patient information

Page 23: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

©2012 MFMER | slide-23

• Standardized representation of clinical data

• Create new and re-use existing clinical element models (CEMs)

• Standardized and structured representation of phenotype definition criteria

• Use the NQF Quality Data Model (QDM)

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

Page 24: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Our task: human readable machine computable

©2012 MFMER | slide-24

[Thompson et al., submitted 2012]

Page 25: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

NQF Quality Data Model (QDM)• Standard of the National Quality Forum (NQF)

• A structure and grammar to represent quality measures in a standardized format

• Groups of codes in a code set (ICD-9, etc.)• "Diagnosis, Active: steroid induced diabetes" using

"steroid induced diabetes Value Set GROUPING (2.16.840.1.113883.3.464.0001.113)”

• Supports temporality & sequences• AND: "Procedure, Performed: eye exam" > 1 year(s)

starts before or during "Measurement end date"• Implemented as set of XML schemas

• Links to standardized terminologies (ICD-9, ICD-10, SNOMED-CT, CPT-4, LOINC, RxNorm etc.)

©2012 MFMER | slide-25

Page 26: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-26

116 Meaningful Use Phase I Quality Measures

Page 27: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Example: Diabetes & Lipid Mgmt. - I

©2012 MFMER | slide-27

Human readable HTML

Page 28: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Example: Diabetes & Lipid Mgmt. - II

©2012 MFMER | slide-28

Computable XML

Page 29: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

NQF Measure Authoring Tool (MAT)

©2012 MFMER | slide-29

Page 30: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

©2012 MFMER | slide-30

• Standardized representation of clinical data

• Create new and re-use existing clinical element models (CEMs)

• Standardized and structured representation of phenotype definition criteria

• Use the NQF Quality Data Model (QDM)

• Conversion of structured phenotype criteria into executable queries

• Use JBoss® Drools (DRLs)

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

Page 31: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

JBoss® open-source Drools rules based management system (RBMS)

©2012 MFMER | slide-31

• Represents knowledge with declarative production rules• Origins in artificial intelligence

expert systems• Simple when <pattern> then

<action> rules specified in text files

• Separation of data and logic into separate components

• Forward chaining inference model (Rete algorithm)

• Domain specific languages (DSL)

Page 32: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Example Drools rule

©2012 MFMER | slide-32

rule "Glucose <= 40, Insulin On“

when $msg : GlucoseMsg(glucoseFinding <= 40,

currentInsulinDrip > 0 )then

glucoseProtocolResult.setInstruction(GlucoseInstructions.GLUCOSE_LESS_THAN_40_INSULIN_ON_MSG);end

{binding} {Java Class} {Class Getter Method}

Parameter {Java Class}

{Class Setter Method}

{Rule Name}

Page 33: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Automatic translation from NQF QDM criteria to Drools

Measure Authoring

Toolkit

Drools Engine

From non-executable to executable

Data TypesXML-based structured

representation

Value Setssaved in XLS

files

MeasuresXML-basedStructured

representation

Mapping data typesand value sets

Fact Models

Converting measures to Drools scripts

Droolsscripts

©2012 MFMER | slide-33

[Li et al., submitted 2012]

Page 34: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Automatic translation from NQF QDM criteria to Drools

©2012 MFMER | slide-34

[Li et al., submitted 2012]

Page 35: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

The “executable” Drools flow

©2012 MFMER | slide-35

Page 36: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

©2012 MFMER | slide-36

Phenotype library and workbench - I

1. Converts QDM to Drools2. Rule execution by querying

the CEM database3. Generate summary reports

http://phenotypeportal.org

Page 37: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

©2012 MFMER | slide-37

Phenotype library and workbench - IIhttp://phenotypeportal.org

Page 38: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

©2012 MFMER | slide-38

Phenotype library and workbench - IIIhttp://phenotypeportal.org

Page 39: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-39

Phenotype library and workbench - IV

Page 40: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Page 41: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Additional on-going research efforts - I• Machine learning and

association rule mining• Manual creation of

algorithms take time• Let computers do the

“hard work”• Validate against

expert developed ones

©2012 MFMER | slide-41

[Caroll et al. 2011]

Page 42: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Additional on-going research efforts - I

• Origins from sales data• Items (columns): co-morbid conditions• Transactions (rows): patients• Itemsets: sets of co-morbid conditions• Goal: find all itemsets (sets of conditions)

that frequently co-occur in patients.• One of those conditions should be DM.

• Support: # of transactions the itemset I appeared in• Support({TB, DLM, ND})=3

• Frequent: an itemset I is frequent, if support(I)>minsup

Patient TB DLM

ND … IEC

001 Y Y Y Y

002 Y Y Y Y

003 Y Y

004 Y

005 Y Y Y

X: infrequent

[Simon et al. 2012]

©2012 MFMER | slide-42

Page 43: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Additional on-going research efforts - II

©2012 MFMER | slide-43

Page 44: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-44

TRALI/TACO sniffer

Additional on-going research efforts - II

Page 45: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-45

Page 46: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Active Surveillance for TRALI and TACO

Of the 88 TRALI cases correctly identified by the CART algorithm, only 11 (12.5%) of these were reported to the blood bank by the clinical service.

Of the 45 TACO cases correctly identified by the CART algorithm, only 5 (11.1%) were reported to the blood bank by the clinical service.

©2012 MFMER | slide-46

Page 47: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

Additional on-going research efforts - III• Phenome-wide association scan (PheWAS)

• Do a “reverse GWAS” using EHR data• Facilitate hypothesis generation

©2012 MFMER | slide-47

[Pathak et al. submitted 2012]

Page 48: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Publications till date (conservative)

Year 1 (2011) Year 2 (2012) Year 3 (2013)0

2

4

6

8

10

12

14

8

66

2

12

PapersAbstractsUnder review

©2012 MFMER | slide-48

Page 49: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Mayo projects and collaborations• Ongoing

• Transfusion related acute lung injury (Kor)• Drug induced liver injury (Talwalkar)• Drug induced thrombocytopenia and neutropenia (Al-Kali)• Active surveillance for celiac disease (Murray)• Warfarin dose response & heartvalve replacements (Pereira)• Phenotype definition standardization (HCPR/Quality)

• Getting started/planning• Pharmacogenomics of systolic heart failure

(Bielinski/Pereira)• Pharmacogenomics of SSRI (Mrazek/Weinshilboum)• Lumbar image reporting with epidemiology (Kallmes)• Active clinical trial alerting (CTMS/Cancer Center)

©2012 MFMER | slide-49

Page 50: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

HTP related presentations• June 11th, 2012

• Using EHRs for clinical research (Vitaly Herasevich)• Association rule mining and T2D risk prediction (Gyorgy Simon)• Scenario-based requirements engineering for developing EHR add-ons

to support CER in patient care settings (Junfeng Gao)• June 12th, 2012

• Exploring patient data in context clinical research studies: Research Data Explorer (Adam Wilcox et al.)

• Utilizing previous result sets as criteria for new queries with FURTHeR (Dustin Schultz et al.)

• Semantic search engine for clinical trials (Yugyung Lee)• Knowledge-driven workbench for predictive modeling (Peter Haug et al.)• Clinical analytics driven care coordination for 30-day readmission –

Demonstration from 360 Fresh.com (Ramesh Sairamesh)

©2012 MFMER | slide-50

Page 51: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Thank You!

©2012 MFMER | slide-51

[email protected]