Top Banner
i2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard Medical School Brigham and Women’s Hospital
29

I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Dec 25, 2015

Download

Documents

Charla Taylor
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

i2b2 Rheumatoid Arthritis DBPDefining RA in the electronic health

record for future studies

Elizabeth Karlson, MDAssociate Professor of Medicine

Harvard Medical SchoolBrigham and Women’s Hospital

Page 2: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Background: Partners Resources

• i2b2: “Informatics for Integrating Biology and the Bedside”

• RPDR: “Research Patient Data Repository”• Natural Language Processing (HiTEX)• Gold standard dataset:

– Training set: 500 manual chart reviews – Validation set: 400 manual chart reviews

Page 3: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Coded data

• ICD-9 codes for RA

• ICD-9 codes for related phenotypes– Lupus (SLE), psoriatic arthritis (PsA), juvenile

inflammatory arthritis (JIA)

• Lab results for RA related antibodies– Rheumatoid factor (RF), anti-CCP

• Medications– physician entry, escripts

Page 4: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.
Page 5: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

NLP Concepts

NLP queries– Rheumatoid arthritis– RA-related antibodies

• Anti-CCP/RF/seropositive• Result coded as positive/negative

– RA Medications• Coded as any mention

– Radiographs: RA erosions• Coded as any erosion

Page 6: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Approach to develop RA cohort

RA Martn=29, 432

Predicted RA Casesn=3,585

≥ 1 ICD RAn=25,830

ORAnti-CCP n=3,602

↑ Sensitivity ↑ Specificity

Training setn=500

Validation setn=400

Classificationalgorithm

training

Classification algorithmStep 1: Develop gold standard training setStep 2: Identify variables important for predicting RAStep 3: Develop algorithm

Page 7: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Chart review results

• RA Mart, N=32,000– ICD9 = 714.xxx

OR– CCP test ordered

• Manual chart review for 500 patients– 20% validation rate– definite RA=100– possible/no RA= 400

Page 8: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Comparison of NLP to manual chart review

• Precision of NLP queries– Methotrexate 100%– Etanercept 100%– CCP+ 98.7%– Seropositive 96%– Erosion 88%

Page 9: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Approach to develop RA cohort

Classification algorithm Step 2: Define variables(Vivian Gainer, Sergey Goryachev, Qing Zeng-Treitler, Shawn Murphy)• Codified data

– ICD9 billing codes– Electronic medication prescription– CCP, RF lab results

• Narrative data extracted using natural language processing (NLP), i.e. from physician notes, radiology reports– Erosions– RF positive, CCP positive, seropositive– RA medications

RA Martn=29, 432

Predicted RA Casesn=3,585

≥ 1 ICD RAn=25,830

ORAnti-CCP n=3,602

↑ Sensitivity ↑ Specificity

Training setn=500

Validation setn=400

Classificationalgorithm

training

Page 10: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Approach to develop RA cohort

Classification algorithm

Step 3: Develop algorithm(Tianxi Cai)

• Penalized logistic regression with adaptive LASSO• Parsimonious predictors selected based on BIC

RA Martn=29, 432

Predicted RA Casesn=3,585

≥ 1 ICD RAn=25,830

ORAnti-CCP n=3,602

↑ Sensitivity ↑ Specificity

Training setn=500

Validation setn=400

Classificationalgorithm

training

Page 11: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Model RA

PPV (%)

Sensitivity (%)

Difference in PPV

Algorithms

Narrative +

Codified 3585 94 63 reference

Codified only 3046 88 51 6

NLP only 3341 89 56 5

Published administrative codified criteria

≥ 3 ICD9 RA 7960 56 80 38

≥1 ICD9RA + med 7799 45 66 49

Page 12: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Top 5 predictive variables for RA

Page 13: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

i2b2 RA cohort

Liao, et al., Arthritis Care & Research 2010

*Consortium of Rheumatology Researchers of North America

Page 14: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

i2b2 Virtual RA Cohort Studies

• Case-control cohort– ~4,000 RA cases– ~13,000 matched non-RA controls

• Age, gender, race and health care utilization

• Samples collected from 1500 cases/1500 controls for genotyping– Genetic risk score predicts RA with same magnitude

as in GWAS (Kurreman, 2010)– CAD outcomes in RA cases being validated in i2b2

• Pharmacogenetics Research Network (PGRN)

Page 15: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

i2b2 RA Project:

• Selected codified data from RPDR

• Performed NLP queries for RA features

• Developed algorithm based on:

coded + NLP data

Liao, 2010

Page 16: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

PGRN Methods:

• Select codified data from RPDR (meds)

• Perform NLP queries for RA disease activity features

• Develop algorithm (s) based on:

Meds + NLP data

Page 17: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.
Page 18: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

PGRN Specific Aims

• Aim 1: Define RA disease activity level in the EMR

• Aim 2: Develop an algorithm to predict RA disease activity from EMR data

• Aim 3: Define temporal relations between RA medications and disease activity to define to define treatment response in RA

Page 19: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Background

• In RA, disease activity score (DAS28) is considered the gold standard tool to evaluate disease activity and response to treatment in clinical practice

• DAS28 has 2 components:– Disease activity level– Change in disease activity level

Page 20: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Van Gestel AM et al. Arthritis Rheum 1996; 39: 34-40

• Disease activity level scored as low, moderate, high

• Disease activity change scored as low, moderate, high

Page 21: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Research Methods

• Construct a virtual cohort of RA patients (N=5906)• Review charts for disease activity (document level)

– Remission– Low– Moderate Remission/Low vs. High/Moderate– High– Indeterminant

• Annotate charts for disease activity features (Knowtator) – Disease_disorder– Symptoms (reported pain, stiffness, swelling)– Signs (objective tenderness, limited range of motion, synovitis)– Anatomic site (relations with signs and symptoms)– RA medication signature– RA labs, level of inflammation (CRP, ESR)– Patient functioning (activities of daily living)

Page 22: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.
Page 23: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

NLP Methods

• Move from keyword matching in i2b2 to ontology mapping in PGRN

• Customize cTAKES for– RA medications– RA anatomic sites

• Find relations between entities • Define new modules

– RA medication changes (start/stop)– Reasons to stop medications– Lab values– Patient functioning status

Page 24: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.
Page 25: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

NLP Analytic Approaches

1- Internal gold standard datasets– N=200 BWH annotated notes– N= 200 MGH annotated notes

2- Analyses– Study whether MD summary (1-3 sentences) predicts disease

activity– SVM: construct vectors based on features and relations to

predict disease activity– Bag of concepts to predict disease activity 

2- External gold standard datasets: – DAS28 scores from standardized tool at MGH matched to

clinical note– DAS28 scores from BRASS matched to clinical note

Page 26: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Future work

• Define temporal relations between anti-TNF medication use (eg. new starts) and pre and post start disease activity to define response to therapy– Construct disease activity timeline (patient

level)– Construct medication timeline (patient level)

Page 27: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Use NLP to define temporal sequence of medication start and adverse event

Page 28: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.

Questions?

Page 29: I2b2 Rheumatoid Arthritis DBP Defining RA in the electronic health record for future studies Elizabeth Karlson, MD Associate Professor of Medicine Harvard.