Top Banner
Artificial Intelligence and Machine Learning: Innovations in Clinical Trial Data Automation Presented by SDC Richard B. Abelson, Ph.D. | President & CEO Dale W. Usner, Ph.D. | Sr. VP, Strategic Scientific Consulting & CSO 1 DIA 2019 Innovation Theater | June 24, 2019
22

SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Mar 23, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Artificial Intelligence and Machine Learning:Innovations in Clinical Trial Data Automation

Presented by SDC

Richard B. Abelson, Ph.D. | President & CEO

Dale W. Usner, Ph.D. | Sr. VP, Strategic Scientific Consulting & CSO

1

DIA 2019 Innovation Theater | June 24, 2019

Page 2: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Revolution or Evolution

2

• The “Japanese Post-War Economic Miracle”

• Six-sigma and lean manufacturing

• A focus on quality, reduced touch time and automation of repetitive tasks has made it so that the average vehicle has increased less then 30% in cost since 1970 on an inflation adjusted basis. (Energy.gov, 2016)

EVOLUTION

01010111 01100101 01101100 01100011 01101111 01101101 01100101 00100000 01110100 01101111 00100000 01000100 01001001 01000001

Page 3: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

What Does This Mean For Us?

3

Through process automation, we can achieve:

Leading to:

Higher Quality Data

Higher ROI on R&D

at a Lower Cost

Increased Profitability

in Less Time

Better Patient Care

Page 4: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Understanding Automation

Level Description

Level 0 System issues warnings

Level 1The driver and system share control (cruise control, parking assistance, lane control)

Level 2“Hands off”The automated system takes control fully but is closely monitored by driver

Level 3“Eyes off” Driver may do something else and the system will notify the driver if involvement is needed; driver must be ready to intervene immediately

Level 4“Mind off” Driver may go to sleep

Level 5“Steering wheel is optional” No human intervention is required in any circumstance

4

The Case of Self-Driving Cars

Page 5: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Artificial Intelligence and Machine Learning

• Artificial Intelligence (AI): AI is a general term for any computer system that simulates intelligent behavior

• Machine Learning (ML): ML is an application of AI where the system auto learns and improves based on observed data

• Similarity Models: Method for determining similarity of text

5

Page 6: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Applications in Clinical Trials

6

Patient Compliance

AI/ML Based Software as a Medical Device

Drug Discovery

Patient Recruitment

Page 7: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Applications in Clinical Data Sciences

Development of draft CRF and EDC visit schedule

Trending in trial data and key performance and quality indicators

EDC user acceptance testing

SDTM Mapping and aCRF generation

7

Page 8: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

SDTM Mapping and aCRF Generation

• Standard specification and format for submitting data to the FDA

• Requires converting format of collected clinical data and creation of dataset structure (identifier) and trial design variables

• Annotation of CRF depicting SDTM mapped variables

• Requirements increased SAS programming resources for a study by approximately 20%

• Can AI and ML make this process more efficient?

8

Page 9: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

AI/ML SDTM Auto Mapping Process

9

EDC, Labs, & Other Data

Raw Study data downloaded

Compile Raw Clinical Data

Page 10: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

AI/ML SDTM Auto Mapping Process

10

EDC, Labs, & Other Data

Raw Study data downloaded

AI/ML Process

Get all saved study data

already converted to

SDTM

Train learning models and measure accuracy

based on SDTM variable values

Measure name similarity

Baseline SDTM term prediction

Predict the SDTM Variable

Page 11: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Collection of Race in Demographics

11

Page 12: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

AI Application Deliverable

12

DM.BRTHDTC

DM.AGE

DM.SEX

DM.ETHNIC

DM.RACE

SUPPDM.QVAL WHEN SUPPDM.QNAM =

‘RACEOTH’

DM.BRTHDTC

DM.SEX

DM.ETHNIC

DM.RACE

SUPPDM.QVAL WHEN SUPPDM.QNAM =

‘RACEOTH’

DM.AGE

Page 13: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Predict the SDTM Variable

13

RACE_AMERICAN RACE_ASIAN RACE_BLACK RACE_HAWAIIAN RACE_WHITE RACE_OTHER

FALSE FALSE FALSE FALSE TRUE FALSE

FALSE TRUE FALSE FALSE FALSE FALSE

FALSE FALSE FALSE FALSE TRUE FALSE

FALSE FALSE FALSE FALSE TRUE FALSE

FALSE FALSE FALSE FALSE TRUE FALSE

TRUE FALSE FALSE FALSE FALSE FALSE

FALSE FALSE FALSE FALSE TRUE FALSE

FALSE FALSE FALSE FALSE TRUE FALSE

FALSE FALSE TRUE FALSE FALSE FALSE

RACE

White

Asian

White

White

White

American Indian or Alaska Native

White

White

Black or African American

RACE

WHITE

ASIAN

WHITE

WHITE

WHITE

AMERICAN INDIAN OR ALASKA NATIVE

WHITE

WHITE

BLACK OR AFRICAN AMERICAN

Raw EDC DataSDTM Mapped

Data

Page 14: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Predict the SDTM Variable

14

Domain Study ID Raw Variable Name SDTM Variable Name Variable Label Probability of Match Model Method

DM ###-##-#### Sex SEX Sex 100% ML & Similarity

DM ###-##-#### Race RACE Race 100% ML & Similarity

DM ###-##-#### Ethnic ETHNIC Ethnicity 89% ML & Similarity

Domain Study ID Raw Variable Name SDTM Variable Name Variable Label Probability of Match Model Method

DM ###-##-#### Sex SEX Sex 100% ML & Similarity

DM ###-##-####

Race_AmericanRace_AsianRace_Black

Race_HawaiianRace_WhiteRace_Other

RACE Race 73% Similarity

DM ###-##-#### Ethnic ETHNIC Ethnicity 98% ML & Similarity

Page 15: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

AI/ML SDTM Auto Mapping Process

15

EDC, Labs, & Other Data

Raw Study data downloaded

Validate to SDTM standards by domain Check CDISC code list for non-extensible variables

Calculate derived fields

AI/ML Process

Get all saved study data

already converted to

SDTM

Train learning models and measure accuracy

based on SDTM variable values

Measure name similarity

Baseline SDTM term prediction

• CDISC Study Data Tabulation Model Implementation Guide• CDISC SDTM Controlled Terminology• Pinnacle21

Reference Documents

Derive Fields and Validation to SDTM Standards

Page 16: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Validation and Derived Fields

16

Code Codelist Code Codelist Name CDISC Submission Value

C74457 Race RACE

C41259 C74457 Race AMERICAN INDIAN OR ALASKA NATIVE

C41260 C74457 Race ASIAN

C16352 C74457 Race BLACK OR AFRICAN AMERICAN

C41219 C74457 Race NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER

C41261 C74457 Race WHITE

Example

CDISC code list values

Raw Variable Race Values

race CDISC Submission Value

american AMERICAN INDIAN OR ALASKA NATIVE

asian ASIAN

black BLACK OR AFRICAN AMERICAN

hawaiian NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER

white WHITE

other NOT IN LIST

Similarity Model

Mapped CDISC Value

Domain Study ID Raw Variable Name SDTM Variable Name Variable Label Probability of Match Model MethodDM ###-##-#### Sex SEX Sex 100% ML & SimilarityDM ###-##-#### Race RACE Race 100% ML & Similarity

DM ###-##-#### Ethnic ETHNIC Ethnicity 89% ML & Similarity

RACE

American Indian or Alaska Native

Asian

Black or African American

Native Hawaiian or Other Pacific Islander

White

Other

Page 17: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

AI/ML SDTM Auto Mapping Process

17

EDC, Labs, & Other Data

Raw Study data downloaded

Validate to SDTM standards by domain Check CDISC code list for non-extensible variables

Auto create new SDTM study data sets created

by domain

Auto create annotated Case Report Form with

SDTM term next to raw variable term

Calculate derived fields

AI/ML Process

Get all saved study data

already converted to

SDTM

Train learning models and measure accuracy

based on SDTM variable values

Measure name similarity

Baseline SDTM term prediction

• CDISC Study Data Tabulation Model Implementation Guide• CDISC SDTM Controlled Terminology• Pinnacle21

Reference Documents

Create SDTM Data Sets Annotate CRF

Page 18: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Annotate CRF

18

DM.BRTHDTC

DM.AGE

DM.SEX

DM.ETHNIC

DM.RACE

SUPPDM.QVAL WHEN SUPPDM.QNAM =

‘RACEOTH’

DM.BRTHDTC

DM.SEX

DM.ETHNIC

DM.RACE

SUPPDM.QVAL WHEN SUPPDM.QNAM =

‘RACEOTH’

DM.AGE

Page 19: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

AI/ML SDTM Auto Mapping Process

19

EDC, Labs, & Other Data

Raw Study data downloaded

Validate SDTM standards by domain Check CDISC code list for non-extensible variables

Auto create new SDTM study data sets created

by domain

Auto create annotated Case Report Form with

SDTM term next to raw variable term

Calculate derived fields

AI/ML Process

Get all saved study data

already converted to

SDTM

Train learning models and measure accuracy

based on SDTM variable values

Measure name similarity

Baseline SDTM term prediction

• CDISC Study Data Tabulation Model Implementation Guide• CDISC SDTM Controlled Terminology• Pinnacle21

Reference Documents

Page 20: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Lessons Learned for Implementing AI/ML

• Start Focused✓Define homogenous set of training data

• Consistent version of SDTM Implementation Guide and Controlled Terminology

• Same EDC system

• SDTM implemented by the same company

✓Choose a subset of Domains

• Look Beyond the Actual Measure Results✓ Important information resides in the dataset and variable names

• Train Data Scientist on Manual Process

20

Page 21: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Key Takeaways

• AI is any computer system that simulates intelligent behavior. ✓Patient recruitment, compliance, drug discovery, clinical data.

• Do not need to get to Level 5 automation to see benefits of AI.

• Small investments in AI-driven automation to get to a Level 2 or 3.✓Minimal oversight and occasional human intervention

✓Higher quality data, less time, less cost

• Driving overall shorter cycle times to meaningful therapeutics.

21

Page 22: SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference

Artificial Intelligence and Machine Learning:Innovations in Clinical Trial Data Automation

Presented by SDC

Richard B. Abelson, Ph.D. | President & CEO

Dale W. Usner, Ph.D. | Sr. VP, Strategic Scientific Consulting & CSO

22

DIA 2019 Innovation Theater | June 24, 2019

Visit SDC at Booth #1239