Artificial Intelligence and Machine Learning: Innovations in Clinical Trial Data Automation Presented by SDC Richard B. Abelson, Ph.D. | President & CEO Dale W. Usner, Ph.D. | Sr. VP, Strategic Scientific Consulting & CSO 1 DIA 2019 Innovation Theater | June 24, 2019
22
Embed
SDC: Statistics & Data Corporation - Artificial Intelligence ......• CDISC Study Data Tabulation Model Implementation Guide • CDISC SDTM Controlled Terminology • Pinnacle21 Reference
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Artificial Intelligence and Machine Learning:Innovations in Clinical Trial Data Automation
• A focus on quality, reduced touch time and automation of repetitive tasks has made it so that the average vehicle has increased less then 30% in cost since 1970 on an inflation adjusted basis. (Energy.gov, 2016)
Level 1The driver and system share control (cruise control, parking assistance, lane control)
Level 2“Hands off”The automated system takes control fully but is closely monitored by driver
Level 3“Eyes off” Driver may do something else and the system will notify the driver if involvement is needed; driver must be ready to intervene immediately
Level 4“Mind off” Driver may go to sleep
Level 5“Steering wheel is optional” No human intervention is required in any circumstance
4
The Case of Self-Driving Cars
Artificial Intelligence and Machine Learning
• Artificial Intelligence (AI): AI is a general term for any computer system that simulates intelligent behavior
• Machine Learning (ML): ML is an application of AI where the system auto learns and improves based on observed data
• Similarity Models: Method for determining similarity of text
5
Applications in Clinical Trials
6
Patient Compliance
AI/ML Based Software as a Medical Device
Drug Discovery
Patient Recruitment
Applications in Clinical Data Sciences
Development of draft CRF and EDC visit schedule
Trending in trial data and key performance and quality indicators
EDC user acceptance testing
SDTM Mapping and aCRF generation
7
SDTM Mapping and aCRF Generation
• Standard specification and format for submitting data to the FDA
• Requires converting format of collected clinical data and creation of dataset structure (identifier) and trial design variables
• Annotation of CRF depicting SDTM mapped variables
• Requirements increased SAS programming resources for a study by approximately 20%
Domain Study ID Raw Variable Name SDTM Variable Name Variable Label Probability of Match Model Method
DM ###-##-#### Sex SEX Sex 100% ML & Similarity
DM ###-##-#### Race RACE Race 100% ML & Similarity
DM ###-##-#### Ethnic ETHNIC Ethnicity 89% ML & Similarity
Domain Study ID Raw Variable Name SDTM Variable Name Variable Label Probability of Match Model Method
DM ###-##-#### Sex SEX Sex 100% ML & Similarity
DM ###-##-####
Race_AmericanRace_AsianRace_Black
Race_HawaiianRace_WhiteRace_Other
RACE Race 73% Similarity
DM ###-##-#### Ethnic ETHNIC Ethnicity 98% ML & Similarity
AI/ML SDTM Auto Mapping Process
15
EDC, Labs, & Other Data
Raw Study data downloaded
Validate to SDTM standards by domain Check CDISC code list for non-extensible variables
Calculate derived fields
AI/ML Process
Get all saved study data
already converted to
SDTM
Train learning models and measure accuracy
based on SDTM variable values
Measure name similarity
Baseline SDTM term prediction
• CDISC Study Data Tabulation Model Implementation Guide• CDISC SDTM Controlled Terminology• Pinnacle21
Reference Documents
Derive Fields and Validation to SDTM Standards
Validation and Derived Fields
16
Code Codelist Code Codelist Name CDISC Submission Value
C74457 Race RACE
C41259 C74457 Race AMERICAN INDIAN OR ALASKA NATIVE
C41260 C74457 Race ASIAN
C16352 C74457 Race BLACK OR AFRICAN AMERICAN
C41219 C74457 Race NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER
C41261 C74457 Race WHITE
Example
CDISC code list values
Raw Variable Race Values
race CDISC Submission Value
american AMERICAN INDIAN OR ALASKA NATIVE
asian ASIAN
black BLACK OR AFRICAN AMERICAN
hawaiian NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER
white WHITE
other NOT IN LIST
Similarity Model
Mapped CDISC Value
Domain Study ID Raw Variable Name SDTM Variable Name Variable Label Probability of Match Model MethodDM ###-##-#### Sex SEX Sex 100% ML & SimilarityDM ###-##-#### Race RACE Race 100% ML & Similarity
DM ###-##-#### Ethnic ETHNIC Ethnicity 89% ML & Similarity
RACE
American Indian or Alaska Native
Asian
Black or African American
Native Hawaiian or Other Pacific Islander
White
Other
AI/ML SDTM Auto Mapping Process
17
EDC, Labs, & Other Data
Raw Study data downloaded
Validate to SDTM standards by domain Check CDISC code list for non-extensible variables
Auto create new SDTM study data sets created
by domain
Auto create annotated Case Report Form with
SDTM term next to raw variable term
Calculate derived fields
AI/ML Process
Get all saved study data
already converted to
SDTM
Train learning models and measure accuracy
based on SDTM variable values
Measure name similarity
Baseline SDTM term prediction
• CDISC Study Data Tabulation Model Implementation Guide• CDISC SDTM Controlled Terminology• Pinnacle21
Reference Documents
Create SDTM Data Sets Annotate CRF
Annotate CRF
18
DM.BRTHDTC
DM.AGE
DM.SEX
DM.ETHNIC
DM.RACE
SUPPDM.QVAL WHEN SUPPDM.QNAM =
‘RACEOTH’
DM.BRTHDTC
DM.SEX
DM.ETHNIC
DM.RACE
SUPPDM.QVAL WHEN SUPPDM.QNAM =
‘RACEOTH’
DM.AGE
AI/ML SDTM Auto Mapping Process
19
EDC, Labs, & Other Data
Raw Study data downloaded
Validate SDTM standards by domain Check CDISC code list for non-extensible variables
Auto create new SDTM study data sets created
by domain
Auto create annotated Case Report Form with
SDTM term next to raw variable term
Calculate derived fields
AI/ML Process
Get all saved study data
already converted to
SDTM
Train learning models and measure accuracy
based on SDTM variable values
Measure name similarity
Baseline SDTM term prediction
• CDISC Study Data Tabulation Model Implementation Guide• CDISC SDTM Controlled Terminology• Pinnacle21
Reference Documents
Lessons Learned for Implementing AI/ML
• Start Focused✓Define homogenous set of training data
• Consistent version of SDTM Implementation Guide and Controlled Terminology
• Same EDC system
• SDTM implemented by the same company
✓Choose a subset of Domains
• Look Beyond the Actual Measure Results✓ Important information resides in the dataset and variable names
• Train Data Scientist on Manual Process
20
Key Takeaways
• AI is any computer system that simulates intelligent behavior. ✓Patient recruitment, compliance, drug discovery, clinical data.
• Do not need to get to Level 5 automation to see benefits of AI.
• Small investments in AI-driven automation to get to a Level 2 or 3.✓Minimal oversight and occasional human intervention
✓Higher quality data, less time, less cost
• Driving overall shorter cycle times to meaningful therapeutics.
21
Artificial Intelligence and Machine Learning:Innovations in Clinical Trial Data Automation