Top Banner
SDTM Validation: Methodologies and Tools Bay Area CDISC Implementation Network Meeting Friday, April 30 th , 2010 Dan Shiu
42

SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Feb 28, 2018

Download

Documents

lynguyet
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

SDTM Validation: Methodologies and Tools

Bay Area CDISC

Implementation Network

Meeting

Friday, April 30th, 2010

Dan Shiu

Page 2: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Disclaimer

The ideas and examples presented here do NOT imply:– They have been or will be implemented at Amgen

– They have not been or will not be implemented at Amgen

– Amgen agrees or disagrees with them

The ideas and examples presented here DO represent:– My personal views

– My sweat and blood

Page 3: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Regulations, Guidance, and Expectations on SDTM Validation

FDA 21 CFR Part 11 applies to computer systems (e.g.

Base SAS) but not to use/output of the systems (e.g.

SAS programs/datasets)

FDA Guidance for Industry: Study Data Specifications

for electronic submission – data tabulation datasets

should follow SDTMIG

FDA website: SDTM Validation Specifications –

validation checks from FDA software tools (Janus)

Data submitted to regulatory agency is expected to be

complete and accurate, regardless of the regulatory

requirement

Page 4: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

SDTM Validation Categories

SDTM Mapping Validation

– Raw Data → Mapping Specifications/aCRF → Programming

→ SDTM Data

– Verify raw data is CORRECTLY and TRUTHFULLY converted

to SDTM data

SDTM Compliance Checks

– Rules have been developed to ensure the software used by

FDA (WebSDM™ by PhaseForward) can check and load the

submitted SDTM data into their data warehouse (Janus)

– Each rule carries a degree of severity for non-compliance – in

the worst case may result in refusal to file

Page 5: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

SDTM Mapping Validation vs. Compliance Checks

SDTM

Mapping

Validation

SDTM

Compliance

Checks

The QS domain is not intended for use in submitting diaries capturing routine study data

Measurement, Test, or Examination values must have consistent standard unit value (--STRESU) across all records in EG, LB, QS, VS

Start Date/Time of Observation (--STDTC) must be less than or equal to End Date/Time of Observation (--ENDTC)

Page 6: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

SDTM Validation Methodologies

SDTM Mapping Validation

– Full Independent-programming

– Risk-based QC Process

– Characteristics-based QC Process

SDTM Compliance Checks

– WebSDM (v1.5/v2.6/v3.0)

– Janus (v1.0 Draft)

– Other SDTMIG custom checks

Page 7: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Full Independent-Programming

Create SDTM mapping specifications/aCRFs

Programmer creates production SDTM datasets based on mapping specifications/aCRFs

QC role creates QC SDTM datasets based on the same mapping specifications/aCRFs

PROC COMPARE production vs. QC SDTM datasets

Resolve discrepancies until production SDTM matches with QC SDTM

Page 8: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Issues with Full Independent-Programming

Result is still dependent and biased

Inconsistent QC process across

products/studies/milestones

QC not based on risk – spend more time on

less important/risky issues

Double resources – programmers, codes,

datasets, documentation

Inefficiency – delayed deliverables

Page 9: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Risk-based QC

Not all uses of SDTM data are equally

important

Not all programming steps are equally error-

prone

Align QC efforts with the intended use of

SDTM as well as the programming steps used

to produce data

Spend most of your QC resources on data with

the greatest business/quality risk!

Page 10: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Risk-based QC Concept

Page 11: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Risk Assessment Examples –Complexity

Programming Complexity

Low - No pooling or merging of data

- No calculations or derivations

- Basic data steps and sorting

Medium - Simple data merges

- Simple pre-processing of data, sub-setting, where/if

clauses, retains, arrays, transposing

- Steps involving validated/standard macros

High - Complex merging data across various source data

- Complex derivation and calculation of data

Page 12: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Risk Assessment Examples –Intended Use

Intended Use of SDTM Data

Low - Internal use only

- Not to be used for major business decisions

Medium - Data/safety review

- Non-endpoint data

High - Regulatory submission

- Primary analysis/final CSR

- Endpoint safety and efficacy data

Page 13: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Risk-based QC Method Examples

Method Responsibility Time Needed

Log Review – use automated

log checking utility to detect

potential errors

Programmer, QC Role Short

Code Review – line-by-line

review of code and log

QC Role, designated

groupMedium

Requirements/Specifications

Review – comparison of SDTM

data with specifications/aCRF

Programmer, QC Role,

StatisticianMedium

Spot Check Review – ad hoc

programming/visual checks on

SDTM/raw data

QC Role, Statistician Medium

Independent Programming –

programming to produce

matching datasets

QC Role Long

Page 14: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Risk Matrix Examples

High 1. Log Review

2. Requirements/

Specifications

Review

3. Code Review

1. Log Review

2. Requirements/

Specifications Review

3. Spot Check Review

4. Code Review

1. Log Review

2. Requirements/

Specifications Review

3. Independent Programming

Medium 1. Log Review

2. Requirements/

Specifications

Review

1. Log Review

2. Requirements/

Specifications Review

3. Spot Check Review

1. Log Review

2. Requirements/

Specifications Review

3. Spot Check Review

4. Code Review

Low 1. Log Review

2. Requirements/

Specifications

Review

1. Log Review

2. Requirements/

Specifications Review

3. Spot Check Review

1. Log Review

2. Requirements/

Specifications Review

3. Spot Check Review

Low Medium High

Co

mp

lexity

of P

rog

ram

Intended Use (Business Risk/Impact of Error)

Page 15: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Characteristics-based QC

SDTM Mapping Validation:

Full Independent-programming Risk-based QC

Raw Data Mapping Specifications / aCRF / Programming

SDTM Data

Are these the best ways?

Page 16: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Characteristics-based QC Concept

Each data element has characteristics

Characteristics describe a data element as whole

If all characteristics match, data elements match

If all data elements match, raw data is CORRECTLY and TRUTHFULLY converted to SDTM

"Grandma, what big eyes you have!”

“Grandma what big ears you have!“

“Grandma what big teeth you have!"

Page 17: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Data Element Examples

Data Element: a group of data, regardless of datasets, variables, records, attributes, that together represent a precise meaning or semantics

– CDISC SHARE Project: The vision for CDISC SHARE is to build a global, accessible electronic library, which through advanced technology, enables precise and standardized data element definitions that can be used in applications and studies to improve biomedical research and its link with healthcare.

Age Element: USUBJID, AGE

Race Element: USUBJID, RACE, SUPPDM.QNAM=“RACEOTH”, QVAL

AE Term Element: USUBJID, AETERM, AEDECOD

SF36 Score Element: USUBJID, QSCAT=“SF36”, QSORRES, QSSTRESC, QSSTRESN, QSSTAT, QSREASND

Page 18: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Data Element Characteristics

Numeric Characteristics – Descriptive Statistics: can be generated from PROC SUMMARY, PROC MEANS, PROC UNIVARIATE

– N, NMISS, MIN, MAX, MEAN, MODE

– SUM, RANGE, VAR, STD, STDMEAN

– Coefficient of Variation, Skewness, Kurtosis

Character Characteristics– FREQ, NOBS, min/max length

– Checksum: e.g. odd parity bit – a simplified algorithm “Pain”=01010000011000010110100101101110

Count the number of 1s 14

To keep odd parity pit, add 1 to 14 checksum=1

If all checksums match all character values match

If statistics of all checksums match all character values match

Page 19: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Characteristics-based QC Examples

QC on Age Element– From raw data: demog.age_raw

– From SDTM: DM.AGE

– Compare: N, MIN, MAX, MEAN, MODE, SUM, STD

QC on AE Term Element– From raw data: adverse.subjectid, adverse.aevt, adverse.aept

– From SDTM: AE.USUBJID, AE.AETERM, AE.AEDECOD

– Compare: FREQ, NOBS, min/max length, checksum

QC on SF36 Score Element: – From raw data: sf36.subjectid, sf36.score_raw, sf36.cmt

– From SDTM: QS.USUBJID, QS.QSCAT=“SF36”, QS.QSORRES, QS.QSSTRESC, QS.QSSTRESN, QS.QSSTAT, QS.QSREASND

– Compare numeric: N, NMISS, MIN, MAX, MEAN, MODE, SUM, STD, RANGE

– Compare character: FREQ, NOBS, min/max length

Page 20: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Characteristics-based QC Benefits

Data element characteristics exist as soon as

data is created/refreshed

Characteristics-based QC is an extension of

risk-based QC in a more consistent way

Characteristics-based QC can be applied to all

end-to-end data conversion processes (e.g.

raw to SDTM, SDTM to ADaM)

Characteristics-based QC can be automated!

Page 21: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

SDTM Compliance Checks

Raw Data SDTM

Mapping Validation

Page 22: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

SDTM Validation and Loading at FDA

FDA Electronic Document RoomJANUS Data Repository

WebSDM

ChecksJANUS Checks

Sponsor:

SDTM

Define.xml

eCTD

FDA Review Tools:

JMP

J-Review

WebSDM

Etc.

Electronic Submission

Data Validation and

Loading

Communication /

Refuse to File

Pass

Communication

Review

Pass

Page 23: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

WebSDM v3.0 Checks

154 rules based on SDTMIG 3.1.2

Checks apply to data (classes, domains, variables, values) and metadata (define.xml, SDTM Terminology.xls)

Severity (Low, Medium, High) is only an indicator of potential problems or anomalies in the data. There is no direct correlation between a severity value and a FDA decision about whether the data is acceptable for review or not.

Page 24: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Janus v1.0 (Draft) Checks

109 rules based on SDTMIG 3.1.1

Overlap with WebSDM rules but with different

definition of the severity levels

Severity Description

High The error is serious and will prevent the study data from being loaded

successfully into the Janus repository. The SDTM study will not be loaded

into the Janus repository.

Medium The error may impact the reviewability of the submission, but will not have

an impact on loading the study data into the Janus repository. The SDTM

study will be loaded into the Janus repository.

Low The error may or may not impact the reviewability or the integrity of the

submission but will not have an impact on loading the study data into the

Janus repository. The SDTM study will be loaded into the Janus repository.

Page 25: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

WebSDM vs. Janus – Severity

WebSDM and Janus

may assign different

severity levels for the

same rule

Page 26: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Custom SDTMIG Compliance Checks

WebSDM/Janus checks cannot cover all of the

explicit/implicit rules in SDTMIG:

– 8/40/200 character limitation check

– USUBJID value must be unique for each subject

across all trials in the submission

– IDVAR (variable), IDVARVAL (record) reference

check against parent domain for CO

– ISO 8601 format check on Duration, Elapsed Time,

and Interval values

– And many more ……

Page 27: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Tools for SDTM Compliance Checks

Proprietary Software: WebSDM™ from Phase Forward, …., etc.

Free Software:– OpenCDISC Validator

Direct-download and installation on PC

Graphic user interface

Reporting in Excel, CSV, and HTML

– SAS Clinical Standards Toolkit PC/UNIX installation support from IT

Interactive/Batch SAS programming interface

Reporting functions not provided but can be custom-built

Page 28: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 29: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 30: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 31: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 32: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 33: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 34: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

SAS CST is a framework including:– Directory structure

– Metadata: datasets, format catalog, XML, Excel

– Data: datasets, format catalog, XML, Excel

– Source code: SAS programs/macros

Page 35: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 36: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 37: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 38: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 39: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)
Page 40: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Tools Comparison

OpenCDISC Validator SAS Clinical Standards Toolkit

Installation User direct-download

PC/USB flash drive, tweak on UNIX

IT/SAS administrator support

PC (9.1.3/9.2) and UNIX (9.2)

Interface Graphic user interface Interactive/Batch SAS programming

interface

Supported

Standards /

Features

Validate SDTMIG 3.1.1/3.1.2 based on

WebSDM v3/Janus v1 draft

Additional custom checks

CDISC-NCI Terminology

Generate/Validate define.xml based on

CRTDDS v1

Validate SDTMIG 3.1.1 based on

WebSDM v2.6/Janus v1 draft

Additional custom checks

CDISC-NCI Terminology

Generate/Validate define.xml based

CRTDDS v1

Reporting Excel/CSV/HTML reports

Can only limit number of occurrence per

rule

WebSDM/Janus rule ID on website but

not on reports

Severity levels follow Janus

Results in SAS datasets

Can limit number of occurrence per

rule/dataset/actual value

WebSDM/Janus ID in results

Severity levels follow

WebSDM/Janus

Page 41: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

Tools Comparison (Cont’d)

OpenCDISC Validator SAS Clinical Standards Toolkit

Processing Real memory

Check on SAS transport XPT or other

delimited text files

Disk and real memory

Redundant processing steps

Check on SAS datasets

Performance Fair (hours) for small studies but

potential memory crash for large

studies

To be improved (1+ day)

Maintenance Open XML code for configuration

Open Java code on website

Standard/Custom metadata in

XML/Excel

Open source SAS code/configuration

Standard/Custom metadata in SAS datasets

Flexibility Need XML/Java expertise for any

customization/enhancement

Select/Deselect rules to check in SAS code

Build custom checks with SAS code

Build graphic user interface in SAS/Excel

Documentation Website Instructions Installation Instructions

IQ/OQ document

Examples/Exercises

User’s Guide

Technical Support Website forum SAS technical support from phone/email/website

Page 42: SDTM Validation: Methodologies and Toolscdiscportal.digitalinfuzion.com/CDISC User Networks/North America... · SDTM Validation: Methodologies and Tools ... raw to SDTM, SDTM to ADaM)

References and Contact

FDA Guidance for Industry, Part 11, Electronic Records; Electronic Signatures – Scope and Application http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm072322.pdf

FDA Guidance for Industry, Study Data Specifications (v1.5.1): http://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/UCM199759.pdf

WebSDM Checks: http://www.phaseforward.com/products/cdisc

Janus Checks: http://www.fda.gov/ForIndustry/DataStandards/StudyDataStandards/ucm155327.htm

OpenCDISC Validator: http://www.opencdisc.org

SAS Clinical Standards Toolkit: http://ftp.sas.com/techsup/download/hotfix/12clintlkt.html

Contact Information: [email protected]