AZ CDISC Implementation

AZ CDISC Implementation

A brief history of CDISC implementation

Stephen Harrison

2

Overview

� Background

� CDISC Implementation Strategy

� First steps

� Business as usual

� ADaM or RDB?

� Lessons learned

� Summary

3

Background

� Seven R&D sites all operating

in their own environments

� Creating and maintaining

similar tools across the R&D

sites

� Continuous duplication of effort

across regions

4

A&RT Initiative

� Project initiation – April 2003

� Objective: Harmonise the A&R

process and environment across ALL R&D sites within AZ

� Multiple workstreams looking at

technology, process and standards

� Reporting Database (RDB) w/stream

� Deliver standardized reusable code or macros to automate production of analysis and report ready datasets

5

Data Flow Process

� Previous data flow process was a simple route from existing CRFsto Clinical Study Reports/Higher Level Document outputs

� Reporting Database is created directly from the Module Package

� Remit of project was to use existing internal data standards

� Opportunity to implement CDISC standards

Analysis

Datasets/

RDB

Module

Package

RAW

Data

CSR/

HLD

Output

CRFs

6

CDISC Implementation Strategy

� RDB completely described in terms of SDTM source – good for reviewer

� No need to construct SDTM at the end of the process

Analysis

Datasets/

RDB

Module

Package

RAW

Data

CRFs

CSR/

HLD

Output

SDTM

7

CDISC Implementation Strategy

Analysis

Datasets/

RDB

Module

Package

RAW

Data

CRFs

CSR/

HLD

Output

SDTM

� RDB completely described in terms of SDTM source – good for reviewer

� No need to construct SDTM at the end of the process

� Linear process fulfils the requirement of traceability

8

Longer term strategy

Analysis

Datasets/

RDB

Module

Package

RAW

Data

CRFs

CSR/

HLD

Output

SDTM

New

CRFs/

CDASH

Modified CRFs

� Underlying RAW data standards are SDTM friendly

� Transformation process is simplified

� CDASH - Clinical Data Acquisition Standards Harmonization

9

Longer term strategy

Analysis

Datasets/

RDB

Module

Package

RAW

Data

CRFs

CSR/

HLD

Output

SDTM

New

CRFs/

CDASH

ADaM

� Adopt ADaM model, replacing internal data standards

� Utilise industry standard transformation and derivation processes

ADaM

10

First steps

� Global team set up August 2005 to specify AZ business rules

� Application of SDTM Implementation Guide v3.1.1 from an AZ point of view

� Two team members also part of CDISC SDS team

� Inside track to SDTM

� Scope - all corporate and TA standard modules (>200)

� Mapping exercise took nearly 18 months to complete!

11

Manual mapping document

12

Business as usual

� Web Interface developed

� Metadata driven process

� RAW to SDTM and SDTM to RDB mapping function

� Inherit Corporate data standards and maps down to project or study

level

� Metadata used by “code builder” to create executable code

13

Windows

PMPL

Data Standards

A&RTApplication

Database

Datasets Variables

RAW Data Metadata

Study

Project Import

A&RT Web Interface

Variables

Dataset

Variables

Dataset

CSV file

14

Standards and Reuse of Code

Corporate

Therapy Area

Project

Study

• Data standards

• Mappings

• Data standards

• Mappings

• Data standards

• Mappings

• Data standards

• Mappings

•Locked dataset definitions

•Locked Corporate map


•Locked TA map


•Locked Project map

15

Inheritance – SDTM

RAW Metadata

Study 2Study 1

TA (Respiratory)

Corporate

RESPHIS

DEM

AELOGDEM DEM

AELOG

Project

RESPHISDEM

HISM

RESPHIS

AELOG

PULM

PULMAELOG

SDTM Metadata

Corporate

DM AE MH

PF CF

mapping

Project

DM AE MHPF CF

PULM

Study 1

DM AE

Study 2

DM AE

MHPF CF

16

Example RAW – SDTM map

17

Define Simple Mapping

18

Define Macro Mapping

19

Transposition Groups

20

A&RT Application DatabaseWeb

Interface

(Oracle)

UNIX

(SAS)

SDTMDatabase

RAWDatabase

ReportingDatabase

Program Builder

Execute job

SAS code

Execute job

SAS code

Load

RAW

Data

Import

RAW

Data

Metadata

Create Mapping Metadata

SDTM � RDB

Create Mapping Metadata

RAW � SDTM

A&RT Mapping Process

21

ADaM or RDB?

� Well established reporting requirements

� AZ Reporting Database standards defined and in use before CDISC considered

� Perception that ADaM model still quite unstable and subject to significant change

� Unlike SDTM, no regulatory pressure to implement ADaM

22

Reporting Database

RAWModulePackageDatasets

Etc…

WBDC LAB

GRand

AMOS

CROSDTMData

Domains

SupplementalQualifiers

UnalteredSource

Datain SDTMformat

Su

pp

lem

en

tal

Qu

alifi

ers

Ke

y I

D V

ari

ab

les

DerivedVariables

DerivedObservations

Study Database

(RAW Data) Mapping toSDTM

Reporting Database

Deri

ved

Vari

ab

les

Superset New Dataset

+

R_AE

R_DM

R_VS

Etc.

RD_xx

RH_xx

Etc.

23

Reporting Datasets (R_)

� Datasets must remain fundamentally unchanged from the SDTM source data. An R_ dataset is a superset of the SDTM dataset

SDTM RDB

� Original SDTM dataset name retained, but prefixed with R_

� All information from SUPP-- datasets re-attached to parent RDB dataset

Observations

R_VS

(Superset)

VS VS

Su

pp

VS

SuppVS

Vari

ab

les

24

RDB General Conventions

� All reporting must take place directly from Reporting Database defined at study level

� All variables used for reporting must be created in relevant reporting dataset

� Subject datasets must have at least 1 observation per randomized subject

� All SDTM data must be present in Reporting Database

� Original SDTM data cannot be amended, but new variables or observations can be created as needed (e.g., imputing dates)

� All naming conventions defined by SDTM must be followed when generating additional variables

25

RDB Common Dataset Features

� Datasets taken from source database – name prefixed with R_ (e.g., DM becomes R_DM)

� New derived datasets – name prefixed with RD_(e.g., RD_SUBJ)

� Transposed datasets – name prefixed with RH_ (e.g., R_LB becomes RH_LB)

� Datasets must contain Key variables to uniquely identify every observation

� Duplication of variables across multiple datasets should be avoided (except for Key and Cross variables)

� Duplication of source (SDTM) variables should be avoided

� Variables defined at a higher level must not have attributes changed, except in the following circumstances:

� Length may be increased

� Algorithm may be project-specific

26

RDB Use of Codes and Decodes

� Historically, codes and decodes used widely

� Associated using SAS formats

� Loses all meaning outside of SAS

� SDTM does not use codes and decodes

� Variables defined using explicit text values to describe observations

� Clear, unambiguous and interpretable irrespective of the tools or software used

� RDB based on SDTM

� Codes and decodes not used in final reporting datasets

27

Transposed Datasets

� RAW datasets may be transposed to contain re-structured RAW data (e.g., RH_dataset = horizontal structure, RV_dataset = vertical)

� Normally only considered for Findings domains

� Original dataset must still exist as R_dataset

� May make reporting easier (e.g., lab parameters reported as columns)

28

Transposed Datasets

� Carefully consider whether transposed data is essential and/or appropriate

� Duplicates data

� Variable names driven by --TESTCD can be meaningless, e.g.,:

� Significant loss of information

� e.g., original results, units, reference ranges, analysis flags,etc.

� Contravenes CDISC SDTM convention to store units as a separate variable qualifier to the test result

Unique

subject

Identifier

Visit

name

Alanine

Aminotranferase

(ukat/L)

Albumin

(g/L)

Alkaline

Phosphotase

(ukat/L)

Aspartate

Aminotranferase

(ukat/L)

USUBJID VISIT L01101 L01118 L01104 L01102

29

Example SDTM to RDB map

30

Lessons learned

� Mapping takes a lot of effort!

� Ambiguity in guidance

� Individual opinions and interpretations

� Get your conventions right

� Often had to revisit decisions as experience grew

� Big differences between CRF and SDTM standards:

� Purpose: data collection vs. data storage

� Coding: codes vs. text(e.g., 1, 2, 3 vs. mild, moderate, severe)

� Structure: horizontal vs. vertical

� SDTM IG v3.1.2 a big improvement

� Introduction of Clinical Findings (CF) domain really helped with many difficult mappings

31

Changes for SDTM IG v3.1.2 – CF

General Observation ClassesSpecial Purpose

Datasets

Interventions Events Findings Demographics

Comments

Related

Records

Supplemental

Qualifiers

Trial Design

Clinical Findings (CF) Domain

� Findings about Events or Interventions that don’t fit

in SDTM domain variables for those classes

� CFOBJ (Object of Measurement): Event or

Intervention that is the subject of the test evaluation

� Mandatory, but won’t necessarily have a

parent record in another domain

32

MHTERM MHOCCUR MHPRESP

MHSTDTC

MHCAT


33

CFOBJ

CFTESTCD = OCCUR

CFORRES = answer provided in checkbox

CFCAT


34

CFOBJ

CFTEST

CFORRES

CFCAT


35

ExampleRow USUBJID CFSEQ CFOBJ CFTEST CFTESTCD CFDTC

1 D06-608-123 1 HYPERTENSION OCCURRENCE OCCUR 2006-08-28

2 D06-608-123 2MYOCARDIAL

INFARCTIONOCCURRENCE OCCUR 2006-08-28


INFARCTION

DATE OF MOST

RECENT MIMY_LDAT 2006-06-20


INFARCTIONNUMBER OF MI MYNO 2006-08-28

Row USUBJID VISITNUM CFORRES CFSTRESC CFCAT

1 D06-608-123 1 CURRENT CURRENTSPECIFIC CV MEDICAL AND

SURGICAL HISTORY

2 D06-608-123 1 PAST PASTSPECIFIC CV MEDICAL AND

SURGICAL HISTORY

3 D06-608-123 1 2006-06-20 2006-06-20SPECIFIC CV MEDICAL AND

SURGICAL HISTORY

4 D06-608-123 1 2 2SPECIFIC CV MEDICAL AND

SURGICAL HISTORY

(continued)


36

Summary

� CDISC Implementation is a huge

task

� AZ strategy allows for step-wise

implementation

� CDASH

� ADaM

� Mapping tool really assists process

� Easy inheritance

� Reuse of standards and code

� SDTM IG v3.1.2 big improvement

37

Questions and Answers

38Thank You

AZ CDISC Implementation

Documents

corporate data standards

raw data standards

process linear process

process raw

sdtm scope

terms of sdtm source

cdisc sds team

rdb mapping function