Top Banner
© 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D © 2009 Octagon Research Solutions, Inc. All Rights Reserved.
18

© 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

Dec 18, 2015

Download

Documents

Arnold Sullivan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.1

Octagon Research Solutions, Inc.Leading the Electronic Transformation of Clinical R&D

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.

Page 2: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.2

Data Profiling

Octagon Research Solutions

Page 3: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.3

Metadata Profiling

• Metadata (structure)– Likeness of nomenclature among study

databases– Answer some planning questions:

• Claim: “The studies are 90% identical.” Are they?

• If they indeed are, can you to create pool(s) of source data to gain efficiency?

Not our main focus today

Page 4: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.4

Data Profiling

• Data (content)– Statistics, e.g., min, max, average– Relationship– PatternFact: Data are often “bad, worse, or ugly”

Goal: Get a realistic pulse on quality of the data

Page 5: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.5

Case Study(“Slightly” Altered for Illustration Purposes)

• Background– Central lab, i.e., eDT

• CHEM for biochemistry (20807 records), along with 4 other labs

– No annotated CRF• Mapping document initially authored using

variable label

Page 6: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.6

Case Study (con’t)

• Sponsor decisions:– Match standard results with original results,

i.e., no unit conversion; therefore, LBSTRSC = LBORRES

– LPARM to (LBTEST and LBTESTCD) will be done through a sponsor-supplied lookup table

Easy enough, right?

Page 7: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

High-level mapping based on source dataset metadata

Page 8: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.8

Case Study (con’t)

• Programmer noticed errors– LBSTRESN is a numeric variable, but

CHEM.LVALUE contains non-numeric data

• Programmer determined the mapping specifications document is not detailed enough, began to involve the analyst

Page 9: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.9

Case Study (con’t)

• Let’s look some options at their disposal (novice to veteran):– SAS System Viewer– A creative method by an Excel-savvy– SAS PROC FREQ

Page 10: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.10

Case Study (con’t)

• SAS System Viewer– Read-only, great for displaying data– Unreliable as a data browser

• Analyze data in Excel– Very manual– Changes of data ownership, possible “lost in

translations”?• “Smart” behaviors, e.g., “01JAN2009 12:00” to “1/1/2009

12:00:00 PM”, auto-trimming, etc

• SAS PROC FREQ– CHEM.LVALUE: 20807 records reduced to

1237 unique values

Page 11: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.11

Case Study (con’t)

• 4th option– A data pattern analyzer

Page 12: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.12

Case Study (con’t)

– Reduced 20807 records to only 11 patterns

Aha, we found the needle in the haystack! 0.3% of LVAULE is not numeric.

Page 13: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.13

Case Study (con’t)

– Drilled down to the actual values with non-numeric data patterns

Page 14: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

Through issue/resolution with the sponsor, addeddetailed instructions for LVALUE to accommodatethe non-numeric values

Page 15: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.15

Another Data Pattern Example #1

• Source: Character variable AEV.STOP (AE stop date), being mapped to AE

• Realized source is “somewhat” a free-form field– Critical data point, must

handle case-by-case using regular expression (regex) technique

Page 16: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.16

Another Data Pattern Example #2

• Source: Character variable DOSE.DOSE_ACT (Actual dose), being mapped to EX

• Realized source does not always contain numbers– Used both

EX.EXDOSE and EX.EXDOSTXT

Page 17: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.17

Wrapping Up

• Integrated data profiling – a tool demo

• The bigger picture:– Data rules (e.g., pre-defined business

rules, data standards, etc)– Data corrections

• Although ETL is a solution platform for CDISC SDTM data conversion, too much of it is symptom of a problem

Page 18: © 2009 Octagon Research Solutions, Inc. All Rights Reserved. 1 Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D ©

© 2009 Octagon Research Solutions, Inc. All Rights Reserved.18

Thank you!

Anthony Chow

[email protected]

(610) 535-6500 x5526