define.xml: A Crash Course Frank DiIorio, CodeCrafters, Inc. 1 define.xml: A Crash Course Frank DiIorio CodeCrafters, Inc. Philadelphia PA define.xml XSL Xpath XML Mapper validation define.pdf metadata tables define version ‘x schema/XSD XMLPad metadata interface metadata storage SAS Clinical Standards Toolkit XSL-FO iText JavaScript CSS HTML ODM ODM extensions XML4Pharma CDISC standard version ‘x sponsor requests Oracle/database (the other) define.pdf old school brute fo Remember define.pdf? • Purpose: document deliverables Datasets: description, structure, sort order Variables: attributes, codes, derivation, et al. • Created using: Metadata, SAS macros • Contents validated by: Visual inspection Programmatic checks of the metadata • FDA now requests define.xml, aka CDSISC’s “Case Report Tabulation Data Definition Specification” • And conceptually it resembles define.pdf … define.xml: Dataset-Level (transformed by XSL)
12
Embed
define.xml: Dataset-Level (transformed by XSL) · 2013. 9. 25. · XSL Xpath XML Mapper validation define.pdf metadata tables define version ‘x schema/XSD XMLPad metadata interface
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
define.xml: A Crash Course Frank DiIorio, CodeCrafters, Inc.
• Contents validated by:Visual inspectionProgrammatic checks of the metadata
• FDA now requests define.xml, aka CDSISC’s“Case Report Tabulation Data DefinitionSpecification”
• And conceptually it resembles define.pdf …
define.xml: Dataset-Level (transformed by XSL)
define.xml: A Crash Course Frank DiIorio, CodeCrafters, Inc.
2
define.xml: Variable-Level (transformed by XSL)
define.xml: Similar, but …
• define.xml differs from define “classic”:Unlike a PDF, it is easily machine-readableIt follows a strictly defined format (schema)It’s “meatier” than define.pdf, requiring muchricher metadataRequires validation of
• syntax• compliance with schema
• Clearly, we’re dealing with something new andcomplex
This Presentation …
• Briefly reviews XML basics• Describes metadata needed to support
construction of define.xml• Presents one way to build the XML file• Shows how to validate the file• Discusses define.pdf (no, not that define.pdf!)• Focuses on define Version 1 but identifies
issues relevant to Version 2• Is simply an overview of the file creation and
validation process
define.xml: A Crash Course Frank DiIorio, CodeCrafters, Inc.
3
XML Basics
• Extensible Markup Language: plain text with mark-up(“tags”) similar in look & feel to HTML
• Content is user-defined, by schemas• Files are collections of elements (aka “nodes”), each of
which can have one or more attributes. Elements canbe arranged in a hierarchy.
• Unlike HTML, emphasis is on data content, not itsdisplay
• XML is part of a “family” of specificationsXSL – transforms XML into another formatXPath – navigates within the document. Used by XSL.XSD/Schema – defines rules for content and structure ofan XML file
XML Basics, Illustrated
“Study” element“OID” attribute of “Study”element
Element hierarchy: “GlobalVariables”is child of “Study”
Schema specifies whichelements can repeat
Schema specifies validattribute values
define.xml: A Crash Course Frank DiIorio, CodeCrafters, Inc.
4
define.xml Basics
• define.xml must be valid from two perspectives:SyntaxContent (compliance with schema)
• define schema/contentAn extension of the CDISC Operational Data Model(ODM)Schema controls content, not display
• Rules for names, attributes, number of occurrences, order ofnodes, etc.
• A value can conform to the schema but still be wrong! (e.g., typeis Integer but really should be Float)
Available at CDISC, OpenCDISC web sitesDetermining what goes where is, arguably, the hardestpart of the file creation process.
Node OrderStart of OpenCDISC XML file showing node order
What You’ll Need
• An XML Viewer/Editor (display ODM schema,define.xml, XSL) such as:
XMLpadSAS XML Mapper
• ValidatorOpenCDISCSAS Clinical Standards ToolkitXML4PharmaCan be supplemented with home-grown tools
• Knowledge and patienceW3Schools.com, other sites/books
Between the Tags: Metadata• Metadata
Drives the creation of the XMLAnd can also be used for various tasks throughout theproject life cycle (next slide)
• Metadata tables can include:Study-level: protocol name, standard name/versionDatasets: name, structure, key fieldsVariables: attributes, controlled terminology usage,derivation/CRF sourceValue: detail of variable values (test codes, etc.)Comp. algorithms: extended and/or repeated derivationsControlled terms: descriptions and values ofcoded/enumeratedResults: description of TFLs – name, content, source(s),etc. (new in define v2)
define.xml: A Crash Course Frank DiIorio, CodeCrafters, Inc.
5
Metadata: Usage Throughout Study Life Cycle
VariablesTable
%cre8Spec %attrib
%domSplit
%domChk
%crXFDF %xpt
%defXML
%defPDF
domain
variable
type
length
label
order
definitionProg
definitionSub
use
crflocation
core
EDC /raw program /
validate domain XPTdefine.xml/pdfStudy
setup
\study\data\prog
m’data. config. sdtm. adam
blankcrf.pdf
exportdefineotherdataset
spec
Metadata Issues
• DesignIdeally, maps (directly/views) to XML elementsand attributes with a minimum of transformationShould be sensitive to changes in standards:
• define.xml• data (SDTM, ADaM)
• StorageThe metadata should be regarded as a valuablecorporate asset.So don’t store it in Excel! Oracle or similarenterprise-level database is a far better choice(though more resource intensive).
define.xml: A Crash Course Frank DiIorio, CodeCrafters, Inc.
6
Metadata Issues: Entry (Dataset-Level)
Metadata Issues: Entry (Variable-Level)
define.xml: A Crash Course Frank DiIorio, CodeCrafters, Inc.
7
Building the XML
• Many ways to do this, among themSAS Clinical Standards ToolkitBrute force: Macros, DATA steps
• Benefits: extreme flexibility with respect to order ofdataset display, control of Comments content,selection of XSL, etc. Also, tool (macros) can performXML validation, create ZIP file of deliverables
• Drawbacks: lots of code; has to be responsive tochanges in the standards
Building (or not) the XSL
• XSL transforms XML into other formats (HTML is themost common) and makes the XML reader friendly.
• Since the define XML is in a predictable format,transformation of any file for any study can be done witha standard XSL file (the “XML Promise”)
• The XSL is identified by a reference in the XML:
• Consider whether the sponsor will accept theXSL (ActiveX, JavaScript, securityconsiderations)
define.xml: A Crash Course Frank DiIorio, CodeCrafters, Inc.
8
Sample XSL from Early CDISC Pilot<!-- ***************************************** --><!-- Code List Items --><!-- ***************************************** --><xsl:iftest="/odm:ODM/odm:Study/odm:MetaDataVersion/odm: CodeList[odm:CodeListItem]"> <div id="decodelist"> <xsl:for-eachselect="/odm:ODM/odm:Study/odm:MetaDataVersion/odm CodeList[odm:CodeListItem]"> <fieldset> <xsl:attribute name="id">CL.<xsl:value-of select="@OID"/></xsl:attribute> <legend>Code List - <xsl:value-ofselect="@Name"/>, Reference Name(<xsl:value-of select="@OID"/>) </legend> <table>
Syntaxresembles XML
Inclusion of “pure” HTML
The XSL can buildHTML statements
Element selection requiresknowledge of XPath
Coding of XSL can dramatically affect transformation and readability of anXML file, as shown in next slides …
define.xml: Style Sheet 1
The difference is in the HTML created by the XSL, not in the XML itself!
define.xml: A Crash Course Frank DiIorio, CodeCrafters, Inc.
9
define.xml: Style Sheet 2
The difference is in the HTML created by the XSL, not in the XML itself!
Did We Get It Right? Validating the XML
• Recall define.pdf v. define.xml discussion: different,more stringent and definable validation requirements
• Ensures names/values, attributes, occurrences, order ofnodes conform to the schema.
• But we can’t validate that the data makes sense!Var. length of 20 may be valid according to the schema,but if length in the dataset was >20, problem lieselsewhere