Top Banner
OASIS Electronic Trial Master File Standard Technical Committee Content Classification Layer January 20, 2014 9:00 – 10:00 AM PST
24

OASIS Electronic Trial Master File Standard Technical Committee Content Classification Layer

Feb 24, 2016

Download

Documents

yoland

OASIS Electronic Trial Master File Standard Technical Committee Content Classification Layer. January 20, 2014 9:00 – 10:00 AM PST. Agenda. Roll Call. Meeting Etiquette. Announce your name prior to making comments or suggestions Keep your phone on mute when not speaking (#6) - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

OASIS Electronic Trial Master File Standard Technical

Committee

Content Classification Layer

January 20, 20149:00 – 10:00 AM PST

Page 2: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

AgendaTopic Presenter

9:00-9:05 Call to Order & Roll Call Zack Schmidt

9:05-9:10 Approval of Minutes https://www.oasis-open.org/committees/documents.php?wg_abbrev=etmf

All

TC Process and Administration (deferred) Chet Ensign

2

9:10-9:20 Outreach Subcommittee - All Jennifer Alpert9:20-9:50 Tech presentation – Content Classification Layer Z. Schmidt/Aliaa

9:50-9:55 New Business All

9:55-10:00 Next meeting agenda / Date Z. Schmidt

Page 3: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Name Company Voting Status Present?Jennifer Alpert Palchak CareLex Voter y

Aliaa Badr CareLex Voter yOleksiy (Alex) Palinkash CareLex Voter yTroy Jacobson Forte Research Voter yLou Chappuie Individual Voter yLisa Mulcahy Individual Non-Voter yRobert Gehrke Mayo Clinic Voter n

Rich Lustig Oracle Non-Voter yMichael Agard Paragon Solutions Non-Voter yChristopher McSpiritt Paragon Solutions Non-Voter y

Jamie O’Keefe Paragon Solutions Non-Voter nFran Ross Paragon Solutions Non-Voter yPeter Alterman SAFE-BioPharma Voter yCatherine Schmidt SterlingBio Voter yZack Schmidt SureClinical Voter yTrish Whetzel, PhD SureClinical Non-Voter yPeter Junge Beijing Sursen Observer nLaura Hilty Forte Research Observer nTony O’Hare Forte Research Observer nEldin Rammell Rammell Consulting Observer nRobin Cover OASIS staff Non-Voter nChet Ensign OASIS staff Non-Voter n

Roll Call

Page 4: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Meeting Etiquette• Announce your name prior to making comments or

suggestions • Keep your phone on mute when not speaking (#6)

• Do not put your phone on hold – Hang up and dial in again when finished with your other call – Hold = Elevator Music = very frustrated speakers and participants

• Meetings will be recorded and posted– Another reason to keep your phone on mute when not speaking!

• Use the join.me “Chat” feature for questions / comments / Votes

• We will follow Robert’s Rules of OrderNOTE: This meeting is being recorded and minutes will be posted on TC page after the

meeting

From eTMF Std TC to Participants:Hi everyone: remember to keep your phone on mute

4

Page 5: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

• Status – New Members:– Oracle – Joined– In Progress: EMC, Kaiser Permanente, Shire,

Medtronics• Activities / Milestones

Outreach Subcommittee

Page 6: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

• Status• Timeline• In parallel with other Tech work from charter

Tech Discussion

Page 7: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

–Classification System Components:

• Classification Categories

– Taxonomy, hierarchy

• Metadata (‘Tags’)– Characterizes content

• Content Model– Published set of

classifications, metadata for a domain (e.g., eTMF)

Content Classification System Discussion

Page 8: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Classification Categories Component

– Hierarchy of categories

• Categories, subcategories, content types

– Defined relationships with rules: Parent-Child

– All categories, content types required to have unique names and machine codes

– Each content type is associated with Metadata Properties (includes core and domain-specific)

– Content items are linked to content types.

– Unique classification and term codes based on Universal Decimal Classification System (UDC) numbering, widely used in libraries worldwide. Human and machine readable; infinitely expandable

– Can be described, edited and validated using OWL editor (like open source editor Protégé’)

– Supports any simple text vocabulary, including TMF Ref Model and other vocabularies

– W3C OWL2 and RDF/XML supported

Classification Categories Component

StudyDigital Content

Classification Categories Hierarchy

Page 9: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Metadata Component– Used to tag or index digital content itemsMetadata Classes:Core - Comprised of four areas:

File Properties, Classification, Audit Trail Business Process

Domain-specific -- Metadata for a domain in life sciences such as eTMF, finance, legal administration, or others. Uses standards-based terms from groups like NCIOrg Specific – Metadata that meets organizations needs – not standards basedGeneral – obtained from public standards-based vocabulary terminology resources like dublin core Annotation Properties

Metadata about classification categories and metadata: Core, Org-Specific metadata

Metadata ComponentCore Metadata Example – File Properties:

Page 10: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Content Model Component

– Contains classification hierarchy, metadata in machine readable format:

Content Model Component

Page 11: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Term Sourcing Concepts:• Terms adopted by standards bodies should be used first in eTMF model

Primary Term Sources for eTMF Classification System:– Internet Standards Dev Orgs: W3C, IETF, ISO, etc.

» Required for interoperability of machine code

– NIH NCIthesaurus: Term database for FDA, CDISC, HL7, other orgs

» Required for interoperability of clinical / health sciences data

Secondary Term Sources for eTMF Classification System:• Industry sources – widely used terms in enterprise content mgmt software, TMF RM

Classification System – Term Sources

*Spec, Table 6, p21

Page 12: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Classification Categories Component

– Classification hierarchy and numbering is based on UDC library numbering standard and XML naming

– Digital dot notation – Designed for human and machine readability

– Each number is also a unique code for naming and ordering in the hierarchy

– Primary Categories (PC): Three digit. eTMF: 100-200

– Subcategories (SC): Two digit: 10-99

– Content Types (CT): : Two digit: 10-99

– Maximum number of Sub-Category divisions is 5, excluding the 3-digits for the Primary Category

[1] Per spec section 2.1.1; 6.0

Classification Categories Component

Classification Categories Hierarchy and Numbering [1]:

Hierarchy Numbering/Naming Considerations: • Flexible, standards-based approach (W3C XML compliant naming*)• Ability to add multiple hierarchy divisions / levels

• Proposed: 5 divisions = [100*905) = 5.9x1011 Content Types• Uniqueness of numbers – usable as machine code identifiers• Machine readable, human readable• No sorting issues, no need for leading zeros*, no special chars

*Leading zeros in XML syntax are ignored: http://www.w3.org/TR/REC-xml/

Page 13: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Numbering and Naming Scheme

Numbering

• Primary Categories and Sub-Categories :

– Category Code number

• Content Type:

– Content Type ID

Naming

• Primary Categories and Sub-Categories

– Simple text-based names

– Unique name, 64 char limit

– Abbreviation – 16 char limit suggested

– Compatible with W3C XML naming standards :

No special characters :

( ) < > ? / % # @ !

Classification Categories ComponentExample: Classification Categories Hierarchy, Naming, Numbering

Page 14: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Modifying Classification Category Entities – General Editing Rules

Domain Specific

– Classifications cannot be deleted –> Reserve/Unreserve

– Modifications allowed to some annotation properties (see spec)

– Codes (Category Codes, CT Type ID) cannot be generated

Organization Specific

– Classifications can be deleted

– Modifications allowed for classification metadata, annotations

– Codes (Category Codes, CT Type ID) can be generated

Classification Categories Component

Classification Category, Content Type Editing Rules*

Type Import Terms Generate Code

Add/Modify Delete/Reserve

DomainSpecific

Yes No No/Yes** Reserve/Unreserve

OrganizationSpecific

Yes Yes Yes/Yes Delete

*Spec, Table 6, p21

**Annotation metadata

Page 15: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Classification Editing Tool – Free, Open Source Protégé (From Stanford University: http://protege.stanford.edu/ )

*Spec, Table 6, p21

Protégé Editor:-Edit Classification Taxonomy and Metadata Terms-Validate Taxonomy and Term name compliance-Create valid RDF/XML Ontology

Page 16: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Proposed Classification System has following Properties:

• Based on Naming and Numbering that is W3C XML compliant

– No special characters: ( ) & # @ / … etc.

– No leading zeros in classification numbers

• Based on Universal Decimal Classification (UDC) system for content classification:

– 100199 : eTMF Domain

– UDC system used in 170+ countries worldwide; expandable, human and machine readable, sortable http://en.wikipedia.org/wiki/Universal_Decimal_Classification

• Flexible and customizable for organizations, yet interoperable

– Domain classifications – Standardized; Organization-specific classifications – Editable

• Defined set of rules for Editing, modifying Taxonomy

• Any Organization can Modify/Edit taxonomy using open source editors like Protégé

Classification Categories - Summary

*Spec, Table 6, p21

Page 17: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Appendix

Page 18: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Content Classification System – Core Terms needed for Architecture – Objectives:

• Classification, Subclassification concept -

– Supports RDF/XML, OWL languages

– Non-domain specific, generic terms

– Easily understandable by anyone - conveys concept

– Conveys hierarchy

– No conflicts – not a reserved term in RDF/XML, OWL or other compilers/ IDE’s

– First priority – Source terms from standards bodies

Classification System – Core Terms

*Spec, Table 6, p21

Page 19: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Content Classification System – Core Terms needed for Architecture

• Classification, Subclassification term concept:

Classification System – Core Terms

*Spec, Table 6, p21

Term Options: Source DefinitionCategory, SubCategory NIH NCIthesaurus Category: ‘This term is used informally

to mean a class of things’ (NCI code: C25372); Subcategory: ‘A subdivision that has common differentiating characteristics within a larger category.’ (NCI Code C25692)

Class, SubClass W3C OWL Class: ‘Resources may be divided into groups called classes’ SubClass: ‘Subclasses are classes; If a class C is a subclass of a class C', then all instances of C will also be instances of C'. (W3C RDF Class def)

TMF Zone, Section TMF Ref Model TMF Zone = Primary Classification (no published def found online) Section = SubClassification (no published def found online)

Proposed Term

Page 20: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Content Classification System – Core Terms needed for Architecture

• Classification, Subclassification term concept:

Classification System – Core Terms

*Spec, Table 6, p21

Term Options: Source +/-Category, SubCategory NIH NCIthesaurus +Everyone knows it

+Describes hierarchy+In use by standards body (NIH NCI Thesaurus)+Generic

Class, SubClass W3C OWL +Describes hierarchy+In use by standards body+Generic - Could be a reserved word for some development tools

TMF Zone, Section TMF Ref Model +In use by TMF RM users-Doesn’t convey hierarchy-Not in use by standards body-Not Generic

Proposed Term

Page 21: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Content Classification System – Core Terms needed for Architecture – Objectives:

• Content Type concept

– Supports RDF/XML, OWL languages

– Non-domain specific, generic terms

– Easily understandable by anyone – conveys concept

– No conflicts – not a reserved term in RDF/XML, OWL or other compilers/ IDE’s

– First priority – Source terms from standards bodies

Classification System – Core Terms

*Spec, Table 6, p21

Page 22: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Content Classification System – Core Terms needed for Architecture

• Content Type term concept:

Classification System – Core Terms

*Spec, Table 6, p21

Term Source DefinitionContent Type W3C &

CareLexOracle

W3C: ‘Specifies the nature of a linked resource’ W3C and RFC2045] and [RFC2046]

CareLex: A content type is a reusable collection of metadata, business processes, behavior, and other settings for a category of items or documents in electronic content material.

Oracle: Content types are used to define the metadata that you can associate with content.

Artifact TMF Ref Model ‘A collection of documents’Wikipedia (Not published)

Proposed Term

Page 23: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

Content Classification System – Core Terms needed for Architecture

• Content Type term concept:

Classification System – Core Terms

*Spec, Table 6, p21

Term Source +/-Content Type W3C +Widely used in internet SW

+ECM SW use - Microsoft, Oracle, Alfresco, etc. +In use by standards body (W3C)+Generic

Artifact TMF Ref Model +In use by TMF RM users-Not in use by standards body-Not Generic -Doesn’t convey concept of metadata

Proposed Term

Page 24: OASIS  Electronic Trial Master File Standard Technical Committee  Content Classification Layer

• Roll call

• Reports– Outreach– Tech Discussion: Classification Layer: Core Metadata (Charter item 2, p.2)

• New business

Draft Agenda: Next Meeting