Top Banner
Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis
32

Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Dec 14, 2015

Download

Documents

Desiree Haddon
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Clinical Genomics Work Group (HL7)

Mukesh Sharma

Washington University in St. Louis

Page 2: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Agenda

• Clinical Genomics Work Group

• Family History Project

• Genetic Variation

• Cytogenetics LOINC codes

• Gene Expression DAM

• Genomic Specimen Model Project

• New Models for Future Ballot

• Useful Links

Page 3: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

The HL7 Clinical Genomics (CG) Work Group• Established as a SIG in 2003

• Mission To enable the standard use of patient-related genetic data such as DNA sequence variations and gene expression levels, for healthcare purposes (‘personalized medicine’) as well as for clinical trials & research

• Work Products and Contributions to HL7 ProcessesThe Work Group will collect, review, develop and document clinical genomics use cases in order to determine what data needs to be exchanged. The WG will review existing genomics standards formats such as BSML (Bioinformatics Sequence Markup Language), MAGE-ML (Microarray and Gene Expression Markup Language), LSID (Life Science Identifier) and other. This group will recommend enhancements to and/or extensions of HL7's normative standards for exchange of information about clinical genomic orders and observations.

In addition, Clinical Genomics will seek to assure that related or supportive standards produced by other HL7 groups are robust enough to accommodate their use in both research and clinical care use. The group will also monitor information interchange standards developed outside HL7, and attempt harmonization of information content and representation of such standards with the HL7 content and representation.

Page 4: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

CG Work Group Leadership (Co-Chairs)

• Joyce Hernandez

Merck & Co. Inc.

• Kevin Hughes MD   Partners HealthCare System, Inc.

• Amnon Shabo, PhD

IBM

• Mollie Ullman-Cullere   Dana-Farber Cancer Institute

Page 5: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Formal Relationships with Other HL7 Groups

CG Work Group coordinates with a large number of other Work Groups in order to accomplish its mission. Strongest relationships are with

•Orders and Observation

•Clinical Statement

•Clinical Decision Support

•Regulated Clinical Research Information Management

•Patient Care

•Electronic Health Records

•Modeling and Methodology

•Structured Documents

Page 6: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

WG meetings/Balloting Cycles

• 3 times annually

• January, May, September

• 2010 meetings

• January 17–22, 2010 meeting at Pointe Hilton Squaw Peak, Phoenix, AZ

• May 17-20, 2010 meeting at Windsor Barra Hotel and Congressos, Rio De Janerio, Brazil

• October 3-8, 2010 meeting at Cambridge, MA

Page 7: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Clinical genomics Work Group Meeting

OCTOBER, 2010 Update

Page 8: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Open Floor

Discussed FDA regulations and groups concern about reporting raw data

• FDA wants raw data to be part of medical record but it is very expensive to store the data.

• Some members raised concerns that e.g. for next generation sequencing they do not have space to store raw data and quality scores etc.

Page 9: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Overview of Activities

Three Tracksv3:Family History (Pedigree) TopicGenetic Variations TopicGene Expression TopicCMETs defined by the Domainv2:v2 Implementation Guides* The IG “Genetic Test Result Reporting to EHR” is modeled after the HL7 Version 2.5.1 Implementation Guide: Orders And Observations; Interoperable Laboratory Result Reporting To EHR (US Realm), Release 1CDA:A CDA Implementation Guide for Genetic Testing ReportsCommon:Domain Analysis Models for the various topicsA Domain Information Model (v3) describing the common semanticsSemantic alignment among the various specsNormative (V3); DSTU (CDA); Informative (V2)

Page 10: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

HL7 Clinical Genomics: The v3 Track

FamilyHistory

Domain Information Model: Genome

Gene Expression

Phenotype(utilizing the HL7 Clinical Statement)

Utilize

Co

nstra

in

Genetic VariationC

on

strain

utilize

Page 11: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Family History

Background

• HL7 and ANSI approved pedigree model• Numerous implementations within care setting• Deployed by Surgeon General’s My Family Health Portrait and

MS Health Vault

Status

• Several groups developing compliant family history tools have confirmed need for compliance testing framework; therefore….• Canonical Pedigree project to develop tools to test compliance to Pedigree

standard and interoperability• Hosted Web Service, using Pedigree Standard, provides hereditary cancer

risk assessments

Page 12: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Genetic Variation

Background• Approved CMET: passed normative ballot under reconciliation• Published HL7 v2.5.1 Lab Reporting Implementation Guide (IG)

for structured clinical genetic test results

Status• Genetic Test Report Project using Clinical Document

Architecture (CDA)• Release 2 of 2.5.1. IG, expanding to new clinical scenarios (e.g.

tumor genetic profile) and genetic test definition• Genetic test orders will be a collaborative modeling effort (e.g.

Clinical Genomics, Orders & Observation, Laboratory)• Starting analysis for scope expansion to whole genome

sequencing• Starting analysis for utility of data set in clinical/research data

warehouse

Page 13: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Cytogenetics LOINC Codes 1

Background• CG has a Genetic Variation Implementation Guide that covers genetic mutations

located within a gene. Need to report larger genetic changes found in cytogenetic testing.

• Develop LOINC codes for representing cytogenetics test results• Develop prototype V2 interface based on the LOINC panel structure

• In Intermountain Healthcare’s DEV environment• Potentially real/live interface between ARUP Laboratories and Intermountain

Healthcare

Status• Officially submitted to LOINC for approval

• Three panels (total 43 codes)• Chromosome Analysis G Banded Panel• Chromosome Analysis FISH Panel• Chromosome Analysis Microarray Copy Number Change Panel

• Additional 11 codes• Drafting HL7 V2 Implementation Guide for Cytogenetics

• Sample messages, etc.• Detailed data models and associated terminology are created in Intermountain

Healthcare’s development environment

Page 14: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Cytogenetics LOINC Codes 2

Next Step

• HL7 standard development

• Target to ballot the v2 IG in January 2011 ballot cycle

• Develop the cytogenetics section of the CDA Genetic Test Report (GTR)

• Prototyping implementation, eventually real implementation

• Real practical challenges

Page 15: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Genomic Specimen Model Project

Background

• CG has started a Specimen Process Step Project

• Discussion with Orders and Observation (O & O) in Jan 2010 meeting concluded that the requirements should be captured in O & O Specimen Model

• O & O will enhance the Universal Specimen CMET. The scope will be updated and named Specimen CMET enhancement phase 2

• CG will drop the specimen process step project and place a change request on the O & O site to make sure that their use cases are captured in the specimen model

Update

• Scope: Project will detail specimen collection, procedure(s) done on specimen and specimen storage that will affect the quality of the specimen.

• Requirements represented in specimen CMET

• Requirements not represented in specimen CMET

Page 16: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Requirements Represented In Specimen CMET

Specimen Handling and Processing Type of preservatives used and amount. Examples: additives used to preserve

RNA/DNA Special handling such as flash freezing

Storage Type of storage used for collected specimen and any genetic extracted

material.

Specimen Access

• Unique identifiers assigned to all materials (both collected and derived) to help manage access to specimens.

Specimen Type

• Whether fluid, tissue, cell or molecular specimen?

Specimen Quantity

• Quantity and/or size of specimen collected.

• In the Specimen model, Natural class is available to capture this information

Page 17: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Requirements Represented In Specimen CMET

Specimen Characteristics

• RNA/DNA characteristics: e.g. Purity values-A260/A230 and A260/A280, RNA integrity number (RIN) number etc.

• QC needs to be done by the specimen core lab O&O : Captured in ObservationEvent. Need an implementation guide for

details. May separate it out in future from Observation Event

Page 18: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Requirements Not Represented In Specimen CMET 1

Genetic Consent Form

• Linking up with the Genetic Consent Form.

• Form signed by the patient to allow genetic/genomic testing and in some cases to permit long-term storage of genetic samples for further research.

• Need to know that there is consent; duration that it allows the specimen to be used for (indefinite or restricted to particular duration or protocols).

• Consent could be withdrawn and as a result the specimen is pulled out and destroyed.

O&O: • Bullet 1 and 2: Present in the current model as part of clinical statement (bullet 1 and 2)

• Bullet 3: Needs to be handled in Medical Records as Medical record owns consent.

• Can not tie consent to specific specimen currently.

• In future, could be captured in the SpecimenProcessStep messaging-and include provision to destroy the specimen. CMET it self does not deal with this activity.

Page 19: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Requirements Not Represented In Specimen CMET 2

Specimen Management

• Specimen Collection

• Two use cases: • i) Patient comes in and we take 2 or more specimens • ii) Patient comes in and we take 1 specimen

• We need to capture the relationship between multiple specimens collected at one time (use case i)

• The universal CMET only has one entry point (SpecimenChoice) i.e. all CMETs are starting from Specimen

•  Suggested Action 

• In the SpecimenChoice Box add the SpecimenCollectionGroup class (especially for use case i)

Page 20: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Requirements Not Represented In Specimen CMET 3

Current Model Proposed Change

Page 21: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Gene Expression DAM Update

• Currently reviewing results of the last ballot ( informative ballot in May, 2010)

• Next steps:• Finish NCI Generic Assay (IRWG/ICR)

• Changes to GE DAM• Add “generic” classes from Generic Assay

• Bring over additional BRIDGE Classes

• Apply suggested changes from the ballot (use case, BRIDG compatibility)

Page 22: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Clinical Genomics DAM (50,000 foot level view)

class Complete Diagram

A-Phenotype

AminoAcid

+ name: String

ArrayDataType

+ name: String+ version: String

HL7 CG Elements

Joyce Addi tions

NCI Model Elements

BRIDG 2.1

Modi fied for CG

MIAME-MAGE

MAGE-T AB

Legend

Expression

- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String

T his is the Domain Information Model for the HL7 Cl inical Genomics Work Group.

It consists of the fol lowing topics:

1. Gene Expression 2. Genetic Variation3. Genotype4. Sequence5. Proteomics6. Links to Cl inical Phenotypes

Entry point for the Gene Expression CMET POCG_RM000031UV

AssociatedProperty

- classCode: String- id: Integer- code: String- text: String- effectiveT ime: String- value: String- methodCode: String

Gene

+ symbol: String+ ful lName: StringColumn+ genbankAccession: String+ genbankAccessionVersion: String+ ensemblgeneID: String+ unigeneclusterID: String+ entrezgeneID: String

Chromosome

+ chormosomeNumber: Integer

DNA

- name

Need defini tions

Use this class for inherent data about the locus, e.g. chromosone no.

RNA

- name

Nucleotides

+ nucleotideName: StringIntron

+ length: Integer+ intronClass: String

Exon

+ length: Integer+ intronClass: String

Nucleobases

+ shortName: String

Phosphate

+ name: String

Ester

+ name: String

Sugar

+ name: String

Codon

+ codonId: Integer

Usha: Relationship should be from Gene to DNA. Portions of DNA correspond to a Gene. Chromosones would have a bunch of genes.

GeneticLocus

- id: Integer- text: String- methodCode: String- chromosomePosition: Integer- cel lT ype: String

T his class is a placeholder for speci fying a locus on the genome, i .e., a posi tion of a particular given sequence in the subject’s genome. Note that the semantics of the locus (e.g., gene) is defined by data assigned in the code & value attributes of this class, and also by placing additional data relating to this locus into the classes (and CMET s) associated with this class.

Genome

- classCode: String- id: Integer- code: String- text: String- effectiveT ime: String- confidential i tyCode: String- value: String- interpretationCode: String- methodCode: String

GeneticLoci

- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- confidential i tyCode: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String

LargeDuplicaiton

- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- uncertaintycode: String- confidential i tycode: String- value: String- interpretationCode: String- methodCode: String

GeneticDocument

- classCode: String- id: Integer- code: String- ti tle: String- text: String- statusCode: String- effectiveT ime: String- confidential i tyCode: String- languageCode: String- setId: Integer- versionNumber: Integer

LargeDeletion

- classCode: String- id: Integer- code: String- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String

Sequence

- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String

Need defini tions.

Need defini tions.

Need defini tions.

Use the value attribute to encapsulate raw data relating to the enti re set of loci . For example, SNP genotyping of a large number of genes/markers.

Cytogenetics

- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String

???OtherNonLocusData

- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String

Need defini tions.

Need defini tions. Need defini tions.

Need defini tions.

GenotypeFinding

- normal izedXIntensi ty: float- normal izedYIntensi ty: float- rawXIntensi ty: float- rawYIntensi ty: float- cal l : String

Indiv idualAllele

- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String

SequenceVariation

- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String

DeterminantPepetide

- classCode: String- id: Integer- text: String- effectiveT ime: String- value: String- methodCode: String

AssociatedObserv ation

- id: Integer- name: String- copyNumber: Integer- zygosi ty: String- dominancy: String- geneFamily: String

Polypeptide

- classCode: String- id: Integer- text: String- effectiveT ime: String- value: String- methodCode: String

Need defini tions. Need defini tions.

Should we leave this out and just add classes as needed?

Entry path to the broadest path of the genetic variation model.

ViralGenetics

- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String

Added as a placeholder. > Future Expansion <

Need defintions

SNPAssay

- designAl leles: String- designScore: Float- designSequence: String- designStrand: String- id: Long- status: String- vendorAssayId: String- version: String

SNPPanel

- assayCount: Integer- description: String- id: Long- name: String- technology: String- vendor: String- vendorPanelId: String- version: String

SNP Design classes

Material

+ id: Integer+ description: String+ name: String+ formcode: String

ExtractedNon-GeneticSample

- extractedSsampleId: Integer- extractedAmount: Integer- extractedAmountUOM: String- extractionMethod: String::Material+ id: Integer+ description: String+ name: String+ formcode: String

ExtractedGeneticSample

+ geneticSampleId: Integer+ geneticSampleT ype: String+ extractedAmount: Integer+ extractionMethod: String+ GeneticSampleT ype: int+ hybridization: int+ authorizationLink: url::Material+ id: Integer+ description: String+ name: String+ formcode: String

OriginalBioSpecimen

+ amount: int+ uni tofMeasure: String+ statusCode: String+ statusDateRange: String::Material+ id: Integer+ description: String+ name: String+ formcode: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

HandlingDocument

- Id: int- text: String

SpecimenCharacteristics

+ Id: int+ color: String+ clari ty: String+ condition: String

Collection

- col lectionMethod: int- id: int::Handl ingDocument- Id: int- text: String

Storage

+ id: Integer+ flashFrozenMethod: String+ temp: Integer+ storageMethod: String::Handl ingDocument- Id: int- text: String

Transportation

- id: Integer::Handl ingDocument- Id: int- text: String

Assume this is a generic l ist of al l material . Speci fic material used and tracked within the conduct of a study and/or cl inical care would be uniquely identi fied via other classes (i .e. extracted or resecti ioned samples). T he identi fier is used only for the original biological specimen.

ArrayGroup

- arraySpacingX: float- arraySpacingY: float- barcode: String- length: float- numArrays: Integer- orientationMark: enum(top,bottom,left,right)- width: float

Array

+ arrayIdenti fier: String+ arrayXOrigin: Integer+ arrayYOrigin: Integer+ originRelativeT o: String

ArrayDesign

+ id: Integer+ version: String+ comment: String+ substrateT ype: String+ surfaceT ype: String+ sequecnePolymerT ype: String+ contactId: Integer::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ArrayManufacture

- manufacurungDate: String- tolerance: Integer

Gene expression Design classes

Do we need separate classes for Array Design (GE versus Genetic variation)?T he attributes I have added are from the new MAGE-T AB model .Do we sti l l need number of features (this came from the old version of the model)?

LabeledExtract

- flourescentLabel ingSubstance: String- flourescentLabel ingSubstanceAmount: float- flourescentLabel ingSubstanceUnits: float

Hybridization

- name: String- amountOfMaterial : float

ArrayManufactureDev iation

T his area of the MAGE model seems to be placeholders. T here are relationships to both FeatureDefect and ZoneDefect both of which do not have attributes.

FeatureDefect: Stores the defect information for a feature.T his class points to Posi tionDelta which has coordinate information (del ta X,Y). Posi tionDeltapoints to DistantUnit which contains additional measurement data.FeatureDefect points to an OntologyEntry which contains control led vocabulary. T he l ink constrains the vocabulary entries to represent only "defectT ype".

ZoneDefect: Stores the defect information for a zone.T his class points to the Zone class which does have lower-Right X,Y and upper-Left X,Y coordindates,plus a row identi fier.

Channel

- channel_no: Integer

BioAssayTreatment

+ bioAssayProcess: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ImageAcquistion

+ imageAcquistionMethod

Image

+ name: String+ url : String::DataFi le+ id: String+ name: String+ dataFi leT ype: String+ dataFormat: String::Data+ uri : String+ datatype: String

Should we embed the image as blob, rather than point to i t? Or provide both options?ANS: T oo huge to store in the database. Itis rare to go back to them. But some folks want to keep the images. T hey could be kept in secondary.

Deriv edBioAssayData

Need Array Manufacturing control data. Not chip but chip by overal l .

NOT E: Image is scanned at di fferent wave lengths.

FactorValue

- value: String::Measurement+ value: String+ m invalue: String+ maxvalue: String+ uni t: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Lab_Experiment

+ id: Integer+ ti tle: String+ description: String+ date: date+ assayT ype: String+ experimentalDesigns: String+ formatVersion: String+ publ icIdenti fier: String+ sdrfFi le: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Factor

+ type: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ExperimentalDesigns

+ type: String+ description: String

QualityControl

+ type: String+ qual i tyControlDescription: String

Replicatetypes

+ repl icateT ype: String+ repl icateDescription: String

NormalizationTypes

+ normal izationT ype+ normal izationDescription: String

T hese classes needs to be harmonized to Study Design portion ofthe BRIDG model.

GenomicProtocol

+ id: Integer+ name: String+ type: String+ description: String+ hardware: String+ software: String+ contact: String+ url : String+ publ icProtocolUrl : String

Assume there can be multiple "experiments" for complex studies. ???

Also the new version cal ls this an Investigation. When talking about genomics testing a lot of SMEs use the term "Experiment". Investigation can also be connected more easi ly to the term study which already has a broader scope since i t represents the "cl inical trai l " used in the research context.

Another factor is that the MGED ontology makes references to "ExperimentalProtocol" in a number of places, so i t m ight be better to keep a known term.

Which terms does the team prefer? Is there a term that could fi t both research and healthcare use?

T his class wi l l need speci fic harmonization to the Study class in BRIDG.hardware/software requirements for the arrays need to added. T hese should probably be normal ized into separate classes.

For CG DAM model reviewers:we need more examples: Is there other software required other than the Reporter?

ProtocolApplications

+ edgeId: Integer+ order: Integer+ notes: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Is this the proper way toidenti fy an individual channel?

Germ l ine/Somatic needs to be val idated by a lab test. Should i t just be represented as part of a test and taken out of the bio-specimen?

ImageFile

+ name: String+ status: String+ type: String

Raw ArrayData

DataFile

+ id: String+ name: String+ dataFi leT ype: String+ dataFormat: String::Data+ uri : String+ datatype: String

Need defini tion on what type of data is carried here and which function in the process populates i t.

Val idate that "ordered" means sequenced and does not represent "ordered" from lab.

Feature

+ featureId: Integer+ blockCol : Integer+ blockRow: Integer+ col : Integer+ row: Integer+ reporterid: Integer::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Reporter

+ id: Integer+ controlT ype: String+ sequence: String+ databaseEntry: String::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ReporterGroup

+ reporterGroupId: Integer+ name: String::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Need to di fferentiate between frozen and fix.For breast cancer.Containers need to be added.Example:Non-frozen and frozen tissue samples need to be included.Unfixed tissue sections (sl ide type and sl ide mount. In healthcare)

Add class to handle thechange of state of the material .

T ypical ly cal led protocol of treatment.

In MAGE this is the actual Image and everything that was done to get i t.

Can do another treatment and get another image. Actual steps are not kept for al l images. Usual ly only recorded for the last image.

JH: MAGE-T AB model confl icts with these statements. It has an Assay Class as part of the sdrf package and an Image class as part of the data package. I wi l l rename this Bio-Assay class to just Assay.

NOTE: Mollie: Wants to constrain model for clinical environment at a later point.

Hardw are

+ description: String+ version: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Softw are

+ description: String+ version: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Edge

+ id: Integer+ experimentIdenti fier: String+ input: String+ output: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Need examples for EDGE data, primari ly for input and output. Couldn't find any at themagetab and tabemage si tes.

Node

+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

DimensionElement

+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Performer

+ id: Integer+ personID: Integer+ protocol ID: Integer

Person

+ fi rstname: String+ lastname: String+ m idini tials: String+ affi l iation: String::Contact+ Address: String+ phone: String+ emai l : String+ fax: String+ tol lFreePhone: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

PersonRole

+ role: String::Person+ fi rstname: String+ lastname: String+ m idini tials: String+ affi l iation: String::Contact+ Address: String+ phone: String+ emai l : String+ fax: String+ tol lFreePhone: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Publication

+ id: Integer+ pubMedID: String+ ti tle: String+ publ icationDOI: String+ authorl ist: String+ status: String

Contact

+ Address: String+ phone: String+ email: String+ fax: String+ tollFreePhone: String::Identifiable+ id: URI+ name: String+ properties: String+ description: String

Assay

+ arrayIdenti fier: String+ arrayDataFi les: String+ derivedArrayDataFi les: String+ arrayDataMatricFi les: String+ derivedArrayDataMatrixFi les: String

TechnologyType

+ technologyT ype: String

CompositElement

+ id: Integer+ reporterID: Integer+ databaseEntry: String::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

DesignElement

+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Question on EBI example: http://www.ebi.ac.uk/m iamexpress/help/array_designs.htm l#ADF

Should the CompositeSequenceComment be represented as a databaseentry (Ontology T erm / Value pair) or as variable?

Data

+ uri : String+ datatype: String

DataElement

+ id: Integer+ datamatrixId: Integer+ col : Integer+ row: Integer+ rowQuanti tationT ype: String+ index_: Integer+ secondayKey: String

DataMatrix

Need more information on how this is implemented. Description seems to indicate calculation.See MGED section below:

class Quanti tationT ypedefini tion:T he Quanti tationT ype provides a method for calculating a single datum of the BioAssayData matrix.superclasses: Quanti tationT ypePackageproperties: unique_identi fier MO_67 class_role abstract class_source mageconstraints: restriction: has_scale has-class Scale restriction: has_type has-class DataT ype

Name: Complete DiagramAuthor: hernajoyVersion: 1.0Created: 2006-01-11 12:00:00 AMUpdated: 2010-06-17 7:10:15 PM

DATA MATRIX EXAMPLE from: http://tab2mage.sourceforge.net/docs/magetab_docs.html#datamatrix

Bio-Specimen-Characteristics

+ term: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

T his is equivalent to Material class in the MAGE-T AB model .

Material in this model appl ies to BRIDG and HL7 expanded scope which goes beyond biologic material .

Assume this class needs to represent the many to many associations between the fol lowing MGED concepts. T hese associations attempt to group mathematical functions into nodes.

1. Nodes2. Node Values3. Node Value T ypes4. BioAssays5. BioAssayDataCluster

Normalization

+ derviedArrayDataFi le: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Scan

+ arrayDataFi les: String+ derivedArrayDataFi les: String+ arrayDataMatricFi les: String+ derivedArrayDataMatrixFi les: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Measurement

+ value: String+ m invalue: String+ maxvalue: String+ uni t: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Need sample data for this class.

ProtocolParameter

::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ParameterValue

+ protocolParameterId: Integer+ protocolAppl ication: Integer::Measurement+ value: String+ m invalue: String+ maxvalue: String+ uni t: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Source

+ contactid: Integer::OriginalBioSpecimen+ amount: int+ uni tofMeasure: String+ statusCode: String+ statusDateRange: String::Material+ id: Integer+ description: String+ name: String+ formcode: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

TreatedSample

::ExtractedGeneticSample+ geneticSampleId: Integer+ geneticSampleT ype: String+ extractedAmount: Integer+ extractionMethod: String+ GeneticSampleT ype: int+ hybridization: int+ authorizationLink: url::Material+ id: Integer+ description: String+ name: String+ formcode: String

NameValueType

+ id: Integer+ name: String+ type: String

Definition of Experiment

SPECIMEN HANDLING

ARRAY DESIGN

RELATIONSHIPS BETWEEN: (Samples, Arrays and Data)

GENE EXPRESSION DATA

Usha: May not need sugar and phosphate data.

Specimen Handling

+ type: String+ name: String+ amount: Integer::Handl ingDocument- Id: int- text: String

Shipper

- dateShipped: String- senderT ype: String- senderName: String::Transportation- id: Integer::Handl ingDocument- Id: int- text: String

Receiv er

- dateRecieved: String- receiverT ype: String- receiverrName: String::Transportation- id: Integer::Handl ingDocument- Id: int- text: String

SpecimenContainer

+ containerT ype: String+ risk: String+ handl ing: String+ capaci tyQuanti ty: Integer+ heightQuanti ty: Integer+ diameterQuanti ty: Integer+ capT ype: String+ separatorT ype: String+ barrierQuanti ty: Integer+ bottomDeltaQuanti ty: Integer::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

CellSource

+ T ype: String::OriginalBioSpecimen+ amount: int+ uni tofMeasure: String+ statusCode: String+ statusDateRange: String::ResultInterpretation+ id: Integer::Material+ id: Integer+ description: String+ name: String+ formcode: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ResultInterpretation

+ id: Integer

1

0..*

0..*

0..*

1

0..*

Contain/ about

contains * /coded by 1

made upof * / ispart of 1

binds * /boundby 1

0..* 0..*

sourced from /derivedcol lection store

0..1

Sourcedfrom /produces

0..*

1..*

produces /produced by

0..*

0..* 0..*

0..1

representedby /represents

0..1

1contains 1.* /part of 1

binds * /boundby 1

* doneon an /0..1undergo

makes * /made by 1

1..

speci fiedby 1 /speci fies * *

contain1.* / partof 1

definedby 1 /defines *

1

0..* 1

0..*

1

0..*

1

0..*

may have 0.*/ defined by 1

contains1.* / partof 1

0...* created by 1/ 1 resul ts in 0..*

1..*

coded by1.* /codes 0..1

0..1

1 may have * /* can beassociated to 1

0..*

0..*

1

0..*

0..*

arrayDataMatrixFi lesLink

0..*

0..*

0..*

appears in */ 1represents asection of

0..*

shipsto 0..*

1..*

0..1

+usedfor

1..*

0..1

derivedArrayDataMatrixFi lesLink

contains * /part of 1

maycontain1.* /belongsto

0..*

arrayDataFi lesLink

0..*

0..*

derivedArrayDataFi lesLink

0..*

0..*

0..*

0..*

0..*

1

0..*

0..*

0..*

1

0..*

contains / isdescribedby

0..*

0..*

1

0..*

1..

0..*

0..*

0..*

0..*

0..*

0..*

0..*

printingProtocol

0..*

1..*

belongsto /contains

1..

0.* used in /0.*performed on

label l ing produces 1 /resul ts from label l ing 1

mayproduce /producedfrom

processingdecribed by /describesprocessing for

0..*1

GeneticVariation

Bio-Specimen

Gene Expression

Page 23: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Color Coding Scheme

class Gene Expression

HL7 CG Elements

Joyce Additions

NCI Model Elements

BRIDG 2.1

Modified for CG

MIAME-MAGE

MAGE-TAB

Legend

Page 24: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

CG DAM Views

• Process Models• Specimen Handling and Collection (based on NCI public protocol)• Genomics Testing Process (high level)• Future – interaction diagrams for message flows per Use Case

• Gene Expression – Whole Model• Bio-specimen• Experiment Definition (Gene express specific protocol, not entire study)• Array Design• Common Classes• Data• Relationships

Generic Assay Overview

Page 25: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Generic Assay Overview 1

Study Experiment

Data

Protocol

Equipment

Software

ExperimentalItem

*

*

*

*

*

*

Study: A detailed examination or analysis designed to discover facts about a system under investigation. Systems may include intact organisms, biologic specimens, and natural or synthetic materials.

Experiment: A coordinated set of actions and observations designed to generate data, with the ultimate goal of discovery or hypothesis testing.

Protocol: A rule which guides how an activity should be performed.

ExperimentalItem: Items used in the execution of an experiment: specimens - samples either taken from nature or created for the purpose of study and which are to be the subject of an experiment, and reagents and supplies which will be used in the execution of an experiment. It is not instruments, analysis tools, and general-purpose resources (common reagents, lab equipment, personnel).

Page 26: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Generic Assay Overview 2

Notes:

1.ProcessedData has association to Finding; not included on the diagram to keep things focused

1. Isn’t the result of an analytical experiment what we’ve called ProcessedData?

2. Do we need to have distinction between Data and ProcessedData? Can we have self association on Data to handle both in the DAM

2.Software needs to be defined

3.What about association from ExperimentalItem to ExperimentalStudy?

Page 27: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

New v3 Models for Future Ballot

• Domain Information Model (Genome )

• Allows non-locus specific data (e.g., large deletions, cytogenetics, etc.) to be represented

• Link to the locus-specific models, i.e., GeneticLoci & GeneticLocus

• Query Model

• Based on the HL7 V3 Query by Parameter Infrastructure

• Adds selected attributes from the Clinical Genomics models as parameters of the query message

Page 28: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Useful links

• HL7.orghttp://www.hl7.org/

• HL 7 Wiki http://wiki.hl7.org/index.php?title=Main_Page

• Clinical Genomics Wiki http://wiki.hl7.org/index.php?title=CG

• HL7 Standardshttp://www.hl7.org/implement/standards/index.cfm

• HL7v3 Ballot Site http://www.hl7.org/v3ballot/html/welcome/environment/index.htm

• ICR (IRWG) Wikihttps://wiki.nci.nih.gov/x/kQiG

• ICR (IRWG) comments on CG Gene Expression DAMhttps://wiki.nci.nih.gov/x/FZZ9AQ

•Clinical genomics Oct 2010 Meeting Slideshttp://www.hl7.org/Special/committees/clingenomics/docs.cfm?wg_id=7&wg_docs_subfolder_name=presentations

Page 29: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Questions?

Page 30: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

CG Gene Expression DAMMay 2010 Ballot; Model Details

• Subpackages

• Array Design• Classes e.g Array, ArrayDesign, ArrayGroup, Reporter etc.

• Common Classes• Identifiable, OntologySource, OntologyTerm etc.

• Data• DataFile, DataMatrix, Image, ImageAcquistion etc.

• Design Element• DesignElement, DimensionElement etc.

• Experiement Definition• GenomicProtocol, LabExperiment, NormalizationTypes, ProtocolParameter etc.

• Relationship• Relationships between: Samples, Arrays and Data

• Bio-Specimen Diagrams• Classes e.g BioSpecimen, Bio-Specimen-Characteristics, Specimen Handling

etc.

Page 31: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

Clinical Genomics DAMMay 2010 Ballot; Terminology• Terminology: definitions from NCI EVS team for a number of terms needed

for genetic sample type entries

• nDNA (Nuclear DNA)

• pDNA (plasmid DNA)

• RNA (Ribonucleic acid)

• RNAP (RNA polymerase)

• mRNA (Messenger Ribonucleic Acid)

• snRNA (Small nuclear RNA)

• miRNA (microRNA)

• ssRNA (single-stranded RNA)

• dsRNA (double-stranded RNA)

• snoRNA (small nucleolar RNA)

• tRNA (Transfer RNA)

• hnRNA (heterogeneous nuclear RNA)

• RNP (Ribonucleoprotein)

• snRNP (small nuclear ribonucleoproteins)

Page 32: Clinical Genomics Work Group (HL7) Mukesh Sharma Washington University in St. Louis.

CG Gene Expression DAMMay 2010 Ballot

• Model available at http://www.hl7.org/v3ballot/html/domains/uvcg/uvcg_GeneExpressionDAM.htm#POCG_DO000000UV-GeneExpressionDam-ic.&nbsp

• Comments submitted by IRWG (ICR WS) on May 7, 2010

https://wiki.nci.nih.gov/display/ICR/IRWG+Review+of+HL7+CG+DAM+2.0

• Review of the ballot results on the Gene Expression DAM

• Received 16 Negatives and 30 Affirmative votes

• Negatives from : CDISC, NCI, FDA and Siemens