Top Banner
Clinical Genomics Joint Clinical Genomics Joint with RCRIM with RCRIM Amnon Shabo Amnon Shabo Joyce Hernandez Joyce Hernandez Mukesh Sharma Mukesh Sharma
12
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

Clinical Genomics Joint with Clinical Genomics Joint with RCRIMRCRIM

Amnon ShaboAmnon Shabo

Joyce HernandezJoyce Hernandez

Mukesh SharmaMukesh Sharma

Page 2: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

AGENDAAGENDAAGENDAAGENDA

• Gene Expression CMET Overview• Genetic Reports CDA Ballot Overview• Gene Expression DAM Update • Generic Assay Overview• Specimen Model

Page 3: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

Gene Expression CMET OverviewGene Expression CMET OverviewGene Expression CMET OverviewGene Expression CMET Overview

Page 4: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

Gene Expression DAM UpdateGene Expression DAM UpdateGene Expression DAM UpdateGene Expression DAM Update

• Currently reviewing results of the last ballot• Next steps:

– Finish NCI Generic Assay (IRWG)

– Changes to GE DAM• Add “generic” classes from Generic Assay

• Bring over additional BRIDGE Classes• Apply suggested changes from the ballot (use case, BRIDG

compatibility)

Page 5: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

Clinical Genomics DAM Clinical Genomics DAM (50,000 foot level view) (50,000 foot level view)Clinical Genomics DAM Clinical Genomics DAM (50,000 foot level view) (50,000 foot level view)

class Complete Diagram

A-Phenotype

AminoAcid

+ name: String

ArrayDataType

+ name: String+ version: String

HL7 CG Elements

Joyce Addi tions

NCI Model Elements

BRIDG 2.1

Modi fied for CG

MIAME-MAGE

MAGE-T AB

Legend

Expression

- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String

T his is the Domain Information Model for the HL7 Cl inical Genomics Work Group.

It consists of the fol lowing topics:

1. Gene Expression 2. Genetic Variation3. Genotype4. Sequence5. Proteomics6. Links to Cl inical Phenotypes

Entry point for the Gene Expression CMET POCG_RM000031UV

AssociatedProperty

- classCode: String- id: Integer- code: String- text: String- effectiveT ime: String- value: String- methodCode: String

Gene

+ symbol: String+ ful lName: StringColumn+ genbankAccession: String+ genbankAccessionVersion: String+ ensemblgeneID: String+ unigeneclusterID: String+ entrezgeneID: String

Chromosome

+ chormosomeNumber: Integer

DNA

- name

Need defini tions

Use this class for inherent data about the locus, e.g. chromosone no.

RNA

- name

Nucleotides

+ nucleotideName: StringIntron

+ length: Integer+ intronClass: String

Exon

+ length: Integer+ intronClass: String

Nucleobases

+ shortName: String

Phosphate

+ name: String

Ester

+ name: String

Sugar

+ name: String

Codon

+ codonId: Integer

Usha: Relationship should be from Gene to DNA. Portions of DNA correspond to a Gene. Chromosones would have a bunch of genes.

GeneticLocus

- id: Integer- text: String- methodCode: String- chromosomePosition: Integer- cel lT ype: String

T his class is a placeholder for speci fying a locus on the genome, i .e., a posi tion of a particular given sequence in the subject’s genome. Note that the semantics of the locus (e.g., gene) is defined by data assigned in the code & value attributes of this class, and also by placing additional data relating to this locus into the classes (and CMET s) associated with this class.

Genome

- classCode: String- id: Integer- code: String- text: String- effectiveT ime: String- confidential i tyCode: String- value: String- interpretationCode: String- methodCode: String

GeneticLoci

- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- confidential i tyCode: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String

LargeDuplicaiton

- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- uncertaintycode: String- confidential i tycode: String- value: String- interpretationCode: String- methodCode: String

GeneticDocument

- classCode: String- id: Integer- code: String- ti tle: String- text: String- statusCode: String- effectiveT ime: String- confidential i tyCode: String- languageCode: String- setId: Integer- versionNumber: Integer

LargeDeletion

- classCode: String- id: Integer- code: String- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String

Sequence

- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String

Need defini tions.

Need defini tions.

Need defini tions.

Use the value attribute to encapsulate raw data relating to the enti re set of loci . For example, SNP genotyping of a large number of genes/markers.

Cytogenetics

- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String

???OtherNonLocusData

- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String

Need defini tions.

Need defini tions. Need defini tions.

Need defini tions.

GenotypeFinding

- normal izedXIntensi ty: float- normal izedYIntensi ty: float- rawXIntensi ty: float- rawYIntensi ty: float- cal l : String

Indiv idualAllele

- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String

SequenceVariation

- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String

DeterminantPepetide

- classCode: String- id: Integer- text: String- effectiveT ime: String- value: String- methodCode: String

AssociatedObserv ation

- id: Integer- name: String- copyNumber: Integer- zygosi ty: String- dominancy: String- geneFamily: String

Polypeptide

- classCode: String- id: Integer- text: String- effectiveT ime: String- value: String- methodCode: String

Need defini tions. Need defini tions.

Should we leave this out and just add classes as needed?

Entry path to the broadest path of the genetic variation model.

ViralGenetics

- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String

Added as a placeholder. > Future Expansion <

Need defintions

SNPAssay

- designAl leles: String- designScore: Float- designSequence: String- designStrand: String- id: Long- status: String- vendorAssayId: String- version: String

SNPPanel

- assayCount: Integer- description: String- id: Long- name: String- technology: String- vendor: String- vendorPanelId: String- version: String

SNP Design classes

Material

+ id: Integer+ description: String+ name: String+ formcode: String

ExtractedNon-GeneticSample

- extractedSsampleId: Integer- extractedAmount: Integer- extractedAmountUOM: String- extractionMethod: String::Material+ id: Integer+ description: String+ name: String+ formcode: String

ExtractedGeneticSample

+ geneticSampleId: Integer+ geneticSampleT ype: String+ extractedAmount: Integer+ extractionMethod: String+ GeneticSampleT ype: int+ hybridization: int+ authorizationLink: url::Material+ id: Integer+ description: String+ name: String+ formcode: String

OriginalBioSpecimen

+ amount: int+ uni tofMeasure: String+ statusCode: String+ statusDateRange: String::Material+ id: Integer+ description: String+ name: String+ formcode: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

HandlingDocument

- Id: int- text: String

SpecimenCharacteristics

+ Id: int+ color: String+ clari ty: String+ condition: String

Collection

- col lectionMethod: int- id: int::Handl ingDocument- Id: int- text: String

Storage

+ id: Integer+ flashFrozenMethod: String+ temp: Integer+ storageMethod: String::Handl ingDocument- Id: int- text: String

Transportation

- id: Integer::Handl ingDocument- Id: int- text: String

Assume this is a generic l ist of al l material . Speci fic material used and tracked within the conduct of a study and/or cl inical care would be uniquely identi fied via other classes (i .e. extracted or resecti ioned samples). T he identi fier is used only for the original biological specimen.

ArrayGroup

- arraySpacingX: float- arraySpacingY: float- barcode: String- length: float- numArrays: Integer- orientationMark: enum(top,bottom,left,right)- width: float

Array

+ arrayIdenti fier: String+ arrayXOrigin: Integer+ arrayYOrigin: Integer+ originRelativeT o: String

ArrayDesign

+ id: Integer+ version: String+ comment: String+ substrateT ype: String+ surfaceT ype: String+ sequecnePolymerT ype: String+ contactId: Integer::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ArrayManufacture

- manufacurungDate: String- tolerance: Integer

Gene expression Design classes

Do we need separate classes for Array Design (GE versus Genetic variation)?T he attributes I have added are from the new MAGE-T AB model .Do we sti l l need number of features (this came from the old version of the model)?

LabeledExtract

- flourescentLabel ingSubstance: String- flourescentLabel ingSubstanceAmount: float- flourescentLabel ingSubstanceUnits: float

Hybridization

- name: String- amountOfMaterial : float

ArrayManufactureDev iation

T his area of the MAGE model seems to be placeholders. T here are relationships to both FeatureDefect and ZoneDefect both of which do not have attributes.

FeatureDefect: Stores the defect information for a feature.T his class points to Posi tionDelta which has coordinate information (del ta X,Y). Posi tionDeltapoints to DistantUnit which contains additional measurement data.FeatureDefect points to an OntologyEntry which contains control led vocabulary. T he l ink constrains the vocabulary entries to represent only "defectT ype".

ZoneDefect: Stores the defect information for a zone.T his class points to the Zone class which does have lower-Right X,Y and upper-Left X,Y coordindates,plus a row identi fier.

Channel

- channel_no: Integer

BioAssayTreatment

+ bioAssayProcess: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ImageAcquistion

+ imageAcquistionMethod

Image

+ name: String+ url : String::DataFi le+ id: String+ name: String+ dataFi leT ype: String+ dataFormat: String::Data+ uri : String+ datatype: String

Should we embed the image as blob, rather than point to i t? Or provide both options?ANS: T oo huge to store in the database. Itis rare to go back to them. But some folks want to keep the images. T hey could be kept in secondary.

Deriv edBioAssayData

Need Array Manufacturing control data. Not chip but chip by overal l .

NOT E: Image is scanned at di fferent wave lengths.

FactorValue

- value: String::Measurement+ value: String+ m invalue: String+ maxvalue: String+ uni t: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Lab_Experiment

+ id: Integer+ ti tle: String+ description: String+ date: date+ assayT ype: String+ experimentalDesigns: String+ formatVersion: String+ publ icIdenti fier: String+ sdrfFi le: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Factor

+ type: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ExperimentalDesigns

+ type: String+ description: String

QualityControl

+ type: String+ qual i tyControlDescription: String

Replicatetypes

+ repl icateT ype: String+ repl icateDescription: String

NormalizationTypes

+ normal izationT ype+ normal izationDescription: String

T hese classes needs to be harmonized to Study Design portion ofthe BRIDG model.

GenomicProtocol

+ id: Integer+ name: String+ type: String+ description: String+ hardware: String+ software: String+ contact: String+ url : String+ publ icProtocolUrl : String

Assume there can be multiple "experiments" for complex studies. ???

Also the new version cal ls this an Investigation. When talking about genomics testing a lot of SMEs use the term "Experiment". Investigation can also be connected more easi ly to the term study which already has a broader scope since i t represents the "cl inical trai l " used in the research context.

Another factor is that the MGED ontology makes references to "ExperimentalProtocol" in a number of places, so i t m ight be better to keep a known term.

Which terms does the team prefer? Is there a term that could fi t both research and healthcare use?

T his class wi l l need speci fic harmonization to the Study class in BRIDG.hardware/software requirements for the arrays need to added. T hese should probably be normal ized into separate classes.

For CG DAM model reviewers:we need more examples: Is there other software required other than the Reporter?

ProtocolApplications

+ edgeId: Integer+ order: Integer+ notes: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Is this the proper way toidenti fy an individual channel?

Germ l ine/Somatic needs to be val idated by a lab test. Should i t just be represented as part of a test and taken out of the bio-specimen?

ImageFile

+ name: String+ status: String+ type: String

Raw ArrayData

DataFile

+ id: String+ name: String+ dataFi leT ype: String+ dataFormat: String::Data+ uri : String+ datatype: String

Need defini tion on what type of data is carried here and which function in the process populates i t.

Val idate that "ordered" means sequenced and does not represent "ordered" from lab.

Feature

+ featureId: Integer+ blockCol : Integer+ blockRow: Integer+ col : Integer+ row: Integer+ reporterid: Integer::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Reporter

+ id: Integer+ controlT ype: String+ sequence: String+ databaseEntry: String::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ReporterGroup

+ reporterGroupId: Integer+ name: String::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Need to di fferentiate between frozen and fix.For breast cancer.Containers need to be added.Example:Non-frozen and frozen tissue samples need to be included.Unfixed tissue sections (sl ide type and sl ide mount. In healthcare)

Add class to handle thechange of state of the material .

T ypical ly cal led protocol of treatment.

In MAGE this is the actual Image and everything that was done to get i t.

Can do another treatment and get another image. Actual steps are not kept for al l images. Usual ly only recorded for the last image.

JH: MAGE-T AB model confl icts with these statements. It has an Assay Class as part of the sdrf package and an Image class as part of the data package. I wi l l rename this Bio-Assay class to just Assay.

NOTE: Mollie: Wants to constrain model for clinical environment at a later point.

Hardw are

+ description: String+ version: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Softw are

+ description: String+ version: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Edge

+ id: Integer+ experimentIdenti fier: String+ input: String+ output: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Need examples for EDGE data, primari ly for input and output. Couldn't find any at themagetab and tabemage si tes.

Node

+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

DimensionElement

+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Performer

+ id: Integer+ personID: Integer+ protocol ID: Integer

Person

+ fi rstname: String+ lastname: String+ m idini tials: String+ affi l iation: String::Contact+ Address: String+ phone: String+ emai l : String+ fax: String+ tol lFreePhone: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

PersonRole

+ role: String::Person+ fi rstname: String+ lastname: String+ m idini tials: String+ affi l iation: String::Contact+ Address: String+ phone: String+ emai l : String+ fax: String+ tol lFreePhone: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Publication

+ id: Integer+ pubMedID: String+ ti tle: String+ publ icationDOI: String+ authorl ist: String+ status: String

Contact

+ Address: String+ phone: String+ email: String+ fax: String+ tollFreePhone: String::Identifiable+ id: URI+ name: String+ properties: String+ description: String

Assay

+ arrayIdenti fier: String+ arrayDataFi les: String+ derivedArrayDataFi les: String+ arrayDataMatricFi les: String+ derivedArrayDataMatrixFi les: String

TechnologyType

+ technologyT ype: String

CompositElement

+ id: Integer+ reporterID: Integer+ databaseEntry: String::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

DesignElement

+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Question on EBI example: http://www.ebi.ac.uk/m iamexpress/help/array_designs.htm l#ADF

Should the CompositeSequenceComment be represented as a databaseentry (Ontology T erm / Value pair) or as variable?

Data

+ uri : String+ datatype: String

DataElement

+ id: Integer+ datamatrixId: Integer+ col : Integer+ row: Integer+ rowQuanti tationT ype: String+ index_: Integer+ secondayKey: String

DataMatrix

Need more information on how this is implemented. Description seems to indicate calculation.See MGED section below:

class Quanti tationT ypedefini tion:T he Quanti tationT ype provides a method for calculating a single datum of the BioAssayData matrix.superclasses: Quanti tationT ypePackageproperties: unique_identi fier MO_67 class_role abstract class_source mageconstraints: restriction: has_scale has-class Scale restriction: has_type has-class DataT ype

Name: Complete DiagramAuthor: hernajoyVersion: 1.0Created: 2006-01-11 12:00:00 AMUpdated: 2010-06-17 7:10:15 PM

DATA MATRIX EXAMPLE from: http://tab2mage.sourceforge.net/docs/magetab_docs.html#datamatrix

Bio-Specimen-Characteristics

+ term: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

T his is equivalent to Material class in the MAGE-T AB model .

Material in this model appl ies to BRIDG and HL7 expanded scope which goes beyond biologic material .

Assume this class needs to represent the many to many associations between the fol lowing MGED concepts. T hese associations attempt to group mathematical functions into nodes.

1. Nodes2. Node Values3. Node Value T ypes4. BioAssays5. BioAssayDataCluster

Normalization

+ derviedArrayDataFi le: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Scan

+ arrayDataFi les: String+ derivedArrayDataFi les: String+ arrayDataMatricFi les: String+ derivedArrayDataMatrixFi les: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Measurement

+ value: String+ m invalue: String+ maxvalue: String+ uni t: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Need sample data for this class.

ProtocolParameter

::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ParameterValue

+ protocolParameterId: Integer+ protocolAppl ication: Integer::Measurement+ value: String+ m invalue: String+ maxvalue: String+ uni t: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

Source

+ contactid: Integer::OriginalBioSpecimen+ amount: int+ uni tofMeasure: String+ statusCode: String+ statusDateRange: String::Material+ id: Integer+ description: String+ name: String+ formcode: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

TreatedSample

::ExtractedGeneticSample+ geneticSampleId: Integer+ geneticSampleT ype: String+ extractedAmount: Integer+ extractionMethod: String+ GeneticSampleT ype: int+ hybridization: int+ authorizationLink: url::Material+ id: Integer+ description: String+ name: String+ formcode: String

NameValueType

+ id: Integer+ name: String+ type: String

Definition of Experiment

SPECIMEN HANDLING

ARRAY DESIGN

RELATIONSHIPS BETWEEN: (Samples, Arrays and Data)

GENE EXPRESSION DATA

Usha: May not need sugar and phosphate data.

Specimen Handling

+ type: String+ name: String+ amount: Integer::Handl ingDocument- Id: int- text: String

Shipper

- dateShipped: String- senderT ype: String- senderName: String::Transportation- id: Integer::Handl ingDocument- Id: int- text: String

Receiv er

- dateRecieved: String- receiverT ype: String- receiverrName: String::Transportation- id: Integer::Handl ingDocument- Id: int- text: String

SpecimenContainer

+ containerT ype: String+ risk: String+ handl ing: String+ capaci tyQuanti ty: Integer+ heightQuanti ty: Integer+ diameterQuanti ty: Integer+ capT ype: String+ separatorT ype: String+ barrierQuanti ty: Integer+ bottomDeltaQuanti ty: Integer::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

CellSource

+ T ype: String::OriginalBioSpecimen+ amount: int+ uni tofMeasure: String+ statusCode: String+ statusDateRange: String::ResultInterpretation+ id: Integer::Material+ id: Integer+ description: String+ name: String+ formcode: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String

ResultInterpretation

+ id: Integer

1

0..*

0..*

0..*

1

0..*

Contain/ about

contains * /coded by 1

made upof * / ispart of 1

binds * /boundby 1

0..* 0..*

sourced from /derivedcol lection store

0..1

Sourcedfrom /produces

0..*

1..*

produces /produced by

0..*

0..* 0..*

0..1

representedby /represents

0..1

1contains 1.* /part of 1

binds * /boundby 1

* doneon an /0..1undergo

makes * /made by 1

1..

speci fiedby 1 /speci fies * *

contain1.* / partof 1

definedby 1 /defines *

1

0..* 1

0..*

1

0..*

1

0..*

may have 0.*/ defined by 1

contains1.* / partof 1

0...* created by 1/ 1 resul ts in 0..*

1..*

coded by1.* /codes 0..1

0..1

1 may have * /* can beassociated to 1

0..*

0..*

1

0..*

0..*

arrayDataMatrixFi lesLink

0..*

0..*

0..*

appears in */ 1represents asection of

0..*

shipsto 0..*

1..*

0..1

+usedfor

1..*

0..1

derivedArrayDataMatrixFi lesLink

contains * /part of 1

maycontain1.* /belongsto

0..*

arrayDataFi lesLink

0..*

0..*

derivedArrayDataFi lesLink

0..*

0..*

0..*

0..*

0..*

1

0..*

0..*

0..*

1

0..*

contains / isdescribedby

0..*

0..*

1

0..*

1..

0..*

0..*

0..*

0..*

0..*

0..*

0..*

printingProtocol

0..*

1..*

belongsto /contains

1..

0.* used in /0.*performed on

label l ing produces 1 /resul ts from label l ing 1

mayproduce /producedfrom

processingdecribed by /describesprocessing for

0..*1

GeneticVariation

Bio-Specimen

Gene Expression

Page 6: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

Color Coding SchemeColor Coding SchemeColor Coding SchemeColor Coding Scheme

class Gene Expression

HL7 CG Elements

Joyce Additions

NCI Model Elements

BRIDG 2.1

Modified for CG

MIAME-MAGE

MAGE-TAB

Legend

Page 7: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

CG DAM ViewsCG DAM ViewsCG DAM ViewsCG DAM Views

• Process Models– Specimen Handling and Collection (based on NCI public protocol)– Genomcis Testing Process (high level)– Future – interaction diagrams for message flows per Use Case

• Gene Expression – Whole Model– Bio-specimen– Experiment Definition (Gene express specific protocol, not entire study)– Array Design– Common Classes– Data– Relationships

Page 8: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

Study Experiment Data

ProtocolEquipment

Software

ExperimentalItem

* *

*

**

*

*

-Study may include other Studies-Study may be composed of many Experiments-Experiment may include other Experiments-Experiment may involve multiple ExperimentalItems-Experiment may be based on multiple Protocols-Experiment may be performed using multiple Equipment-Experiment may be performed using multiple Software-Experiment may produce multiple Data (Output)

Generic Assay Overview

Page 9: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

Experiment:-Affymetrix U133P2 Gene Expression-Affymetrix U133P2 Analysis-Specimen definition information entry (might be a component of Affymetrix U133P2 Gene Expression)-Total RNA extraction and QC (might be a component of Affymetrix U133P2 Gene Expression)-cDNA synthesis and cleanup-U133P2 array scan (GCOS: create *.dat and create (.dat to) *.cel)-GCOS U133P2 Gene Expression Analysis (might be a component of Affymetrix U133P2 Analysis)

Study Experiment Data

ProtocolEquipment

Software

ExperimentalItem

* *

*

**

*

*

Study:- Gene expression analysis of tumor/non-tumor sample pair

Examples of Data (ie., Output):-A_U133P2_cDNA-A_U133P2_cDNA_gel_tif-A_U133P2_cDNA_gel_doc -A_U133P2_SpecimenHybChipWashed (ready for stain and wash)-A_U133P2_Specimen_dat-A_U133P2_Specimen_cel-A_U133P2_Specimen_chp (data file with genotypes)

ExperimentalItem:-Project-specific specimen set

Equipment:-Thermacycler-gel apparatus-camera/image system-Affymetrix Fluidics WashStation 450-Affymetrix GS3000 scanner

Protocol:-Affymetrix Cytogenetics Assay Protocol-Affymetrix Protocol for One-Cycle cDNA Synthesis

Software:-image acquisition application-Agilent 2100 Operating Software -GCOS application

Generic Assay Overview

Page 10: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

10

Study Experiment

Data

Protocol

Equipment

Software

ExperimentalItem

*

*

*

*

*

*-Study may include other Studies-Study may be composed of many Experiments-Study may be performed according to multiple Protocols-Experiment may include other Experiments-Experiment may involve multiple ExperimentalItems-Experiment may be performed according to multiple Protocols-Experiment may be performed using multiple Equipment-Experiment may be performed using multiple Software-Experiment may produce multiple Data (Output) -Experiment may be performed on Data (data an input for analytical experiment)-Protocol may include other Protocols-Protocol may specify Equipment-Protocol may specify Software-Equipment may specify Software

Study: A detailed examination or analysis designed to discover facts about a system under investigation. Systems may include intact organisms, biologic specimens, and natural or synthetic materials.

Experiment: A coordinated set of actions and observations designed to generate data, with the ultimate goal of discovery or hypothesis testing.

Protocol: A rule which guides how an activity should be performed.

Equipment: An object intended for use whether alone or in combination for diagnostic, prevention, monitoring, therapeutic, scientific, and/or experimental purposes. For example, ….mass spectrometer, PCR machine, microscope, pH meter

ExperimentalItem: Items used in the execution of an experiment: specimens - samples either taken from nature or created for the purpose of study and which are to be the subject of an experiment, and reagents and supplies which will be used in the execution of an experiment. It is not instruments, analysis tools, and general-purpose resources (common reagents, lab equipment, personnel).

Data: A collection or single item of factual information, derived from measurement or research, from which conclusions may be drawn. For example, an image, a .DAT, or .CEL file.

ProcessedData: Data derived from other data. For example, image annotations derived from an image, or the outcome of running a .CEL file through an analytical tool.

The notion of what is data (vs. processed data) is defined by community consensus and may be mutable. Some may consider the .DAT file to be data, and that the .CEL file is processed, while others may consider the .CEL file itself to be data (unprocessed).

= Proposed last week

Generic Assay Overview

Page 11: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

Notes:1.ProcessedData has association to Finding; not included on the diagram to keep things focused

1. Isn’t the result of an analytical experiment what we’ve called ProcessedData?2. Do we need to have distinction between Data and ProcessedData? Can we have self association

on Data to handle both in the DAM2.Software needs to be defined3.What about association from ExperimentalItem to ExperimentalStudy?

Generic Assay Overview

Page 12: Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.

Specimen Model OverviewSpecimen Model Overview(Mukesh Sharma)(Mukesh Sharma)

Specimen Model OverviewSpecimen Model Overview(Mukesh Sharma)(Mukesh Sharma)