Top Banner
9/28/2004 REMBRANDT Empowering Translational Research… RE REpository of M Molecular BRA BRAin N Neoplasia D Da T Ta HL7 Clinical Genomics SIG HL7 Clinical Genomics SIG Atlanta, September’04 Atlanta, September’04
31

9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

Mar 27, 2015

Download

Documents

Lily Andrews
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

REMBRANDTEmpowering Translational Research…

REREpository of MMolecular BRABRAin NNeoplasia DDaTTa HL7 Clinical Genomics SIGHL7 Clinical Genomics SIGAtlanta, September’04Atlanta, September’04

Page 2: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Agenda

Translational Research – Why do we care? GMDI – How we got here? Conceptual Model Gene Expression Use Case analysis Gene Expression Data analysis Wire Frames System Architecture Object Model Data warehouse design

Page 3: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Translational Research – Why do we care? Iressa Drug Case Study (at Harvard Medical School)

Targeted towards lung cancer Phase II trial – A minority of patients showed dramatic

tumor shrinkage Phase III randomized trial – No survival improvement. Patients with mutations in Iressa’s target, EGFR, showed

response to the drug. Pharmacogenomics future is based on translational

research

Reference: Clinical Pharmacogenomics: Almost a reality; Modern Drug Discovery, August 2004

Page 4: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Scientific goals of GMDI

Develop a molecular classification schema that is both clinically and biologically meaningful, based on gene expression and genomic data from tumors (Gliomas) of patients who will be prospectively followed through natural history and treatment phase of their illness

Page 5: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Rembrandt Knowledgebase

Better understanding

Better treatments

Expression array data

Clinical data

SNPArray data

Proteomics data

Page 6: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

REMBRANDT Project Goals Produce a national molecular/genetic/clinical

database of several thousand primary brain tumors that is fully open and accessible to all investigators (including intramural and extramural)

Provide informatics support to molecularly characterize a large number of adult and pediatric primary brain tumors and to correlate those data with extensive retrospective and prospective clinical data

Page 7: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Functional genomics data in the knowledge-base

RNA

100K SNP array

Protein

Tissue Arrays for ISH

Gene Expression Analysis

Affy/Oligo Arrays

cDNA/GenePix Arrays

Real time RTPCR

LOH Copy No.

DNA

Proteomics(Mass Spec)

Tissue Arrays(IHC)

ArrayCGH

Page 8: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Conceptual Model

User

Input

SampleSample

SNPSNP

Abnorm_StatusAbnorm_Status

SNP_ExptSNP_Expt

CallCall

GeneGene

Change_StatusChange_Status

Expr_ExptExpr_Expt

E-valueE-value

BAC_IDBAC_ID

Map_LocationMap_Location

CGH_ExptCGH_Expt

Abnorm_StatusAbnorm_Status

PatientPatient

SurvivalSurvivalPrior_TherapyPrior_Therapy

OutcomeOutcome

DemographicsDemographics

PathologyPathology TrialTrial

Time courseTime courseC3D

CaCore

caArray

Page 9: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

REMBRANDT will Leverage NCICB and caBIG Infrastructure Components Aligns with caBIG principles:

Open source Open access Syntactic and Semantic interoperability Federated data

NCICB Infrastructure caARRAY gene expression data repositories and analysis

tools Cancer Genome Anatomy Project (CGAP) genomic tools C3D Clinical Informatics System caCORE Infrastructure (caBIO, EVS, caDSR)

caBIG Infrastructure being delivered by caBIG workspaces

Page 10: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Typical Rembrandt Search

Show me the tumors (Tumor samples) that have amplification and over-expression of Genes EGFR & Cyclin D1.

Restrict the search to cases with amplification confirmed by SNP Chip and CGH, and over-expression confirmed by Oligo and cDNA Arrays

Presentation of Results Which genes are under-expressed respect to normal? Do this subset of tumors have a better survival? Do they segregate to a certain age group, geographical area or

ethnicity?

Page 11: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

True Measure ofTranslation Research To present the all DOWN

Regulated Genes within each sample in the result set, we have to pivot the result set on its Gene Expression axis.

All Translational Queries should allow the ability to easily pivot between:

Disease View Patient / Sample View Experiment/ Annotations

View Time Course View

Page 12: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

High-level Search Use cases

Advanced REMBRANDT Dataset Search

Search RBT Affy Gene Expression Dataset

Search RBT cDNA Expression Dataset

Search RBT SNP Array Dataset

Search RBT ArrayCGH Dataset

Search RBT Clinical Dataset

<<Extends>>

Search RBT Gene Expression Dataaset

<<Extends>>

<<Extends>> <<Extends>>

Search RBT Comparitive Genomic Dataset

<<Extends>>

<<Extends>><<Extends>>

Display Results

Report Selection

RBT_USER

<<Uses>>

Page 13: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Gene Expression Search Use cases

Search RBT Affy Gene Expression Dataset

RBT_USER

Search differential gene expression by Gene Name

Search differential gene expression by fold change

Search differential gene expression by chromosomal region Obtain gene information from

cytoband location

Obtain cytoband location form gene name

Calculate fold change

<<Uses>>

<<Uses>>

<<Uses>>

Search differential gene expression by Probeset ID

<<Extends>>

<<Extends>>

<<Extends>>

Search differential gene expression by GO Terms

<<Uses>>

<<Uses>>

<<Uses>>

<<Uses>>

Search differential gene expression by Pathway name

<<Extends>>

<<Uses>>

Get Genes

<<Uses>>

<<Uses>>

<<Uses>>

<<Extends>>

<<Extends>>

<<Uses>>

Page 14: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Gene Expression data analysis Binary chp files

from GCOS

Convert chp files to txt filesUsing GDAC SDK

Calculate ratio of individual tumor intensity to

Normal pool

Calculate ratio of average intensities between tumor

pool and Normal pool

Calculate statistical significance for comparison, using

Permax R module

Page 15: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

cDNA data handling

Technical Replicates

Pearson Correlation between one spot across all arrays and another spot for the same clone across all arrays

Is Correlation > 0.7

Yes

No

For each array, calculate the average of expression measurement

“inconsistent” call is made and no e-value Computed for that clone

Page 16: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

UI Wire Frames

Page 17: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

UI Wire Frames

Page 18: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Architecture

Data AccessEngine

StudyDatawarehouse

JSP

Servlet

Struts

Tomcat

Parses DTOs Map Business Objects to

Schema Construct & Execute Queries Instantiate DTOs from Resultset

Data Trasfer Objects (DTOs): DomainElement Criteria Query View ResultSet LookUp

Nautilus Architecture Ver 1.0

Data TrasferObjects

LookupHandler QueryHandler QueryFoctory ViewFactory QueryValidator ReportGenerator QueryResultSetHandler SecurityHandler (Future)

(Future) Abstraction

Between Client &Server Layer

XMLSerializatiion XMLDesrialization

DimensionalDatabase

Star / (Reverse Star)Schema for fastretrievals

Currently: Fact Tables Dimensions LookUp Tables No Measures

Staging Db ETL process

Abstracts queries to the DWh Materializes Data to Objects

Graphic User Interface Handles HTTP

LookUpDataManager

QueryProcessingManager

Pro

xy

XML

Clie

nt P

roxy

Ser

ver

Pro

xy

Handles LookupDataAbstraction

OJB

OLAP

(Future) OLAP Engine

JOLAP API CWM API XMLA API

Future Consideration(Not Implemented in Nautilus)

BusinessObjects

SecurityManager

Page 19: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Object Model

DomainElement Represents the basic elements involved in translational research

space. All queries, views and presentation objects are composed of

domain elements Provides strong type checking and validations

Page 20: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Database Schema

Star schema Is a generic, query optimized schema A star schema consists of Fact tables and

dimensions Provides a highly de-normalized view of the data Provides a data neutral framework from which

queries can be executed with very fast results Prototype usage will help us validate our

approach

Page 21: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Database Schema

Fact Table Contains key performance indicators Helps eliminate expensive joins from queries In the future, if multi-dimensional measures are required,

then our schema is extensible to allow us to perform OLAP queries

Dimension Dimensions are the categories of data analysis When a report is requested "by" something, that something

is usually a dimension. For example, in a gene expression query, the two

dimensions needed are genes (GENE_DM) and samples (BIOSPECIMEN _DM)

Page 22: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Database Schema

   

ARRAY_GENO_ABN_FACTAGA_ID: NUMBER

DISEASE_TYPE_ID: NUMBERPROBESET_ID: NUMBERCLONE_ID: NUMBERCHROMOSOME: VARCHAR2(20)CYTOBAND: VARCHAR2(50)LOSS_GAIN: VARCHAR2(20)COPY_NUMBER: VARCHAR2(20)CHANNEL_RATIO: FLOATDATASET_ID: NUMBERINSTITUTION_ID: NUMBERGENE_ID: NUMBERGENDER_CODE: VARCHAR2(1)EXP_PLATFORM_ID: NUMBERTIMECOURSE_ID: NUMBERBIOSPECIMEN_ID: NUMBERSURVIVAL_LENGTH_RANGE: VARCHAR2(15)AGE_GROUP: VARCHAR2(20)TREATMENT_HISTORY_ID: NUMBERAGENT_ID: NUMBERDISEASE_HISTORY_ID: NUMBER

EXP_PLATFORM_DIMEXP_PLATFORM_ID: NUMBER

EXP_PLATFORM_NAME: VARCHAR2(50)EXP_PLATFORM_DESC: VARCHAR2(200)

BIOSPECIMEN_DIMBIOSPECIMEN_ID: NUMBER

SAMPLE_ID: VARCHAR2(50)SPECIMEN_NAME: VARCHAR2(100)SPECIMEN_DESC: VARCHAR2(255)PATIENT_DID: NUMBER

CLONE_DIMCLONE_ID: NUMBER

CLONE_NAME: VARCHAR2(200)CLONE_DESC: VARCHAR2(4000)CLONE_LOCATION: VARCHAR2(50)UTR: NUMBER(1)LIBRARY: VARCHAR2(500)ACCESSION_NUMBER: VARCHAR2(15)UNIGENE_LIBRARY: NUMBERUNIGENE_ID: VARCHAR2(50)CLONE_TYPE: VARCHAR2(20)

DIFFERENTIAL_EXPRESSION_SFACTDES_ID: NUMBER

PROBESET_ID: NUMBERBIOSPECIMEN_ID: NUMBERDISEASE_TYPE_ID: NUMBEREXPRESSION_RATIO: FLOATSAMPLE_INTENSITY: FLOATNORMAL_INTENSITY: FLOATINSTITUTION_ID: NUMBERDATASET_ID: NUMBERCLONE_ID: NUMBERGENE_ID: NUMBERGENDER_CODE: VARCHAR2(1)EXP_PLATFORM_ID: NUMBERTIMECOURSE_ID: NUMBERSURVIVAL_LENGTH_RANGE: VARCHAR2(15)AGE_GROUP: VARCHAR2(20)TREATMENT_HISTORY_ID: NUMBERAGENT_ID: NUMBERDISEASE_HISTORY_ID: NUMBER

DISEASE_TYPE_DIMDISEASE_TYPE_ID: NUMBER

DISEASE_TYPE: VARCHAR2(100)SUBTYPE: VARCHAR2(100)DESC: VARCHAR2(200)

GENE_CLONEGENE_ID: NUMBERCLONE_ID: NUMBER

GENE_SYMBOL: VARCHAR2(50)CLONE_TYPE: VARCHAR2(20)

GENE_DIMGENE_ID: NUMBER

GENE_SYMBOL: VARCHAR2(50)GENE_TITLE: VARCHAR2(2000)GENOME_VERSION: VARCHAR2(100)ALIGNMENTS: VARCHAR2(255)LL_ID: VARCHAR2(50)OMIN_ID: VARCHAR2(50)CYTOBAND: VARCHAR2(50)UNIGENE_ID: VARCHAR2(50)EC: VARCHAR2(100)KB_START: NUMBERKB_END: NUMBERCHROMOSOME: VARCHAR2(20)

GENE_ONTOLOGYGO_ID: NUMBERGENE_ID: NUMBER

GO_NAME: VARCHAR2(200)GO_DESC: VARCHAR2(4000)

GENE_PROBESETGENE_ID: NUMBERPROBESET_ID: NUMBER

INSTITUTION_DIMINSTITUTION_ID: NUMBER

INSTITUTION_NAME: VARCHAR2(100)INSTITUTION_DESC: VARCHAR2(200)

PATHWAYPATHWAY_ID: NUMBER

PATHWAY_NAME: VARCHAR2(200)PATHWAY_DESC: VARCHAR2(4000)DATA_SOURCE: VARCHAR2(30)

PATIENTPATIENT_DID: NUMBER

POPULATION_TYPE_ID: NUMBER

POPULATION_TYPEPOPULATION_TYPE_ID: NUMBER

POPULATION_TYPE_NAME: VARCHAR2(50)POPULATION_TYPE_DESC: VARCHAR2(200)

PROBESET_DIMPROBESET_ID: NUMBER

ARRAY_NAME: INTEGERPROBESET_NAME: VARCHAR2(200)

PROTEIN_FAMILYGENE_ID: NUMBER

PROTEIN_FAMILY: VARCHAR2(1000)

REFSEQ_MRNA_IDGENE_ID: NUMBER

REFSEQ_MRNA_ID: VARCHAR2(50)

REFSEQ_PROTEIN_IDGENE_ID: NUMBER

REFSEQ_PROTEIN_ID: NUMBER

STUDY_DATASET_DIMDATASET_ID: NUMBER

DATASET_NAME: VARCHAR2(100)DATASET_DESC: VARCHAR2(255)

STUDY_TIMECOURSE_DIMTIMECOURSE_ID: NUMBER

TIMECOURSE_NAME: VARCHAR2(50)TIMECOURSE_DESC: VARCHAR2(200)

SWISSPROTGENE_ID: NUMBER

SWISSPROT: VARCHAR2(50)

AGE_GROUP_DXAGE_GROUP: VARCHAR2(20)

AGE_GROUP_DESC: VARCHAR2(50)

GENDERGENDER_CODE: VARCHAR2(1)

GENDER_DESC: VARCHAR2(30)

SURVIVAL_LENGTH_RANGESURVIVAL_LENGTH_RANGE: VARCHAR2(15)

UPPERBOUND: NUMBERLOWERBOUND: NUMBERGROUP_DESC: VARCHAR2(100)

GENE_PATHWAYGENE_ID: NUMBERPATHWAY_ID: NUMBER

AGENTAGENT_ID: NUMBER

AGENT_NAME: VARCHAR2(120)AGENT_TYPE: VARCHAR2(100)NSC_NUMBER: NUMBEREVS_ID: VARCHAR2(50)

DISEASE_HISTORYDISEASE_HISTORY_ID: NUMBER

OCCURRENCE_STATUS: VARCHAR2(10)OCCURRENCE_STATUS_DESC: VARCHAR2(50)

TREATMENT_HISTORYTREATMENT_HISTORY_ID: NUMBER

TREATMENT_TYPE: VARCHAR2(30)TREATMENT_TYPE_DESC: VARCHAR2(200)

DIFFERENTIAL_EXPRESSION_GFACTDEG_ID: NUMBER

EXPRESSION_RATIO: FLOATRATIO_PVAL: CHAR(18)SAMPLE_G_INTENSITY: FLOATNORMAL_INTENSITY: FLOATDATASET_ID: NUMBERTIMECOURSE_ID: NUMBERINSTITUTION_ID: NUMBEREXP_PLATFORM_ID: NUMBERCLONE_ID: NUMBERGENE_ID: NUMBERPROBESET_ID: NUMBERDISEASE_TYPE_ID: NUMBER

Page 23: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Problem we are trying to solve A typical Rembrandt data portal search:

Show me all tumor samples that have amplification of 13q11.3, deletion of 10p21, D7S522 and the FHIT region confirmed by SNP chips and CGH analysis.

Display regions with LOH for these samples. Which genes are under-expressed in these tumor samples

with respect to normal? Do this subset of tumors have a better survival? Do they segregate to a certain age group, geographical area

or ethnicity?

Page 24: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Fact: Cancer develops as a result of Chromosomal aberrations Duplications Deletions Somatic Mutations

We need to measure chromosomal aberrations

To solve this problem

Chrom N, Copy 1

Chrom N, Copy 2

LOHComplete

LossDuplication

Page 25: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

How to measure aberrations?

CGH SNP Arrays

Have higher resolution than CGH Analyze chromosomal copy number and genotype in

one experiment SNP arrays help determine the following between

normal blood sample and Tumor sample Heterozygous to Homozygous: Loss of one allele Heterozygous to No Call: Partial Loss of one allele/No Call Homozygous to Homozygous: Unchanged/Loss of one allele

Page 26: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Genotype model for Rembrandt Model basic science

Model SNPs in relation to chromosomal aberrations and as markers on the genome

Model to include annotations and external cross-references

Model Experimental observations Capture observations such as LOH in relation to SNPs and

chromosomal aberrations (CGH data) Capture expression value for SNP elements on arrays to

correlate with DNA copy number

Page 27: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Translational Research use case The Clinical Genomics model should serve the

translational research use case Model should allow for associations between:

Basic science / molecular observations (Gene expression, SNP, pathway etc)

Clinical science (Prior therapy, outcome, demographics etc) data.

Page 28: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Translational Research Space

0..1 pertinentMutation

typeCode*: <= PERTpertinentInformation

0..1 pertinentGeneExpression

typeCode*: <= PERTpertinentInformation3

0..* pertinentPolymorphism

typeCode*: <= PERTpertinentInformation6

IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code*: CE CWE [1..1] (allele identifier & classification, e.g. GeneBank)text: ED [0..1]

SNPclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (SNP identifier & classification, e.g. Entrez dbSNP)text: ED [0..1]value: BAG<ED> [0..*] (the SNP itself)methodCode: SET<CE> CWE [0..*]

HaplotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1]

GenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE (e.g., HETEROZYGOTE)text: ED

0..* haplotype

typeCode*: <= COMPcomponentOf

1..3 individualAllele

typeCode*: <= COMPcomponent

0..* pertinentSNP

typeCode*: <= PERTpertinentInformation1

AlleleSequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: [1..1] (the sequence standard code, e.g.BSML, GMS)text: (the annotated sequence)effectiveTime: [1..1]value: ED [1..1] (the actual sequence)methodCode: (the sequencingmethod)

0..1 pertinentAlleleSequence

typeCode*: <= PERTpertinentInformation2

GeneExpressionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE <= ActCode (the standard's code (e.g., MAGE-ML identifier)text:effectiveTime:value: ED [1..1] (the actual geneexpression levels)methodCode:

PolypeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code*: CE CWE [1..1](idnetifier & classification ofthe protein, e.g., SwissProt,) (PDB, PIR, HUPO)text:

0..* outcomePolypeptide

typeCode*: <= OUTCoutcome

DeterminantPeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE (identifier and classification of the determinant, e.g., Entrez)text: ED

0..* pertinentDeterminantPeptide

typeCode*: <= PERTpertinentInformation2

MutationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE (mutation identifier andclassification, e.g. LOINC MOLECULARGENETICS NAMING)text:

0..* pertinentMutation

typeCode*: <= PERTpertinentInformation4

ClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (disease, allergy, sensitivity, ADE, etc.)text: ED [0..1]uncertaintyCode: CE CNE [0..1]value: ANY [0..1]

HL7 Clinical Genomics SIGDocument: Individual Genotype DIM (to be registered as a CMET)Subject: Genomics Data Rev: 0.4 Date: March 14, 2004Authors: Amnon Shabo (IBM Research in Haifa), Shosh Israel (Hadassah University Hospital)

CGSIG(CGEN_RM000002)

Clinical-GenomicsEntry point to theGenotype Model

Note:There must be at least oneIndividualAllele and threeat the most. The typical casewould be an allele pair, oneon the paternal chromosome andone on the maternal chromosome.

The third allele could bepresent if the patient hasthree copies of a chromosome asin the Down’s Syndrome.

Mutation

0..* haplotype

typeCode*: <= COMPcomponentOf

Constrained to a restricted MAGE-MLcontent model, specified elesewhere.

Constraint: GeneExpression.value

Constrained to a restrictedBSML or GMS content model,specified elsewhere.

Constraint: AlleleSequence.value

0..* pertinentMethod

typeCode*: <= PERTpertinentInformation1

MethodclassCode*: <= PROCmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1] <=ActCode (type of method)text: ED [0..1] (free text description of themethod used)methodCode: SET<CE>CWE [0..*]

0..* pertinentIndividualAllele

typeCode*: <= PERTpertinentInformation5

Note:A related allele that is on adifferent haplotype, and stillhas significant interrelationwith the source allele.

IndividualAllele

0..* priorClinicalPhenotype

typeCode*: <= SEQLsequelTo

ExternalClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid*: II [1..1] (The id of an external observation (e.g., in a problemlist)

Note:An external observation is a valid Observationinstance existing in any other HL7-compliantartifact, e.g., a document or a message.

Note:An observation of a clinical conditionrepresented internally in this model.

Note: Shadowed observationsare copies of other observationsand thus have all of the originalact attributes.

Note:Use methodCode ifyou don’t use theassociated methodprocedure.

Note:Could refine ActRelationship typeCodeto elaborate on different types of genomicto phenotype effects.

Method0..* pertinentMethod

typeCode*: <= PERTpertinentInformation

Note:Usually this is a computed outcome, i.e.,the lab does not produce the actual protein.

0..* referredToExternalClinicalPhenotype

typeCode*: <= x_ActRelationshipExternalReferencereference

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype0..* priorClinicalPhenotype

typeCode*: <= SEQLsequelTo

0..* priorClinicalPhenotype

typeCode*: <= SEQLsequelTo

0..* priorClinicalPhenotype

typeCode*: <= SEQLsequelTo

Haplotype

Note:The classCode should beOBSGENPOLMUTwhich stands for mutation-polymorphismgenomic observation,a subtype ofOBSGENPOL (polymorphismgenomic observation) whichis a subtype ofOBSGEN (genomicobservation).

Note:The classCode should beOBSGENPOLSNP whichstands forSNP-polymorphismgenomic observation,a subtype ofOBSGENPOL(polymorphism genomicobservation) which is asubtype of OBSGEN(genomic observation).

PolymorphismclassCode*: <= OBSmoodCode*: <=EVNid: II [0..1]code: CD CWE [0..1] <= ActCodetext: ED [0..1]value: ANY [0..1]

Note:The classCode should beOBSGENPOL which standsfor polymorphism genomicobservation, a subtype ofOBSGENPOL (polymorphismgenomic observation) whichis a subtype of OBSGEN(genomic observation).

Genotype Model Clinical Trial ModeltimePointEventclassCode*: <= CTTEVENTmoodCode*: <= EVNcode: CD CWE <= ActCodetext: EDeffectiveTime: GTSreasonCode: SET<CE> CWE <= ActReason

procedureInterventionclassCode*: <= PROCmoodCode*: <= EVNcode: CD CWE [0..1] <= ActCodenegationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]reasonCode: SET<CE> CWE [0..*] <= ActReason

medicationInterventionclassCode*: <= SBADMmoodCode*: <= EVNcode: CD CWE <= ActCodenegationInd: BLtext: EDeffectiveTime: GTSreasonCode: SET<CE> CWE <= ActReasonrouteCode: CE CWE <= RouteOfAdministrationdoseQuantity: IVL<PQ>rateQuantity: IVL<PQ>doseCheckQuantity: SET<RTO<QTY,QTY>>

ManufacturedMaterialclassCode*: <= MMATdeterminerCode*: <= INSTANCEcode: CE CWE <= DrugEntityformCode: CE CWE <= MaterialFormlotNumberText: ST

0..1 manufacturedManufacturedMaterial

0..1

medication

0..* scopedRoleName

0..* playedmedication

classCode*: <= MANUcode: CE CWE <= RoleCode

0..* medicationIntervention

0..* medication

typeCode*: <= TPAfunctionCode: CD CWE <= ParticipationFunction

therapeuticAgent / therapeuticAgentOf

adverseEventclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE <= ActCodetext: EDeffectiveTime: GTSreasonCode: SET<CE> CWE <= ActReasontargetSiteCode: SET<CD> CWE <= ActSite

FindingclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1] <= ActCodenegationInd: BL [0..1]text: ED [0..1]statusCode: SET<CS> CNE [0..*] <= ActStatuseffectiveTime: IVL<TS> [0..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialityreasonCode: SET<CE> CWE [0..*] <= ActReasonvalue: ANY [0..1]interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretation

eventCausalityclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE <= ActCodetext: EDeffectiveTime: GTSvalue: CV [0..1]methodCode: SET<CE> CWE <= ObservationMethod

0..1 manufacturedDevice

0..1 manufacturerOrganization

ManufacturedDevice

0..* scopedManufacturedDevice

0..* playedManufacturedDevice

classCode*: <= MANU

Person

subjectExperience

intervention

0..* eventCausality

0..* intervention

typeCode*: <= SUBJsubject / subjectOf

0..* subjectAdverseEvent

0..* pertinenteventCausality

typeCode*: <= PERTpertinentInformation2 / pertainsTo

sourceOf / targetOf

0..* source

0..* target

typeCode*: <= ActRelationshipTypecontextControlCode: CS CNE <= ContextControlcontextConductionInd: BL "true"pauseQuantity: PQ

0..* assignedEntity

typeCode*: <= x_ParticipationAuthorPerformerauthorOrPerformer

CMET: (ASSIGNED) R_AssignedEntity

[universal](COCT_MT090000)

0..1 roleName

0..* subjectExperience

0..* clinicalResearchEventPerformer

typeCode*: <= x_ParticipationAuthorPerformerauthorOrPerformer / production1

InterpretationRangeclassCode*: <= OBSmoodCode*: <= EVN.CRTvalue: IVL<PQ> [1..1]interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretation

0..* subjectFinding

0..* referenceInterpretationRange

typeCode*: <= REFVreferenceRange / referenceRangeFor

0..1 specimenNaturalMaterial

0..1Specimen0..* scopedRoleName

0..* specimenSource

classCode*: <= SPECid: II [0..1]

naturalMaterialclassCode*: <= MATdeterminerCode*: <= INSTANCEcode: CE CWE <= EntityCode

0..* researchFinding

0..* specimen

typeCode*: <= SBJsubject1 / subjectOf2

protocolProcedureInterventionclassCode*: <= PROCmoodCode*: <= DEFcode: CD CWE <= ActCodetext: EDtitle: STreasonCode: SET<CE> CWE <= ActReason

protocolMedicationInterventionclassCode*: <= SBADMmoodCode*: <= DEFcode: CD CWE [0..1] <= ActCodetext: ED [0..1]title: ST [0..1]reasonCode: SET<CE> CWE [0..*] <= ActReasonrouteCode: CE CWE [0..1] <= RouteOfAdministrationapproachSiteCode: SET<CD> CWE [0..*] <= ActSitedoseQuantity: IVL<PQ> [0..1]rateQuantity: IVL<PQ> [0..1]doseCheckQuantity: SET<RTO<QTY,QTY>> [0..*]

protocolTimepointEventclassCode*: <= CTTEVENTmoodCode*: <= DEFcode: CD CWE <= ActCodetext: EDtitle: ST

protocolSubjectExperience

protocolTimepointEvent

protocolProcedureIntervention

protocolMedicationIntervention

0..* instantiatingTimePointEvent

0..* protocolTimepointEvent

typeCode*: <= INSTdefinition / instantiation

0..* instantiatingProcedureIntervention

0..* protocolProcedureIntervention

typeCode*: <= INSTdefinition / instantiation1

0..* instantiatingMedicationIntervention

0..* protocolMedicationIntervention

typeCode*: <= INSTdefinition / instantiation1

sourceOf / targetOf0..* source

0..* target

typeCode*: <= ActRelationshipTypeinversionInd: BLsequenceNumber: INTpriorityNumber: INTpauseQuantity: PQcheckpointCode: CS CNE <= ActRelationshipCheckpointsplitCode: CS CNE <= ActRelationshipSplitjoinCode: CS CNE <= ActRelationshipJoinnegationInd: BLconjunctionCode: CS CNE <= RelationshipConjunction

protocolResearchFindingclassCode*: <= OBSmoodCode*: <= DEFcode: CD CWE <= ActCodetext: EDtitle: STreasonCode: SET<CE> CWE <= ActReasonvalue: ANY [0..1]

0..* instantiatingFinding

0..* protocolResearchFinding

typeCode*: <= INSTdefinition / instantiation

protocolResearchFinding

SubjectAssignmentclassCode*: <= CLNTRLmoodCode*: <= EVN

ClinicalTrialclassCode*: <= CLNTRLmoodCode*: <= EVNid: II [0..1]code: CD CWE <= ActCodetitle: STactivityTime: GTS

interventionGroupAssignmentclassCode*: <= ACTmoodCode*: <= EVNcode: CD CWE <= ActCodeeffectiveTime: IVL<TS>

protocolIntervention

0..* instantiatingInterventionGroupAssignment

0..* protocolIntervention

typeCode*: <= INSTdefinition / instantiation2

0..* subjectAssignment

0..* interventionGroupAssignment

typeCode*: <= COMPcomponent1 / componentOf1

ClinicalTrialProtocolclassCode*: <= CLNTRLmoodCode*: <= DEFid: SET<II>code: CD CWE <= ActCodetitle: ST0..* instantiatingClinicalTrial

0..* clinicalTrialProtocol

typeCode*: <= INSTdefinition / instantiation

0..* clinicalTrialProtocol

0..* protocolSubjectExperience

typeCode*: <= COMPcomponent / componentOf

0..* clinicalTrial

0..* subjectAssignmenttypeCode*: <= COMP

component1 /componentOf

0..* subjectAssignment

0..* subjectExperience

typeCode*: <= COMP

component2 /componentOf

0..* subjectAssignment

0..* clinicalResearchSubject

typeCode*: <= RCTrecordTarget / recordTargetOf

0..1 subjectLivingSubject

0..1 researchSponsor

ClinicalResearchSubject0..* scopedClinicalResearchSubject

0..* playedClinicalResearchSubject

classCode*: <= RESBJid: II [0..1]code: CE CWE <= ResearchSubjectRoleBasis

PersonclassCode*: <= PSNdeterminerCode*: <= INSTANCEid: SET<II>name: BAG<EN>administrativeGenderCode: CE CWE <= AdministrativeGenderbirthTime: TSraceCode: SET<CE> CWE <= RaceethnicGroupCode: SET<CE> CWE <= Ethnicity

component2 / componentOf0..* clinicalTrial1

0..* clinicalTrial2

typeCode*: <= COMP

0..* clinicalTrial

0..* clinicalInvestigator

typeCode*: <= RESPresponsibleParty / responsibleFor

0..1 investigatorPerson

0..1

ClinicalResearchInvestigator

0..* scopedRoleName

0..* clinicalInvestigations

classCode*: <= CRINVid: II [0..1]code: CE CWE <= RoleCode

0..* clinicalTrial

0..* researchSponsor

typeCode*: <= AUTauthor / origination

0..1 sponsorOrganization 0..1

ClinicalResearchSponsor

0..* scopedRoleName

0..* sponsoredResearch

classCode*: <= CRSPNSRid: II [0..1]code: CE CWE <= RoleCodeOrganization

classCode*: <= ORGdeterminerCode*: <= INSTANCEid: SET<II>name: BAG<EN>

experienceParametersclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1] <= ActCodevalue: ANY [0..1]

0..* subjectSubjectExperience

0..* pertinentexperienceParameters

typeCode*: <= PERTpertinentInformation / pertainsTo

0..* clinicalTrial

0..* clinicalResearchSite

typeCode*: <= LOClocation / locationOf

0..1 location

0..1

ClinicalResearchSite

0..* scopedRoleName

0..* playedClinicalResearchSite

classCode*: <= SDLOCid: II [0..1]code: CE CWE <= RoleCode

PlaceclassCode*: <= PLCdeterminerCode*: <= INSTANCEid: SET<II>name: BAG<EN>addr: AD

CRFclassCode*: <= DOCmoodCode*: <= EVNid: SET<II>code: CD CWE <= ActCodetitle: ST

CRFSectionclassCode*: <= DOCSECTmoodCode*: <= EVNid: SET<II>code: CD CWE <= ActCodetitle: ST

component / componentOf10..* cRFSection1

0..* cRFSection2

typeCode*: <= COMPsequenceNumber: INT

0..* cRF

0..* cRFSection

typeCode*: <= COMPsequenceNumber: INT

component / componentOf2

documentationLocatiion

0..* referringDocumentationLocatiion

0..* referredToSubjectExperience

typeCode*: <= REFRreference1 / referencedBy

0..* referringDocumentationLocatiion

0..* referredToProtocolSubjectExperience

typeCode*: <= REFRreference2 / referencedBy

subjectRelatedObservationclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE <= ActCodevalue: ANY [0..1]

0..* subjectSubjectAssignment

0..* pertinentsubjectRelatedObservation

typeCode*: <= PERTpertinentInformation / pertainsTo

component / componentOf20..* interventionGroupAssignment1

0..* interventionGroupAssignment2

typeCode*: <= COMP

0..1 subjectTimePointEvent

0..1 pertinentStudyDay1

typeCode*: <= PERTpertinentInformation1 / pertainsTo2

0..1 subjectIntervention

0..1 pertinentStudyDay1

typeCode*: <= PERTpertinentInformation1 / pertainsTo1

StudyDay1classCode*: <= OBSmoodCode*: <= EVNcode: CD CWE <= ActCodevalue: BL [0..1] "true"

0..1 associatedAuthorType

0..1 associatedOrganization

ClinicalResearchEventPerformer

0..* scopedClinicalResearchEventPerformer

0..* playedClinicalResearchEventPerformer

classCode*: <= RoleClassAssociativeid: II [0..1]code: CE CWE <= RoleCode

Organization

0..* subjectExperience

0..* subjectExperienceSite

typeCode*: <= LOClocation / locationOf

0..1 location

0..1

SubjectExperienceSite

0..* scopedRoleName

0..* playedSubjectExperienceSite

classCode*: <= SDLOCid: II [0..1]code: CE CWE <= RoleCode

Place

DeviceclassCode*: <= DEVdeterminerCode*: <= INSTANCEid: SET<II>code: CE CWE <= EntityCodemanufacturerModelName: SC CWE <= ManufacturerModelNamesoftwareName: SC CWE <= SoftwareName

0..1 identifiedOrganization

0..1

ClinicalResearchIdentifiedEntity

0..* scopedRoleName

0..* playedClinicalResearchIdentifiedEntity

classCode*: <= IDENTid: II [0..1]code: CE CWE <= RoleCode

0..* subjectExperience

0..* accession

typeCode*: <= COMP

component /componentOf

AccessionclassCode*: <= ACSNmoodCode*: <= EVNid: II [0..1]activityTime: GTS0..* accession

0..* specimenDefinition

typeCode*: <= COMPcomponent / componentOf

SpecimenDefinitionclassCode*: <= ACTmoodCode*: <= EVN

0..* accession

0..* clinicalResearchEventPerformer

typeCode*: <= x_ParticipationAuthorPerformerauthorOrPerformer / production2 ClinicalResearchEventPerformer

0..* specimenDefinition

0..* specimen

typeCode*: <= SBJ

subject /subjectOf3

nonInterventionProcedureclassCode*: <= PROCmoodCode*: <= EVNid: SET<II> [0..*]code: CD CWE [0..1] <= ActCodetitle: ST [0..1]effectiveTime: GTS [0..1] 0..* instantiatingNonInterventionProcedure

0..* protocolNonInterventionProcedure

typeCode*: <= INSTdefinition / instantiation

protocolNonInterventionProcedureclassCode*: <= PROCmoodCode*: <= DEFid: SET<II>code: CD CWE <= ActCodetitle: ST

protocolNonInterventionProcedure0..* nonInterventionProcedure

0..* specimen

typeCode*: <= PRDproduct / productOf

ContainerRegistrationEventclassCode*: <= CONTREGmoodCode*: <= EVNcode: CD CWE <= ActContainerRegistrationCodeeffectiveTime: GTS

0..* containerRegistrationEvent

0..* specimen

typeCode*: <= SBJsubject / subjectOf1

LivingSubjectclassCode*: <= LIVdeterminerCode*: <= INSTANCEid: SET<II>code: CE CWE <= EntityCodequantity: SET<PQ>name: BAG<EN>desc: EDadministrativeGenderCode: CE CWE <= AdministrativeGenderbirthTime: TS

0..* subjectExperience

0..* nonSubjectResearchParticipant

typeCode*: <= SBJsubject / subjectOf0..1 playingGeneralGeneralEntity

0..1 scopingGeneralGeneralEntity

nonSubjectResearchTarget0..* scopednonSubjectResearchTarget

0..* playednonSubjectResearchTarget

classCode*: <= ROLid: II [0..1]code: CE CWE [0..1] <= RoleCodeaddr: BAG<AD> [0..*]

0..* specimenDefinition

0..* researchFinding

typeCode*: <= COMPcomponent / componentOf1

Finding

AuthorType

ClinicalTrialDMIM(PORT_RM010001)Description

FindingParametersclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1] <= ActCodevalue: ANY [0..1]

EventParametersclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1] <= ActCodevalue: ANY [0..1]

0..* subjectAdverseEvent

0..* pertinentEventParameters

typeCode*: <= PERTpertinentInformation1 / pertainsTo

0..* subjectFinding

0..* pertinentFindingParameters

typeCode*: <= PERTpertinentInformation / pertainsTo

GeneralEntity

MaterialclassCode*: <= MATdeterminerCode*: <= INSTANCEid: SET<II> [0..*]code: CE CWE [0..1] <= EntityCodequantity: SET<PQ> [0..*]name: BAG<EN> [0..*]desc: ED [0..1]existenceTime: IVL<TS> [0..1]

Note:changed name from ...Participantto ...Target 9/17/03. Participants other thantargets have their own roles. this role isspecifically for a target participating on behalfof the record target (trial subject).

Note:9/18/03- changed name from Range- added interpretationCode

caBIOEVS

caDSR

caMOD

MAGE-OM/caArray

Page 29: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

Next Steps

Reviewing the HL7 Re-usable genotype R-MIM as a starting point to build a clinical genomics object model

Translating the genotype R-MIM into UML to establish relationships and cardinalities between various scientific “observations”

For REMBRANDT, Extending the caBIO Object Model Developing a data warehouse infrastructure for REMBRANDT to

define relevant translational spaces and relationships between them

Future: We plan to merge our clinical objects with the HL7 Clinical model

Page 30: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

The Rembrandt Team!

Ram Bhattaru James Luo Alex Jiang Prashant Shah Ryan Landy Kevin Rosso Jyotsna Chilukuri Dana Zhang Nick Xiao Smita Hastak Himanso Sahni Subha Madhavan

Internal Advisors Ken Buetow Peter Covitz Sue Dubman Mervi Heiskanen Carl Schaefer Christo Andonyadis Scott Gustafson Sharon Settnek

External Advisors Jean-Claude Zenklusen Yuri Kotliarov Howard Fine Tracy Lugo Bob Finkelstein

Page 31: 9/28/2004 REMBRANDT Empowering Translational Research… RE M BRA N DT REpository of Molecular BRAin Neoplasia DaTa HL7 Clinical Genomics SIG Atlanta, September04.

9/28/2004

I am done Questions