Introduction to microarray analysis and tools · User Access, Roles, Security - - + Barcode Support, Automation - + ++ MIAME Standard Template - + + Publishing to AADM Database -

Post on 21-Dec-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Introduction to microarrayanalysis and tools

Module B: Survey of Microarray Analysis ToolsCommercial Tools

Agnes Viale, Ph.D.Genomics Core lab

MSKCC

Microarray assay life cycle

BiologicalQuestion

SamplePreparation

MicroarrayHybridization

Microarray Detection

Data Analysis& modeling

M.Shena and R. Davis,MIcroarray biochip technology

Plan

I- GeneChip Operating System (GCOS)

II- Genespring

III-Submission to public repository

IV- NetAffx

Affymetrix GeneChip-Definitions

5’ 3’600bp

PMMM

Software name

– MicroArray Suite 4.0 (MAS 4.0) = Empirical algorithm

– Microarray Suite 5.0 (MAS 5.0) = Statistical algorithm

– Genechip Operating System (GCOS)

Data comparability

• Affy arrays

1995 1998 2001 20032.0 platform

•Redesign the oligos•Change probe set names

•Keep same name•Change the manufacturingprocess

•Redesign the oligos•Change probe set names

• Software

MAS 4.0 MAS 5.0 GCOS

•New Algorithm •Same algorithm•Different data management

GenechipSoftware

•New Algorithm

Signal intensity

1- Genechip software: use all pairs

∑Α∈

−Α

=j

jj MMPMAvDiff )(1 A: probe pairs selected by the

software

2- MAS 4.0: excluded outlier pairs: PM-MM values that were more than 3 SDfrom the mean PM-MM value

- not robust average- negative Average difference if MM>PM

3- MAS5.0: weighted mean of avg.(PM-MM)- Probe intensities preprocessed for global background.- PM-IM intensities are log transformed- Robust mean of probe set values taken using Tukey Biweight.

)}{log(BiweightTukey *jj MMPMsignal −=

MAS4.0/MAS5.0

MAS 5.0 MAS 4.0

Detection p-valueChange p-value

MAS 5.0/GCOS

+--Centralized Data Sharing

+--User Access, Roles, Security

+++-Barcode Support, Automation

++-MIAME Standard Template

++-Publishing to AADM Database

++-Manage & Associate Projects,Experiments, Samples & Data

+++Gene Expression Data Analysis(Statistical Algorithm (CHP)

-++Instrument Control / Data Acquisition

GCOSServer

GCOSClient

MAS

Raw files-comparison files

.Exp

.DAT

.CEL

.CHP

.CHP

Raw CHP file B Raw CHP file A

GCOS

Software status window

Data window

Files window

.Exp file

.DAT file

DAT= scan

Using the .DAT file

1- To identify defective arrays

Using the .DAT file

1- To identify image problem

.DAT file

.CEL file

.CEL= Computerized version of the .DAT file CEL file is used to generate the .CHP file

Raw . CHP file

comparison . CHP file

.RPT file

• Data set QC

GeneChip built-in control 1: % present genes

GeneChip built-in control 2: 3’/5’ratio for “house keeping” genes

Right click a .CHP file

=> Report (.RPT) file

.RPT file

.RPT file

Access to probe cell information

Generate an comparison file

- Drag and drop the experimental CEL file- Choose the baseline file- Enter the output file name

Scatter plot

2X up

2X down

Scatter plot

“Background box”

Next stepsData export in ExcelData export to third party software

Applied Maths, GenExplore™ :BioDiscovery, GeneSight:GeneData AG -Expressionist. LION Bioscience AG'sMolecular Applications Group, Stingray™.MolecularWare, Inc.: ArrayAnalyzerDBPartek, Inc., Partek Pro 2000Rosetta Inpharmatics. Resolver™Scanalytics, Inc. , MicroArray SuiteSilicon Genetics' GeneSpringTMSpotfire, Inc., .Media Cybernetics, Array-Pro(R).Microarray Software developed by Stanford UniversityTIGR (The Institute for Genome Research) offers software tools (free foracademic institutions) for array analysis.

OmniViz, Inc., OmniViz Pro Xpogen Inc., PathlinX

Plan

I- GeneChip Operating System (GCOS)

II- Genespring

III- NetAffx

Plan

I- Introdution and potential applications of array platformII- Existing platformsIII- Experimental design

IV- Steps involved in data analysisData set QCNormalizationFeature (gene) filteringReplicate analysisClusteringStatistical testsPathway

Genespring interface

Choice of “genome”

Data import

TXT files from a Genechiparray or from a spottedarray or from any othertype of array as long asyou have a “signal”associated with a identifier(gene, transcript, protein,other)

Samples information

• Sample-centric system (not experiment centric)

• Sample attributes format is MIAME compliant

Minimum information about a microarray experiment (MIAME)-toward standards for microarray data.Nat Genet. 2001 Dec;29(4):365-71.

MIAME goal : to specify the minimum information that must bereported about a microarray experiment in order to ensure itsinterpretability, as well as potential verification of the results

• MIAME format required for microarray data publication

1. Experimental design: the set of the hybridisation experiments as a whole2. Array design: each array used and each element (spot) on the array3. Samples: samples used, the extract preparation and labeling4. Hybridizations: procedures and parameters5. Measurements: images, quantitation, specifications6. Controls: types, values, specifications

Hybridisation ArraySample

Analysis

Experiment Normalisation

6 parts in MIAME

MIAME

Samples information

• Sample-centric system (not experiment centric)

• MIAME compliant sample attributes format

Experiment parameters

Parameters can be used for gene filtering with a statistical test

Gene filtering

Gene filtering

Gene lists

Gene lists

Union/Intersection of gene lists

Venn Diagram

Union/Intersection of gene lists

Venn Diagram

Statistical analysis

Statistical analysis output

•Venn diagram•Clustering•Pathway analysis•…

Clustering tools in GS

Hierarchical clusteringExperiments and samples

Projection methods

•Principal component analysis (PCA)•Multi-Dimensional Scaling (MDS)•Not clustering methods but can beused to determine or visualize clusterstructure if present

Microarray assay life cycle

BiologicalQuestion

SamplePreparation

MicroarrayHybridization

Microarray Detection

Data Analysis& modeling

M.Shena and R. Davis,MIcroarray biochip technology

PLAN

I- GeneChip Operating System (GCOS)

II- Genespring

III-Submission to public repository

IV- NetAffx

Data submission to public repository

Do you submit your data to MIAME compliant microarray public database?Response % Response Total

Always 9.70% 6sometimes 19.40% 12Only if requested by publisher 38.70% 24never 33.90% 21

Total Respondents 62

Which database are you submitting your data to?Response % Response Total

GeneExpression Omnibus- (GEO-NIH) 43.50% 27Array Express (EMBL) 29% 18Other (please specify) 33.90% 21

Total Respondents 62

Data submission to GEO

3 steps process:

1- Submission of theplatform (Array type)

Data submission to GEO

3 steps process:

1- Submission of theplatform (Array type)

2- Submission of thesamples (MIAME)

ID_REF VALUE DETECTION Detection p-valueAFFX-MurIL2_at 13.4 A 0.953518AFFX-MurIL10_at 17.3 A 0.843268AFFX-MurIL4_at 18.1 A 0.749204AFFX-MurFAS_at 15.8 A 0.425962AFFX-BioB-5_at 730.6 P 0.001593AFFX-BioB-M_at 1952.8 P 0.000044AFFX-BioB-3_at 1267.6 P 0.000147AFFX-BioC-5_at 3155.5 P 0.00007AFFX-BioC-3_at 2296.3 P 0.000052AFFX-BioDn-5_at 2987.8 P 0.000044AFFX-BioDn-3_at 16968.8 P 0.00006AFFX-CreX-5_at 31299.5 P 0.000044AFFX-CreX-3_at 47550 P 0.000044AFFX-BioB-5_st 117.8 A 0.165861AFFX-BioB-M_st 155 A 0.108979AFFX-BioB-3_st 179.6 A 0.327079

Data submission to GEO

3 steps process:

1- Submission of theplatform (Array type)

2- Submission of thesamples (MIAME)

3- Submission of a“serie”( experiment)

Plan

I- GeneChip Operating System (GCOS)

II- Genespring

III-Submission to public repository

IV- NetAffx

NetAffx

Def: comprehensive resource of functional annotations and public database

NetAffx

Def: comprehensive resource of functional annotations and public database

Accession number

Access to NetAffx

Free registrationUpdated every quarter

Quick query input

Key wordGene symbolPublic DB numberProbe set name

Quick query output

GO: Gene Ontology Pathway information

Pathway Diagram

Quick query output

Detailed information

•Genechip Array Information•Probe design information•Genomic Alignment of target sequence•Public domain and Genome references•Functional annotations•Sequence

Genechip Array Information

Probe design/ Genomic Alignment

Link to UC Santa Cruz Genome Browser

Public domain

Functional Annotations

Sequence information

Batch query

Batch query

Batch query output

Export data to ExcelGene ontology brower

Gene ontology browser

•DNA microarrayBowtell, Sambrook, CSHL

•A Biologist's Guide to Analysis of DNA Microarray Data Steen Knudsen

•http://ihome.cuhk.edu.hk/%7Eb400559/array.html

•DNA Microarray (genome chip) Leming Shihttp://www.gene-chips.com/

Useful links and lectures

Conclusion

Proteomics

Human Genetics(Genotyping)

ClinicalDatabase

Genomics

Basic ResearchAnimal Models of Human Cancer

Pathway Analysis

GLOBAL UNDERSTANDING OF MOLECULAR BASIS OF CANCER

top related