Top Banner
Chips and Chips and Microarray Expression Microarray Expression Data Data Dr. Travis Doom, Assistant Professor BIRG Lab Department of Computer Science and Engineering Wright State University
30

Introduction to Gene Chips and Microarray Expression Data

Jan 22, 2016

Download

Documents

Edric

Introduction to Gene Chips and Microarray Expression Data. Dr. Travis Doom, Assistant Professor BIRG Lab Department of Computer Science and Engineering Wright State University. Outline. DNA Microarrays Fabrication Application Microarray Data Analysis Techniques New Technology & - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Gene Chips and Microarray Expression Data

Introduction to Gene Chips andIntroduction to Gene Chips andMicroarray Expression DataMicroarray Expression Data

Dr. Travis Doom, Assistant Professor

BIRG LabDepartment of Computer Science and Engineering

Wright State University

Page 2: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 2

OutlineOutline

DNA Microarrays– Fabrication

– Application Microarray Data

– Analysis Techniques New Technology &

Open Commentary

Page 3: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 3

Fabrication via Printing DNA sequence stuck

to glass substrate DNA solution pre-

synthesized in the lab Fabrication In Situ

Sequence “built” Photolithographic

techniques use light to release capping chemicals

365 nm light allows 20-m resolution

FabricationFabrication

Page 4: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 4

DNA MicroarraysDNA Microarrays Each probe consists of thousands of strands of identical

oglionucleotides– The DNA sequences at each probe represent important

genes (or parts of genes) Printing Systems

– Ex: HP, Corning Inc.– Printing systems can build lengths of DNA up to 60

nucleotides long– 1.28 x 1.28+ cm glass wafer

• Each “print head” has a ~100 m diameter and are separated by ~100 m. ( 5,000 – 20,000 probes)

Photolithographic Chips– Ex: Affymetix – 1.28 x 1.28 cm glass/silicon wafer

• 24 x 24 m probe site ( 500,000 probes)

– Lengths of DNA up to 25 nucleotides long– Requires a new set of masks for each new array type

GeneChip

Page 5: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 5

Practical Application of DNA MicroarraysPractical Application of DNA Microarrays

DNA Microarrays are used to study gene activity (expression)– What proteins are being actively produced by a group of cells?

• “Which genes are being expressed?”

How?– When a cell is making a protein, it translates the genes (made of DNA)

which code for the protein into RNA used in its production– The RNA present in a cell can be extracted– If a gene has been expressed in a cell

• RNA will bind to “a copy of itself” on the array• RNA with no complementary site will wash off the array

– The RNA can be “tagged” with a fluorescent dye to determine its presence

DNA microarrays provide a high throughput technique for quantifying the presence of specific RNA sequences

Page 6: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 6

The ProcessThe Process

CellsPoly-ARNA

AAAA

cDNA

L L L

L

IVT

10% Biotin-labeled UracilAntisense cRNA

L

Fragment (heat, Mg2+)

Labeledfragments

Hybridize Wash/stain Scan

L

(In-vitro Transcription)

Page 7: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 7

Hybridization and StainingHybridization and Staining

LL

GeneChip BiotinLabeled cRNA

+L

L

L

L

L

L

L

L

L

L+

SAPEStreptavidin-phycoerythrin

Hybridized Array

Page 8: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 8

The ResultThe ResultA light source scans the array, causing the dyes to fluoresce

The glow is picked up by a sensor and is used to determine the relative abundance of the RNA

This information must be processed to determine the level of activity for each expressed gene

Page 9: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 9

The GoalsThe Goals Basic Understanding

– Arrays can take a snap shot of which subset of genes in a cell is actively making proteins

– Heat shock experiments Medical diagnosis

– Microarrays can indicate where mutations lie that might be linked to a disease. Still others are used to determine if a person’s genetic profile would make him or her more or less susceptible to drug side effects

– 1999 – A genechip containing 6800 human genes was used distinguish between myeloid leukemia and lympholastic leukemia using a set of 50 genes that have different activity levels

Drug design– Pharmaceutical firms are in a rush to translate the human genome results into

new products• Potential profits are huge• First, though, they must figure out what the genes do, how they interact, and how

they relate to diseases.

– Evaluation, Specificity, Response

Page 10: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 10

The GainsThe Gains

A decade of rapid advances in biology has swept an avalanche of genetic information into scientist’s laps.

Mass analysis of the vast set of biologic data is impractical without high-throughput techniques

DNA microarrays (aka Gene chips, biochips) allow researchers to look for the presence, productivity, or sequence of thousand of genes simultaneously

Advantages: – Speed

– Feasibility

– Sensitivity

– Reproducibility

Page 11: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 11

OutlineOutline

DNA Microarrays– Fabrication

– Application Microarray Data

– Analysis Techniques New Technology &

Open Commentary

Page 12: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 12

Microarray DataMicroarray Data

First, the Problems:1. The fabrication process is not

error free2. Probes have a maximum

length 25-60 nucleotides3. Biologic processes such as

hybridization are stochastic4. Background light may skew

the fluorescence 5. How do we decide if/how

strongly a particular gene is being expressed?

Solutions to these problems are still in their infancy

Page 13: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 13

FeaturesFeatures

Problem #1: The fabrication process is not error free

Solution: Each probe does not represent a unique DNA sequence.

Probe set: A set of probes each containing the same DNA sequence (the Feature)

Remove outermost rows and columns to avoid fabrication-based error

Page 14: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 14

Feature ValueFeature Value

83 112 96 32

47 382 165 87

55 246 140 93

104 552 187 65

Remove outermost rows and columns

Find 75th percentile of remaining values

This value is taken as representative of this feature

Page 15: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 15

How Features Are ChosenHow Features Are Chosen

Multipleoligo probes

25-mers

Features

5’ 3’Gene Sequence

Problem #2: Probes have a maximum length 25-60 nucleotides:– Solution: Use multiple features per gene

– Affymetrix claims that this redundancy actually improves detection and quantification of the target gene

Page 16: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 16

Feature MismatchesFeature Mismatches

Multipleoligo probes

25-mers

Perfect MatchMismatch

5’ 3’Gene Sequence

Problem #3: Biologic processes such as hybridization are stochastic– Solution: Include a “control” for each probe – a DNA sequence which differs

only slightly from the feature

– In a 25-mer, the mismatch sequence differs in the 13th position (A-T or G-C)

Page 17: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 17

Background Noise RemovalBackground Noise Removal

Problem #4: Background light may skew the fluorescence “Measure of non-specific fluorescence attributed to hybridization

conditions and sample” = Noise Solution: Estimate background noise and subtract intensity

The array is divided into equal sectors (16 is standard) For each sector

– Find the lowest feature intensities (2%)

– Average these

– Subtract this average from the intensity value of all features in the sector

Page 18: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 18

Average Difference IntensityAverage Difference Intensity

Problem #5: How do we decide if / how strongly a particular gene is being expressed?

For a given gene– For each feature match/mismatch pair for the given gene

• Calculate the difference PM-MM

– Calculate , for this set

– Remove outliers from set• Ex: abs( (PM – MM) - ) 3

– The average (PM – MM) difference over the set (minus outliers) is the average difference intensity

– This value can be used to compare expression levels for the gene which the features represent

avgin pairsavgin pairs#

1

iii MMPMAvgDiff

Page 19: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 19

Positive & Negative Probe PairsPositive & Negative Probe Pairs

If both true, mark probe pair as positive evidence

If both true, mark as probe pair as negative evidence

PM/MM SRT

PM-MM SDT

MM/PM SRT MM-PM SDT

Problem #5: How do we decide if / how strongly a particular gene is being expressed?

For each perfect match/mismatch probe pair in the feature, perform a standard difference and ratio test

Example SRT and SDT thresholds:– SRT 1.5– SDT a multiple of intensity or

Otherwise, mark probe pair as inconclusive

Page 20: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 20

Voting Methods for Absolute CallVoting Methods for Absolute Call Problem #5: How do we decide if / how strongly a particular gene is

being expressed?– Solution: Use decision matrix to make absolute call

Positive/negative ratio PNR = # pos. calls / # neg. calls Positive fraction PF = # pos. calls / # probe pairs Log average ratio LA = 10 x avg. ( log (PM/MM) )

Absent Marginal Present

PNR 3.00 4.00

PF 0.33 0.43

LA 0.90 1.30

VOTE!

Page 21: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 21

Average Difference and Absolute CallAverage Difference and Absolute Call Problem #5: How do we decide if / how strongly a particular gene is

being expressed? Which of these do you base a decision on, for whether a gene is being

expressed? Use the absolute call for decision if a particular gene is being expressed

Use average difference to compare how strongly a gene which is present is expressed

avgin pairsavgin pairs#

1

iii MMPMAvgDiff

Absent Marginal Present

PNR 3.00 4.00

PF 0.33 0.43

LA 0.90 1.30

Page 22: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 22

Comparison AnalysisComparison Analysis Compare probe sets between two gene chips to determine whether gene

expression increased, did not change or decreased Comparison analysis has its own set of problems:

– The signals must be adjusted (if necessary) to normalize average signal levels

For each perfect match/mismatch probe pair in the feature, perform a difference and ratio test

If both true, mark probe pair as evidence of increase from base

– PM/MMexperiment – PM/MMbase Change Threshold

– (PM-MM)experiment /(PM-MM)base Percentage Change Threshold

If both true, mark probe pair as evidence of decrease from base

– PM/MMbase - PM/MMexperiment Change Threshold

– (PM-MM)base / (PM-MM)experiment Percentage Change Threshold

Otherwise mark probe pair as unchanged

Page 23: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 23

Voting Methods for Comparison CallVoting Methods for Comparison Call Increase fraction IR = # increase calls / # PP used Increase ratio DR = # increase calls / # decrease

calls Log average ratio change LAC = LAexp – Labase

If a change is called, use the average difference to measure percent change Are there better ways to extract patterns from multivariate gene

expression profiles?

No Change Marginal Increase

IF .33 .43

IR 3.0 4.0

LAC 0.90 1.30

Page 24: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 24

OutlineOutline

DNA Microarrays– Fabrication

– Application Microarray Data

– Analysis Techniques New Technology &

Open Commentary

Page 25: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 25

Does Moore’s Law apply to Gene Chips?Does Moore’s Law apply to Gene Chips? Ideally, we

would like to fit all of an organism’s genes on one chip– Current

estimates for Humans are between 30,000 – 40,000 genes

Cost

0

0.2

0.4

0.6

0.8

1

1.2

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004

Years

Do

llars

/ge

ne

Density

0

50000

100000

150000

200000

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004

Year

Ge

ne

s/c

hip

Page 26: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 26

Field-Programmable Microarrays?Field-Programmable Microarrays?

Nanogen has produced a silicon chip embedded with 100 “programmable” probe pads– 80m platinum pads (each spaced about 200um apart)

– Each pad can have apply a voltage (-1.3 to 2.0 V) Since DNA carries a negative charge, applying a positive charge on a

pad “corrals” DNA onto that spot– This is used to build custom arrays by washing the chip in a single stranded

DNA solution, biasing the desired spot on the chip, and then chemically fixing the DNA to that spot

The electric charge is also useful during the hybridization reaction– Pooling the DNA onto the charged pads increases the reaction by a factor of

1000

– Reversing the charge “shakes loose” imperfectly matched DNA leading to more accurate results

Page 27: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 27

From the Rumor-MillFrom the Rumor-Mill

Xeotron Corp: Maskless lithography– An array of micro mirrors are used to direct/block light during fabrication

Motorola: 3D microarrays– Arrays with a coating of acrylimide gel to allow “certain enzymatic

reactions” to occur that might be important to lab-on-a-chip applications Motorola: Electrical intensity measures

– Arrays contain embedded circuitry to detect hybridization through a change in conductance rather than fluorescence

Ciphergen Biosystems Inc. & Packard Instrument Co.: Protein chips– Creates microarrays of antibodies (rather than DNA) to bind and identify

proteins

Page 28: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 28

AcknowledgementsAcknowledgements

David Paoletti, Ph.D. Student, BIRG Lab, Wright State University. Berberich, S, and McGorry, M; GeneChip protocols, Wright State

University. Moore, S K; Making chips to probe genes, IEEE Spectrum, March

2001, 54-60. GeneChip Gene Expression Algorithm Training, Affymetrix.

Page 29: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 29

Questions ?Questions ?

DNA Microarrays– Fabrication

– Application Microarray Data

– Analysis Techniques New Technology &

Open Commentary

Page 30: Introduction to Gene Chips and Microarray Expression Data

SIAC 2001 Intro to gene chips - 30

The EndThe End

DNA Microarrays– Fabrication

– Application Microarray Data

– Analysis Techniques New Technology &

Open Commentary