Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Post on 22-Dec-2015

215 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

Transcript

Sequence Analysis & Gene Expression

MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC)

bohnerth@life.uiuc.edu

Organism selection: genome size – why – what is the benefit - politics

Decisions: mapping first, “shotgun sequencing”, BAC alignment/sequencing[BAC – bacterial artificial chromosome; also YAC (yeast)]

Genome sequence: raw sequence – confirmed sequencegene models – verification

Verification: is the gene model transcribed? Yes/no/perhaps“ubiquitous” gene, family specific, homolog - ortholog - paralog

Transcript profiles: when – how much [abundant] – where transcript “variants” – inducible by condition X?

GenomicsGenomics

information mining, hypotheses, experiment - insight, application, virtual life

expressionprofiles

knock-outsRNA & RNAi

protein localization

structure analysis

dynamic metabolite

catalogs

biochemicalgenetics

protein interaction maps

TPMal

A

BX Y

ATCCGAAGCGCTTGGAAAA

Databases, Integration& Intuition

genome & transcriptome sequences

… not just genes

markers& QTLs

How (much) will‘encyclopedic’

approaches lead to better

understanding?

control

O3

CO2

Columbia grown in Soy-FACE

Field ona dish!

Arabidopsis – model plant

small, fast, prolific,mutants, lines, ecotypes,genome sequence

PIP2;2

Ch-1

PIP1;3 TIP3;2 NIP3;1 TIP2;xpseudo TIP3;1 NIP6;1

Ch-5

SIP1;2 NIP4;1 NIP4;2 TIP2;3 PIP2;4

5 10 20 30Mb

Ch-4

PIP1;4 TIP1;3 NIP5;1 TIP2;2NIP1;1NIP1;2 PIP1;5

Ch-3

TIP1;2TIP2;1NIP7;1SIP1;1 TIP5;1

PIP1;1

PIP2;1

PIP2;5

SIP2;1

Ch-2

PIP1;2TIP4;1NIP2;1pseudoNIP3;1pseudo NIP2;1

TIP1;1

PIP2;6

PIP2;3

PIP2;8

PIP2;7

(15)

(4)

(14)

(3)

(12)

- duplicated regions that include AQPs.

rDNA

AQP are distributed over all Chromosomes - a few clusters, many duplications

Arabidopsis thalianaAGI, 2000

Plants in silico? Sure! And then: Plant Design from Scratch

Ecosystem – population – species – ecotype (- breeding line)

Organism – organ – tissue – cell – compartment

Nucleus – envelope & pore – nucleoplasm, nucleolus & chromosomes

Euchromatin & heterochromatin – gene islands – gene

Promoters – 5’-regulatory (untranslated = UTR) –

introns & exons – mature coding region –

3’-regulatory (UTR) regions

The Plant Genome

The Plant Genome ControlsControls for Gene Expression – many Switchboards

• Chromatin condensation state

• Local chromatin environment• Transcription initiation• Transcript elongation• mRNA splicing • mRNA export• mRNA place in the cell• RNA half-life• Killer microRNAs• Ribosome loading• Protein transport/targeting• Protein modifications• Protein turnover

Levels of regulation that

affect what we call

“gene expression”

The Plant Transcriptome

Killer RNAs(there are micro-genes)

Result: no protein-

i.e., gene isessentially“silenced”

5 years ago, we did not know that

such a control system existed!

microRNAs

The Plant Transcriptome

How to sample the transcriptome?

• Morphological dissection (root, leaf, flower - epidermis, guard cell, etc.)

• Cell sorting make single cells, send through cell sorter (size, color, reporter gene)

• Laser ablation micromanipulation of laser to cut individual cells

• Biochemical dissection (compartment isolation) chloroplasts, mitochondria, ribosomes, other membranes

Painting cellswith a

reporter gene-

here isGFP

GreenFluorescence

Protein

Painting tissuesthen isolating desired cells

Enzymatic staining

The Plant Transcriptome

The Endodermis of the root tip

is highlighted in transgenic

plants using pSCR::mGFP5.

Emerging lateral roots[requires plant transformation]

The Plant Transcriptome

> cDNA libraries

• “neat”

• normalized

• subtracted

> SAGE libraries

cDNA – complementary DNA

converts messenger RNA into

double-stranded DNA

“Normalization” removes mRNAs

for which there are many copies

in a cell – thus enriching for

“rare mRNAs” (not so much sequencing to do)

Subtraction removes cDNAs which you already know

(less sequencing)

Total RNA

Poly(A)+ RNA

1st strand cDNA

ds-cDNA

Size-selected double stranded cDNA (>500 bp)

Ligate to EcoRI adapters/digest NotI

Clone (EcoRI/NotI) digested pBSII/SK+ & adaptored cDNA

Primary cDNA Library

Primary (neat) library may be used for “normalization”

Library Normalization

primary cDNA library

ss-DNA

DNA “tracer”

PCR inserts by T7 and T3

standard primers

DNA “driver”tracer/driverhybridization

column chromatogr.(double-strands stick)

Non-hybridized DNA from flow-through = normalized clones

make ss-DNA out of primary

library

cDNA Libraries

Cloning ofroot RNAs

from segmentsS1 – S4root tip

(Sharp lab)

sequenced~18,000 clones

found~8,000 unique

and~130 novel genes

How many genesmake a root?

The Plant Transcriptome

SerialAnalysisGeneExpression

http://www.sagenet.org/

Velculescu et al. 1995

1 2 3 4 5 6 7 8 9 10 M

coding region (known or expected)forward p.

reverse p.

Amplicon(sequence or clone + sequence)

results

Serial dilution1x - 1/5x - 1/25x - 1/125x

[cycle number]

Real-time PCR)(quantitative)

RNA (DNA-free) to cDNA

use product in dilutionsfor amplification

Assumption each cycle increases amountby factor 2 (or 1.8)

Check by using knownamount of cloned control cDNA

Melting curves

[single products]

Two amplicons are shown

Each shows a single melting curve

Single genes have been amplified here

Melting curves

[multiple products]

More than one gene has been amplified here

Homologous genes

[identity – similarity – divergence]

orthologous – paralogous

relationships

Quantitative PCRin 384-well plates

(96 primer pairs,3 repeats each)

Taking SAGE & cDNA

sequences together-

corn roots

“express”

20-23,000 genes(i.e., mRNA is made)

-

The entire corn genomeis expected to include

~50,000 genes

The Plant Transcriptome

Substrates for High Throughput Arrays

Nylon Membrane Glass SlidesGeneChip

Single label 33P Single label biotinstreptavidin

Dual labelCy3, Cy5

TeleChem ChipMaker2 Pins

Pin pick-up volume 100-250 nlSpot diameter 75-200 umSpot volume 0.2-1.0 nl

Creating cDNA Arrays

cDNA cloned into vector and transformed to create cDNA library

Clones sequencedand unique setchosen and reracked

Slides printed on Cartesian Arrayer

384 well microtiter plate

Q-Pix

PCR on Tecan workstation Final product

Unique set of clones

NSF Soybean Functional GenomicsSteve Clough / Vodkin Lab

Printing Arrays on 50 slides

Slide Chemistry

Glass

Coatings

Si

OH

OH

OHC

O

H

SilylatedAldehyde

...NCCNCCNCC.......

O

O

O

NH3+

NH3

+

NH3+

HN3+ Si

OH

OH

OH

Poly-L-lysineAmine

Silanated

NSF Soybean Functional GenomicsSteve Clough / Vodkin Lab

SiO

O

Si

O

OSi

O

Si

O

Si

O

Si

O

Si Si

O

O OO OO OO O O OSi Si

O OO O

Si

We use SuperAmine and SuperAldehyde from TeleChem (arrayit.com)

GSI Lumonics

NSF Soybean Functional GenomicsSteve Clough / Vodkin Lab

Placenta vs. Brain – 3800 Cattle Placenta Array cy3 cy5

GenePix Image Analysis Software

Troubleshooting

The Good

The BadThe Ugly

NSF Soybean Functional GenomicsSteve Clough / Vodkin Lab

Post-Print Processing

HotWater

UV light

Printed slide

Rehydrate spots

Snap dry

Fix DNA to coatingHybridize & Scan

Chemically block background.Denature tosingle strands.

Cells from condition ACells from condition ACells from condition ACells from condition A Cells from condition BCells from condition BCells from condition BCells from condition B

mRNA

Label Dye 2

Ratio of expression of genes from two sources

Label Dye 1

cDNA

equal over under

Mix

ScanArray 3000 Fluorescent Scanner

Overlay Images

Slide 2Cy5 over-expressed

Slide 1Cy3 over-expressed

Reverse Labeling

Universal vs. Universal (control v. control)

Problem area atlow intensity readings

LungvsControl

Cholesterol Biosynthesis

Cell Cycle

Immediate Early Response

Signaling and Angiogenesis

Wound Healing and Tissue Remodeling

Clustered display of data from time course of serum stimulation of primary human fibroblasts.

Eisen et al. Proc. Natl. Acad. Sci. USA 95 (1998) pg 14865

Hierarchical Clustering: 14 Tissues7653 Genes

• One sample, one chipOne sample, one chip• Single Color ScansSingle Color Scans• Labeling by incorporating Biotin into cRNA Labeling by incorporating Biotin into cRNA not not Cy3Cy3 or or Cy5 Cy5 dyesdyes• Oligonucleotides instead of full-length cDNAsOligonucleotides instead of full-length cDNAs• Higher Density ArraysHigher Density Arrays

–Feature sizes down to 18 Feature sizes down to 18 m instead of ~100 m instead of ~100 mm

–Non-contact Creation of ArraysNon-contact Creation of Arrays

Differences in TechnologyDifferences in Technology

Affymetrix

GeneChips

Affy Technology OverviewAffy Technology Overview

• Photolithography and Photolithography and combinatorial combinatorial chemistrychemistry– Technology from Technology from

microchip microchip industry: industry: “GeneChip”“GeneChip”

– Coat slidesCoat slides– ““Mask” to apply Mask” to apply

light to only light to only desired features, desired features, de-protects featurede-protects feature

Technology Overview (cont.)Technology Overview (cont.)

• Apply required Apply required nucleotide base to nucleotide base to arrayarray

• Apply new mask to de-Apply new mask to de-protect different protect different featuresfeatures

• Stack nucleotides on Stack nucleotides on top of one anothertop of one another

• Repeat with bases and Repeat with bases and masks until 25-mer masks until 25-mer oligonucleotides are oligonucleotides are built directly onto arraybuilt directly onto array

OOOOO

Light(deprotection)

HO HO OOO TTOOO

TTCCO

Light(deprotection)

TTOOO

CATATAGCTGTTCCG

MaskMask

SubstrateSubstrate

MaskMask

SubstrateSubstrate

T T ––

C C ––REPEATREPEAT

OOOOO

Light(deprotection)

OOOOO

Light(deprotection)

HO HO OOOHO HO OOO TTOOOTTOOO

TTCCOTTCCO

Light(deprotection)

TTOOO

Light(deprotection)

TTOOO

CATATAGCTGTTCCG

CATATAGCTGTTCCG

MaskMask

SubstrateSubstrate

MaskMask

SubstrateSubstrate

T T ––

C C ––REPEATREPEAT

Technology Final StepsTechnology Final Steps

• Silicon “wafers” of 90 arrays are cutSilicon “wafers” of 90 arrays are cut

• Glass substrate is then added to plastic Glass substrate is then added to plastic cartridge for:cartridge for:

– Safe handlingSafe handling

– Easy storageEasy storage

– Easy hybridizationEasy hybridization

– Easy scanning Easy scanning

Easy, convenientExpensive (very much so)No confirmation of qualityErroneous data when

low intensityProblems with SNPs*

*not with 70-mer oligo glass slides

Questions?

Give me a call or send a message

217-265-5475

bohnerth@life.uiuc.edu

http://www.life.uiuc.edu/bohnert/

Remember:

YOU CAN ALWAYS FIND EVERYTHING ON GOOGLE!

(though not these slides)

top related