Top Banner
Sequence Analysis & Gene Expression GRET workshop, Columbia, MO, June 2005 Bohnert, UIUC) [email protected] m selection: genome size – why – what is the benefit - po ns: apping first, “shotgun sequencing”, BAC alignment/sequencing BAC – bacterial artificial chromosome; also YAC (yeast)] sequence: raw sequence – confirmed sequence ene models – verification ation: is the gene model transcribed? Yes/no/perhaps ubiquitous” gene, family specific, homolog - ortholog - paral ipt profiles: when – how much [abundant] – where ranscript “variants” – inducible by condition X?
39

Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) [email protected] Organism selection:genome size.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Sequence Analysis & Gene Expression

MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC)

[email protected]

Organism selection: genome size – why – what is the benefit - politics

Decisions: mapping first, “shotgun sequencing”, BAC alignment/sequencing[BAC – bacterial artificial chromosome; also YAC (yeast)]

Genome sequence: raw sequence – confirmed sequencegene models – verification

Verification: is the gene model transcribed? Yes/no/perhaps“ubiquitous” gene, family specific, homolog - ortholog - paralog

Transcript profiles: when – how much [abundant] – where transcript “variants” – inducible by condition X?

Page 2: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

GenomicsGenomics

information mining, hypotheses, experiment - insight, application, virtual life

expressionprofiles

knock-outsRNA & RNAi

protein localization

structure analysis

dynamic metabolite

catalogs

biochemicalgenetics

protein interaction maps

TPMal

A

BX Y

ATCCGAAGCGCTTGGAAAA

Databases, Integration& Intuition

genome & transcriptome sequences

… not just genes

markers& QTLs

How (much) will‘encyclopedic’

approaches lead to better

understanding?

Page 3: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

control

O3

CO2

Columbia grown in Soy-FACE

Field ona dish!

Arabidopsis – model plant

small, fast, prolific,mutants, lines, ecotypes,genome sequence

Page 4: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

PIP2;2

Ch-1

PIP1;3 TIP3;2 NIP3;1 TIP2;xpseudo TIP3;1 NIP6;1

Ch-5

SIP1;2 NIP4;1 NIP4;2 TIP2;3 PIP2;4

5 10 20 30Mb

Ch-4

PIP1;4 TIP1;3 NIP5;1 TIP2;2NIP1;1NIP1;2 PIP1;5

Ch-3

TIP1;2TIP2;1NIP7;1SIP1;1 TIP5;1

PIP1;1

PIP2;1

PIP2;5

SIP2;1

Ch-2

PIP1;2TIP4;1NIP2;1pseudoNIP3;1pseudo NIP2;1

TIP1;1

PIP2;6

PIP2;3

PIP2;8

PIP2;7

(15)

(4)

(14)

(3)

(12)

- duplicated regions that include AQPs.

rDNA

AQP are distributed over all Chromosomes - a few clusters, many duplications

Arabidopsis thalianaAGI, 2000

Page 5: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.
Page 6: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Plants in silico? Sure! And then: Plant Design from Scratch

Ecosystem – population – species – ecotype (- breeding line)

Organism – organ – tissue – cell – compartment

Nucleus – envelope & pore – nucleoplasm, nucleolus & chromosomes

Euchromatin & heterochromatin – gene islands – gene

Promoters – 5’-regulatory (untranslated = UTR) –

introns & exons – mature coding region –

3’-regulatory (UTR) regions

The Plant Genome

Page 7: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

The Plant Genome ControlsControls for Gene Expression – many Switchboards

• Chromatin condensation state

• Local chromatin environment• Transcription initiation• Transcript elongation• mRNA splicing • mRNA export• mRNA place in the cell• RNA half-life• Killer microRNAs• Ribosome loading• Protein transport/targeting• Protein modifications• Protein turnover

Levels of regulation that

affect what we call

“gene expression”

Page 8: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

The Plant Transcriptome

Killer RNAs(there are micro-genes)

Result: no protein-

i.e., gene isessentially“silenced”

5 years ago, we did not know that

such a control system existed!

microRNAs

Page 9: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

The Plant Transcriptome

How to sample the transcriptome?

• Morphological dissection (root, leaf, flower - epidermis, guard cell, etc.)

• Cell sorting make single cells, send through cell sorter (size, color, reporter gene)

• Laser ablation micromanipulation of laser to cut individual cells

• Biochemical dissection (compartment isolation) chloroplasts, mitochondria, ribosomes, other membranes

Painting cellswith a

reporter gene-

here isGFP

GreenFluorescence

Protein

Page 10: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Painting tissuesthen isolating desired cells

Enzymatic staining

The Plant Transcriptome

The Endodermis of the root tip

is highlighted in transgenic

plants using pSCR::mGFP5.

Emerging lateral roots[requires plant transformation]

Page 11: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

The Plant Transcriptome

> cDNA libraries

• “neat”

• normalized

• subtracted

> SAGE libraries

cDNA – complementary DNA

converts messenger RNA into

double-stranded DNA

“Normalization” removes mRNAs

for which there are many copies

in a cell – thus enriching for

“rare mRNAs” (not so much sequencing to do)

Subtraction removes cDNAs which you already know

(less sequencing)

Page 12: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Total RNA

Poly(A)+ RNA

1st strand cDNA

ds-cDNA

Size-selected double stranded cDNA (>500 bp)

Ligate to EcoRI adapters/digest NotI

Clone (EcoRI/NotI) digested pBSII/SK+ & adaptored cDNA

Primary cDNA Library

Primary (neat) library may be used for “normalization”

Library Normalization

primary cDNA library

ss-DNA

DNA “tracer”

PCR inserts by T7 and T3

standard primers

DNA “driver”tracer/driverhybridization

column chromatogr.(double-strands stick)

Non-hybridized DNA from flow-through = normalized clones

make ss-DNA out of primary

library

cDNA Libraries

Cloning ofroot RNAs

from segmentsS1 – S4root tip

(Sharp lab)

sequenced~18,000 clones

found~8,000 unique

and~130 novel genes

How many genesmake a root?

The Plant Transcriptome

Page 13: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

SerialAnalysisGeneExpression

http://www.sagenet.org/

Velculescu et al. 1995

Page 14: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

1 2 3 4 5 6 7 8 9 10 M

coding region (known or expected)forward p.

reverse p.

Amplicon(sequence or clone + sequence)

results

Page 15: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Serial dilution1x - 1/5x - 1/25x - 1/125x

[cycle number]

Real-time PCR)(quantitative)

RNA (DNA-free) to cDNA

use product in dilutionsfor amplification

Assumption each cycle increases amountby factor 2 (or 1.8)

Check by using knownamount of cloned control cDNA

Page 16: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Melting curves

[single products]

Two amplicons are shown

Each shows a single melting curve

Single genes have been amplified here

Page 17: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Melting curves

[multiple products]

More than one gene has been amplified here

Homologous genes

[identity – similarity – divergence]

orthologous – paralogous

relationships

Page 18: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Quantitative PCRin 384-well plates

(96 primer pairs,3 repeats each)

Taking SAGE & cDNA

sequences together-

corn roots

“express”

20-23,000 genes(i.e., mRNA is made)

-

The entire corn genomeis expected to include

~50,000 genes

The Plant Transcriptome

Page 19: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Substrates for High Throughput Arrays

Nylon Membrane Glass SlidesGeneChip

Single label 33P Single label biotinstreptavidin

Dual labelCy3, Cy5

Page 20: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

TeleChem ChipMaker2 Pins

Pin pick-up volume 100-250 nlSpot diameter 75-200 umSpot volume 0.2-1.0 nl

Page 21: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Creating cDNA Arrays

cDNA cloned into vector and transformed to create cDNA library

Clones sequencedand unique setchosen and reracked

Slides printed on Cartesian Arrayer

384 well microtiter plate

Q-Pix

PCR on Tecan workstation Final product

Unique set of clones

Page 22: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

NSF Soybean Functional GenomicsSteve Clough / Vodkin Lab

Printing Arrays on 50 slides

Page 23: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Slide Chemistry

Glass

Coatings

Si

OH

OH

OHC

O

H

SilylatedAldehyde

...NCCNCCNCC.......

O

O

O

NH3+

NH3

+

NH3+

HN3+ Si

OH

OH

OH

Poly-L-lysineAmine

Silanated

NSF Soybean Functional GenomicsSteve Clough / Vodkin Lab

SiO

O

Si

O

OSi

O

Si

O

Si

O

Si

O

Si Si

O

O OO OO OO O O OSi Si

O OO O

Si

We use SuperAmine and SuperAldehyde from TeleChem (arrayit.com)

Page 24: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

GSI Lumonics

NSF Soybean Functional GenomicsSteve Clough / Vodkin Lab

Page 25: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Placenta vs. Brain – 3800 Cattle Placenta Array cy3 cy5

GenePix Image Analysis Software

Page 26: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Troubleshooting

The Good

The BadThe Ugly

NSF Soybean Functional GenomicsSteve Clough / Vodkin Lab

Page 27: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Post-Print Processing

HotWater

UV light

Printed slide

Rehydrate spots

Snap dry

Fix DNA to coatingHybridize & Scan

Chemically block background.Denature tosingle strands.

Page 28: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Cells from condition ACells from condition ACells from condition ACells from condition A Cells from condition BCells from condition BCells from condition BCells from condition B

mRNA

Label Dye 2

Ratio of expression of genes from two sources

Label Dye 1

cDNA

equal over under

Mix

Page 29: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

ScanArray 3000 Fluorescent Scanner

Page 30: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Overlay Images

Slide 2Cy5 over-expressed

Slide 1Cy3 over-expressed

Reverse Labeling

Page 31: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Universal vs. Universal (control v. control)

Problem area atlow intensity readings

Page 32: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

LungvsControl

Page 33: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Cholesterol Biosynthesis

Cell Cycle

Immediate Early Response

Signaling and Angiogenesis

Wound Healing and Tissue Remodeling

Clustered display of data from time course of serum stimulation of primary human fibroblasts.

Eisen et al. Proc. Natl. Acad. Sci. USA 95 (1998) pg 14865

Page 34: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Hierarchical Clustering: 14 Tissues7653 Genes

Page 35: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

• One sample, one chipOne sample, one chip• Single Color ScansSingle Color Scans• Labeling by incorporating Biotin into cRNA Labeling by incorporating Biotin into cRNA not not Cy3Cy3 or or Cy5 Cy5 dyesdyes• Oligonucleotides instead of full-length cDNAsOligonucleotides instead of full-length cDNAs• Higher Density ArraysHigher Density Arrays

–Feature sizes down to 18 Feature sizes down to 18 m instead of ~100 m instead of ~100 mm

–Non-contact Creation of ArraysNon-contact Creation of Arrays

Differences in TechnologyDifferences in Technology

Affymetrix

GeneChips

Page 36: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Affy Technology OverviewAffy Technology Overview

• Photolithography and Photolithography and combinatorial combinatorial chemistrychemistry– Technology from Technology from

microchip microchip industry: industry: “GeneChip”“GeneChip”

– Coat slidesCoat slides– ““Mask” to apply Mask” to apply

light to only light to only desired features, desired features, de-protects featurede-protects feature

Page 37: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Technology Overview (cont.)Technology Overview (cont.)

• Apply required Apply required nucleotide base to nucleotide base to arrayarray

• Apply new mask to de-Apply new mask to de-protect different protect different featuresfeatures

• Stack nucleotides on Stack nucleotides on top of one anothertop of one another

• Repeat with bases and Repeat with bases and masks until 25-mer masks until 25-mer oligonucleotides are oligonucleotides are built directly onto arraybuilt directly onto array

OOOOO

Light(deprotection)

HO HO OOO TTOOO

TTCCO

Light(deprotection)

TTOOO

CATATAGCTGTTCCG

MaskMask

SubstrateSubstrate

MaskMask

SubstrateSubstrate

T T ––

C C ––REPEATREPEAT

OOOOO

Light(deprotection)

OOOOO

Light(deprotection)

HO HO OOOHO HO OOO TTOOOTTOOO

TTCCOTTCCO

Light(deprotection)

TTOOO

Light(deprotection)

TTOOO

CATATAGCTGTTCCG

CATATAGCTGTTCCG

MaskMask

SubstrateSubstrate

MaskMask

SubstrateSubstrate

T T ––

C C ––REPEATREPEAT

Page 38: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Technology Final StepsTechnology Final Steps

• Silicon “wafers” of 90 arrays are cutSilicon “wafers” of 90 arrays are cut

• Glass substrate is then added to plastic Glass substrate is then added to plastic cartridge for:cartridge for:

– Safe handlingSafe handling

– Easy storageEasy storage

– Easy hybridizationEasy hybridization

– Easy scanning Easy scanning

Easy, convenientExpensive (very much so)No confirmation of qualityErroneous data when

low intensityProblems with SNPs*

*not with 70-mer oligo glass slides

Page 39: Sequence Analysis & Gene Expression MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu Organism selection:genome size.

Questions?

Give me a call or send a message

217-265-5475

[email protected]

http://www.life.uiuc.edu/bohnert/

Remember:

YOU CAN ALWAYS FIND EVERYTHING ON GOOGLE!

(though not these slides)