Genome Biology and Genome Biology and Biotechnology Biotechnology 7. The phenome 7. The phenome Prof. M. Zabeau Prof. M. Zabeau Department of Plant Systems Biology Department of Plant Systems Biology Flanders Interuniversity Institute for Biotechnology Flanders Interuniversity Institute for Biotechnology (VIB) (VIB) University of Gent University of Gent International course 2005 International course 2005
88
Embed
Genome Biology and Biotechnology 7. The phenome Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute for Biotechnology.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Genome Biology and Genome Biology and BiotechnologyBiotechnology
7. The phenome7. The phenome
Prof. M. ZabeauProf. M. ZabeauDepartment of Plant Systems Biology Department of Plant Systems Biology
Flanders Interuniversity Institute for Biotechnology (VIB)Flanders Interuniversity Institute for Biotechnology (VIB)University of GentUniversity of Gent
International course 2005International course 2005
Functional Functional MapsMaps
or “-omes”or “-omes”
proteins
ORFeome
Localizome
Phenome
Transcriptome
Interactome
Proteome
Genes or proteins
Genes
Mutational phenotypes
Expression profiles
Protein interactions
1 2 3 4 5 n
DNA Interactome Protein-DNA interactions
“Conditions”
After: Vidal M., Cell, 104, 333 (2001)
Cellular, tissue location
The phenome: The phenome: genome-wide phenotypic genome-wide phenotypic analysisanalysis
¤ Classical (forward) genetic screens– Saturated mutagenesis to identify all the genes that exhibit a
specific phenotype– Draw back
• characterization of the gene through positional cloning is slow and laborious
¤ Phenomics platforms: Reverse genetics– Systematic alteration of gene function to identify the functions of
High Throughput High Throughput Insertion Insertion
MutagenesisMutagenesis
¤ Yeast genomic DNA library – mutagenized with mTn– plasmids were digested
with Not I – transformed into a diploid
yeast strain – Integrated by homologous
recombination– Transformants were
assayed for -gal activity
Reprinted from: Ross-Macdonald et al., Nature 402: 413 (1999)
Analysis of the MTn Insertion StrainsAnalysis of the MTn Insertion Strains
¤ Identified 11,232 strains expressing lacZ ¤ Sequenced the site of insertion in 6,358 strains
– 5,442 in or within 200 bp of an annotated ORF• Insertions affect 1,917 different ORFs (~30%)
¤ Identified 328 previously non-annotated ORFs– 52% overlap an ORF in the antisense direction– 33% are in intergenic regions - small ORFs– 15% overlap an ORF in the same orientation in a different
frame– In the annotation genes are missed because of
• Arbitrary lower size limit of 100 amino acids• Not annotating partially overlapping ORFs
Reprinted from: Ross-Macdonald et al., Nature 402: 413 (1999)
Analysis of Mutant PhenotypesAnalysis of Mutant Phenotypes
¤ Phenotypes of essential genes– 14.1% of the insertions are non viable in haploid strains
• Represent genes that are essential for viability
¤ Large scale scoring of “other” phenotypes – growth under 20 different growth conditions
• 'phenotypic macroarrays' (96-well format) – Insertions in 407 genes (20%) result in a phenotype different from
the wild type
¤ The majority (80%) of the insertions exhibit no phenotype!– Expand the range of phenotypic assays– Utilize more precise criteria for phenotypic analysis
• Growth rate
Reprinted from: Ross-Macdonald et al., Nature 402: 413 (1999)
Phenotypic Macroarray Analysis of Yeast Phenotypic Macroarray Analysis of Yeast MutantsMutants
Reprinted from: Ross-Macdonald et al., Nature 402: 413 (1999)
mutants deficient in oxidative phosphorylation
mutants deficient in cell-wall maintenance
Genomic ScaleGenomic Scale Analysis of Analysis of PhenotypesPhenotypes
¤ Phenotypes observed – Expected phenotypes
• genes involved in microtubule functions - sensitive to benomyl
¤ RNA-mediated gene regulation is ancient in origin– Evolved before the divergence of plants and animals– Two pathways are interconnected and share molecular
components• Highly conserved nuclease Dicer• Small dsRNAs about 21 to 23 nucleotides in length
– RNA Interference (RNAi) is thought to be • a primitive genetic surveillance mechanism that protects cells
from viruses
¤ RNAi is well suited for large scale gene knockout– First pioneered in C. elegans– Now used in all model organisms
RNA Interference (RNAi) RNA Interference (RNAi) in C. in C. ElegansElegans
¤ Injection of anti-sense or double stranded RNA into cells – can be used to interfere with the function of endogenous
genes– results in silencing of the corresponding gene
¤ The RNA interference process involves – a catalytic or amplification component
• Only a few molecules of injected dsRNA are required
– injection of dsRNA into the extracellular body cavity in C. Elegans, results in silencing in the whole animal
¤ Experimentally, gene silencing is achieved in nematodes– Feeding worms E. coli expressing dsRNAs
RNA Interference (RNAi) RNA Interference (RNAi) in C. in C. ElegansElegans
¤ dsRNA is expressed in E. coli by – bi-directional transcription by phage T7 RNA polymerase
Reprinted from: Timmons et al., Nature 395: 854 (1998)
T7 promoter T7 promoter
Open Reading Frame
Feeding on wt E.coli
Feeding on E.coli
expressing ds GFP RNA
Functional Genomic Analysis of C. Elegans Functional Genomic Analysis of C. Elegans Chromosome I by Systematic RNAiChromosome I by Systematic RNAi
¤ Paper reviews/presents– RNAi approach to systematically investigate
• loss-of-function phenotypes of predicted genes of C. Elegans chromosome I
– by feeding worms with E. coli bacteria that express double-stranded RNA
– Demonstrates that high-throughput genome-wide RNAi screens can be performed using a library of dsRNA-expressing bacteria
• The specificity of RNAi make it an ideal tool for investigating gene function
Fraser et al., Nature 408: 325 (2000)
Functional Analysis of Chromosome I Functional Analysis of Chromosome I GenesGenes
¤ Constructed a library of E.coli expressing dsRNA for – the predicted genes on chromosome I
• 2,416 predicted genes (87.3% of the predicted genes)
¤ Screened the library for detectable phenotypes– L3–L4 stage worms were were fed for 72 h at 15 °C on
bacterial cultures for each targeted gene– Phenotypes of adults and progeny were scored
• Sterile (Ste) – brood size of <= 10 (wild-type worms typically give > 50)
• Progeny sterile (Stp) – brood size of <= to 10 in the progeny of fed worms
Reprinted from: Fraser et al., Nature 408: 325 (2000)
Functional Analysis of Chromosome I Functional Analysis of Chromosome I GenesGenes
¤ Assigned a phenotype to 13.9% of the genes– Confirmed 90% of the known embryonic lethal genes– number of genes with known phenotypes increased from 70
to 378– Not all genes give a RNAi phenotype
• Did not find phenotypes for some previously characterized genes
– genes involved in neuronal function
¤ Highly conserved genes are more likely to have an RNAi phenotype than genes that show no conservation – >72% of genes with an RNAi phenotype have a Drosophila
match
Reprinted from: Fraser et al., Nature 408: 325 (2000)
Functional Analysis of Chromosome I GenesFunctional Analysis of Chromosome I Genes
¤ Embryonic lethal (Emb) mutants: essential genes– genes involved in the basal cellular machinery:
• RNA-binding proteins, chromosome condensation and separation, components of signal transduction pathways
– genes involved in basic metabolic processes– largest class: >60% of the mutants
¤ Uncoordinated and post-embryonic mutants – High proportion (30% to 40%) of genes of unknown function
• genes that regulate the development are still largely unknown
Reprinted from: Fraser et al., Nature 408: 325 (2000)
Biochemical Function and RNAi Biochemical Function and RNAi PhenotypePhenotype
Reprinted from: Fraser et al., Nature 408: 325 (2000)
Toward Improving Toward Improving Caenorhabditis elegansCaenorhabditis elegans Phenome Mapping With an ORFeome-Based Phenome Mapping With an ORFeome-Based
RNAi Library RNAi Library
¤ Paper presents– the use of the C. elegans ORFeome as a starting point for
high throughput RNAi with enhanced flexibility• increasing the possibilities for phenome mapping in C.
elegans– additional HT-RNAi libraries can be generated to perform
gene knockdowns under various conditions
Rual et. al., Genome Research 14:2162-2168(2004)
Generating RNAi resources from flexible Generating RNAi resources from flexible Gateway ORFeome and promoterome Gateway ORFeome and promoterome
collections collections
Reprinted from: Rual et. al., Genome Research 14:2162-2168(2004)
Screening the ORFeome-RNAi v1.1 LibraryScreening the ORFeome-RNAi v1.1 Library
¤ The C. elegans ORFeome v1.1 library – contains 11,942 ORFs cloned as Gateway Entry clones
– ORFs were transferred into the RNAi Destination vector (T7
promoter vector)
¤ Genome-Wide Phenotypic Analysis– RNAi-by-feeding at the first larval stage– observed phenotypes for 1066 (10%) of the ORFs tested
Reprinted from: Rual et. al., Genome Research 14:2162-2168(2004)
Genome-Wide RNAi Analysis of Growth Genome-Wide RNAi Analysis of Growth and Viability in and Viability in DrosophilaDrosophila Cells Cells
¤ Paper presents– a high-throughput RNA-interference (RNAi) screen of nearly
all (91%) predicted Drosophila genes – Using in Drosophila cultured cells to characterize genes in
cell growth and viability• Treatment of cells with dsRNA leads to detect specific
phenotypes • Systematic screen for loss-of-function phenotypes• Genome-wide RNAi performed on two embryonic cell lines
– Established a quantitative assay of cell death: z-score
Boutros et. al., Science, 303, 832-835(2004)
Genome-wide RNAi screen for viability Genome-wide RNAi screen for viability defects defects
Genome-wide RNAi screening in Genome-wide RNAi screening in ArabidopsisArabidopsis
¤ The Arabidopsis GST Entry clone resource was used to – Generate a library of hairpin RNA (hpRNA) expression plasmids
• Large scale transformation of Arabidopsis
Reprinted from: Hilson et. al., Genome Research 14:2176-2189 (2004)
GST GST
hairpin RNA expression constructs
Phenotypes of plants carrying a GST hpRNA Phenotypes of plants carrying a GST hpRNA transgene targeting a subunit of cellulose transgene targeting a subunit of cellulose
synthasesynthase
Reprinted from: Hilson et. al., Genome Research 14:2176-2189 (2004)
Phenotypes of plants carrying a GST Phenotypes of plants carrying a GST hpRNA transgene targeting a H+-hpRNA transgene targeting a H+-
ATPase subunit ATPase subunit
Reprinted from: Hilson et. al., Genome Research 14:2176-2189 (2004)
ConclusionsConclusions¤ The function of 10 to 20% of the genes is
identified by insertional mutagenesis and RNAi– Expect that the detection of phenotypes for other genes will
require alternative approaches • different growth conditions, for example, environmental stress• in other genetic backgrounds
¤ Reverse and forward genetics are complementary– Reverse genetics
• Has the advantage of being high throughput and non-redundant• Mutant phenotype is automatically connected to a known sequence
– Classical forward genetics • Has the disadvantage that positional cloning is slow and laborious • Some genes are resistant to RNAi, while all genes are sensitive to
mutagens • Can also yield gain-of-function mutations
Genome Biology and Genome Biology and BiotechnologyBiotechnology
8. The transcriptome 8. The transcriptome
International course 2005International course 2005
Functional Functional MapsMaps
or “-omes”or “-omes”
proteins
ORFeome
Localizome
Phenome
Transcriptome
Interactome
Proteome
Genes or proteins
Genes
Mutational phenotypes
Expression profiles
Protein interactions
1 2 3 4 5 n
DNA Interactome Protein-DNA interactions
“Conditions”
After: Vidal M., Cell, 104, 333 (2001)
Cellular, tissue location
SummarySummary
¤ Transcriptome mapping– Identification of transcribed regions in the genome
• Experimental confirmation of predicted gene models• Discovery of non-coding RNA genes
– The “evolving” transcriptome map shows that• The genome contains many more “genes” than simply genes
coding for proteins
¤ Transcriptome profiling– Functional characterization of genes based on expression
patterns• Cluster analysis of expression patterns• Identification of co-regulated gene clusters• Classification of tumors
¤ Large scale EST sequencing– Primarily used to identify protein coding genes– Noisy data sets that have been difficult to interpret
¤ Large scale full-length cDNA sequencing– Technically very difficult and laborious– Limited to a few model organisms: mouse and human
¤ Microarray technologies– Become increasingly powerful as the density of the
microarrays has increased tremendously– Providing the most detailed view of the transcribed regions
in the genome
EST Sequencing EST Sequencing
¤ 3’ or 5’ ESTs sequences of individual cDNA clones– cDNAs are often truncated at the 5’ end (not full length)– Typically done on 5.000 to 10.000 clones per library
• Identifies the 1000 to 2000 most abundantly expressed genes
¤ Identifying ~70% of the protein coding genes requires– Sequencing several 10s or even 100s of libraries– Typically EST data bases contain >200.000 to 500.000 ESTs
¤ EST sequence assemblies yield unigene collections– Clusters of overlapping sequence reads from the same gene
5’EST
3’EST
poly A
Cloned cDNAvector vector
Full length cDNA SequencingFull length cDNA Sequencing
¤ Technically very challenging– Special techniques for selecting full length cDNA clones
• 5’ end (Capped end) selection• Aggressive subtraction/normalization required to cover “all” genes
¤ Mouse and human “FANTOM” full length cDNA libraries– Large scale sequencing of >> million 5' end and 3'-end sequences – Complete sequencing of >100.000 full length cDNA clones
¤ Full length cDNAs define transcriptional units (TU)– segments of the genome from which transcripts are generated– TUs are DNA strand-specific, and are typically bounded by
promoters at one end and termination sequences at the other
The human Chr 22 placental transcriptomeThe human Chr 22 placental transcriptome
¤ Twice as many sequences are transcribed than previously reported– Equal number of transcribed sequences in unannotated
regions as in annotated regions
¤ Transcripts from unannotated regions comprise– transcripts internal to annotated introns – transcripts that are antisense to annotated genes– a large portion of the novel transcripts is evolutionarily
conserved in the mouse
Novel RNAs Identified From an In-Depth Analysis Novel RNAs Identified From an In-Depth Analysis of the Transcriptome of Human Chromosomes 21 of the Transcriptome of Human Chromosomes 21
and 22 and 22
¤ Paper describes– Transcriptome analysis of nonrepetitive regions of
chromosomes 21 and 22 in 11 different cell lines using• High density oligonucleotide arrays with a 35 bp resolution
Poly A+ and poly A– transcription in the nucleus Poly A+ and poly A– transcription in the nucleus and cytosoland cytosol
¤ Analysis of poly A+ and poly A– transcripts– poly A– transcripts are twice as abundant as poly A+– A large proportion of the transcripts is found exclusively in the
¤ Transcriptome mapping experiments show that – a larger percentage of the genome is transcribed than can
be accounted for by the current state of genome annotations
– The human transcriptome is composed of • a network of overlapping transcripts (> 50% of the transcripts)• Poly A– RNAs potentially comprise almost half of the human
transcriptome
¤ Our understanding of the human transcriptome is still evolving…– What are the functions of the non-coding transcripts?
The complexity of the transcriptomeThe complexity of the transcriptome
A Gene Expression Map for the A Gene Expression Map for the Euchromatic Genome of Euchromatic Genome of Drosophila Drosophila
melanogastermelanogaster
¤ Paper presents– Transcriptome map of the Drosophila genome
• using microarrays with 179,972 unique 36-nucleotide probes– 61,371 exon probes for the 13,197 predicted genes– 30,787 splice junction probes– 87,814 nonexon probes from intronic and intergenic
regions• Using RNA from six developmental stages during the
¤ DNA sequencing based methods– DNA sequencing of individual cDNA clones to count the number of
times a cDNA clone is present in a cDNA library– Limited resolution but measures absolute RNA levels
¤ DNA fragment analysis based methods– PCR-based amplification of DNA fragments derived from mRNA or
cDNA whereby• Each DNA fragment represents a different mRNA
– Currently primarily used for not (yet) sequenced species
¤ Array-based hybridization methods– Hybridization to microarrays with gene-specific DNA probes– Has become the most performant and most widely used platform
• High resolution exon microarrays allow quantitative analysis of alternatively spliced transcripts
Cluster Analysis and Display of Genome-Cluster Analysis and Display of Genome-wide Expression Patterns wide Expression Patterns
¤ Paper presents– Method for analyzing and representing genome-wide
expression data• Cluster analysis of data using standard statistical
algorithms to arrange genes according to similarity in pattern of gene expression
• The output is displayed graphically, conveying the clustering and the expression data simultaneously in a form intuitive for biologists
Eisen et. Al., PNAS 95, 14863 (1998)
Cluster Analysis of Expression Cluster Analysis of Expression PatternsPatterns
¤ A logical basis for organizing gene expression data is to group genes with similar patterns of expression – using a mathematical description of similarity that captures
• similarity in "shape" of expression profiles
¤ Since there is no a priori knowledge of gene expression patterns, unsupervised methods are favored– Pair wise average-linkage cluster analysis - a form of
hierarchical clustering - similar to that used in sequence and phylogenetic analysis
– Yields a similarity tree: branch lengths reflect the degree
¤ Temporally regulated genes are – maximally expressed at specific
times throughout the entire cell cycle
– Genes were induced immediately before or coincident with each cell cycle-regulated event
Profiles Profiles Profiles of Genes Associated With DNA Profiles of Genes Associated With DNA Replication and Cell Division Replication and Cell Division