Alternative Splicing Alternative Splicing Hedi Hegyi, PhD @ Institute of Enzymology, Budapest Institute of Enzymology, Budapest http://www.enzim.hu/~hegyi/ Szeged University, Biochemistry Szeged University, Biochemistry Course Course Oct 31, 2007 Oct 31, 2007
75
Embed
Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Alternative SplicingAlternative Splicing
Hedi Hegyi, PhD@ Institute of Enzymology, BudapestInstitute of Enzymology, Budapest
-Spring of 2000. Molecular biologists placing dollar bets: how many genes in human genome?
90,000? 153,000? C.elegans:19,500, Maize:40,000
35,000, 30,000, a paltry 25,000!
C-value paradox: Complexity C-value paradox: Complexity does not correlate with does not correlate with genome size. (C.A. Thomas, genome size. (C.A. Thomas, Jr, 1971)Jr, 1971)
3.4 x 109 bpHomo sapiens
6.7 x 1011 bpAmoeba dubia
N-value paradox: Complexity N-value paradox: Complexity does not correlate with gene does not correlate with gene number.number.
Discovery of Alternative SplicingDiscovery of Alternative Splicing
-- Alternative splicing gives two forms of the protein with different C-termini:
–- 1 form is shorter and secreted–- Other stays anchored in the plasma
membrane via C-terminus
-First predicted by Walter Gilbert in 1978
- First discovered for an Immunoglobulin heavy chain gene in 1980 (Edmund Choi, Michael Kuehl & Randolph Wall, Nature 286, 776 - 779)
S - signal peptide Red – untranslated regionV - variable region Green – membrane anchorC - constant region YellowYellow – end of coding reg. for secreted form
Alternative splicing of the mouse Alternative splicing of the mouse immunoglobulin μ heavy chain immunoglobulin μ heavy chain genegene
Splicing & the Splicing & the spliceosomespliceosome
StructureStructure• 60S dynamic structure – a large
complex consisting of ~ 150 proteins• Five small nuclear RNAs (U1, U2, U4, U1, U2, U4,
U5 & U6U5 & U6) • RNAs assemble with proteins to form
snRNPs (“snurpssnurps”)• Protein splicing factorsAssembly of spliceosome requires Assembly of spliceosome requires
ATPATP
Splicing defectsSplicing defects• Estimation: 15% of all genetic
diseases associated with mutated splice sites Green globuleGreen globule: RNA pol
Yellow globuleYellow globule: spliceosome
snRNAssnRNAs
snRNAsnRNALength Length
(nts)(nts)FunctionFunction
U1 165 Binds 5’ splice site, then 3’ splice site
U2 185Binds the branch site and forms part
of the catalytic center
U4 116 Masks the catalytic activity of U6
U5 145 Binds the 5’ splice site
U6 106 Catalyzes splicing
U1
U2
U4
U5
U6
Orange - interaction with 5’ splice siteGreen – Interaction with branch siteBlue - interaction between U2 and U6Tan - Sm-binding site (PuAU4-6GPu) flanked by two stem-loop structures
Secondary structure of snRNAsSecondary structure of snRNAs
U1 snRNAU1 snRNA
• Contains conserved sequence complementary to 5’ splice site of nuclear mRNA introns
• Contains pseudouridine ()
Upstream exon
5’ splice site
GUAAGU-------3’ ::::::3’---CAUUCA---cap-5’
U1 snRNA
Splice-site recognitionSplice-site recognition
---AGGUAAGU-----------A--------(Py)nNCAGG
upstream exon
downstream exon
Intron
5’ splice site 3’ splice site
branch site
~ 20 – 50 nts
Branch site in yeast: often 5’- UACUAAC-3’
Splice Site ConservationSplice Site Conservation
E I E EI 3’5’
Splice Junction
XX YY
Class XX YY
U2_GT_AG GT AG
U2_GC_AG GC AG
U12_GT_AG GT AG
U12_AT_AC AT AC
Donor (5’) SS Acceptor (3’) SS
Splice Site ConservationSplice Site Conservation
E I E EI 3’5’
Splice Junction
XX YY
Class XX YY
U2_GT_AG (13289) GT AG
U2_GC_AG (1085 ) GC AG
U12_GT_AG (688) GT AG
U12_AT_AC (187) AT AC
Donor (5’) SS Acceptor (3’) SS
Splicing mechanismSplicing mechanism
GU A AGU1 U2
U4U5
U6
Exon 1 Intron Exon 2
3’ splice site5’ splice site branch site
U1 U2
ATP
U4U5
U6
AG
Factors Playing a Role in Exon Factors Playing a Role in Exon RecognitionRecognition1. Evolution appears to have weakened splice sites
Derived from 253, only 3% of the S. cerevisiae genes contain intronsNo Alternative Splicing
Derived from 4,697 S. pombe genes; approximately 43% of all genes contain intronsIntron Retention
Derived from 49,778, nearly 100% contain introns75% Alternative Splicing
- G T C C A T T C A - 5' U1
Exon Recognition is complex
Complexity means multiple points of possible regulation and that exons could be skipped by failing to get all the pieces in place
Nature, Vol. 418, p. 236, 2002
The ability to form or disrupt these interactions is thought to play a key role in alternative splicing!!!
Exon Definition
Intron Definition
Factors Playing a Role in Exon Factors Playing a Role in Exon RecognitionRecognition
Intron Intron statisticsstatistics Species Average Average Average Average %
exon exon No. intron No. length(kb) kb mRNA per gene
How Prevalent is Alternative How Prevalent is Alternative Splicing?Splicing?
No one really knows for sure.
EST Database estimates between 35 - 60% of protein coding gene have alternative mRNAs
Caveat - These databases contain sequences derived from aberrant, as well as, alternative splicing, they are typically 3' and 5' end biased, and have insufficient number to infer frequency
Therefore, database mining may overestimate the rate of alternative splicing
Array-Based NumbersArray-Based Numbers
Science 302, 2141-44 (2003)74%
Genome-Wide Survey of Human Alternative Pre-Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays mRNA Splicing with Exon Junction Microarrays ((Science, 2003Science, 2003))
Conclusion: 74% of multi-exon human genes are alternatively spliced
10,000 multi-exonhuman genes in 52tissues
Number of Splicing Isoforms Number of Splicing Isoforms per Gene by EST Comparisonper Gene by EST Comparison
Harrington et al. Nature Genetics 36:916 (2004)
3.8
Regulation of/by Alternative Regulation of/by Alternative SplicingSplicing
• Sex determination in Drosophila involves 3 regulatory genes that are differentially spliced in females versus males; 2 of them affect alternative splicing
1. Sxl (sex-lethal) - promotes alternative splicing of tra (exon 2 is skipped) and of its own (exon 3 is skipped) pre-mRNA
2. Tra – promotes alternative splicing of dsx (last 2 exons are excluded)
3. Dsx (double-sex) - Alternatively spliced form of dsx needed to maintain female state
Fig. 14.38
Alternative splicing
Sxl and Tra are SR proteinsTra binds exon 4 in dsx mRNA causing it to be retained in mature mRNA.
Alternative splicing in Drosophila Alternative splicing in Drosophila maintains the female statemaintains the female state
Known Roles of Alternative SplicingKnown Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)Stamm et al. Gene 344:1-20 (2005)
• Introduction of stop codons -
25-35% of alternative splicing events introduce stop codons that either function to produce truncated proteins or regulate mRNA stability through the nonsense mediated decay (NMDNMD) pathway
Nonsense Mediated Decay Nonsense Mediated Decay
- A surveillance mechanism that selectively degrades nonsense mRNAs
- Regulates gene expression by alternative splicing
- Transcripts containing a PTC (premature termination codon) are degraded rapidly
1/31/3rdrd of alternative transcripts contain premature termination codons of alternative transcripts contain premature termination codons
Brenner, SE et al, PNAS January 7, 2003 vol. 100 no. 1 189–192
• Add new protein parts -
75% of alternative splicing involves the protein coding region, in addition to truncations you can change the overall protein sequence
Known Roles of Alternative SplicingKnown Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)Stamm et al. Gene 344:1-20 (2005)
Known Roles of Alternative SplicingKnown Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)Stamm et al. Gene 344:1-20 (2005)
• Consequences of new protein parts -
Alter protein binding properties, eg. receptor/ligand
Alter intracellular localization, eg. membrane insertion
Alter extracellular localization, eg. secretion
Alter enzymatic or signaling activities, eg. TK truncations
Alter protein stability, eg. inclusion of cleavage sites
Insertion of post-translation modification domains
Change ion channel properties eg. slo
• Coordinated Regulation of Biological Events
Potassium channel activity associated with hearing (slo)
Muscle contraction
Neurite (axon or dendrite) growth
Cell differentiation
Apoptosis
Neuron development (Dscam) (TIBS 31:581-588, 2006)
Known Roles of Alternative SplicingKnown Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)Stamm et al. Gene 344:1-20 (2005)
The Power of Alternative RNA Splicing The Power of Alternative RNA Splicing
Exon 412 Alternatives
Exon 648 Alternatives
Exon 933 Alternatives
Exon 172 Alternatives
12 X 48 X 33 X 2
Equals38,016 Possible mRNAs
Genome has only 14,800 genes!
Ig Loop 3 Ig Loop 4 Ig Loop 7 Trans-membrane
The final mRNA chooses 24 exons from 115 possibilities(20 constitutive exons and 4 alternatively spliced ones)
Drosophila DSCAM gene codes for an axon guidance receptor
Evolutionary Overview of Alternative Evolutionary Overview of Alternative SplicingSplicing
• Introns unlikely to have been derived from ancient genes
• Multi-intron genes probably predated alternative splicing
• Most eukaryotes have introns but alternative splicing prevalent only in multicellular organism
• S.cerevisiae has only 253 introns (3% of its genes) and only 6 genes have 2 introns
• S. pombe: 43% of its genes have introns (usually 40-75 nt)
• S.cerevisiae and S. pombe have NO alternative splicing
Large-scale multiple Large-scale multiple alignment of expressed alignment of expressed
sequencessequences
• Databases: •tens of thousands of mRNAs•millions of ESTs
• From large-scale alignments: 60-80% of all human genes undergo alternative splicing.
AluAlu elements elements• Length = ~Length = ~300 bp300 bp• Repetitive: > Repetitive: > 1,400,0001,400,000 times in the human times in the human
genomegenome• Constitute >10% of the human genome Constitute >10% of the human genome • Found mostly in intergenic regions and intronsFound mostly in intergenic regions and introns• Propagate in the genome through retroposition Propagate in the genome through retroposition
(RNA intermediates). (RNA intermediates).
AluAlu elements can be divided into elements can be divided into subfamiliessubfamilies
The subfamilies The subfamilies are distinguished are distinguished by ~16 diagnostic by ~16 diagnostic positions.positions.
Alu-containing exonsAlu-containing exons
• Out of 1,182 alternatively spliced cassette exons, 62 have a significant hit to an Alu sequence.
• Out of 4,151 constitutively spliced exons, none has a significant hit to an Alu sequence.
all all AluAlu-containing exons -containing exons are alternatively spliced.are alternatively spliced.
Graur et al., Graur et al., Genome Res. Genome Res. ((20022002))
The minus strand ofThe minus strand of Alu Alu elements contains “near” elements contains “near”
splice sitessplice sites• The minus strand of The minus strand of AluAlu contains ~3 sites contains ~3 sites
that resemble the acceptor recognition site:that resemble the acceptor recognition site:
• The minus strand of The minus strand of AluAlu contains ~9 sites contains ~9 sites thatthat resemble the consensus donor site:resemble the consensus donor site:
- Sequence features of alternatively regulated exons are different from constitutive exons. - These features are conserved between species.
Factors Playing a Role in Exon Factors Playing a Role in Exon RecognitionRecognition
Takeda, J.-i. et al. Nucl. Acids Res. 2006 34:3917-3928; doi:10.1093/nar/gkl507
Large-scale identification of alternative Large-scale identification of alternative splicsplice e variants of human gene transcripts variants of human gene transcripts
using 56using 56,,419419 cDNAs cDNAsDistribution of the length difference between the alternative splicing variants
Large-scale identification of Large-scale identification of human human alternative splicalternative splicee
- Alternative Splicing & Transcript Diversity Db ASTDASTDhttp://www.ebi.ac.uk/astd/- SpliceMinerSpliceMiner (querying EVDBEVDB - Evidence Viewer Database)http://discover.nci.nih.gov/spliceminer/- Hollywoodhttp://hollywood.mit.edu-Human Alternative Splicing Db (HASDBHASDB),http://www.bioinformatics.ucla.edu/~splice/HASDB/-Putative Alternative Splicing Database,PALSPALS db, http://palsdb.ym.edu.tw/
Alternative splicing databases (1,560,000 Alternative splicing databases (1,560,000 hits in google)hits in google)
databases are integrated, cross-linked and are available through a variety of interface tools
ASTD data are integrated with Ensembl genome annotation
Spliceminer (NCBI)Spliceminer (NCBI)
Querying EVDB (Evidence Viewer DB). Composite of five separate interactive queries. Each query corresponds to a different Affymetrix HG-U133A Probe. The composite permits facile comparison of the exons that are targeted by each of the probes. For example, the probes for exons 16 and 18 uniquely identify the splice variants NM_006487 and NM_006485, respectively.
Kan, Z. et al. Nucl. Acids Res. 2005 33:5659-5666; doi:10.1093/nar/gki834
Evolutionarily conserved and diverged Evolutionarily conserved and diverged alternative splicing events show different alternative splicing events show different
expression and functional profiles expression and functional profiles (Kan, NAR, (Kan, NAR, 2005)2005)
Evolutionarily conserved and diverged Evolutionarily conserved and diverged alternative splicing events show different alternative splicing events show different
expression and functional profiles expression and functional profiles ((Kan et al, Kan et al, NAR, 2005NAR, 2005))
• Alternative splicing events in 10818 pairs of human and mouse genes
• 43% (8921) of mouse alternative splices could be found in the human genome but not in human transcripts
• Only 7% of human alternative splices are conserved in mouse transcripts
• 5 of 11 tested mouse predictions were observed in human tissues
• Diverged alternative splicing is more prevalent in cancerous cell-lines
Evolutionarily conserved and diverged alternative Evolutionarily conserved and diverged alternative splicing events show different expression and functional splicing events show different expression and functional
profiles profiles ((Kan et al, NAR, 2005Kan et al, NAR, 2005))
Microarray expression of alternatively spliced Microarray expression of alternatively spliced human-mouse pairs (ASP) of genes in different human-mouse pairs (ASP) of genes in different
tissues (tissues (Kan et al, 2005Kan et al, 2005))
(i) level of conserved alternative splicing most elevated in brain(ii) diverged alternative splicing is the most enriched in testis
Functional transcripts for the α, β (brain, periphery) and γ (brain) and receptors
CorticotrophinCorticotrophin releasing hormone releasing hormone receptor receptor 22 (CRHR2) (CRHR2) alternative splices alternative splices
Catalano et al, Molecular Endocrinology. First published December 18, 2002 as doi:10.1210/me.2002-0302
The implications of alternative The implications of alternative splicing in thesplicing in the ENCODE protein ENCODE protein
complementcomplement (cont’d) (cont’d)Fig. 2. The potential effect of splicing on protein structure. Four splice isoforms mapped onto the nearest structural templates. Structures are colored in purple where the sequence of the splice isoform is missing. (a) Hemoglobin (b) SET domain-containing protein 3, (c) Mitochondrial cysteine desulfurase (d). Eukaryotic initiation factor 6.
How many genes and How many genes and transcripts in human transcripts in human
genome?genome?• Ensembl NCBI 35 release (Dec,
2005): 33,869 transcripts derived from 22,218 genes