Aliens? Oddities? Or misunderstood? Transposons and miRNAs
Jan 11, 2016
Aliens? Oddities? Or misunderstood?
Transposons and miRNAs
Genome sizes (haploid)Wheat 16 GB (? ploid) ~7
chromosomes
Human 3.3 GB 23 chromosomes
Mouse 2.5 GB 19 chromosomes
Dog 2.4 GB 38 chromosomes
Chicken 1.2 GB 38 chromosomes plus microchr.
Drosophila 1.2 GB 4-5 chromosomes
C. elegans 100 MB 5 chromosomes
E. coli 5.2 MB 1 chromosome
Carsonella ruddii
160 KB 1 chromosome182 ORFs
Number of genes in different organisms
0
5000
10000
15000
20000
25000
Human Rice Mouse Arabidopsis Chicken C. elegans Dog Drosophila E. coli
What is a transposon?
• Contiguous piece of DNA of varying length (300 bp to 6.5kb or so)
• Repeated with minor variations throughout the host genome
• Can replicate itself by cut and paste or copy and paste mechanisms (can move around!)
• No known function — most synthetic genome projects aim to remove them
• Structural and functional analogies to viruses– Much of the terminology reflects this
Barbara McClintock, 1940sDiscovered transposons and
characterized their effects on their hosts
She was ostracized for her ideas but won the Nobel Prize in 1983.
Types of transposons
• Cut and paste– DNA transposons
• Copy and paste– Autonomous retrotransposons
• ERVs *possibly active in human genome• L1 & relatives *active in human genome
– Nonautonomous retrotransposons• SINEs (Alu) *active in human genome• SVA *active in human genome
– Composite element (SINE, VNTR, Alu)
• Processed pseudogenes
Transposons comprise ~45% of the human genome
• DNA transposons 3%• Autonomous retrotransposons
– ERVs– L1 18% (500,000 copies)– L2 3%– L3 & relatives 1%
• Nonautonomous retrotransposons– SINEs (Alu) 15% (1 million+ copies)– SVA (3000 copies)– Processed pseudogenes (>8000)
(Simple repeats occupy almost another 10%)
]LTR retrotransposons
“Junk DNA”?
• What do transposons do?– Make more of themselves– Move genes around– Serve as reservoirs of new sequence– Cause genetic instability (repeats stimulate
translocation; L1 causes chromosome breakage)
• Can contribute to genes and gene expression– 5% of alternatively spliced internal human exons come
from Alus– 80% of genes have some L1 sequence in noncoding
portion– 1-4% of coding sequence is L1-derived– Act as methylation centers
Importance in genomics
• Transposons are a source of human variability– Roughly 5% of people have a transposon not
found in either parent (not due to nonpaternity!)– Overall polymorphism variable but remarkable
(40-50% of youngest elements are polymorphic)
• Transposons can be useful in medicine– Occasionally cause disease (de novo insertion in
factor VIII clotting gene led to L1 discovery in 1980s)
– May often be linked to disease loci
Importance in genomics
• Transposons in introns may disrupt gene expression– Mechanism depends on whether they are on
the sense or antisense strand– (+) strand orientation — transcription
stalling– (-) strand orientation — premature
polyadenylation, gene splitting
Importance in genomics
• Can have huge effects, through chromosomal translocation, inversion, breakage
Transposon domestication
• Overly active transposons will kill a cell (and then the organism)
• Transposons have tempered– active almost exclusively in germ line – also in cancer cells and neuronal cell
precursors
Transposon domestication
• Host cells use many mechanisms to control transposons– Methylation (original role?)– miRNA defense– Sequestered in stress granules– Nucleic acid editing
• APOBEC family of proteins edits cytosines to uracils
• ADARs edit dsRNA adenosine to inosine
What to do with transposons?
• Study them• Work around them (be aware)
– RepeatMasker (Smit & Jurka)– Problem: each element is at least in part
unique, and RepeatMasker will mask that too
Another old element, new to science:
microRNAsRNA world hypothesis:
First “organism” was a strand of RNA that could somehow replicate itself.Eventually RNA used DNA as a more stable storage for genetic material.
1982: Tom Cech reported self-splicing RNAs
microRNA
• 21-25 nucleotide small RNAs• Discovered in a C. elegans screen• Alter gene expression at the post-
transcriptional level (precise mechanism unknown)
• Tend to be high-level regulators (>100 targets each)
• Percentage of human genes under miRNA control is unknown but possibly 20-30%
• Often are developmental or cell state switches
miRNA
Two mechanisms:
Perfect match to target leads to mRNA cleavage
or
Imperfect match leads to translational repression
Neither is well-understood, but likely involve the dsRNA recognition system
Another role?• Under conditions of cell stress, a miRNA
may be activating instead, as responding regulatory proteins interpret the signal differently
Seems odd . . .
• Why would a cell use this sort of mechanism? It’s making an mRNA and then degrading it. Should be easier to just not make it . . .
• But what if the cell is not in control of that RNA, for example if it’s coming from an invasive nucleic acid species under its own promoter?– Transposon control!!!– piRNA (piwi RNA) are a whole class of small RNAs
that control transposons– Invasive RNA was a big problem in the RNA world!
Occam’s razor
All other things being equal, the simplest solution is the best
My alternative: If a biological principle is simple, it’s probably wrong.
Evolution tends to higher complexity, as old mechanisms are reused and there’s little incentive to clean up.
Looking for new miRNAs
• Often found within stem-loop precursor structures (hairpins)
• Associated (in the cell) with polysomes and other structures
• Bioinformatics: unexpected sequence conservation in noncoding region, or homology to miRNA in a closely related species (works less often than you would think)
• Identify candidate miRNA targets (TargetScan, by Chris Burge’s group)– A target protein usually has multiple target sites
Problems with miRNAs
• Small! Unstable, hard to get large quantities• Binding is degenerate, noncontiguous, and
includes not only mismatches but bulges• Actual sequence recognition only 15 or so
nucleotides (noncontiguous), varies by target• Essential “seed” element not well characterized• Sequences not well conserved across species• miRNA microarrays: statistics problematic
because there are so few spots