RNA & NKS Erik A. Schultes Hedgehog Research hedgehogresearch.info June 16, 2006
Mar 26, 2015
RNA & NKS
Erik A. SchultesHedgehog Research
hedgehogresearch.info
June 16, 2006
5’
3’
Ribonucleic Acid:• Universal biopolymer• Linear, polarized
5’
3’
Cytosine
Uracil
Adenine
Guanine
Ribonucleic Acid:• Universal biopolymer• Linear, polarized • 4 distinct nitrogenous bases (nt)• RNA can store genetic info (like DNA)
5’
3’
Cytosine
Uracil
Adenine
Guanine
Ribonucleic Acid:• Universal biopolymer• Linear, polarized • 4 distinct nitrogenous bases• RNA can store genetic info (like DNA)• Base-pairing rules:
A with UC with G
5’
3’
Cytosine
Uracil
Adenine
Guanine
5’
3’
Cytosine
Uracil
Adenine
Guanine
Double-stranded RNA Helix
5’
3’
Cytosine
Uracil
Adenine
Guanine
Ribonucleic Acid:• Universal biopolymer• Linear, polarized • 4 distinct nitrogenous bases• RNA can store genetic info (like DNA)• Base-pairing rules:
A with UC with G
• RNA can act as an enzyme (like proteins)
PrimaryStructure
5 ’ G G A A U U G C G G G AA A G G G G U C A A C A GC C G U U C A G U A C C AA G U C U C A G G G G A AA C U U U G A G A U G G CC U U G C A A A G G G U AU G G U A A U A A G C U GA C G G A C A U G G U C CU A A C C A C G C A G C CA A G U C C U A A G U C AA C A G A U C U U C U G UU G A U A U G G A U G C AGUUCA3’
Single-stranded RNA:
ACGUGGAAUUGCGGGAAGAGGGCAACAGCCGUUCAGUACCAAGUCUCAGGGGAAACUUUGAGAUGGCCUGCAAAGGGUAUUGGUAAUAAGCUGACGUGGUCCUAACCCGCAGUGCCAAGUACGUAGCGCUAUAAGUUAGCUAUAUAGCGCAUUCU5’3’ACA
5 ’ G G A A U U G C G G G AA A G G G G U C A A C A GC C G U U C A G U A C C AA G U C U C A G G G G A AA C U U U G A G A U G G CC U U G C A A A G G G U AU G G U A A U A A G C U GA C G G A C A U G G U C CU A A C C A C G C A G C CA A G U C C U A A G U C AA C A G A U C U U C U G UU G A U A U G G A U G C AGUUCA3’
PrimaryStructure
SecondaryStructure
Single-stranded RNA:
ACGUGGAAUUGCGGGAAGAGGGCAACAGCCGUUCAGUACCAAGUCUCAGGGGAAACUUUGAGAUGGCCUGCAAAGGGUAUUGGUAAUAAGCUGACGUGGUCCUAACCCGCAGUGCCAAGUACGUAGCGCUAUAAGUUAGCUAUAUAGCGCAUUCU5’3’ACA
5 ’ G G A A U U G C G G G AA A G G G G U C A A C A GC C G U U C A G U A C C AA G U C U C A G G G G A AA C U U U G A G A U G G CC U U G C A A A G G G U AU G G U A A U A A G C U GA C G G A C A U G G U C CU A A C C A C G C A G C CA A G U C C U A A G U C AA C A G A U C U U C U G UU G A U A U G G A U G C AGUUCA3’
PrimaryStructure
SecondaryStructure
TertiaryStructure
Single-stranded RNA:
Hammer head
VS
HDV
Group I
Group IIRNAse P
Hairpin
Single-stranded RNA:
16S rRNA
Single-stranded RNA:
16S rRNA Universal Tree of Life
A New Kind of Scienceby Stephen Wolfram
In our everyday experience with computers, the programs that we encounter are normally set up to perform very definite tasks.
In our everyday experience with computers, the programs that we encounter are normally set up to perform very definite tasks.
Key idea:What happens if one instead just looks at simple arbitrarily chosen programs, created without any specific task in mind.
A New Kind of Scienceby Stephen Wolfram
In our everyday experience with computers, the programs that we encounter are normally set up to perform very definite tasks.
Key idea:What happens if one instead just looks at simple arbitrarily chosen programs, created without any specific task in mind.
How do such programs typically behave?
A New Kind of Scienceby Stephen Wolfram
computer (hardware) RNA molecule
program (software) nucleotide sequence
In our everyday experience with computers, the programs that we encounter are normally set up to perform very definite tasks.
Key idea:What happens if one instead just looks at simple arbitrarily chosen programs, created without any specific task in mind.
How do such programs typically behave?
A New Kind of Scienceby Stephen Wolfram
In our everyday experience with RNAs, the sequences that we encounter are normally set up to perform very definite tasks.
Key idea:What happens if one instead just looks at simple arbitrarily chosen sequences, created without any specific task in mind.
How do such sequences typically behave?
/. computer (hardware) RNA molecule
/. program (software) nucleotide sequence
A New Kind of Scienceby Stephen Wolfram
What do we mean by:
RNA & NKS
Arbitrary sequence? RNA behavior?
What do we mean by:
RNA & NKS
generated by random processArbitrary sequence? RNA behavior?
What do we mean by:
RNA & NKS
generated by random processArbitrary sequence? RNA behavior? folding dynamics
Never converges on aunique, specific fold
Rapidly converges on a unique, specific fold
What do we mean by:
RNA & NKS
generated by random processArbitrary sequence? RNA behavior? folding dynamics
Never converges on aunique, specific fold
Rapidly converges on a unique, specific fold
What do we mean by:
RNA & NKS
generated by random processArbitrary sequence? RNA behavior? folding dynamics
HelixPoly(U)
OrderedDisordered
What do we mean by:
RNA & NKS
generated by random processArbitrary sequence? RNA behavior? folding dynamics
Evolved
ComplexNever converges on aunique, specific fold
Rapidly converges on a unique, specific fold
HelixPoly(U)
OrderedDisordered
What do we mean by:
RNA & NKS
generated by random processArbitrary sequence? RNA behavior? folding dynamics
Evolved
Complex Rapidly converges on a unique, specific fold
Helix
Ordered
Never converges on aunique, specific fold
Poly(U)
Disordered
Classes I & IIClass IVClass III
1. Lead II chemical probingsecondary structureuniqueness of folding
2. Native gel electrophoresisuniqueness of foldingsize of fold
3. Analytical centrifugationsize & shape of fold
Analyzing RNA Behavior
RNA & NKS
Control SequencestRNAPHE (76nt)
HDV (85nt)Ligase (87nt)
Reference SequencePoly(U)
Choosing Arbitrary Sequences
RNA & NKS
Control SequencestRNAPHE (76nt)
HDV (85nt)Ligase (87nt)
Reference SequencePoly(U)85
Arbitrary Sequences10 Permuted HDV (85nt)10 Isoheteropolymer (85nt)
Choosing Arbitrary Sequences
RNA & NKS
Automated DNA Synthesis
OHOOBasePOOOOHOOBaseCH2CH2
Structure Probing with Pb++
Pb++
OHOOBasePOOOOHOOBaseCH2CH2
Pb++
Structure Probing with Pb++
OHOOBasePOOOHOOBaseCH2CH2OH
Pb++
Structure Probing with Pb++
CGGGCGGGCGGCGUCGCGUCCGACGUCCGUACGUCCGCGUACGUCCGACGUACGUCACGACGUACGUACGACGACGUUAGCGCGC5’3’
OH ladderT1 ladderTimeOHOOBasePOOOHOOBaseCH2CH2OH
HDV
Structure Probing with Pb++
CGGGCGGGCGGCGUCGCGUCCGACGUCCGUACGUCCGCGUACGUCCGACGUACGUCACGACGUACGUACGACGACGUUAGCGCGC5’3’
OH ladderT1 ladderTimeOHOOBasePOOOHOOBaseCH2CH2OH
HDV
Structure Probing with Pb++
Arbitrary sequences acquire sequence-specific folds
8% (29:1)100mM THE, pH7.530mM KCl 0, 1, 10mM MgCl2
3W, 2000VhrXC = 100mmT = 23-24 ºC
Native Gel Electrophoresis
-
+
Arbitrary sequences acquire compact folds
Inferring molecular size and shape from the concentration distribution of puresample under a centrifugal filed.
Sedimentation Velocity Experiments
Reference
Sample
Sedimentation Velocity Experiments
Meniscus
Sedimentation Velocity Experiments
RNA
A260
r (cm)
Fs
Sedimentation Velocity Experiments40,000 RPM100,000 Xg
tRNA
Sedimentation Velocity Experiments
A260
r (cm)
S=4.072 SD=8.45 FicksM=25.2 kDaRs=26.2 Å
Arbitrary sequences acquire compact folds
The behavior of arbitrary RNA
Arbitrary sequences frequently have compact, sequence specific folding - properties that have always been assumed to be evolutionarily derived.
So far:20 seq, 2 compositions = (1/2 postdoc)
The behavior of arbitrary RNA
Arbitrary sequences frequently have compact, sequence specific folding - properties that have always been assumed to be evolutionarily derived.
So far:20 seq, 2 compositions = (1/2 postdoc)
Next step:35,000 seq, 1700 compositions = ($100M)
The behavior of arbitrary RNA
Arbitrary sequences frequently have compact, sequence specific folding - properties that have always been assumed to be evolutionarily derived.
Principle of Computational Equivalenceweak RNA PCE: complex, biologically relevant folds are abundant in seq space
The behavior of arbitrary RNA
Arbitrary sequences frequently have compact, sequence specific folding - properties that have always been assumed to be evolutionarily derived.
Principle of Computational Equivalenceweak RNA PCE: complex, biologically relevant folds are abundant in seq spacestrong RNA PCE: specific folds may beabundant in seq space
Distribution of Folds inRNA Sequence Space
3'AGGGGUCGGUCCACCUCCGCGGUCCGACUCGGGCAUCGAUGG CUAAGGACGAAUGC GGCGAUUAGGGGUUUUCCUAGGUACCUCUCGGC3'GGCAACUCGUCUAAACGCCGGGUGAAGAUGCCCpppCAGUCGGUGGAUGUUAGGGGCGGACCACGUUUUUUUUC U ACCA 2' HOGU CAAACGClass IIIligaseHDV
Prototype Ribozymes
AUUAAAAGCCGCUGGGCCGCCUCCUCGCGGCCGGCCCCGAUAAGGGAGGAAUUUUUCCACGGGACUCCGGAGUGUGGGCUGACHDV fold A GUA C3’U UGAACCCCUCUGGGGGCCCGCCUAACACGACGUCGUGCCGGAAACUppp2' HOCAGUCGGUGGGAGCUGCCGGAGGGAAGGAUG3’UCUUU AACCLigase foldCC
Intersection Sequence
Testing for Ligation & Cleavage
Connecting the Prototypesby Neutral Paths
Connecting the Prototypesby Neutral Paths
Prototype Ligase
42 mutations
Connecting the Prototypesby Neutral Paths
Prototype HDV
Prototype Ligase
42 mutations
44 mutations
Intersection of Fitness Landscapes
1. RNA NN exist - seq space is highly redundant in folds
1. New folds from existing folds
2. RNAs with different sequences and folds could still share ancestry
2. Different NN are proximal
Conclusion
Biological Implications
NKS and Neutral Networks
1. Are INT sequences typical or rare?
2. Are sequences on NN typical or evolved?
NKS and Neutral Networks
1. Are INT sequences typical or rare?
2. Are sequences on NN typical or evolved?
3. Does CA rule space have NN?4. If so, are there INT rules?
Acknowledgments
David P. BartelWhitehead Institute
NSF/Alfred P. Sloan FellowshipTMF/Charles A. King Trust Fellowship