Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies
Dec 20, 2015
Thanks to:
DOE GtL
DARPA BioComp
PhRMA
NHLBI
17-Sep-2003 Virtual Conference on Genomics & Bioinformatics
BioSystems Synthesis: New optima demand new technologies
HarvardMIT DOEGtL
Center
Collaborating PIs: Chisholm, Polz, Church, Kolter, Ausubel, Lory, Kucherlapati
C.Ting
Improving Models & Measures
Why model?
“Killer Applications”: Share, Search, Merge, Check, Design
(e.g. sequence & 3D alignment)
DNA RNA Proteins
Metabolites
Replication rate
Environment
Biosystems Integrating Measures & Models
Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms
RNAiInsertionsSNPs
interactions
The issue is not speed, but integration.Cost per 99.99% bp : Including Reagents, Personnel, Equipment/5yr, Overhead/sq.m• Sub-mm scale : 1m = femtoliter (10-15)• Instruments should match GHz / $2K CPU
Why improve measurements?
Human genomes (6 billion)2 = 1019 bpImmune & cancer genome changes >1010 bp per time pointRNA ends & splicing: in situ 1012 bits/mm3
Biodiversity: Environmental & lab evolution Compact storage 105 now to 1017 bits/ mm3 eventually
& How? ($1K per genome, 108-1013 bits/$ )
Examples of cost bottlenecks
Affymetrix $30M? microfabricator limited by chemical reaction rate to one set of chips per day. (~10000X CPU cost)
Electrophoresis limited to 4000 bp/capillary/day. Fixed cost ratio of capillaries to CPUs. (~1e9X CPU cost)
Projected costs determine when biosystems data overdetermination is feasible.
In 1984, pre-HGP (X, pBR322, etc.) 0.1bp/$, would have been $30B per human
genome.
In 2002, (de novo full vs. resequencing ) ABI/Perlegen/Lynx: $300M vs. $3M
103 bp/$ (4 log improvement)
Other data I/O (e.g. video) 1013 bits/$
Steeper than exponential growth
0.001
0.01
0.1
1
10
100
1000
10000
1970 1980 1990 2000 2010
bp/$R2 = 0.985
R2 = 0.992
-5-3-113579
111315
1830 1850 1870 1890 1910 1930 1950 1970 1990 2010
log(IPS/$K)
log(bits/sec transmit)
http://www.faughnan.com/poverty.htmlhttp://www.kurzweilai.net/meme/frame.html?main=/articles/art0184.html
1965 Moore's law of integrated circuits1999 Kurzweil’s law
Instructions Per Second
Why single molecules?
(1) Integrate from cells/genomes/RNAs to data
(2) Geometry, “cis-ness” on a molecule, complex, or cell.e.g. DNA Haplotypes & RNA splice-forms
(3) Asynchronous dNTP incorporation
Polymerasecolonies
(Polonies) along a DNA
or RNAmolecule
HMS: Shendure, Zhu, Butty, WilliamsWash U: MitraAmbergen: OlejnikU. Del: Edwards, Merritt
A’
A’A’
A’
A’
A’
B
BB
B
BB
A
Single Molecule From Library
B
BA’
A’
1st Round of PCR
Primer is Extendedby Polymerase
B
A’
BA’
Polymerase colony (polony) PCR in a gel
Primer A has 5’ immobilizing Acrydite
Mitra & Church Nucleic Acids Res. 27: e34
• Hybridize Universal Primer • Add Red (Cy3) dTTP. Wash.• Add Green (FITC) dCTP• Wash; Scan
B B’
3’ 5’
AGT.
TC
B B’
3’ 5’
GCG..
C
Sequence polonies by sequential, fluorescent single-base extensions
Inexpensive, off-the-shelf equipment
MJR in situ Cycler$10K
Automatedslide fluidics
$4K
MicroarrayScanner
$26K-100K
Human Haplotype:CFTR gene
45 kbp
Rob MitraVincent ButtyJay ShendureBen Williams
Quantitative removal of Fluorophores
Rob Mitra
Template ST30:3' TCACGAGT
Base added: (C) A G T (C)
(A) G (T) C (A)
(G) T C A
3' TCACGAGT AGTGCTCA
Sequencing multiple polonies
Rob Mitra
Multiple Image Alignment
Metric based on optimal coincidence of high intensity noise pixels over a matrix of local offsets (0.4 pixel precision)
1 micron bead sequences
Correct signatures are pseudocolored red, whitewhite, , yellowyellow; noise signatures blue; and “guide” beads green.
Polony exclusion principle &Single pixel sequences
Mitra & Shendure
DNA RNA Proteins
Metabolites
Replication rate
Environment
Biosystems Integrating Measures & Models
Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms
RNAiInsertionsSNPs
interactions
Alternatively Spliced Cell Adhesion Molecule
Specific variable exons are up-or-down-regulated in various cancers
Controversial prospective diagnostic / prognostic marker (>1000 papers)
Can full isoforms resolve controversy and/or act as superior markers?
Eph4 = murine mammary epthithelial cell line
Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic)
F R
v1 v2 v3 v4 v5 v6 v7 v8 v9 v10
TMA
CD44
CD44 Exon Combinatorics (Zhu & Shendure)
1. Search Signature Image for qualified ‘objects’
a. > 50 connected pixels with same signature valueb. ‘solidity’ of > 0.50c. long axis / short axis ratio < 3
OR
a. > 25 connected pixels with same signature valueb. ‘solidity’ of > 0.80c. long axis / short axis ratio < 1.5
2. Search for internal regional maxima within each object (lest two adjacent polonies with same signature get counted as one)
3. Assign centroid locations as qualified individual ‘polonies’
Algorithm for RNA Polony Finding
RNA exon
polony examples
V1
V2
V3
V4
V5
V6
V7
V8
V9
V1
0
RNA exon
examplesauto-
regridded& quan-titated
Jun Zhu
EXON PATTERN Eph4 Eph4bDD TOTALEph4 FRATIO LSTP-PV------------7-8-9-10 609 764 1373 1.17 1E-4--------------8-9-10 320 390 710 1.13 3E-2----------6-7-8-9-10 431 251 682 -1.85 4E-18------4-5-6-7-8-9-10 218 216 434 -1.08 2E-1----------------9-10 68 143 211 1.96 7E-7--------5-6-7-8-9-10 86 39 125 -2.37 2E-6----3-4-5-6-7-8-9-10 40 56 96 1.30 9E-2------4-5---7-8-9-10 16 74 90 4.30 2E-9--2-3-4-5-6-7-8-9-10 44 28 72 -1.69 1E-21-2-3-4-5-6-7-8-9-10 22 5 27 -4.73 3E-4--------5---7-8-9-10 5 19 24 3.53 3E-3----3-4-5---7-8-9-10 1 15 16 13.95 4E-4--2-3-4-5---7-8-9-10 1 10 11 9.30 5E-3
Eph4 = murine mammary epthithelial cell line
Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic)
Summary of Counts (RNA isoforms)
1. Replica plating of DNA images [Mitra et al. NAR 1999]
2. Alternative RNA splicing combinatorics [Zhu et al. Science 2003]
3. Long range haplotyping [Mitra et al. PNAS 2003]
4. Precise SNP-mutant & mRNA ratios [Merritt et al. NAR 2003]
5. Fluorescent in situ Sequencing (FISSEQ) [Mitra et al. An.Bioch2003]
6. Tumor LOH [Butz et al BMC Biotech. 2003]
7. Polony models [Aach & Church, submitted to JTB 2003]
http://arep.med.harvard.edu/Polonator/
Polony Flavors
DNA RNA Proteins
Metabolites
Replication rate
Environment
Biosystems Integrating Measures & Models
Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms
RNAiInsertionsSNPs
interactions
Link et al. 1997 Electrophoresis 18:1259-313 (Pub)
Comparison of predicted with
observed protein properties
(abundance, localization, postsynthetic modifications)
E.coli
(Optionally protein separation steps)
3rd 2nd
Multidimensional peptide measures
Numbers on top in basepairs. 1700 ORFs are predicted . Proteomic Model is based on Mass-spectrometry of peptides at 24h time points. DifferenceMap indicates new peptide regions. The 6 colors represent ORFs in the 6 reading frames .(Harvard-MIT GtL: Jaffe, Church, Lindell, Chisholm, et al. )
Prochlorococcus Proteogenomic Map
R2=.992 R2=.635 Linear Regression R2=.1
(Harvard-MIT GtL: Jaffe, Church, Lindell, Chisholm, et al. )
RNA (3 AM)RNA (3 AM)
Circadian time-series (Prochlorococcus) RNA & protein quantitation:
In vivo crosslinking DNA-binding proteins
Comparison of Quantification Methods
0.001
0.01
0.1
1
10
100
0.0001 0.001 0.01 0.1 1 10 100
Fractional Composition (percent - total intensity all peptides)
Fra
cti
on
al
Co
mp
os
itio
n (
pe
rce
nt)
dps
rpoc
rpob
hns
dbha
ssb
gyrb
ihfalon
ihfb
top1uvra
crp
argr
nusahrpa
sspa
fur
RNAs & Proteomics Integration: Next steps
1 Detect a higher fraction of peptides (currently ~ 80% proteins, 87% peptides max, 19% average)
2 Comparative proteomics, e.g. high vs low light adapted)
3 Smoother time-series.
4 Degradation
DNA RNA Proteins
Metabolites
Replication rate
Environment
Biosystems Integrating Measures & Models
Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms
RNAiInsertionsSNPs
interactions
Synthetic Biology
• Test or manipulate optimality• Program minimal cells (100kbp)• Nanobiotechnology - new polymers• Manage complex systems e.g. stem cells & ocean ecology
Minimization of Metabolic Adjustment (MoMA)for the analysis of non-optimalmetabolic phenotypes
Daniel Segre, Dennis Vitkup
Suboptimality of mutants --integrating growth rate & flux data
- Haemophilus influenzae metabolism (Schilling andPalsson, J.Theor.Biol. 2000)
- Escherichia coli metabolic network and gene deletions (Edwards and Palsson, PNAS 2000, BMC Bioinf. 2000)
- Helicobacter pylori (Edwards, Schilling, Covert, Church, Palsson, J. Bact 2002)
- Escherichia coli MOMA (Segre, Vitkup, & Church, PNAS 2003)
MoMA/FBA REFERENCES
Xi
MembraneVtrans
Vsyn Vdeg
Vgrowth
Growth: c1Xi+ c2X2+... +cmXm Biomass
Fluxes include transport, & a growth flux
Xi=const.
vj=0
0 5 10 15 20 25 30 35 40 4510
-6
10-4
10-2
100
102
ACCOA
COA
ATP
FAD
GLY
NADH
LEU
SUCCOA
metabolites
coef
f. in
gro
wth
rea
ctio
nBiomass Composition
Null(S)={v : Sv=0}1
2
Find max{Growth}using simplex
FluxBalanceAnalysis core
Can we use flux analysis to say something
about suboptimal states ?
Flux ratios at each branch point yields optimal polymer composition for replication
x,y are two of the 100s of flux dimensions
Projection can leave the
mutant feasible space…
so Quadratic programming
(QP) to find the nearest point
12C13C
FluxRatio Data
0 50 100 150 2000
20
40
60
80
100
120
140
160
180
200
1
2
3
456
78
9
10
11121314
15
16
17 18
-50 0 50 100 150 200 250-50
0
50
100
150
200
250
1
2
3456
78
910
11121314
1516
17
18
Experimental Fluxes
Pre
dic
ted
Flu
xes
-50 0 50 100 150 200 250-50
0
50
100
150
200
250
1
2
3
456
78
910
111213
14
15
16
1718
pyk (LP)
WT (LP)
Experimental Fluxes
Pre
dic
ted
Flu
xes
Experimental Fluxes
Pre
dic
ted
Flu
xes
pyk (QP)
=0.91p=8e-8
=-0.06p=6e-1
=0.56p=7e-3
Flux Data C009-limited
Flux data (MOMA & FBA)
Condition Method 1 p-val (a) p-val (b) 2 p-val (c) p-val (d)
wt 0.91 8E-8ko (FBA) -0.064 6E-1 -0.36 9E-1ko MoMA 0.56 7E-3 0.48 2E-2wt 0.97 8E-12ko (FBA) 0.77 8E-5 0.36 7E-2ko MoMA 0.94 3E-9 0.74 2E-4wt 0.78 7E-5ko (FBA) 0.86 3E-6 0.096 4E-1ko MoMA 0.73 3E-4 0.49 2E-2
1E-2
5E-2
2E-4C-0
.09
C-0
.4N
-0.0
9
3E-3
3E-3
9E-2
Essential 142 80 62Reduced growth 46 24 22
Non essential 299 119 180 p = 4∙10-3
Essential 162 96 66Reduced growth 44 19 25
Non essential 281 108 173 p = 10-5
MOMA
FBA
Competitive growth data
2 p-values
4x10-3
1x10-5
Position effects Novel redundancies
On minimal media
negative small selection effect
Replication rate of a whole-genome set of mutants
Badarinarayana, et al. (2001) Nature Biotech.19: 1060
Replication rate challenge met: multiple homologous domains
1 2 3
1 2 3
thrA
metL
1.1 6.7
1.8 1.8
1 2lysC10.4
probes
Selective disadvantage in minimal media
Multiple mutations per gene
Correlation between two selection experiments
Badarinarayana, et al. (2001) Nature Biotech.19: 1060
Synthetic Mini-genomes• 90kbp genome? All 3D structures known.• Comprehensive functional data too.• 100X faster replication (10 sec doubling) & selection to evolve widgets & systems?• Utility of mirror-image & other unnatural polymers.• Chassis & power supply
A 90 kbp mini-genomeSP (3D) StochimetryMge# Bp Min access# Gene L.end R.endorientationlen2 SequenceTotal 144 107 89,498 74,310 285316S 1 y 1418 1418 3968 rrsB 4164238 4165779 > 124 aaattgaagagtttgatcatggctcagattgaacgctggcggcaggcctaacacatgcaagtcgaacggtaacaggaagaagcttgcttctttgctgacgagtggcggacgggtgagtaatgtctgggaaactgcctgatggagggggataactactggaaacggtagctaataccgcataacgtcgcaagaccaaagagggggaccttcgggcctcttgccatcggatgtgcccagatgggattagctagtagg23S 1 y 2903 2903 3970 rrlB 4166220 4169123 > 1 ggttaagcgactaagcgtacacggtggatgccctggcagtcagaggcgatgaaggacgtgctaatctgcgataagcgtcggtaaggtgatatgaaccgttataaccggcgatttccgaatggggaaacccagtgtgtttcgacacactatcattaactgaatccataggttaatgaggcgaaccgggggaactgaaacatctaagtaccccgaggaaaagaaatcaaccgagattcccccagtagcggcgagcga5S 1 120 120 3971 rrfB 4169216 4169335 > 0 tgcctggcggcagtagcgcggtggtcccacctgaccccatgccgaactcagaagtgaaacgccgtagcgccgatggtagtgtggggtctccccatgcgagagtagggaactgccaggcat10sb (RNaseP) 375 375 3123 rnpB 3268233 3267857 < 2 gaagctgaccagacagtcgccgcttcgtcgtcgtcctcttcgggggagacgggcggaggggaggaaagtccgggctccatagggcagggtgccaggtaacgcctgggggggaaacccacgaccagtgcaacagagagcaaaccgccgatggcccgcgcaagcgggatcaggtaagggtgaaagggtgcggtaagagcgcaccgcgcggctggtaacagtccgtggcacggtaaactccacccggagcaaggccaatRNAs 20-46 y 3136 1364 3939 eg. gltT 4165951 4166026 > gtccccttcgtctagaggcccaggacaccgccctttcacggcggtaacaggggttcgaatcccctaggggacgccaCca (no) ? 1236 3056 cca 3199532 3200770 > 3 gtgaagatttatctggtcggtggtgctgttcgggatgcattgttagggctaccggtcaaagacagagattgggtggtggtcggcagtacgccacaggagatgctcgacgcgggctaccagcaggtaggccgcgattttcctgtgtttctgcatccgcaaacgcatgaagagtatgcgctggcacgtaccgaacggaaatccggttccggttacaccggttttacttgctatgccgcaccggatgtcacgctggaaTrmA (22?) ? 1098 3965 trmA 4159749 4160849 < 3 atgacccccgaacaccttccaacagaacagtatgaagcgcagttagccgaaaaagtggtacgtttgcaaagtatgatggcaccgttttctgacctggttccggaagtgtttcgctcgccggtcagtcattaccggatgcgcgcggagttccgcatctggcacgatggcgatgacctgtatcacatcattttcgatcaacaaaccaaaagccgcatccgcgtggatagcttccccgccgccagtgaacttatcaacBstNBI (no) 1815 AF329098 1 1815 > 0 atggctaaaaaagttaattggtatgtttcttgttcacctagaagtccagaaaaaattcagcctgagttaaaagtactagcaaattttgagggaagttattggaaaggggtaaaagggtataaagcacaagaggcatttgctaaagaacttgctgctttaccacaattcttaggtactacttataaaaaagaagctgcattttctactcgagacagagtggcaccaatgaaaacttatggtttcgtatttgtagatTri1 ? AP001918 traI 92673 97943 > atgatgagtattgcgcaggtcagatcggccggaagtgccgggaactattataccgacaaggataattactatgtgctgggcagcatgggagaacgctgggccggcaggggggctgaacagctggggctgcagggcagtgtcgataaggatgtttttacccgtcttctggagggcaggctgccggacggagcggatctaagccgcatgcaggatggcagtaacaggcatcgtcccggctacgatctgaccttctccFlp no 1272 NC_001398 5573 523 > 0 atgccacaatttggtatattatgtaaaacaccacctaaggtgcttgttcgtcagtttgtggaaaggtttgaaagaccttcaggtgagaaaatagcattatgtgctgctgaactaacctatttatgttggatgattacacataacggaacagcaatcaagagagccacattcatgagctataatactatcataagcaattcgctgagtttcgatattgtcaataaatcactccagtttaaatacaagacgcaaaaaGFP no 717 AF302837 27 743 > 0 atgagtaaaggagaagaacttttcactggagttgtcccaattcttgttgaattagatggcgatgttaatgggcaaaaattctctgtcagtggagagggtgaaggtgatgcaacatacggaaaacttacccttaaatttatttgcactactgggaagctacctgttccatggccaacacttgtcactactttcgcgtatggtcttcaatgctttgcgagatacccagatcatatgaaacagcatgactttttcaagRnpa (36%) 357 357 3704 rnpA 3882122 3882481 > 3 gtggttaagctcgcatttcccagggagttacgcttgttaactcccagtcaattcacattcgtcttccagcagccacaacgggctggcacgccgcaaattaccattctcggccgcctgaattcgctggggcatccccgtatcggtcttacagtcgccaagaaaaacgttcgacgcgcccatgaacgcaatcggattaaacgtctgacgcgtgaaagcttccgtctgcgccaacatgaactcccggctatggatttcBstPol multiprot 2631 2631 U93028 95 2728 > 3 atgagattgaagaaaaaactcgtcttaattgatggcaacagtgtggcataccgcgccttttttgccttgccacttttgcataacgacaaaggcattcatacgaatgcggtttacgggtttacgatgatgttgaacaaaattttggcggaagaacaaccgacccatttacttgtagcgtttgacgccggaaaaacgacgttccggcatgaaacgtttcaagagtataaaggcggacggcaacaaacgcccccggaaRpol_Bpt7 multiprot 2649 2649 NC_001604 3171 5822 > 2 atgaacacgattaacatcgctaagaacgacttctctgacatcgaactggctgctatcccgttcaacactctggctgaccattacggtgagcgtttagctcgcgaacagttggcccttgagcatgagtcttacgagatgggtgaagcacgcttccgcaagatgtttgagcgtcaacttaaagctggtgaggttgcggataacgctgccgccaagcctctcatcactaccctactccctaagatgattgcacgcatcEFTu 451 1179 1179 3339 tufA 3467782 3468966 < 6 gtgtctaaagaaaaatttgaacgtacaaaaccgcacgttaacgttggtactatcggccacgttgaccacggtaaaactactctgaccgctgcaatcaccaccgtactggctaaaacctacggcggtgctgctcgtgcattcgaccagatcgataacgcgccggaagaaaaagctcgtggtatcaccatcaacacttctcacgttgaatacgacaccccgacccgtcactacgcacacgtagactgcccggggcacEFG (59%) 89 2109 2109 3340 fusA 3469037 3471151 < 6 atggctcgtacaacacccatcgcacgctaccgtaacatcggtatcagtgcgcacatcgacgccggtaaaaccactactaccgaacgtattctgttctacaccggtgtaaaccataaaatcggtgaagttcatgacggcgctgcaaccatggactggatggagcaggagcaggaacgtggtattaccatcacttccgctgcgactactgcattctggtctggtatggctaagcagtatgagccgcatcgcatcaacEFTs 433 846 846 170 tsf 190857 191708 > 6 atggctgaaattaccgcatccctggtaaaagagctgcgtgagcgtactggcgcaggcatgatggattgcaaaaaagcactgactgaagctaacggcgacatcgagctggcaatcgaaaacatgcgtaagtccggtgctattaaagcagcgaaaaaagcaggcaacgttgctgctgacggcgtgatcaaaaccaaaatcgacggcaactacggcatcattctggaagttaactgccagactgacttcgttgcaaaaEFP (no) 26 561 561 4147 efp 4373277 4373843 > 6 atggcaacgtactatagcaacgattttcgtgctggtcttaaaatcatgttagacggcgaaccttacgcggttgaagcgagtgaattcgtaaaaccgggtaaaggccaggcatttgctcgcgttaaactgcgtcgtctgctgaccggtactcgcgtagaaaaaaccttcaaatctactgattccgctgaaggcgctgatgttgtcgatatgaacctgacttacctgtacaacgacggtgagttctggcacttcatgIF1 173 213 213 884 infA 925448 925666 < 6 atggccaaagaagacaatattgaaatgcaaggtaccgttcttgaaacgttgcctaataccatgttccgcgtagagttagaaaacggtcacgtggttactgcacacatctccggtaaaatgcgcaaaaactacatccgcatcctgacgggcgacaaagtgactgttgaactgaccccgtacgacctgagcaaaggccgcattgtcttccgtagtcgctgaIF2 (25%) 142 2682 2682 3168 infB 3310983 3313655 < -9 atgacagatgtaacgattaaaacgctggccgcagagcgacagacctccgtggaacgcctggtacagcaatttgctgatgcaggtatccggaagtctgctgacgactctgtgtctgcacaagagaaacagactttgattgaccacctgaatcagaaaaattcaggcccggacaaattgacgctgcaacgtaaaacacgcagcacccttaacattcctggtaccggtggaaaaagcaaatcggtacaaatcgaagtcIF3 (~50%) 196 540 540 1718 infC 1798120 1798662 < 3 attaaaggcggaaaacgagttcaaacggcgcgccctaaccgtatcaatggcgaaattcgcgcccaggaagttcgcttaacaggtctggaaggcgagcagcttggtattgtgagtctgagagaagctctggagaaagcagaagaagccggagtagacttagtcgagatcagccctaacgccgagccgccggtttgtcgtataatggattacggcaaattcctctatgaaaagagcaagtcttctaaggaacagaagRF1 (no) 258 1080 1211 prfA 1264235 1265317 > 3 atgaagccttctatcgttgccaaactggaagccctgcatgaacgccatgaagaagttcaggcgttgctgggtgacgcgcaaactatcgccgaccaggaacgttttcgcgcattatcacgcgaatatgcgcagttaagtgatgtttcgcgctgttttaccgactggcaacaggttcaggaagatatcgaaaccgcacagatgatgctcgatgatcctgaaatgcgtgagatggcgcaggatgaactgcgcgaagctRRF 435 555 555 172 frr 192872 193429 > 3 gtgattagcgatatcagaaaagatgctgaagtacgcatggacaaatgcgtagaagcgttcaaaacccaaatcagcaaaatacgcacgggtcgtgcttctcccagcctgctggatggcattgtcgtggaatattacggcacgccgacgccgctgcgtcagctggcaagcgtaacggtagaagattcccgtacactgaaaatcaacgtgtttgatcgttcaatgtctccggccgttgaaaaagcgattatggcgtccRL1 (~50%) 1 82 699 699 3984 rplA 4176457 4177161 > 6 atggctaaactgaccaagcgcatgcgtgttatccgcgagaaagttgatgcaaccaaacagtacgacatcaacgaagctatcgcactgctgaaagagctggcgactgctaaattcgtagaaagcgtggacgtagctgttaacctcggcatcgacgctcgtaaatctgaccagaacgtacgtggtgcaactgtactgccgcacggtactggccgttccgttcgcgtagccgtatttacccaaggtgcaaacgctgaaRL2 1 154 816 816 3317 rplB 3448180 3449001 < 6 atggcagttgttaaatgtaaaccgacatctccgggtcgtcgccacgtagttaaagtggttaaccctgagctgcacaagggcaaaccttttgctccgttgctggaaaaaaacagcaaatccggtggtcgtaacaacaatggccgtatcaccactcgtcatatcggtggtggccacaagcaggcttaccgtattgttgacttcaaacgcaacaaagacggtatcccggcagttgttgaacgtcttgagtacgatccg
The in vitro assembly (& 3D structure) of the prokaryotic ribosomes is known. (e.g. Nomura et al.; Noller et al.)
M 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
DNA Template
RNA Transcript
All 30S-Ribosomal-protein DNAs & mRNAs synthesized in vitro
Tian & Church
His-tagged ribosomal proteins synthesized in vitro
RS-2,4,5,6,9,10,12,13,15,16,17,and 21 as original constructs.
RS1 required deletion of a feedback motif in the mRNA.RS-3, 7, 8, 11, 14, 18, 19, 20 are still weakly expressed.
Note that S1, S4, S7, S8, S20, L1, L4, L10 are known to repress their own translation (and are likely titrated by rRNA).
In progress: Resynthesize all genes with less structure.
Tian & Church
David Goodsell
DNA RNA Proteins
Metabolites
Environment
Biosystems Integrating Measures & Models
Microbes Cancer & stem cellsIn vitro replicationmulticellular organisms
interactions
Polonies(CD44 & cancer)
MOMADarwinian (sub)optima
Arrays & Mass-spec(circadian & cell cycle)