Top Banner
Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies
58

Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Thanks to:

DOE GtL

DARPA BioComp

PhRMA

NHLBI

17-Sep-2003 Virtual Conference on Genomics & Bioinformatics

BioSystems Synthesis: New optima demand new technologies

Page 2: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

HarvardMIT DOEGtL

Center

Collaborating PIs: Chisholm, Polz, Church, Kolter, Ausubel, Lory, Kucherlapati

C.Ting

Page 3: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Improving Models & Measures

Why model?

“Killer Applications”: Share, Search, Merge, Check, Design

(e.g. sequence & 3D alignment)

Page 4: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

DNA RNA Proteins

Metabolites

Replication rate

Environment

Biosystems Integrating Measures & Models

Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms

RNAiInsertionsSNPs

interactions

Page 5: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

The issue is not speed, but integration.Cost per 99.99% bp : Including Reagents, Personnel, Equipment/5yr, Overhead/sq.m• Sub-mm scale : 1m = femtoliter (10-15)• Instruments should match GHz / $2K CPU

Why improve measurements?

Human genomes (6 billion)2 = 1019 bpImmune & cancer genome changes >1010 bp per time pointRNA ends & splicing: in situ 1012 bits/mm3

Biodiversity: Environmental & lab evolution Compact storage 105 now to 1017 bits/ mm3 eventually

& How? ($1K per genome, 108-1013 bits/$ )

Page 6: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Examples of cost bottlenecks

Affymetrix $30M? microfabricator limited by chemical reaction rate to one set of chips per day. (~10000X CPU cost)

Electrophoresis limited to 4000 bp/capillary/day. Fixed cost ratio of capillaries to CPUs. (~1e9X CPU cost)

Page 7: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Projected costs determine when biosystems data overdetermination is feasible.

In 1984, pre-HGP (X, pBR322, etc.) 0.1bp/$, would have been $30B per human

genome.

In 2002, (de novo full vs. resequencing ) ABI/Perlegen/Lynx: $300M vs. $3M

103 bp/$ (4 log improvement)

Other data I/O (e.g. video) 1013 bits/$

Page 8: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Steeper than exponential growth

0.001

0.01

0.1

1

10

100

1000

10000

1970 1980 1990 2000 2010

bp/$R2 = 0.985

R2 = 0.992

-5-3-113579

111315

1830 1850 1870 1890 1910 1930 1950 1970 1990 2010

log(IPS/$K)

log(bits/sec transmit)

http://www.faughnan.com/poverty.htmlhttp://www.kurzweilai.net/meme/frame.html?main=/articles/art0184.html

1965 Moore's law of integrated circuits1999 Kurzweil’s law

Instructions Per Second

Page 9: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Why single molecules?

(1) Integrate from cells/genomes/RNAs to data

(2) Geometry, “cis-ness” on a molecule, complex, or cell.e.g. DNA Haplotypes & RNA splice-forms

(3) Asynchronous dNTP incorporation

Page 10: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Polymerasecolonies

(Polonies) along a DNA

or RNAmolecule

HMS: Shendure, Zhu, Butty, WilliamsWash U: MitraAmbergen: OlejnikU. Del: Edwards, Merritt

Page 11: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

A’

A’A’

A’

A’

A’

B

BB

B

BB

A

Single Molecule From Library

B

BA’

A’

1st Round of PCR

Primer is Extendedby Polymerase

B

A’

BA’

Polymerase colony (polony) PCR in a gel

Primer A has 5’ immobilizing Acrydite

Mitra & Church Nucleic Acids Res. 27: e34

Page 12: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

• Hybridize Universal Primer • Add Red (Cy3) dTTP. Wash.• Add Green (FITC) dCTP• Wash; Scan

B B’

3’ 5’

AGT.

TC

B B’

3’ 5’

GCG..

C

Sequence polonies by sequential, fluorescent single-base extensions

Page 13: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Inexpensive, off-the-shelf equipment

MJR in situ Cycler$10K

Automatedslide fluidics

$4K

                                                                                 

MicroarrayScanner

$26K-100K

Page 14: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Human Haplotype:CFTR gene

45 kbp

Rob MitraVincent ButtyJay ShendureBen Williams

Page 15: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Quantitative removal of Fluorophores

Rob Mitra

Page 16: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Template ST30:3' TCACGAGT

Base added: (C) A G T (C)

(A) G (T) C (A)

(G) T C A

3' TCACGAGT AGTGCTCA

Sequencing multiple polonies

Rob Mitra

Page 17: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Multiple Image Alignment

Metric based on optimal coincidence of high intensity noise pixels over a matrix of local offsets (0.4 pixel precision)

Page 18: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

1 micron bead sequences

Correct signatures are pseudocolored red, whitewhite, , yellowyellow; noise signatures blue; and “guide” beads green.

Page 19: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Polony exclusion principle &Single pixel sequences

Mitra & Shendure

Page 20: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

DNA RNA Proteins

Metabolites

Replication rate

Environment

Biosystems Integrating Measures & Models

Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms

RNAiInsertionsSNPs

interactions

Page 21: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Alternatively Spliced Cell Adhesion Molecule

Specific variable exons are up-or-down-regulated in various cancers

Controversial prospective diagnostic / prognostic marker (>1000 papers)

Can full isoforms resolve controversy and/or act as superior markers?

Eph4 = murine mammary epthithelial cell line

Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic)

F R

v1 v2 v3 v4 v5 v6 v7 v8 v9 v10

TMA

CD44

CD44 Exon Combinatorics (Zhu & Shendure)

Page 22: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

1. Search Signature Image for qualified ‘objects’

a. > 50 connected pixels with same signature valueb. ‘solidity’ of > 0.50c. long axis / short axis ratio < 3

OR

a. > 25 connected pixels with same signature valueb. ‘solidity’ of > 0.80c. long axis / short axis ratio < 1.5

2. Search for internal regional maxima within each object (lest two adjacent polonies with same signature get counted as one)

3. Assign centroid locations as qualified individual ‘polonies’

Algorithm for RNA Polony Finding

Page 23: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

RNA exon

polony examples

Page 24: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

V1

V2

V3

V4

V5

V6

V7

V8

V9

V1

0

RNA exon

examplesauto-

regridded& quan-titated

Page 25: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Jun Zhu

EXON PATTERN Eph4 Eph4bDD TOTALEph4 FRATIO LSTP-PV------------7-8-9-10 609 764 1373 1.17 1E-4--------------8-9-10 320 390 710 1.13 3E-2----------6-7-8-9-10 431 251 682 -1.85 4E-18------4-5-6-7-8-9-10 218 216 434 -1.08 2E-1----------------9-10 68 143 211 1.96 7E-7--------5-6-7-8-9-10 86 39 125 -2.37 2E-6----3-4-5-6-7-8-9-10 40 56 96 1.30 9E-2------4-5---7-8-9-10 16 74 90 4.30 2E-9--2-3-4-5-6-7-8-9-10 44 28 72 -1.69 1E-21-2-3-4-5-6-7-8-9-10 22 5 27 -4.73 3E-4--------5---7-8-9-10 5 19 24 3.53 3E-3----3-4-5---7-8-9-10 1 15 16 13.95 4E-4--2-3-4-5---7-8-9-10 1 10 11 9.30 5E-3

Eph4 = murine mammary epthithelial cell line

Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic)

Summary of Counts (RNA isoforms)

Page 26: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

1. Replica plating of DNA images [Mitra et al. NAR 1999]

2. Alternative RNA splicing combinatorics [Zhu et al. Science 2003]

3. Long range haplotyping [Mitra et al. PNAS 2003]

4. Precise SNP-mutant & mRNA ratios [Merritt et al. NAR 2003]

5. Fluorescent in situ Sequencing (FISSEQ) [Mitra et al. An.Bioch2003]

6. Tumor LOH [Butz et al BMC Biotech. 2003]

7. Polony models [Aach & Church, submitted to JTB 2003]

http://arep.med.harvard.edu/Polonator/

Polony Flavors

Page 27: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

DNA RNA Proteins

Metabolites

Replication rate

Environment

Biosystems Integrating Measures & Models

Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms

RNAiInsertionsSNPs

interactions

Page 28: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Link et al. 1997 Electrophoresis 18:1259-313 (Pub)

Comparison of predicted with

observed protein properties

(abundance, localization, postsynthetic modifications)

E.coli

Page 29: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

(Optionally protein separation steps)

3rd 2nd

Multidimensional peptide measures

Page 30: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Numbers on top in basepairs. 1700 ORFs are predicted . Proteomic Model is based on Mass-spectrometry of peptides at 24h time points. DifferenceMap indicates new peptide regions. The 6 colors represent ORFs in the 6 reading frames .(Harvard-MIT GtL: Jaffe, Church, Lindell, Chisholm, et al. )

Prochlorococcus Proteogenomic Map

Page 31: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

R2=.992 R2=.635 Linear Regression R2=.1

(Harvard-MIT GtL: Jaffe, Church, Lindell, Chisholm, et al. )

RNA (3 AM)RNA (3 AM)

Circadian time-series (Prochlorococcus) RNA & protein quantitation:

Page 32: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

In vivo crosslinking DNA-binding proteins

Comparison of Quantification Methods

0.001

0.01

0.1

1

10

100

0.0001 0.001 0.01 0.1 1 10 100

Fractional Composition (percent - total intensity all peptides)

Fra

cti

on

al

Co

mp

os

itio

n (

pe

rce

nt)

dps

rpoc

rpob

hns

dbha

ssb

gyrb

ihfalon

ihfb

top1uvra

crp

argr

nusahrpa

sspa

fur

Page 33: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

RNAs & Proteomics Integration: Next steps

1 Detect a higher fraction of peptides (currently ~ 80% proteins, 87% peptides max, 19% average)

2 Comparative proteomics, e.g. high vs low light adapted)

3 Smoother time-series.

4 Degradation

Page 34: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

DNA RNA Proteins

Metabolites

Replication rate

Environment

Biosystems Integrating Measures & Models

Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms

RNAiInsertionsSNPs

interactions

Page 35: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Synthetic Biology

• Test or manipulate optimality• Program minimal cells (100kbp)• Nanobiotechnology - new polymers• Manage complex systems e.g. stem cells & ocean ecology

Page 36: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Minimization of Metabolic Adjustment (MoMA)for the analysis of non-optimalmetabolic phenotypes

Daniel Segre, Dennis Vitkup

Suboptimality of mutants --integrating growth rate & flux data

Page 37: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

- Haemophilus influenzae metabolism (Schilling andPalsson, J.Theor.Biol. 2000)

- Escherichia coli metabolic network and gene deletions (Edwards and Palsson, PNAS 2000, BMC Bioinf. 2000)

- Helicobacter pylori (Edwards, Schilling, Covert, Church, Palsson, J. Bact 2002)

- Escherichia coli MOMA (Segre, Vitkup, & Church, PNAS 2003)

MoMA/FBA REFERENCES

Page 38: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.
Page 39: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Xi

MembraneVtrans

Vsyn Vdeg

Vgrowth

Growth: c1Xi+ c2X2+... +cmXm Biomass

Fluxes include transport, & a growth flux

Xi=const.

vj=0

Page 40: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

0 5 10 15 20 25 30 35 40 4510

-6

10-4

10-2

100

102

ACCOA

COA

ATP

FAD

GLY

NADH

LEU

SUCCOA

metabolites

coef

f. in

gro

wth

rea

ctio

nBiomass Composition

Page 41: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Null(S)={v : Sv=0}1

2

Find max{Growth}using simplex

FluxBalanceAnalysis core

Page 42: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Can we use flux analysis to say something

about suboptimal states ?

Page 43: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Flux ratios at each branch point yields optimal polymer composition for replication

x,y are two of the 100s of flux dimensions

Page 44: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Projection can leave the

mutant feasible space…

so Quadratic programming

(QP) to find the nearest point

Page 45: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

12C13C

FluxRatio Data

Page 46: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

0 50 100 150 2000

20

40

60

80

100

120

140

160

180

200

1

2

3

456

78

9

10

11121314

15

16

17 18

-50 0 50 100 150 200 250-50

0

50

100

150

200

250

1

2

3456

78

910

11121314

1516

17

18

Experimental Fluxes

Pre

dic

ted

Flu

xes

-50 0 50 100 150 200 250-50

0

50

100

150

200

250

1

2

3

456

78

910

111213

14

15

16

1718

pyk (LP)

WT (LP)

Experimental Fluxes

Pre

dic

ted

Flu

xes

Experimental Fluxes

Pre

dic

ted

Flu

xes

pyk (QP)

=0.91p=8e-8

=-0.06p=6e-1

=0.56p=7e-3

Flux Data C009-limited

Page 47: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Flux data (MOMA & FBA)

Condition Method 1 p-val (a) p-val (b) 2 p-val (c) p-val (d)

wt 0.91 8E-8ko (FBA) -0.064 6E-1 -0.36 9E-1ko MoMA 0.56 7E-3 0.48 2E-2wt 0.97 8E-12ko (FBA) 0.77 8E-5 0.36 7E-2ko MoMA 0.94 3E-9 0.74 2E-4wt 0.78 7E-5ko (FBA) 0.86 3E-6 0.096 4E-1ko MoMA 0.73 3E-4 0.49 2E-2

1E-2

5E-2

2E-4C-0

.09

C-0

.4N

-0.0

9

3E-3

3E-3

9E-2

Page 48: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Essential 142 80 62Reduced growth 46 24 22

Non essential 299 119 180 p = 4∙10-3

Essential 162 96 66Reduced growth 44 19 25

Non essential 281 108 173 p = 10-5

MOMA

FBA

Competitive growth data

2 p-values

4x10-3

1x10-5

Position effects Novel redundancies

On minimal media

negative small selection effect

Page 49: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Replication rate of a whole-genome set of mutants

Badarinarayana, et al. (2001) Nature Biotech.19: 1060

Page 50: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Replication rate challenge met: multiple homologous domains

 

1 2 3

1 2 3

thrA

metL

1.1 6.7

1.8 1.8

1 2lysC10.4

 

  

probes

Selective disadvantage in minimal media

Page 51: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Multiple mutations per gene

Correlation between two selection experiments

Badarinarayana, et al. (2001) Nature Biotech.19: 1060

Page 52: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

Synthetic Mini-genomes• 90kbp genome? All 3D structures known.• Comprehensive functional data too.• 100X faster replication (10 sec doubling) & selection to evolve widgets & systems?• Utility of mirror-image & other unnatural polymers.• Chassis & power supply

Page 53: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

A 90 kbp mini-genomeSP (3D) StochimetryMge# Bp Min access# Gene L.end R.endorientationlen2 SequenceTotal 144 107 89,498 74,310 285316S 1 y 1418 1418 3968 rrsB 4164238 4165779 > 124 aaattgaagagtttgatcatggctcagattgaacgctggcggcaggcctaacacatgcaagtcgaacggtaacaggaagaagcttgcttctttgctgacgagtggcggacgggtgagtaatgtctgggaaactgcctgatggagggggataactactggaaacggtagctaataccgcataacgtcgcaagaccaaagagggggaccttcgggcctcttgccatcggatgtgcccagatgggattagctagtagg23S 1 y 2903 2903 3970 rrlB 4166220 4169123 > 1 ggttaagcgactaagcgtacacggtggatgccctggcagtcagaggcgatgaaggacgtgctaatctgcgataagcgtcggtaaggtgatatgaaccgttataaccggcgatttccgaatggggaaacccagtgtgtttcgacacactatcattaactgaatccataggttaatgaggcgaaccgggggaactgaaacatctaagtaccccgaggaaaagaaatcaaccgagattcccccagtagcggcgagcga5S 1 120 120 3971 rrfB 4169216 4169335 > 0 tgcctggcggcagtagcgcggtggtcccacctgaccccatgccgaactcagaagtgaaacgccgtagcgccgatggtagtgtggggtctccccatgcgagagtagggaactgccaggcat10sb (RNaseP) 375 375 3123 rnpB 3268233 3267857 < 2 gaagctgaccagacagtcgccgcttcgtcgtcgtcctcttcgggggagacgggcggaggggaggaaagtccgggctccatagggcagggtgccaggtaacgcctgggggggaaacccacgaccagtgcaacagagagcaaaccgccgatggcccgcgcaagcgggatcaggtaagggtgaaagggtgcggtaagagcgcaccgcgcggctggtaacagtccgtggcacggtaaactccacccggagcaaggccaatRNAs 20-46 y 3136 1364 3939 eg. gltT 4165951 4166026 > gtccccttcgtctagaggcccaggacaccgccctttcacggcggtaacaggggttcgaatcccctaggggacgccaCca (no) ? 1236 3056 cca 3199532 3200770 > 3 gtgaagatttatctggtcggtggtgctgttcgggatgcattgttagggctaccggtcaaagacagagattgggtggtggtcggcagtacgccacaggagatgctcgacgcgggctaccagcaggtaggccgcgattttcctgtgtttctgcatccgcaaacgcatgaagagtatgcgctggcacgtaccgaacggaaatccggttccggttacaccggttttacttgctatgccgcaccggatgtcacgctggaaTrmA (22?) ? 1098 3965 trmA 4159749 4160849 < 3 atgacccccgaacaccttccaacagaacagtatgaagcgcagttagccgaaaaagtggtacgtttgcaaagtatgatggcaccgttttctgacctggttccggaagtgtttcgctcgccggtcagtcattaccggatgcgcgcggagttccgcatctggcacgatggcgatgacctgtatcacatcattttcgatcaacaaaccaaaagccgcatccgcgtggatagcttccccgccgccagtgaacttatcaacBstNBI (no) 1815 AF329098 1 1815 > 0 atggctaaaaaagttaattggtatgtttcttgttcacctagaagtccagaaaaaattcagcctgagttaaaagtactagcaaattttgagggaagttattggaaaggggtaaaagggtataaagcacaagaggcatttgctaaagaacttgctgctttaccacaattcttaggtactacttataaaaaagaagctgcattttctactcgagacagagtggcaccaatgaaaacttatggtttcgtatttgtagatTri1 ? AP001918 traI 92673 97943 > atgatgagtattgcgcaggtcagatcggccggaagtgccgggaactattataccgacaaggataattactatgtgctgggcagcatgggagaacgctgggccggcaggggggctgaacagctggggctgcagggcagtgtcgataaggatgtttttacccgtcttctggagggcaggctgccggacggagcggatctaagccgcatgcaggatggcagtaacaggcatcgtcccggctacgatctgaccttctccFlp no 1272 NC_001398 5573 523 > 0 atgccacaatttggtatattatgtaaaacaccacctaaggtgcttgttcgtcagtttgtggaaaggtttgaaagaccttcaggtgagaaaatagcattatgtgctgctgaactaacctatttatgttggatgattacacataacggaacagcaatcaagagagccacattcatgagctataatactatcataagcaattcgctgagtttcgatattgtcaataaatcactccagtttaaatacaagacgcaaaaaGFP no 717 AF302837 27 743 > 0 atgagtaaaggagaagaacttttcactggagttgtcccaattcttgttgaattagatggcgatgttaatgggcaaaaattctctgtcagtggagagggtgaaggtgatgcaacatacggaaaacttacccttaaatttatttgcactactgggaagctacctgttccatggccaacacttgtcactactttcgcgtatggtcttcaatgctttgcgagatacccagatcatatgaaacagcatgactttttcaagRnpa (36%) 357 357 3704 rnpA 3882122 3882481 > 3 gtggttaagctcgcatttcccagggagttacgcttgttaactcccagtcaattcacattcgtcttccagcagccacaacgggctggcacgccgcaaattaccattctcggccgcctgaattcgctggggcatccccgtatcggtcttacagtcgccaagaaaaacgttcgacgcgcccatgaacgcaatcggattaaacgtctgacgcgtgaaagcttccgtctgcgccaacatgaactcccggctatggatttcBstPol multiprot 2631 2631 U93028 95 2728 > 3 atgagattgaagaaaaaactcgtcttaattgatggcaacagtgtggcataccgcgccttttttgccttgccacttttgcataacgacaaaggcattcatacgaatgcggtttacgggtttacgatgatgttgaacaaaattttggcggaagaacaaccgacccatttacttgtagcgtttgacgccggaaaaacgacgttccggcatgaaacgtttcaagagtataaaggcggacggcaacaaacgcccccggaaRpol_Bpt7 multiprot 2649 2649 NC_001604 3171 5822 > 2 atgaacacgattaacatcgctaagaacgacttctctgacatcgaactggctgctatcccgttcaacactctggctgaccattacggtgagcgtttagctcgcgaacagttggcccttgagcatgagtcttacgagatgggtgaagcacgcttccgcaagatgtttgagcgtcaacttaaagctggtgaggttgcggataacgctgccgccaagcctctcatcactaccctactccctaagatgattgcacgcatcEFTu 451 1179 1179 3339 tufA 3467782 3468966 < 6 gtgtctaaagaaaaatttgaacgtacaaaaccgcacgttaacgttggtactatcggccacgttgaccacggtaaaactactctgaccgctgcaatcaccaccgtactggctaaaacctacggcggtgctgctcgtgcattcgaccagatcgataacgcgccggaagaaaaagctcgtggtatcaccatcaacacttctcacgttgaatacgacaccccgacccgtcactacgcacacgtagactgcccggggcacEFG (59%) 89 2109 2109 3340 fusA 3469037 3471151 < 6 atggctcgtacaacacccatcgcacgctaccgtaacatcggtatcagtgcgcacatcgacgccggtaaaaccactactaccgaacgtattctgttctacaccggtgtaaaccataaaatcggtgaagttcatgacggcgctgcaaccatggactggatggagcaggagcaggaacgtggtattaccatcacttccgctgcgactactgcattctggtctggtatggctaagcagtatgagccgcatcgcatcaacEFTs 433 846 846 170 tsf 190857 191708 > 6 atggctgaaattaccgcatccctggtaaaagagctgcgtgagcgtactggcgcaggcatgatggattgcaaaaaagcactgactgaagctaacggcgacatcgagctggcaatcgaaaacatgcgtaagtccggtgctattaaagcagcgaaaaaagcaggcaacgttgctgctgacggcgtgatcaaaaccaaaatcgacggcaactacggcatcattctggaagttaactgccagactgacttcgttgcaaaaEFP (no) 26 561 561 4147 efp 4373277 4373843 > 6 atggcaacgtactatagcaacgattttcgtgctggtcttaaaatcatgttagacggcgaaccttacgcggttgaagcgagtgaattcgtaaaaccgggtaaaggccaggcatttgctcgcgttaaactgcgtcgtctgctgaccggtactcgcgtagaaaaaaccttcaaatctactgattccgctgaaggcgctgatgttgtcgatatgaacctgacttacctgtacaacgacggtgagttctggcacttcatgIF1 173 213 213 884 infA 925448 925666 < 6 atggccaaagaagacaatattgaaatgcaaggtaccgttcttgaaacgttgcctaataccatgttccgcgtagagttagaaaacggtcacgtggttactgcacacatctccggtaaaatgcgcaaaaactacatccgcatcctgacgggcgacaaagtgactgttgaactgaccccgtacgacctgagcaaaggccgcattgtcttccgtagtcgctgaIF2 (25%) 142 2682 2682 3168 infB 3310983 3313655 < -9 atgacagatgtaacgattaaaacgctggccgcagagcgacagacctccgtggaacgcctggtacagcaatttgctgatgcaggtatccggaagtctgctgacgactctgtgtctgcacaagagaaacagactttgattgaccacctgaatcagaaaaattcaggcccggacaaattgacgctgcaacgtaaaacacgcagcacccttaacattcctggtaccggtggaaaaagcaaatcggtacaaatcgaagtcIF3 (~50%) 196 540 540 1718 infC 1798120 1798662 < 3 attaaaggcggaaaacgagttcaaacggcgcgccctaaccgtatcaatggcgaaattcgcgcccaggaagttcgcttaacaggtctggaaggcgagcagcttggtattgtgagtctgagagaagctctggagaaagcagaagaagccggagtagacttagtcgagatcagccctaacgccgagccgccggtttgtcgtataatggattacggcaaattcctctatgaaaagagcaagtcttctaaggaacagaagRF1 (no) 258 1080 1211 prfA 1264235 1265317 > 3 atgaagccttctatcgttgccaaactggaagccctgcatgaacgccatgaagaagttcaggcgttgctgggtgacgcgcaaactatcgccgaccaggaacgttttcgcgcattatcacgcgaatatgcgcagttaagtgatgtttcgcgctgttttaccgactggcaacaggttcaggaagatatcgaaaccgcacagatgatgctcgatgatcctgaaatgcgtgagatggcgcaggatgaactgcgcgaagctRRF 435 555 555 172 frr 192872 193429 > 3 gtgattagcgatatcagaaaagatgctgaagtacgcatggacaaatgcgtagaagcgttcaaaacccaaatcagcaaaatacgcacgggtcgtgcttctcccagcctgctggatggcattgtcgtggaatattacggcacgccgacgccgctgcgtcagctggcaagcgtaacggtagaagattcccgtacactgaaaatcaacgtgtttgatcgttcaatgtctccggccgttgaaaaagcgattatggcgtccRL1 (~50%) 1 82 699 699 3984 rplA 4176457 4177161 > 6 atggctaaactgaccaagcgcatgcgtgttatccgcgagaaagttgatgcaaccaaacagtacgacatcaacgaagctatcgcactgctgaaagagctggcgactgctaaattcgtagaaagcgtggacgtagctgttaacctcggcatcgacgctcgtaaatctgaccagaacgtacgtggtgcaactgtactgccgcacggtactggccgttccgttcgcgtagccgtatttacccaaggtgcaaacgctgaaRL2 1 154 816 816 3317 rplB 3448180 3449001 < 6 atggcagttgttaaatgtaaaccgacatctccgggtcgtcgccacgtagttaaagtggttaaccctgagctgcacaagggcaaaccttttgctccgttgctggaaaaaaacagcaaatccggtggtcgtaacaacaatggccgtatcaccactcgtcatatcggtggtggccacaagcaggcttaccgtattgttgacttcaaacgcaacaaagacggtatcccggcagttgttgaacgtcttgagtacgatccg

Page 54: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

The in vitro assembly (& 3D structure) of the prokaryotic ribosomes is known. (e.g. Nomura et al.; Noller et al.)

Page 55: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

M 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

DNA Template

RNA Transcript

All 30S-Ribosomal-protein DNAs & mRNAs synthesized in vitro

Tian & Church

Page 56: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

His-tagged ribosomal proteins synthesized in vitro

RS-2,4,5,6,9,10,12,13,15,16,17,and 21 as original constructs.

RS1 required deletion of a feedback motif in the mRNA.RS-3, 7, 8, 11, 14, 18, 19, 20 are still weakly expressed.

Note that S1, S4, S7, S8, S20, L1, L4, L10 are known to repress their own translation (and are likely titrated by rRNA).

In progress: Resynthesize all genes with less structure.

Tian & Church

Page 57: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

David Goodsell

Page 58: Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.

DNA RNA Proteins

Metabolites

Environment

Biosystems Integrating Measures & Models

Microbes Cancer & stem cellsIn vitro replicationmulticellular organisms

interactions

Polonies(CD44 & cancer)

MOMADarwinian (sub)optima

Arrays & Mass-spec(circadian & cell cycle)