Top Banner
Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya
60

Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Dec 28, 2015

Download

Documents

Duane Cannon
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Molecular Computing: Challenges across the two tracks in

Theoretical Computer Science

Masami Hagiya

Page 2: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Outline

• Japanese Molecular Computer Project– Adleman-Lipton Paradigm and Improvements

• Suyama’s Dynamic Programming DNA Computer

– Autonomous Molecular Computing• Sakamoto’s Hairpin Engines

• Analysis of Computational Power of Molecules• Complexity of Molecular Computation

• Molecular Computation as Randomized Algorithm

• Towards New Computational Paradigms• Molecular, Chemical, Cell, and Amorphous Computing

• Importance of Engineering Viewpoint --- Programming

Page 3: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

• Project Leader - Masami Hagiya (Computer Science)

• Members– Takashi Yokomori (Computer Science)

– Masayuki Yamamura (Computer Science)

– Masanori Arita (Genome Informatics)

– Akira Suyama (Biophysics)

– Yuzuru Husimi (Biophysics)

– Kensaku Sakamoto (Biochemistry)

– Shigeyuki Yokoyama (Biochemistry)

• October 1996 - March 2001• Funded by Japan Society for Promotion of Science

– Research for the Future Program

JSPS Project on Molecular Computing

Page 4: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Goals of Molecular Computing• Analyses and Applications of Computational Power of Bio

molecules– Understanding Life from the Viewpoint of Computation

• computational mystery of life– Life is computationally very efficient.

– Engineering Applications (not restricted to computation)

• combinatorial optimization

• (computationally inspired) biotechnology

• nanotechnology, nanomachine

• cryptography

• medical and pharmaceutical applications in the future

• New Computational Model, New Simulation Technology

Page 5: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Related Fields• Genome Informatics

– applying computer science techniques to analyze genomic information

– part of the human genome project– the other way round

• But genome informatics is a good application area for molecular computing.

• Quantum Computing– massively parallel computation by quantum

superposition

• Artificial Life• Artificial Molecular Evolution

Page 6: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Major Achievements of the Project• Suyama’s Dynamic Programming DNA Computers

– reduction of molecules by breadth-first search

– automation by robots

• Sakamoto’s Hairpin Engines– Whiplash PCR and SAT Engine

– molecular computation by hairpin formation

– autonomous molecular computation

• Theoretical Studies by Yokomori’s Group• Nishikawa’s Simulator for DNA computations• Arita’s New Tool for Code Design• Husimi’s 3SR-Based Evolutionary Reactor• Yamamura’s Aqueous Computing (with Head)

Page 7: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Dynamic ProgrammingDNA Computers

Page 8: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Adleman-Lipton Paradigm• Adleman (Science 1994)

– Solving Hamilton Path Problem by DNA

• Lipton, et al.– Solving SAT Problem by DNA

• Massively Parallel Computation by Molecules– Mainly for Combinatorial Optimization– Random Generation by Self-Assembly

• solution candidate = DNA molecule

– Selection by Molecular Biology Experiments

Scaling Up ⇒ Efforts to increase yields and reduce errors

Robot and Chemical IC

Page 9: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

cf. Hamiltonian Path Problem by Adleman

Page 10: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Suyama’s Dynamic ProgrammingDNA Computer

• “counting” ( Ogihara and Ray )– O(20.4n) molecules for n-variable 3-SAT

• “dynamic programming” ( Suyama )• Iteration of Generation and Selection

– generation of candidates of partial solutions

– selection of partial solutions

• The order of computational complexity does not decrease, but the amount of necessary molecules is drastically reduced.– 3-SAT

Page 11: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

DP algorithm for 3CNF-SAT on DNA Computers

end

return

end

end

end

thenif

end

thenif

dotofor

dotofor

begin

function

);(detect

);,merge(

);,,(append);,,(append

);,,(getuvsat

);,,(getuvsat

1

);,,(amplify

3

};,,,{

),,,...,,,(sat3dna

/1

/1

1

212121212

111

n

FTk

Fk

FTk

Fk

Fw

FTk

FTk

Tk

Tw

T

jjT

wT

w

kj

jjF

wF

w

kj

Fw

Twk

FFFTTFTT

mmm

T

TTT

XXXTTXXXTT

vuTT

xw

vuTT

xw

mj

TTT

nk

XXXXXXXXT

wvuwvu

end

return

begin

function

;

);,(merge

);,(get

/*/*);,'(get

);,(get');,(get

),,(getuvsat

T

Tv

Tu

T

Tv

Fu

Tv

Fu

Fu

Fu

Tu

Fu

Tu

Tu

T

TTT

XTT

omittedbecanXTT

XTTXTT

vuT

merge)get3(

merge)

append2(amplify)2(

operationsofNumber

m

n

Page 12: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

3-CNF SAT Solution on DP DNA Computer

}{

YES

)()(

)()(

)()(

)()(

)()(

clauses10variables,4

4321

432432

432431

421431

321321

321321

FFTT XXXX

xxxxxx

xxxxxx

xxxxxx

xxxxxx

xxxxxx

:Solution

:Problem

Page 13: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

DP algorithm for 3CNF-SAT

)( 321 xxx

k’s loop: k ranges over variable indices j’s loop: j ranges over clause indices if xk is the 3rd literal of the j-th clause then remove those assignments which satisfy neither the 1st nor the 2nd literal append Xk

F to the remaining assignments (do similarly if xk is the 3rd literal)

X1T X2

T

X1F X2

T

X1T X2

F

X1F X2

F

k = 3x3

X1T X2

F X3F

X1T X2

T X3F

)( 321 xxx

Page 14: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

DP algorithm for 3CNF-SAT

)( 321 xxx

k’s loop: k ranges over variable indices j’s loop: j ranges over clause indices if xk is the 3rd literal of the j-th clause then remove those assignments which satisfy neither the 1st nor the 2nd literal append Xk

F to the remaining assignments (do similarly if xk is the 3rd literal)

X1T X2

T

X1F X2

T

X1T X2

F

X1F X2

F

k = 3x3

X1FX2

F X3T

X1F X2

T X3T

)( 321 xxx

Page 15: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

DP algorithm for 3CNF-SAT

)( 432 xxx

)( 432 xxx

)( 432 xxx

k’s loop: k ranges over variable indices j’s loop: j ranges over clause indices if xk is the 3rd literal of the j-th clause then remove those assignments which satisfy neither the 1st nor the 2nd literal append Xk

F to the remaining assignments (do similarly if xk is the 3rd literal)

X1F X2

T X3T

X1F X2

F X3T

X1T X2

T X3F

X1T X2

F X3F

k = 4x4

X1T X2

T X3F X4

F

Page 16: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Implementation of Basic Operations

annealingand

ligation s

s

immobilizationand

cold wash

s

s

hot wash

s

Taq DNA ligase

get (T, +s), get (T, -s)

s

s

annealing

immobilization

cold wash

hot washs get (T, +s)

get (T, -s)

s

s

amplify (T, T1, T2, …Tn)

PCR

immobilizationand

cold wash

hot washand

divide

annealing T

T1, T2, …Tn

append (T, s, e)

e

e

Page 17: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

On Scaling Up the Size of Computations

• Suyama’s estimation– 2x10-3 g of DNA for 100-variable 3-SAT

• 2x1012 g of DNA by Adleman-Lipton

– Current status: 4-variable 10-clause 3-SAT– Project goal: 30-variable 100-clause 3-SAT– Ultimate goal: 100-variable 400-clause 3-SAT

• Still, 100 variables are not many.

• A number of breakthroughs (in algorithms and experimental techniques) are required to defeat electronic computers. Robots, for example, …

Page 18: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Robot for DNA Computing Based on MAGTRATIONTM

Page 19: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Automatic Operation of get Command on DNA Computer Robot

get (T, +s), get (T, -s)

s

s

annealing

immobilization

cold wash

hot washs get (T, +s)

get (T, -s)

s

s

Page 20: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

[Instrument][Reset Counter] 0[Home Position] 0[MJ-Open Lid]・・・[Get1(0)][Get2(1)][Append(2)]・・・[Exit]

protocol-level

(1-1-4) [MJ-Open Lid]Do 2 _SEND "LID OPEN" Do 10 _SEND "LID?" Wait_msec 500 _CMP_GSTR "OPEN" IF_Goto EQ 0 ;open Wait_msec 1000 LoopLoop; Time outEnd;open

script-level

end

return

end

end

end

thenif

end

thenif

dotofor

dotofor

begin

function

);(detect

);,merge(

);,,(append

);,,(append

);,,(getuvsat

);,,(getuvsat

1

);,,(amplify

3

};,,,{

),,,...,,,(sat3dna

/1

/1

1

212121212

111

n

FTk

Fk

FTk

Fk

Fw

F

Tk

FTk

Tk

Tw

T

jjT

wT

w

kj

jjF

wF

w

kj

Fw

Twk

FFFTTFTT

mmm

T

TTT

XXXTT

XXXTT

vuTT

xw

vuTT

xw

mj

TTT

nk

XXXXXXXXT

wvuwvu

Pascal/C-level

Programming in DNA Computer

Page 21: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Hairpin Engines

Page 22: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Autonomous Molecular Computing• Adleman-Lipton Paradigm

– generation of candidates = autonomous reaction

– selection of solutions = many operations from outside

• One-Pot Reaction ⇒ Autonomous Computation

Comutation by Successive Autonomous Reactions by Molecules

– Winfree’s DNA Tile

– Sakamoto’s Hairpin Engines• Whiplash PCR and SAT Engine

• Applications:– Nanotechnology, Nanomachine

– (Computationally Inspired) Biotechnology

Page 23: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

cf. Winfree’s DNA Tile

Page 24: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

cf. Winfree’s DNA Tile

Page 25: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

cf. Winfree’s DNA Tile

Page 26: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Hairpin Engines

• Molecular Computation by Hairpin Formation– Hairpin --- Typical Secondary Structure

• Whiplash PCR– DNA Automaton: State Machine by DNA

– 5 Transitions in a Control Experiment

• SAT Engine– Selection by Hairpin Structures of DNA

– 3‐SAT: 6-Variable 10-Clause Formula

Page 27: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

SAT Engine• Sakamoto et al., Science, May 19, 2000.• Selection by Hairpin Structures of DNA

– digestion by restriction enzyme

– exclusive PCR

• 3-SAT– ssDNA consisting of literals, each selected from a clause

– complementary literal = complementary sequence

– detection of inconsistency hairpin⇒

• The essential part of the SAT computation is done by hairpin formation.– Autonomous Molecular Computation

Page 28: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

b ¬ be

(a∨b∨c)∧( ¬ d∨e∨¬ f)∧ … ∧( ¬ c∨¬ b∨a)∧ ...

b ¬ bdigestion by restriction enzymeexclusive PCR

Page 29: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.
Page 30: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Selection by Hairpin Structures

• Digestion by Restriction Enzyme– Hairpins are cut at the restriction site inserted in

each literal sequence.• Exclusive PCR

– PCR is inefficient for hairpins.– In exclusive PCR, solution is diluted in each

cycle to keep the difference in amplification.• The number of steps is independent on the number

of variables or clauses.

Page 31: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

6-Variable 10-Clause Formula

(a∨b∨!c)∧(a∨c∨d)∧(a∨!c∨!d)∧(!a∨!c∨d)∧(a∨!c∨e)∧(a∨d∨!f)∧(!a∨c∨d)∧(a∨c∨!d)∧(!a∨!c∨!d)∧(!a∨c∨!d)

! = ¬

Page 32: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Solution of a6-Variable 10-Clause formula

Page 33: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Whiplash PCR• DNA Automaton : State Machine by DNA

– Polymerization of Hairpin– Polymerization Stop

• Autonomous MIMD Computation of Boolean μ-formulas

• Solving NP-Complete Problems in O(1)-Stepe.g., vertex cover:

vertex cover candidate = transition table = ssDNA

vertex cover = transition table that reaches the final state

• 5 Transitions in a Control Experiment

Page 34: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

x B A xC

Bx

ab

Whiplash PCR

Page 35: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

x B A xC

B

Whiplash PCR

Page 36: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

x B A x C B x

a

Whiplash PCR

Page 37: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

x B A x C B x

a

bc

Whiplash PCR

Page 38: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.
Page 39: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.
Page 40: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

5 Transitions ina Control Experiment

Page 41: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

0 12

34

56

7

Page 42: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Analysis of Computational Power of Molecules

Page 43: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Complexity of Molecular Computation

• Time– Number of Laboratory Operations– Time for Each Operation

• more essential for the analysis of the computational power of molecules

• Space (= Parallelism)– Number of Molecules

• maximum number• total number

– Size (Length) of Molecules

• Analysis of the Trade-Off

Page 44: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Some Classical Results• Reif (SPAA’95)

– A nondeterministic Turing machine computation with input size n, space s and time 2O(s) can be executed in our PAM Model using O(s) PA-Match steps and O(s log s) other PAM steps, employing aggregates of length O(s).

• Beaver (DNA1, 1995)– Polynomial-step molecular computers compute PSP

ACE.

• Rooß and Wagner (I&C, 1996)– Exactly the problems in PNP=p

2 can be solved in polynomial time using Lipton’s model.

Page 45: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Yield and Error in Reactions• Yield

– equilibrium --- equilibrium constant (K)– time to reach equilibrium

--- reaction constant (k)– example: A

[B] = (K/(1+K))(1e(k+k1

) t )

K = k/k1

• Error– example: mis-hybridization– Error probability is never zero.

Page 46: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Reduction of Errors• Iteration of Laboratory Operations

– increase in computation time– increase in loss of molecules

• increase in number of molecules

• Reduction of Error Probability– appropriate conditions

• temperature, salt concentration• Low temperature leads to frequent mis-hybridzation.• However, high temperature decreases the yield.

– good encoding• A number of papers have been published for designing go

od encoding.

Page 47: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Some Analyses• Karp, Keynon and Waarts (SODA’96)

– The number of extract operations required for achieving error-resilient bit evaluation is (loglog).

• Kurtz (DNA2, 1996)– thermodynamical analysis of path formation in Adlema

n’s experiment– time needed to form a Hamiltonian path --- (n2)

• Winfree (1998, Ph.D. Thesis)– thermodynamical analysis of DNA Tiling

• Rose, et al. (GECCO’99)– Computational Incoherency (thermodynamical analysis

of mis-hybridization)

Page 48: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Efficiency of SAT Engine:Tentative Analysis

• Parameters– n : number of clauses– : the probability that a satisfying assignment

cannot be detected

• Orders– Time O(n2.5)– Number of Molecules

O(4n ln(1/))

Page 49: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Molecular Computation and Randomized Algorithms

• Randomized Algorithms with Molecules– Massive Parallelism– Random Operations

• very easy to implement by chemical reactions

• Error in Non-Random Operations– Error in non-random operations should not damage

the error reducibility of a randomized algorithm.– Error should be compensated by random operations.

Page 50: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Some Recent Results

• Chen and Ramachandran (DNA6, 2000)– k-SAT by Paturi et al.

• Díaz, Esteban and Ogihara (DNA6, 2000)– k-SAT by Schöning

• Sakakibara (DNA6, 2000)– PAC Learning of DNF Formulas– Approximate Consistent Learning

Page 51: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Towards New Computational Paradigms

Page 52: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

New Computational Paradigms

• Molecular Computing

• Chemical Computing

• Crystal Computing

• Cell Computing

• Gel Computing

• Amorphous Computing

• …

Page 53: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

New Computational Paradigms

• Computation inside a Single Molecule

• Computation by Molecular Interactions

• Computation with Membranes

• Computation with Geometry

• Each paradigm is a rich source of computational power.

• They are strongly related with one another.

Page 54: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Computation inside a Single Molecule

• Computation by Conformational Change (Structure Formation)– Whiplash PCR (Sakamoto, et al.)– SAT Engine (Sakamoto, et al.)– NP-Completeness of Protein Folding (Fraenkel)

• Computation by Modification– Stickers Model (Roweis, et al.)– Aqueous Computing (Head and Yamamura)

• write-once molecular memory

Page 55: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Computation by Molecular Interactions

• Computation by Self-Assembly– DNA hybridization --- everywhere in DNA computing– DNA tiling (Winfree, et al.)

• Computation by Cutting and Pasting– restriction enzymes and ligase

--- everywhere in DNA computing– H Systems --- Splicing Systems (Head)

• Self-Assembly and Conformational Change– Self-Assembling Automaton (Saitou)– YAC (Yokomori)

• Concurrency Calculi (without Membranes)• Abstract Chemistry in Artificial Life

Page 56: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Recent Results in Computation by Self-Assembly

• Rothemund and Winfree (STOC 2000)– For any f (N) non-decreasing unbounded compu

table functions, the number of tiles required for the self-assembly of an NN square is bounded infinitely often by f (N).

• Winfree, Eng and Rozenberg (DNA6, 2000)– Linear assembly of string tiles can generate the

output languages of finite-visit Turing Machines.

Page 57: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Computation with Membranes

• Computation with Compartments– Chemical IC (MEMS)– Liposomes– P Systems (Paun)– Concurrency Calculi

• chemical abstract machine, -calculus, join calculus

• ambient calculus

• Computation by Cells• computation by gene regulation, signal transduction,

and metabolism

Page 58: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Computation with Geometry

• Computation with Compartments– inside-or-outside topology

• Computation in Gel/on Surface– two kinds of molecule: immobile and mobile

• DNA Crystals --- DNA Tiling– 2D or 3D topology (lattice)

• Amorphous Computing (Abelson, Knight and Sussman)– 2D or 3D topology (continuous)– Computational Particles

• generation of coordinate systems• GPL (growth-point language)

– Cellular Computing (Weiss and Knight)

Page 59: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Importance of Engineering Viewpoint --- Programming

• Not Only Analysis but Also Synthesis– Sharp Distinction from Previous Studies:

• mathematical biology• complex systems

• Synthesis = Programming– Design and Engineering of Artificial Systems

• Importance of Engineering Applications– Milestones of Research– Source of Motivations– Not Restricted to Computation

• nanotechnology• biotechnology (computatinally inspired biotechnology)

Page 60: Molecular Computing: Challenges across the two tracks in Theoretical Computer Science Masami Hagiya.

Challenges

• New Computational Paradigms

• New Computational Models

• New Programming Languages

• New Applications

• These challenges should be simultaneously attacked with the progress of implementation techniques.