Top Banner
adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer Science, McGill Centre for Bioinformatics, McGill University, Canada Yann Ponty, PhD Laboratoire d’informatique (LIX), École Polytechnique, France
28

An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure

Jérôme Waldispühl, PhDSchool of Computer Science, McGill Centre for Bioinformatics,McGill University, Canada

Yann Ponty, PhDLaboratoire d’informatique (LIX),École Polytechnique, France

Page 2: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Philippe Flajolet (1948 – 2011)

Page 3: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

RNAmutants: Algorithms to explore the RNA mutational landscape

Overview

Understanding how mutations influence RNA secondary structures AND how structures influence mutations (Waldispühl et al., PLoS Comp Bio, 2008).

Page 4: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Sampling k-mutants

CAGUGAUUGCAGUGCGAUGC (-1.20)..((.(((((...)))))))

Classic: 0 mutation

CAGUGAUUGCAGUGCGAUcC (-3.40)..(.((((((...)))))))CAGUGAUUGCAGUGCGgUGC (-0.30)((.((....)).))......CAGUGAUcGCAGUGCGAUGC (-3.10).....(((((...)))))..

RNAmutants: 1 mutation

uAGcGccgGgAGacCGgcGC (-18.00)..(((((((....)))))))CccUGgccGCAagGCcAgGg (-20.40)((((((((....))))))))CcGUGgccGCgagGCcAcGg (-19.10)((((((((....))))))))

RNAmutants: 10 mutations

Seed

Sample k mutations increasing the folding energyConsequence: it increases the C+G content

Page 5: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Objectives

How to efficiently sample sequences at arbitrary C+G contents … without bias!

C+G Content (%)

Sam

ple

fre

qu

en

cy

Target C+G content

Page 6: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Outline

• Background: RNAmutants in a nutshell Algorithms to sample RNA secondary structures and mutations.

• Our approach: Adaptive sampling Uniformly shifting the distribution of samples.

• Results: Evolutionary studies Insights on the evolutionary pressure stemming from an optimization of the thermodynamical stability.

Page 7: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Outline

• Background: RNAmutants in a nutshell Algorithms to sample RNA secondary structures and mutations.

• Our approach: Adaptive sampling Uniformly shifting the distribution of samples.

• Results: Evolutionary studies Insights on the evolutionary pressure stemming from an optimization of the thermodynamical stability.

Page 8: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

RNA secondary structure

The secondary structure is the ensemble of base-pairs in the structure.

Bracket notation:((((((((…)))..((((….)))).(((…)))..)))))

Page 9: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Loop decomposition

Stacking pairs

Page 10: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

UUUACGGCUAGC

Parameterization of the mutational landscape

UCUGAAACCCGU

UUUACGGCCAGC

Sequence ensemble Structure ensemble

CCUCAACGAAGC

UCUACGGCCAGC

UUUAAGGCCAGC

1-neighborhood(1 mutations)

Page 11: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Classical Recursions (Zuker & Stiegler, McCaskill)

Enumerate all secondary structures

Page 12: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

RNAmutants Generalize Classical Algorithms

Enumerate all secondary structures over all mutants (Waldispuhl et al., ECCB, 2002)

Page 13: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Our approach

Explore the complete mutation landscape. Polynomial time and space algorithm. Compute the partition function for all sequences:

Sample by backtracking the dynamic prog. tables.

RNAmutants

(Waldispuhl et al., PLoS Comp Bio, 2008)

Z = exp(−E(s,S)

RT)

S

∑s

Z(s) = exp(β ⋅E(s,S))S

RNAmutants:

Single sequence:

Page 14: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Sampling k-mutants

CAGUGAUUGCAGUGCGAUGC (-1.20)..((.(((((...)))))))

Classic: 0 mutation

CAGUGAUUGCAGUGCGAUcC (-3.40)..(.((((((...)))))))CAGUGAUUGCAGUGCGgUGC (-0.30)((.((....)).))......CAGUGAUcGCAGUGCGAUGC (-3.10).....(((((...)))))..

RNAmutants: 1 mutation

uAGcGccgGgAGacCGgcGC (-18.00)..(((((((....)))))))CccUGgccGCAagGCcAgGg (-20.40)((((((((....))))))))CcGUGgccGCgagGCcAcGg (-19.10)((((((((....))))))))

RNAmutants: 10 mutations

Seed

Sample k mutations increasing the folding energyConsequence: it increases the C+G content

Page 15: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Outline

• Background: RNAmutants in a nutshell Algorithms to sample RNA secondary structures and mutations.

• Our approach: Adaptive sampling Uniformly shifting the distribution of samples.

• Results: Evolutionary studies Insights on the evolutionary pressure stemming from an optimization of the thermodynamical stability.

Page 16: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

UUUAAGGCUAGC

Our approach: Weighting mutations

UCUGAAACCCGU

UUUAAGGCCAGC

Sequence ensemble Structure ensemble

CCUCAACGAAGC

UAUAAGGCCAGC

UUUAGGGCCAGC

w-1

1w

Z

w-1. ZC9U

1. ZU2A

w. ZA5G

Weighted by partition

function value

Promote A+U content

Penalize C+G content

No change

Page 17: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Weighting recursive equations

) × W(i,x) × W(j,y)(

× W(j,y)

W (i,x) =

w If A,U →C,G

w−1 If C,G→ A,U

1 Otherwise

⎨ ⎪

⎩ ⎪

Page 18: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

C+G Content (%)

Effect of weighted sampling

Unweighted sampling weighted (w=1/2) weighted (w=2)

Frequency

of

sam

ple

s

Page 19: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Sampling pipe-line

• Keep all samples at the target C+G and reject others.• Update w at each iteration using a bisection method.• Stop when enough samples have been stored.

Page 20: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Some features

• After rejection, the weighted schema only impact the performance, not the probability. This is unbiased.

• Partition function can be written as a polynom:

After n iterations we can to calculate all ai and inverse the polynom to compute the optimal weight w.

Remark: In practice, less interations are necessary

Z = ai ⋅wi

i= 0

n

Page 21: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Example: 40 nt., 10000 samples, 30 mutations, 70% C+G content

Cumulative distribution

Page 22: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Outline

• Background: RNAmutants in a nutshell Algorithms to sample RNA secondary structures and mutations.

• Our approach: Adaptive sampling Uniformly shifting the distribution of samples.

• Results: Evolutionary studies Insights on the evolutionary pressure stemming from an optimization of the thermodynamical stability.

Page 23: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

20 nucleotides 40 nucleotides

Low C+G-contents favor structural diversity

Simulation at fixed G+C content from random seeds

10% 30% 50% 70% 90%

Page 24: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Low C+G contents favor internal loop insertion

10% 30% 50% 70% 90%

Nu

mb

er

of

Inte

rnal Lo

op

s

Page 25: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

20 nucleotides 40 nucleotides

High G+C-contents reduce evolutionary accessibility

Simulation at fixed G+C content from random seeds

10% 30% 50% 70% 90%

Page 26: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Perspectives

• More studies of Sequence-Structure maps.

• Applications to RNA design.

• Same techniques can be applied to other parameters (e.g. number of base pairs).

• Can be generalized to multiple parameters.

Page 27: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Acknowledgments

Ecole Polytechnique• Jean-Marc Steyaert

Boston College• Peter Clote

INRIA• Philippe Flajolet

MIT• Bonnie Berger• Srinivas Devadas• Mieszko Lis• Alex Levin• Charles W. O’Donnell

Google Inc.• Behshad Behzadi

Yann PontyCNRS at LIX, École Polytechnique, France.

University of Paris 6• Olivier Bodini

University of Paris 11• Alain Denise

Page 28: An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure Jérôme Waldispühl, PhD School of Computer.

Would you like to know more?

O. Bodini, and Y. PontyMulti-dimensional Boltzmann Sampling of Languages,Proceedings of AOFA'10, 49--64, 2010

J. Waldispühl, S. Devadas, B. Berger and P. Clote,Efficient Algorithms for Probing the RNA Mutation Landscape,Plos Computational Biology, 4(8):e1000124, 2008.

http://csb.cs.mcgill.ca/RNAmutants