Top Banner
RNA secondary structure Functions Representations Predictions Many slides courtesy of M. Zuker, RPI Math Sequence Analysis '16 -- Lecture 13
30

RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Jun 06, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

RNA secondary structure

• Functions

• Representations

• Predictions

Many slides courtesy of M. Zuker, RPI Math

Sequence Analysis '16 -- Lecture 13

Page 2: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

When RNA secondary structure matters

mRNA --> protein

ssRNA

protein

Strong secondary structure can block translation.

Page 3: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

RBS

Registry of Standard Biological Partshttp://parts.igem.org/

UUUCU CUNNNNAAAGA GA NN

NNNAUGNNNN

5' 3'NNNN

fMet

ribosome binding site

16s

NNN

especially sensitive is the...

species specific!

Anderson RBS family -- bacterial.

Page 4: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

primary-microRNA

microRNA duplex

dicer

microRNA

blocks translation deadenylation

endonuclease digestion

passenger strand degraded

argonaut proteins

transcription, folding

microRNA (miR) Found in 3'UTR introns exons

nucl

eus

cyto

plas

m

Page 5: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Ambiguous bases

Page 6: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

RNA secondary structure is base pairingi•j

Rule

s for

nor

mal

SS

Page 7: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Not just Watson-Crick..

Different ways of base-pairing allow RNA to adopt duplex structures beyond A and B

helix.

If any base-pair is possible, how do we predict pairings?

Page 8: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Structure types within RNA sec struct

E

H M

M I B

I

H

I

I

H

M

I

H

HH I

B

2D plot

Page 9: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

BHH

H

H

H

H

H

H

H

H

H

E

M

B

I I

I

I

I

I

I

I

M

M

I

note: No bp lines cross.

Page 10: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Converting from 2D plot sec struct to circle/tree.

Page 11: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

CCC

CUCUCC

AG G GGUCAU

CGGA

Circle plot=====>

Pseudoknots

Page 12: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What
Page 13: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What
Page 14: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What
Page 15: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What
Page 16: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

• Mutual information (MI) - Requires a deep multiple sequence alignment - Can find non-canonical base-pairs.

Comparative methods, phylogenetics

Free energy calculations

• Dot plot - Easy. - Can be done on a single sequence. - Cannot find non-canonical base pairs.

Prediction of RNA secondary structure

Page 17: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Comparative modelingassume conserved structure between homologs

Page 18: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

RNA structure by energy minimization

e(i,j) = 0 if j-i < 4

Forward summation of energy matrix E:

Assumes energy is the sum of base pairs.

add to loopadd to loop

start a helix, or add a base pair join helices in

multi-loop

everywhere is a potential hairpin

Page 19: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Find k such that Ei,j = E i,k + Ek+1,jPush (i,k) and (k+1,j) onto Stack B.Stop with error if no such k exists.

Stack A becomes the answer: a list of base pairs Stack B is a list of unfinished segments

energy matrix

1. energy matrix is initilized starting from diagonal..

2. base-pairing is found by tracing back from (1,n)

RNA structure by energy minimization

Page 20: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

RNA structure by Dot Plot

Page 21: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

RNA structure by Dot Plot

Page 22: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

1) Run BLAST search to get homologs.2) Prune sequences to remove redundancy.*3) Prune columns to remove uninformative data. (conserved

positions tell you nothing)4) Calculate mutual information (Mi,j) for all pairs of

positions (i,j).

position

position

posi

tion

How-to:RNA structure by MI

Page 23: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

sum over all event pair types

frequency of a pair of events observed together= N(a,b events)/N(total events)

expected frequency of a,b events together is the product of the frquencies of

the events separately

Mutual information, in general

in bits, because we used log2

A measure of the surprisingness of a pair of events.

M = Σ f(a,b) log2( )f(a,b)f(a)f(b)a,b ∈{events}

Page 24: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

sum over all base-pair types

frequency of base-pair (B1,B2) at positions (i,j) =N(B1,B2)/N(total sequences)

expected frequency of base B1, B2

Mutual information for base-pairsA measure of the surprisingness of the evolution of two positions in the sequence.

i j

pair of sequence positions (i,j)

M(i,j) = Σ fi,j(B1,B2) log2( )fi,j(B1,B2)fi(B1)fj(B2)

B1,B2 ∈{A,C,G,T}

Exercise: Calculate M(i,j)=___________

123456789

101112

species

position

Page 25: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Can you find pairs of positions with high MI?

W-C non-canonical

Page 26: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Mutual information matrix for 20 aligned sequences

very noisy.

Take home message: You need lots of sequence to do mutual information analysis.

And this is for RNA where the signal is strong. Try protein. You'll need thousands of

sequences....

Page 27: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Same RNA, Mi,j for 302 aligned sequences

Page 28: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

Helices now appear as straight lines of dots, after re-numbering to remove gaps in

the MSA

Corresponding structure.

H

I

M

I

H

Page 29: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

• Mutual information (MI) - Requires a deep multiple sequence alignment - Can find non-canonical base-pairs.

Comparative methods, phylogenetics

Free energy calculations

• Dot plot - Easy. - Can be done on a single sequence. - Cannot find non-canonical base pairs.

Prediction of RNA secondary structure

Page 30: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What

1. What are the IUPAC codes?2. What are the different types of RNA structure?3. What is mutual information? How is it calculated?4. What are the sources of energy for RNA structure?5. What algorithm is used to calculate the energy over

all RNA structures?6. What is expressed in a dot plot?7. What is expressed in a circle plot?8. What is a pseudoknot?9. What is a non-canonical basepair?10. Can you convert a dotplot into a graph?11. Can you convert a graph into a circle plot?12. Can you see a pseudoknot in a graph, circle, dotplot?

Review questions?