Top Banner
Motif discovery Tutorial 5
35

Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Jan 03, 2016

Download

Documents

Alan Dean
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Motif discovery

Tutorial 5

Page 2: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Motif discovery•MEME

Creates motif PSSM de-novo (unknown motif)•MAST

Searches for a PSSM in a DB•TOMTOM

Searches for a PSSM in motif DBs

Agenda

Cool story of the day: How NOT to be a bioinformatician

Page 3: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Motif – definition

Motifa widespread pattern with a biological significance.

Sequence motif

PTB (RNA binding protein)

UCUU

CAP (DNA binding protein)

TGTGAXXXXXXTCACAXT

Page 4: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Sequence motif – definition

1 2 3 4 5 6 7 8 9 10

A 0 0 0 0 0 3/6 1/6 2/6 0 0

D 0 3/6 2/6 0 0 1/6 5/6 1/6 0 1/6

E 0 0 4/6 1 0 0 0 0 1 5/6

G 0 1/6 0 0 1 1/3 0 0 0 0

H 0 1/6 0 0 0 0 0 0 0 0

N 0 1/6 0 0 0 0 0 0 0 0

Y 1 0 0 0 0 0 3/6 3/6 0 0

..YDEEGGDAEE....YDEEGGDAEE....YGEEGADYED....YDEEGADYEE....YNDEGDDYEE....YHDEGAADEE..

Motifa nucleotide or amino-acid sequence pattern that is widespread

and has a biological significance

PSSM - position-specific scoring matrix

Page 5: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Can we find motifs using multiple sequence alignment (MSA)?

YES! NO

Local multiple sequence alignment is a hard problem to solve

Page 6: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Motif search: from de-novo motifs to motif annotation

gapped motifs

Large DNA data

http://meme.sdsc.edu/

Page 7: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MEME

Page 8: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MEME – Multiple EM* for Motif finding

• Motif discovery from unaligned sequences - genomic or protein sequences

• Flexible model of motif presence (Motif can be absent in some sequences or appear several times in one sequence)

*Expectation-maximization

http://meme.sdsc.edu/

Page 9: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MEME - Input

Input file (fasta file)

How many times in each

sequence?

How many motifs?

How many

sites?

Range of motif lengths

Page 10: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MEME - Output

Motif e-value

Page 11: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MEME – Sequence logo

Motif length

Number of appearnces

Motif e-value

A graphical representation of the sequence motif

Page 12: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MEME – Sequence logoHigh information content = High confidence

The relative sizes of the letters indicates their frequency in the sequences The total height of the letters depicts the information content of the position, in bits of information.

Page 13: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Multilevel Consensus

MEME – Sequence logo

Page 14: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Patterns can be presented as regular expressions

[AG]-x-V-x(2)-{YW}

[] - Either residuex - Any residuex(2) - Any residue in the next 2 positions{} - Any residue except these

Examples: AYVACM, GGVGAA

Page 15: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Sequence names

Position in sequence

Strength of match

Motif within sequence

MEME – motif alignment

Page 16: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Overall strength of motif matches

Motif location in the input sequence

MEME – motif locationsSequence names

Page 17: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

What can we do with motifs?

• MAST - Search for them in non annotated sequence databases (protein and DNA).

• TOMTOM - Find the protein which binds the DNA motifs.

Page 18: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MAST

Page 19: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MAST

• Searches for motifs (one or more) in sequence databases:– Like BLAST but motifs for input– Similar to iterations of PSI-BLAST

• Profile defines strength of match– Multiple motif matches per sequence

• MEME uses MAST to summarize results: – Each MEME result is accompanied by the MAST result for

searching the discovered motifs on the given sequences.

http://meme.sdsc.edu/meme4_4_0/cgi-bin/mast.cgi

Page 20: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MAST - Input

Input file (motifs)

Database

Page 21: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

If you wish to use motifs discovered by MEME

Page 22: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MAST - OutputInput motifs

Presence of the motifs in a given database

Page 23: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MAST – Output (another example, global view)

Page 24: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MAST – Output (another example, global view)

Page 25: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

TOMTOM

Page 26: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

TOMTOM

• Searches one or more query DNA motifs against one or more databases of target motifs, and reports for each query a list of target motifs, ranked by p-value.

• The output contains results for each query, in the order that the queries appear in the input file.

http://meme.sdsc.edu/meme/doc/tomtom.html

Page 27: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

TOMTOM - Input

Input motif

Background frequencies

Database

Page 28: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

TOMTOM - OutputInput motif

Matching motifs

Page 29: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

TOMTOM – OutputWrong input (RNA sequence of RNA binding protein NOVA1)

“OK” results

Page 30: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

MAST vs. TOMTOM

MAST TOMTOMComparison Profile against DB Profile against

ProfileDB General DBs Known motif DBs

Page 31: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.

Cool Story of the day

How NOT to be a bioinformatician

Page 32: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.
Page 33: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.
Page 34: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.
Page 35: Motif discovery Tutorial 5. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM.