Top Banner
Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November 2007
80

Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Mar 28, 2015

Download

Documents

Tyler McMillan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Methods course

Multiple sequence alignment andReconstruction of phylogenetic trees

Burkhard Morgenstern, Fabian Schreiber

Göttingen, October/November 2007

Page 2: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

Multiple alignment basis of (almost) all methods for sequence analysis in bioinformatics

Page 3: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

T Y I M R E A Q Y E

T C I V M R E A Y E

Page 4: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

T Y I - M R E A Q Y E

T C I V M R E A - Y E

Page 5: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

T Y I M R E A Q Y E

T C I V M R E A Y E

Y I M Q E V Q Q E

Y I A M R E Q Y E

Page 6: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

T Y I - M R E A Q Y E

T C I V M R E A - Y E

Y - I - M Q E V Q Q E

Y – I A M R E - Q Y E

Page 7: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

T Y I - M R E A Q Y E

T C I V M R E A - Y E

- Y I - M Q E V Q Q E

Y – I A M R E - Q Y E

Astronomical Number of possible alignments!

Page 8: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

T Y I - M R E A Q Y E

T C I V - M R E A Y E

- Y I - M Q E V Q Q E

Y – I A M R E - Q Y E

Astronomical Number of possible alignments!

Page 9: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

T Y I - M R E A Q Y E

T C I V M R E A - Y E

- Y I - M Q E V Q Q E

Y – I A M R E - Q Y E

Which one is the best ???

Page 10: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

Questions in development of alignment programs:

(1) What is a good alignment?

→ objective function (`score’)

(2) How to find a good alignment?

→ optimization algorithm

Page 11: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

What is a biologically good alignment ??

Page 12: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

Criteria for alignment quality:

1. 3D-Structure: align residues at corresponding positions in 3D structure of protein!

2. Evolution: align residues with common ancestors!

Page 13: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

T Y I - M R E A Q Y E

T C I V M - R E A Y E

- Y I - M Q E V Q Q E

- Y I A M R E - Q Y E

Alignment hypothesis about sequence evolution

Search for most plausible hypothesis!

Page 14: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

T Y I - M R E A Q Y E

T C I V - M R E A Y E

- Y I - M Q E V Q Q E

- Y I A M R E - Q Y E

Alignment hypothesis about sequence evolution

Search for most plausible hypothesis!

Page 15: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

Compute for amino acids a and b

Probability pa,b of substitution a → b (or b → a),

Frequency qa of a

Define similarity score s(a,b) based on pa,b , qa

Result: similarity matrix (substitution matrix), e.g. PAM (Dayhoff matrix), BLOSUM, …

Page 16: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.
Page 17: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

Page 18: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

Traditional objective functions:

Define Score of alignments as

Sum of individual similarity scores s(a,b) of aligned amino acid residues

Gap penalty g for each gap in alignment

Optimal alignment can be calculated for two sequences but in practice not for > 8 sequences

Page 19: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

T Y W I V

T - - L V

Example:

Score = s(T,T) + s(I,L) + s (V,V) – 2 g

Page 20: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

Most commonly used heuristic for multiple alignment:

Progressive alignment (mid 1980s):

Idea: calculate multiple alignment as series of pairwise

alignments of sequences and profiles Use guide tree to determine order of pairwise

alignments

Page 21: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

`Progressive´ Alignment

WCEAQTKNGQGWVPSNYITPVN

WWRLNDKEGYVPRNLLGLYP

AVVIQDNSDIKVVPKAKIIRD

YAVESEAHPGSFQPVAALERIN

WLNYNETTGERGDFPGTYVEYIGRKKISP

Page 22: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

`Progressive´ Alignment

WCEAQTKNGQGWVPSNYITPVN

WWRLNDKEGYVPRNLLGLYP

AVVIQDNSDIKVVPKAKIIRD

YAVESEAHPGSFQPVAALERIN

WLNYNETTGERGDFPGTYVEYIGRKKISP

Guide tree

Page 23: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

`Progressive´ Alignment

WCEAQTKNGQGWVPSNYITPVN

WW--RLNDKEGYVPRNLLGLYP-

AVVIQDNSDIKVVP--KAKIIRD

YAVESEASFQPVAALERIN

WLNYNEERGDFPGTYVEYIGRKKISP

Profile alignment, “once a gap - always a gap”

Page 24: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

`Progressive´ Alignment

WCEAQTKNGQGWVPSNYITPVN

WW--RLNDKEGYVPRNLLGLYP-

AVVIQDNSDIKVVP--KAKIIRD

YAVESEASVQ--PVAALERIN------

WLN-YNEERGDFPGTYVEYIGRKKISP

Profile alignment, “once a gap - always a gap”

Page 25: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

`Progressive´ Alignment

WCEAQTKNGQGWVPSNYITPVN-

WW--RLNDKEGYVPRNLLGLYP-

AVVIQDNSDIKVVP--KAKIIRD

YAVESEASVQ--PVAALERIN------

WLN-YNEERGDFPGTYVEYIGRKKISP

Profile alignment, “once a gap - always a gap”

Page 26: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

`Progressive´ Alignment

WCEAQTKNGQGWVPSNYITPVN--------

WW--RLNDKEGYVPRNLLGLYP--------

AVVIQDNSDIKVVP--KAKIIRD-------

YAVESEA---SVQ--PVAALERIN------

WLN-YNE---ERGDFPGTYVEYIGRKKISP

Profile alignment, “once a gap - always a gap”

Page 27: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

CLUSTAL W

Most important software program: CLUSTAL W:

J. Thompson, T. Gibson, D. Higgins (1994, Nuc. Acids Res.)

(22,327 citations in the literaterature!, Oct 2007)

Page 28: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

Problems with traditional approach:

Results depend on gap penalty

Heuristic guide tree determines alignment;

alignment used for phylogeny reconstruction

Algorithm produces global alignments.

Page 29: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for multiple sequence alignment

Problems with traditional approach:

But:

Many sequence families share only local similarity

E.g. sequences share one conserved motif

Page 30: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Local sequence alignment

Find common motif in sequences; ignore the rest

EYENS

ERYENS

ERYAS

Page 31: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Local sequence alignment

Find common motif in sequences; ignore the rest

E-YENS

ERYENS

ERYA-S

Page 32: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Local sequence alignment

Find common motif in sequences; ignore the rest – Local alignment

E-YENSERYENSERYA-S

Page 33: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Gibbs Motive Sampler

Local multiple alignment without gaps:

E.g. Gibbs sampling

C.E. Lawrence et al. (1993, Science)

Page 34: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Traditional alignment approaches:

Either global or local methods!

Page 35: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

New question: sequence families with multiple local similarities

Neither local nor global methods appliccable

Page 36: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

New question: sequence families with multiple local similarities

Alignment possible if order conserved

Page 37: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

Morgenstern, Dress, Werner (1996, Proc Natl. Acad. Sci.)

Combination of global and local methods

Assemble multiple alignment from gap-free local pairwise alignments (,,fragments“)

Page 38: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atctaatagttaaactcccccgtgcttag

cagtgcgtgtattactaacggttcaatcgcg

caaagagtatcacccctgaattgaataa

Page 39: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atctaatagttaaactcccccgtgcttag

cagtgcgtgtattactaacggttcaatcgcg

caaagagtatcacccctgaattgaataa

Page 40: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atctaatagttaaactcccccgtgcttag

cagtgcgtgtattactaacggttcaatcgcg

caaagagtatcacccctgaattgaataa

Page 41: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atctaatagttaaactcccccgtgcttag

cagtgcgtgtattactaacggttcaatcgcg

caaagagtatcacccctgaattgaataa

Page 42: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atctaatagttaaactcccccgtgcttag

cagtgcgtgtattactaacggttcaatcgcg

caaagagtatcacccctgaattgaataa

Page 43: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atctaatagttaaactcccccgtgcttag

cagtgcgtgtattactaacggttcaatcgcg

caaagagtatcacccctgaattgaataa

Page 44: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atc------taatagttaaactcccccgtgcttag

cagtgcgtgtattactaacggttcaatcgcg

caaagagtatcacccctgaattgaataa

Page 45: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atc------taatagttaaactcccccgtgcttag

cagtgcgtgtattactaacggttcaatcgcg

caaa--gagtatcacccctgaattgaataa

Page 46: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atc------taatagttaaactcccccgtgcttag

cagtgcgtgtattactaacggttcaatcgcg

caaa--gagtatcacc----------cctgaattgaataa

Page 47: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atc------taatagttaaactcccccgtgc-ttag

cagtgcgtgtattactaac----------gg-ttcaatcgcg

caaa--gagtatcacc----------cctgaattgaataa

Page 48: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atc------taatagttaaactcccccgtgc-ttag

cagtgcgtgtattactaac----------gg-ttcaatcgcg

caaa--gagtatcacc----------cctgaattgaataa

Consistency!

Page 49: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

atc------TAATAGTTAaactccccCGTGC-TTag

cagtgcGTGTATTACTAAc----------GG-TTCAATcgcg

caaa--GAGTATCAcc----------CCTGaaTTGAATaa

Page 50: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

The DIALIGN approach

Advantages of segment-based approach:

Program can produce global and local alignments!

Sequence families alignable that cannot be aligned with standard methods

Page 51: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

T-COFFEE

C. Notredame, D. Higgins, J. Heringa (2000, J. Mol. Biol.)

Combination of global and local methods

Page 52: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

T-COFFEE

SeqA GARFIELD THE LAST FAT CAT

SeqB GARFIELD THE FAST CAT

SeqC GARFIELD THE VERY FAST CAT

SeqD THE FAT CAT

Page 53: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

T-COFFEE

SeqA GARFIELD THE LAST FAT CAT

SeqB GARFIELD THE FAST CAT

SeqC GARFIELD THE VERY FAST CAT

SeqD THE FAT CAT

Page 54: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

T-COFFEE

SeqA GARFIELD THE LAST FAT CAT SeqB GARFIELD THE FAST CAT SeqC GARFIELD THE VERY FAST CAT SeqD THE FAT CAT

SeqA GARFIELD THE LAST FA-T CAT SeqB GARFIELD THE FAST CA-T --- SeqC GARFIELD THE VERY FAST CAT SeqD ---------THE ---- FA-T CAT

SeqA GARFIELD THE LAST FAT CAT SeqB GARFIELD THE FAST CAT --- SeqA GARFIELD THE LAST FA-T CAT SeqC GARFIELD THE VERY FAST CAT SeqA GARFIELD THE LAST FAT CAT SeqD ---------THE ---- FAT CAT

SeqB GARFIELD THE ---- FAST CAT SeqC GARFIELD THE VERY FAST CAT SeqB GARFIELD THE FAST CAT SeqD ---------THE FA-T CAT SeqC GARFIELD THE VERY FAST CAT SeqD ---------THE ---- FA-T CAT

Pairwise Alignments

Progressive Alignment

Page 55: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Mixing Heterogenous Data With T-Coffee

Local Alignment Global Alignment

Multiple Sequence Alignment

Multiple Alignment

StructuralSpecialist

Page 56: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.
Page 57: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

T-COFFEE

T-COFFEE

Idea:

1. Build library of pairwise alignments

2. Alignment from seq i, j and seq j, k supports alignment from seq i, k.

Page 58: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

T-COFFEE

T-COFFEE Less sensitive to spurious pairwise similarities Can handle local homologies better than CLUSTAL

Page 59: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Evaluation of multi-alignment methods

Alignment evaluation by comparison to trusted benchmark alignments.

`True’ alignment known by information about structure or evolution.

Page 60: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

1aboA 1 .NLFVALYDfvasgdntlsitkGEKLRVLgynhn..............gE 1ycsB 1 kGVIYALWDyepqnddelpmkeGDCMTIIhrede............deiE 1pht 1 gYQYRALYDykkereedidlhlGDILTVNkgslvalgfsdgqearpeeiG 1ihvA 1 .NFRVYYRDsrd......pvwkGPAKLLWkg.................eG 1vie 1 .drvrkksga.........awqGQIVGWYctnlt.............peG

1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp 1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd..... 1vie 28 YAVESeahpgsvQIYPVAALERIN......

Key

alpha helix RED beta strand GREEN core blocks UNDERSCORE BAliBASE

Reference alignments

Evaluation of multi-alignment methods

Page 61: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Result: DIALIGN best method for distantly related sequences, T-Coffee best for globally related proteins

Page 62: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Evaluation of multi-alignment methods

Conclusion: no single best multi alignment program!

Advice: try different methods!

Page 63: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for phylogeny reconstruction

Two approaches covered in this course:

Distance methods, e.g. Neighbour-Joining Maximum Likelihood

Other important methods (not covered in this course):

Maximum parsimony Bayesian approaches

Page 64: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for phylogeny reconstruction

Phylogenetic trees:

rooted trees unrooted trees

Many methods produce unrooted trees: find root using outgroup!

Page 65: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Biological Question:Are Sponges mono-/paraphyletic?

Phylogenetic Reconstuction: An Example

Organims of interest:Sponge

Page 66: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Build Dataset

Dataset

Query Sequence

DNA/Protein Sequencefrom Sponge Gene

Search for Homologsusing e.g BLAST

Hits from Search:“putative” homologs

Page 67: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Sequence alignment

Dataset

Sequence Alignment

Hits from Search:“putative” homologs

Alignment tools:-Clustalw-T-Coffee-Dialign...many more

Use

to bring sequencesin relation

Page 68: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Alignment

PhylogeneticTree

Phylogeny Methods:Distance-based:---Nj---UPGMAParsimony:---Max.Parsimony(Phylip/Paup)Statistical:---Max.Likelihood (Phyml)---Bayesian Inf. (MrBayes)

Estimate Phylogeny

Page 69: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Interpretate results

Hypothesis: Sponges are monophyletic

Page 70: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for phylogeny reconstruction

Distance methods: For N sequences S1, … SN: Calculate distance d(i,j) for any two sequences Si and Sj

Goal find tree that represents all distances d(i,j) as closely as possible

To calculate distances d(i,j) : construct multiple alignment of input sequences, consider substitutions implied by alignment

Page 71: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Matrix of pairwise distances d(i,j)

Page 72: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Find tree that corresponds to distances d(i,j)

Page 73: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for phylogeny reconstruction

Maximum likelihood:

Consider evolution of sequences as random process. Stochastical model assigns probabilities to substitutions.

Consider tree T as hypothesis about observed sequence data D

Search tree with highest likelihood P(D|T)

Page 74: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Tools for phylogeny reconstruction

Assumptions:

Positions in sequences (colums in alignment) independent of each other

Events on different branches of tree independent of each other

Result: probabilities can be multiplied

Page 75: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Probability P(D|T) for given residues at internal nodes

Page 76: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.
Page 77: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.
Page 78: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Consider all possible residues for internal nodes

Page 79: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.

Testing the reliability of a tree (or parts of it): the bootstrap approach

Bootstrap in general: repeat statistical test after random “re-sampling”, i.e. by drawing additional sample data.

In phylogeny:

1. Select randomly columns from Alignment and repeat tree reconstruction with the same method (e.g. 1000 times)

2. Calculate for every branch: how often is it observed in newly constructed trees?

Page 80: Methods course Multiple sequence alignment and Reconstruction of phylogenetic trees Burkhard Morgenstern, Fabian Schreiber Göttingen, October/November.