Top Banner
Phylogenet ics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost Clocks.
35

Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Dec 31, 2015

Download

Documents

Lionel Davidson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

PhylogeneticsWhat is a tree & how many are there?

Principles of phylogenetic receconstruction.

Special Issues

Rooting a tree

The Molecular Clock

Almost Clocks.

Page 2: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Trees – graphical & biological.A graph is a set vertices (nodes) {v1,..,vk} and a set of edges {e1=(vi1,vj1),..,en=(vin,vjn)}. Edges can be directed, then (vi,vj) is viewed as different (opposite direction) from (vj,vi) - or undirected.

Nodes can be labelled or unlabelled. In phylogenies the leaves are labelled and the rest unlabelled.

The degree of a node is the number of edges it is a part of. A leaf has degree 1.

A graph is connected, if any two nodes has a path connecting them.

A tree is a connected graph without any cycles, i.e. only one path between any two nodes.

v1v2

v4

v3

(v1v2)

(v2, v4)

or (v4, v2)

Page 3: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Trees & phylogenies.A tree with k nodes has k-1 edges. (easy to show by induction).

A root is a special node with degree 2 that is interpreted as the point furthes back in time. The leaves are interpreted as being contemporary.

A root introduces a time direction in a tree.

A rooted tree is said to be bifurcating, if all non-leafs/roots has degree 3, corresponding to 1 ancestor and 2 children. For unrooted tree it is said to have valency 3.

Edges can be labelled with a positive real number interpreted as time duration or amount or evolution.

If the length of the path from the root to any leaf is the same, it obeys a molecular clock.

Tree Topology: Discrete structure – phylogeny without branch lengths.

Leaf

Root

Internal Node

Leaf

Internal Node

Page 4: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Enumerating Trees: Unrooted & valency 3

2

1

3

11

24

23

31 2

3 4

4

1 2

3 4

1 2

3 4

1 2

3 4

1 2

3 4

1 2

3 4

5

5 5

5

5

(2 j 3)j3

n 1

(2n 5)!

(n 2)!2n 2

4 5 6 7 8 9 10 15 20

3 15 105 945 10345 1.4 105 2.0 106 7.9 1012 2.2 1020

Recursion: Tn= (2n-5) Tn-1 Initialisation: T1= T2= T3=1

Page 5: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Local operations on trees.

Nearest Neighbor Interchange:

Subtree cut and regrafting – (subtree root kept)

Subtree cut and regrafting – (subtree root possibly new)

A C

DB

AC

DB

Page 6: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Central Principles of Phylogeny Reconstruction

Parsimony

Distance

Likelihood

TTCAGT

TCCAGT

GCCAAT

GCCAAT

s2

s1

s4

s3

s2

s1

s4

s3

s2

s1

s4

s3

0

1

12

0 Total Weight: 4

1

1 2

3 2 10.4

0.6

0.3

0.71.5

L=3.1*10-7

Parameter estimates

Page 7: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Distance Concepts on Trees I

A: Metric, d( , ) : i: d(a,b)=0 <=> a=b ii: d(a,b)=d(b,a) iii: d(a,b) <= d(a,c) + d(c,b)

a

c

b

Page 8: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Tree Metric: (distance function originates from tree)

d(x,y) + d(z,w) = d(x,z) + d(y,w) > d(x,w) + d(y,z), where z,y,z,w is a permutation of a,b,c,d.

(> implies that no branch has length 0)

Distance Concepts on Trees II

s2

s1

s4

s3

Reconstruction Principle: d(s1,i) = (d(s1,s2) + d(s1,s3) - d(s2,s3))/2

s3

s2s1

i

Page 9: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Ultra Metric (distance function originates from tree)

d(x,y) = d(x,z) > d(x,y), where z,y,z is a permutation of a,b,c.(> implies that no branch has length 0)

Distance Concepts on Trees III

i

s1 s3s2

Reconstruction Principle: d(s1,i) = d(s1,s2)/2

Page 10: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Unweighted Pair-Group method with Arithmetic MeanInput: Matrix with pariwise distances between sequences, D:

1: Find smallest distance, di,j

2: i,j are now siblings with a distance, di,j/2, to their MRCA (i,j).

3: A new distancematrix of dimension (n-1)*(n-1) where i and j have been substituted by (i,j). All distances to (i,j) are dk,(i,j) = (dk,i + dj,k)/2.

4: This is done n-1 times and the tree has been reconstructed.

Output: An ultrametric.

Comment: i. If UPGMA is given an ultrametric, it will reconstruct the same ultrametric.

UPGMASokal and Michener, 1958

Page 11: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Assignment to internal nodes: The simple way.

C

A

C CA

CT G

???

?

?

?

What is the cheapest assignment of nucleotides to internal nodes, given some (symmetric) distance function d(N1,N2)??

If there are k leaves, there are k-2 internal nodes and 4k-2 possible assignments of nucleotides. For k=22, this is more than 1012.

Page 12: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Cost of a history - minimizing over internal statesA C G T

A C G T A C G T

d(C,G) +wC(left subtree)

subtree)} (),({min

subtree)} (),({min

)(

rightwNGd

leftwNGd

subtreew

NsNucleotideN

NsNucleotideN

G

Page 13: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Cost of a history – leaves (initialisation).A C G T

G A

Empty

Cost 0

Empty

Cost 0

Initialisation: leaves

Cost(N)= 0 if

N is at leaf,

otherwise infinity

Page 14: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Fitch-Hartigan-Sankoff Algorithm

(A,C,G,T) (9,7,7,7) Costs: Transition 2, / \ Transversion 5. / \ / \ (A, C, G, T) \ (10,2,10,2) \ / \ \ / \ \ / \ \ / \ \ / \ \ (A,C,G,T) (A,C,G,T) (A,C,G,T) * 0 * * * * * 0 * * 0 *

The cost of cheapest tree hanging from this node given there is a “C” at this node

A C

TG

Page 15: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

5S RNA Alignment & PhylogenyHein, 1990

10 tatt-ctggtgtcccaggcgtagaggaaccacaccgatccatctcgaacttggtggtgaaactctgccgcggt--aaccaatact-cg-gg-gggggccct-gcggaaaaatagctcgatgccagga--ta17 t--t-ctggtgtcccaggcgtagaggaaccacaccaatccatcccgaacttggtggtgaaactctgctgcggt--ga-cgatact-tg-gg-gggagcccg-atggaaaaatagctcgatgccagga--t- 9 t--t-ctggtgtctcaggcgtggaggaaccacaccaatccatcccgaacttggtggtgaaactctattgcggt--ga-cgatactgta-gg-ggaagcccg-atggaaaaatagctcgacgccagga--t-14 t----ctggtggccatggcgtagaggaaacaccccatcccataccgaactcggcagttaagctctgctgcgcc--ga-tggtact-tg-gg-gggagcccg-ctgggaaaataggacgctgccag-a--t- 3 t----ctggtgatgatggcggaggggacacacccgttcccataccgaacacggccgttaagccctccagcgcc--aa-tggtact-tgctc-cgcagggag-ccgggagagtaggacgtcgccag-g--c-11 t----ctggtggcgatggcgaagaggacacacccgttcccataccgaacacggcagttaagctctccagcgcc--ga-tggtact-tg-gg-ggcagtccg-ctgggagagtaggacgctgccag-g--c- 4 t----ctggtggcgatagcgagaaggtcacacccgttcccataccgaacacggaagttaagcttctcagcgcc--ga-tggtagt-ta-gg-ggctgtccc-ctgtgagagtaggacgctgccag-g--c-15 g----cctgcggccatagcaccgtgaaagcaccccatcccat-ccgaactcggcagttaagcacggttgcgcccaga-tagtact-tg-ggtgggagaccgcctgggaaacctggatgctgcaag-c--t- 8 g----cctacggccatcccaccctggtaacgcccgatctcgt-ctgatctcggaagctaagcagggtcgggcctggt-tagtact-tg-gatgggagacctcctgggaataccgggtgctgtagg-ct-t-12 g----cctacggccataccaccctgaaagcaccccatcccgt-ccgatctgggaagttaagcagggttgagcccagt-tagtact-tg-gatgggagaccgcctgggaatcctgggtgctgtagg-c--t- 7 g----cttacgaccatatcacgttgaatgcacgccatcccgt-ccgatctggcaagttaagcaacgttgagtccagt-tagtact-tg-gatcggagacggcctgggaatcctggatgttgtaag-c--t-16 g----cctacggccatagcaccctgaaagcaccccatcccgt-ccgatctgggaagttaagcagggttgcgcccagt-tagtact-tg-ggtgggagaccgcctgggaatcctgggtgctgtagg-c--t- 1 a----tccacggccataggactctgaaagcactgcatcccgt-ccgatctgcaaagttaaccagagtaccgcccagt-tagtacc-ac-ggtgggggaccacgcgggaatcctgggtgctgt-gg-t--t-18 a----tccacggccataggactctgaaagcaccgcatcccgt-ccgatctgcgaagttaaacagagtaccgcccagt-tagtacc-ac-ggtgggggaccacatgggaatcctgggtgctgt-gg-t--t- 2 a----tccacggccataggactgtgaaagcaccgcatcccgt-ctgatctgcgcagttaaacacagtgccgcctagt-tagtacc-at-ggtgggggaccacatgggaatcctgggtgctgt-gg-t--t- 5 g---tggtgcggtcataccagcgctaatgcaccggatcccat-cagaactccgcagttaagcgcgcttgggccagaa-cagtact-gg-gatgggtgacctcccgggaagtcctggtgccgcacc-c--c-13 g----ggtgcggtcataccagcgttaatgcaccggatcccat-cagaactccgcagttaagcgcgcttgggccagcc-tagtact-ag-gatgggtgacctcctgggaagtcctgatgctgcacc-c--t- 6 g----ggtgcgatcataccagcgttaatgcaccggatcccat-cagaactccgcagttaagcgcgcttgggttggag-tagtact-ag-gatgggtgacctcctgggaagtcctaatattgcacc-c-tt-

9

11

10

6

8

7

543

12

17

16

1514

13

12

Transitions 2, transversions 5

Total weight 843.

Fungi

Animals

Mitochondria Plants

Prokaryotes

Page 16: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

The Felsenstein ZoneFelsenstein-Cavendar (1979)

s4

s3s2

s1

Patterns:(16 only 8 shown)

0 1 0 0 0 0 0 0

0 0 1 0 0 1 0 1

0 0 0 1 0 1 1 0

0 0 0 0 1 0 1 1

True Tree Reconstructed Tree

s3

s1

s2

s4

Page 17: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

BootstrappingFelsenstein (1985)

ATCTGTAGTCT

ATCTGTAGTCT

ATCTGTAGTCT

ATCTGTAGTCT

10230101201

ATCTGTAGTCT

ATCTGTAGTCT

ATCTGTAGTCT

ATCTGTAGTCT

1

23

4

Page 18: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Probability of a pattern - summing over internal states

A C G T

A C G T A C G T

A

A

A

?

? ?

?

T

GC

Page 19: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Probability of leaf observations - summing over internal states

A C G T

A C G T A C G T

subtree)} ()({

subtree)} ()({

)(

rightPNGP

leftPNGP

subtreeP

NsNucleotideN

NsNucleotideN

G

P(CG) *PC(left subtree)

GleafG leafP

tionInitialisa

,)(

Page 20: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

With Clock: Without Clock: s5 s4 23 5.2 \ / /\ 40.9 20.4 / \ \ / / \ ! / \ 1.6 5.6 23 sd4.6 124.4 / \ s1---6-------22---------------11---3 /\ \ ! ! 44.9 /\ \ /\ 7 3.4 4 sd.1.4 / \ \ / \ ! s1 s2 s3 s4 s5 s2

Likelihood: 7.9*10-14 = 0.31.1,0.18.1 6.2*10-12 = 0.34.1 0.16.1

ln(7.9*10-14) –ln(6.2*10-12) is 2 – distributed with (n-2) degrees of freedom.

Output from Likelihood Method

Page 21: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

First noted by Zuckerkandl & Pauling (1964) as an empirical fact.

How can one detect it?

Known Ancestor Time Unknown AncestorTime

/\ a at time T. / \ / \ ? \ / \ /\ \ / \ / \ \ / \ / \ \s1 s2 s1 s2 s3

The Molecular Clock

Page 22: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

3 billion years ago: no reliable clock no outgroupGiven 2 set of homologous proteins, i.e. MDH & LDH can the archea, prokaria and eukaria be rooted? LDH MDH A A \ \ \ \ --------E --------E / / / / P P LDH MDH / \ / \ / \ /\ /\ / \ / \ / /\ / /\ P A E P A E

Rooting the 3 kingdoms

Page 23: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Purpose 1) To give time direction in the phylogeny & most ancient point2) To be able to define concepts such a monophyletic group.

Metoder:1) Outgrup: Enhance data set with sequence from a species definitely distant to all of them. It will be be joined at the root of the original data set.

2) Midpoint: Find midpoint of longest path in tree.

3) Assume Molecular Clock.

Rootings

Page 24: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

(Illustration of Langley-Fitch) s1 /\ \ / \ clock: l1 \ / \ ----*--- s3 /\ \ {l1 = l2 < l3} l2 / l3 / \ \ / / \ \ s2 s1 s2 s3Given root: (2k-3)-(k-1) = (k-2) degrees of freedoms lost in imposing a clock.Assumptions1. Ancestral Sequences are observable.2. The number of events on branch is Poisson distributed with a mean proportional to the branch length. The same proportionality constant for all branches.3. The observed differences between sequences at two neighboring nodes is the actual number of events. s1' s1 \ \ \ l1 \ c*l1 \ ------- s3 ------------ s3' l2 / l3 c*l2 / c*l3 / / s2 / s2' sequences 1 sequences 2 k sequences s species : s(2k-3)s s(k-1) (2k-3)+s s+(k-1)

The generation/year-time clock

Page 25: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

I Smoothing a non-clock tree onto a clock tree (Sanderson).

II Rate of Evolution of the rate of Evolution (Thorne et al.).The rate of evolution can change at each bifurcation.

III Relaxed Molecular Clock (Huelsenbeck et al.). At random points in time, the rate changes by multiplying with random variable (gamma distributed)

Almost Clocks (MJ Sanderson (1997) “A Nonparametric Approach to Estimating Divergence Times in the Absence of Rate Constancy” Mol.Biol.Evol.14.12.1218-31) , J.L.Thorne et al. (1998): “Estimating the Rate of Evolution of the Rate of Evolution.” Mol.Biol.Evol. 15(12).1647-57, JP Huelsenbeck et al. (2000) “A compound Poisson Process for Relaxing the Molecular Clock” Genetics 154.1879-92. )

Page 26: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Non-contemporaneous leaves.(A.Rambaut (2000): Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies. Bioinformatics 16.4.395-399)

Page 27: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

In presence of recombination and Gene Conversion, the relationship among sequence might not be describable by a phylogeny!!

Recombination and the Molecular Clock I

Common Practice: I Finding “the phylogeny” anyway.II testing for the molecular clock.

Page 28: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

What is the consequences of this practice?I Simulate data with model including recombination.II Reconstruct phylogeny.III Test for Clock.

Recombination and the Molecular Clock IISchierup & Hein (2000): Recombination and the Molecular Clock. Mol.Biol.Evol.17.10.1578-79 + Schierup & Hein (2000): Consequences of Recombination on Traditional Phylogenetic Analysis. Genetics 156.879-91.

Page 29: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

History of Phylogenetic Methods

1958 Sokal and Michener publishes UGPMA method for making distrance trees with a clock.

1964 Parsimony principle defined, but not advocated by Edwards and Cavalli-Sforza.

1962-65 Zuckerkandl and Pauling introduces the notion of a Molecular Clock.

1967 First large molecular phylogenies by Fitch and Margoliash.

1969 Heuristic method used by Dayhoff to make trees and reconstruct ancetral sequences.

1970: Neyman analyzes three sequence stochastic model with Jukes-Cantor substitution.

1971-73 Fitch, Hartigan & Sankoff independently comes up with same algorithm reconstructing parsimony ancetral sequences.

1973 Sankoff treats alignment and phylogenies as on general problem – phylogenetic alignment.

Page 30: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

1979 Cavender and Felsenstein independently comes up with same evolutionary model where parsimony is inconsistent. Later called the “Felsenstein Zone”.

1981: Felsenstein Maximum Likelihood Model & Program DNAML (i programpakken PHYLIP).

1981 Parsimony tree problem is shown to be NP-Complete.

1985: Felsenstein introduces bootstrapping as confidence interval on phylogenies.

1986 Bandelt and Dress introduces split decompostion as a generalization of trees.

1985-: Many authors (Sawyer, Hein, Stephens, M.Smith) tries to address the problem of recombinations in phylogenies.

1997-9 Thorne et al., Sanderson & Huelsenbeck introduces the Almost Clock.

2000 Rambaut (and others) makes methods that can find trees with non-contemporaneous leaves.

2001- Major rise in the interest in phylogenetic statistical alignment

Page 31: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Books:Molecular Systematics (1996) (eds. Hillis and Craig)New Uses for Phylogenies (1996) (eds. P.Harvey)W.Maddison and D.Maddison : MacCladeSemple & Steel (2003): Phylogenetics OUP

Journals:Molecular Biology and EvoltionJ. Molecular EvolutionMolecular PhylogeneticsSystematic Biology.J. of Classification

www-pages:PAUP – probably the best package for phylogenetic analysis available. David Swoffordhttp://www.lms.si.edu/PAUP/about.html

MacClade – W. & D. Maddison http://phylogeny.arizona.edu/macclade/macclade.html

PHYLIP – J. Felsenstein. http://depts.washington.edu/genetics/faculty/felsenstein.html

PAML – Z. Yang http://abacus.gene.ucl.ac.uk/

Phylogeny: literature, www and packages.

Page 32: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

1: Error function: wi,j * (di,j - pi,j)a

2: Minimisation has two parts topology & branchlengths. Try all topologies and solv branch problem for each.

3: A(i,j),k is (n*(n-1)/2)*(2n-3) matrix with 1 if k is an edge on the path from i to j, 0 ellers.

4: The path length i & j, pi,j, In the given topology is given by: pi,j = A(i,j),k*sk.

5: If wi,j =1 og a=2 this can be solved by linear algebra (di,j - A(i,j),k*sk)2

Global Fit Metods

Page 33: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Input: Distancematrix D.

1: For each leaf the average distance to the others is calculated ri=(di,1 + di,2 + + dn,i)/(n-1).

2: Rate corrected distance matrix, M, is constructedmi,j = di,j - (ri + rj)/(n-2). Only minimal mi,j is necessary.

3: Make ancestral node, u, to i & j giving minimal mi,j. New branch lengths are defined by si,u = di,j/2 + (ri - rj)/[2*(N-2)] sj,u = di,j - si,u

4: The distance from u to the others are set to dk,u = (di,k + dj,k -di,j)/2

Do this n-2 times

Alternativ karakterisation af metoden: Start med bedste kvadratiske fit af et træ med en k indre (k<n) indre knuder, tilføj den indre gren, som giver den største forbedring i det kvadratiske fit (nu k+1 knuder). Dette fortsættes indtil hel træet er bygget (k-1 indre knuder er tilføjet.

Nearest Neighbor JoiningSaitou and Nei, 1987

Page 34: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

Ø = Lavt overslag på vægten af træ - eventuelt vægten på godt gættet træ.

W(n) = vægten for træet i knude n.R(n) = højt underslag for vægttilvæksten ved at tilføje resten af sekvenserne.Betingelse for bounding:W(n) + R(n) >= Ø97 7 102Hvordan regnes R(n) ud? A T C G A C G G T C G G *

Branch and Bound Algorithm

Page 35: Phylogenetics What is a tree & how many are there? Principles of phylogenetic receconstruction. Special Issues Rooting a tree The Molecular Clock Almost.

I. Bootstrapping columns in the alignment.Example: Human, Chimp, Gorilla & Orangutan with root.position 1 2 3 4 5 6 7 8 9 12.586H T C T G A C G T T T G A ... CC T C T G A C G G T T G A ... CG T C T G A C G G T T G A ... CO T C A G A C G G T C G A ... Croot T C A G A C G T A A G A ... C15 possible trees, only 3 of relevance: /\ /\ /\ / \ / \ / \ /\ \ /\ \ /\ \ / \ \ / \ \ / \ \ /\ \ \ /\ \ \ /\ \ \ / \ \ \ / \ \ \ / \ \ \ H C G O H G C O C G H OI. Bootstrap probabilities: 0.80 0.09 0.11II. Differences in likelihood: 0.0 -16.63 s.d=14.22 -15.12 sd=13.95

Tree topology comparison.