Top Banner
. Phylogenetic Trees Lecture 11 Sections 7.1, 7.2, in Durbin et al.
34

Phylogenetic Trees Lecture 11

Jan 08, 2016

Download

Documents

Prem

Phylogenetic Trees Lecture 11. Sections 7.1, 7.2, in Durbin et al. Evolution. Evolution of new organisms is driven by Diversity Different individuals carry different variants of the same basic blue print Mutations - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Phylogenetic Trees Lecture 11

.

Phylogenetic TreesLecture 11

Sections 7.1, 7.2, in Durbin et al.

1.6.04: ההרצאה נגמרה כ 10 דקות לפני הזמן, למרות שלא הזדרזתי - יש מקום להוסיף?
Page 2: Phylogenetic Trees Lecture 11

2

Evolution

Evolution of new organisms is driven by

Diversity Different individuals

carry different variants of the same basic blue print

Mutations The DNA sequence

can be changed due to single base changes, deletion/insertion of DNA segments, etc.

Selection bias

Page 3: Phylogenetic Trees Lecture 11

3

The Tree of Life

Sou

rce:

Alb

erts

et

al

Page 4: Phylogenetic Trees Lecture 11

4

D’après Ernst Haeckel, 1891

Tree of life- a better picture

Page 5: Phylogenetic Trees Lecture 11

5

Primate evolution

A phylogeny is a tree that describes the sequence of speciation events that lead to the forming of a set of current day species; also called a phylogenetic tree.

Page 6: Phylogenetic Trees Lecture 11

6

Historical Note Until mid 1950’s phylogenies were constructed by

experts based on their opinion (subjective criteria)

Since then, focus on objective criteria for constructing phylogenetic trees

Thousands of articles in the last decades

Important for many aspects of biology Classification Understanding biological mechanisms

Page 7: Phylogenetic Trees Lecture 11

7

Morphological vs. Molecular

Classical phylogenetic analysis: morphological features: number of legs, lengths of legs, etc.

Modern biological methods allow to use molecular features

Gene sequences Protein sequences

Analysis based on homologous sequences (e.g., globins) in different species

Page 8: Phylogenetic Trees Lecture 11

8

Morphological topology

BonoboChimpanzeeManGorillaSumatran orangutanBornean orangutanCommon gibbonBarbary apeBaboonWhite-fronted capuchinSlow lorisTree shrewJapanese pipistrelleLong-tailed batJamaican fruit-eating batHorseshoe bat

Little red flying foxRyukyu flying foxMouseRatVoleCane-ratGuinea pigSquirrelDormouseRabbitPikaPigHippopotamusSheepCowAlpacaBlue whaleFin whaleSperm whaleDonkeyHorseIndian rhinoWhite rhinoElephantAardvarkGrey sealHarbor sealDogCatAsiatic shrewLong-clawed shrewSmall Madagascar hedgehogHedgehogGymnureMoleArmadilloBandicootWallarooOpossumPlatypus

Archonta

Glires

Ungulata

Carnivora

Insectivora

Xenarthra

(Based on Mc Kenna and Bell, 1997)

Page 9: Phylogenetic Trees Lecture 11

9

Rat QEPGGLVVPPTDA

Rabbit QEPGGMVVPPTDA

Gorilla QEPGGLVVPPTDA

Cat REPGGLVVPPTEG

From sequences to a phylogenetic tree

There are many possible types of sequences to use (e.g. Mitochondrial vs Nuclear proteins).

Page 10: Phylogenetic Trees Lecture 11

10

DonkeyHorseIndian rhinoWhite rhinoGrey sealHarbor sealDogCatBlue whaleFin whaleSperm whaleHippopotamusSheepCowAlpacaPigLittle red flying foxRyukyu flying foxHorseshoe batJapanese pipistrelleLong-tailed batJamaican fruit-eating bat

Asiatic shrewLong-clawed shrew

MoleSmall Madagascar hedgehogAardvarkElephantArmadilloRabbitPikaTree shrewBonoboChimpanzeeManGorillaSumatran orangutanBornean orangutanCommon gibbonBarbary apeBaboon

White-fronted capuchinSlow lorisSquirrelDormouseCane-ratGuinea pigMouseRatVoleHedgehogGymnureBandicootWallarooOpossumPlatypus

Perissodactyla

Carnivora

Cetartiodactyla

Rodentia 1

HedgehogsRodentia 2

Primates

ChiropteraMoles+ShrewsAfrotheria

XenarthraLagomorpha

+ Scandentia

Mitochondrial topology(Based on Pupko et al.,)

Page 11: Phylogenetic Trees Lecture 11

11

Nuclear topology

Round Eared Bat

Flying Fox

Hedgehog

Mole

Pangolin

Whale

Hippo

Cow

Pig

Cat

Dog

Horse

Rhino

Rat

Capybara

Rabbit

Flying Lemur

Tree Shrew

Human

Galago

Sloth

Hyrax

Dugong

Elephant

Aardvark

Elephant Shrew

Opossum

Kangaroo

1

2

3

4

Cetartiodactyla

Afrotheria

Chiroptera

Eulipotyphla

Glires

Xenarthra

CarnivoraPerissodactyla

Scandentia+Dermoptera

Pholidota

Primate

(tree by Madsenl)

(Based on Pupko et al. slide)

Page 12: Phylogenetic Trees Lecture 11

12

Theory of Evolution

Basic idea speciation events lead to creation of different

species. Speciation caused by physical separation into

groups where different genetic variants become dominant

Any two species share a (possibly distant) common ancestor

Page 13: Phylogenetic Trees Lecture 11

13

Phylogenenetic trees

Leafs - current day species Nodes - hypothetical most recent common ancestors Edges length - “time” from one speciation to the next

Aardvark Bison Chimp Dog Elephant

Page 14: Phylogenetic Trees Lecture 11

14

Dangers in Molecular Phylogenies

We have to emphasize that gene/protein sequence can be homologous for several different reasons:

Orthologs -- sequences diverged after a speciation event

Paralogs -- sequences diverged after a duplication event

Xenologs -- sequences diverged after a horizontal transfer (e.g., by virus)

USER
אסתי יגר לוטם 31.12.02: פרלןג - גנים ששוכפלו בתוך אותו יצור, וקבלו תכונות שונות. השואת גנים (למשל המוגלובין אלפה ובטה, גם אצל עכבר וגם אצל אדם - השואת אלפה של אדם עם בטה של עכבר תיתו נתונים מוטעים על מידת המרחק)
Page 15: Phylogenetic Trees Lecture 11

15

Dangers of Paralogs

Speciation events

Gene Duplication

1A 2A 3A 3B 2B 1B

Page 16: Phylogenetic Trees Lecture 11

16

Dangers of Paralogs

Speciation events

Gene Duplication

1A 2A 3A 3B 2B 1B

If we happen to consider genes 1A, 2B, and 3A of species 1,2,3, we get a wrong tree that does not represent the phylogeny of the host species of the given sequences because duplication does not create new species.

In the sequel we assume all given sequences are orthologs.

S

SS

Page 17: Phylogenetic Trees Lecture 11

17

Types of Trees

A natural model to consider is that of rooted trees

CommonAncestor

Page 18: Phylogenetic Trees Lecture 11

18

Types of treesUnrooted tree represents the same phylogeny without

the root node

Depending on the model, data from current day species does not distinguish between different placements of the root.

Page 19: Phylogenetic Trees Lecture 11

19

Rooted versus unrooted treesTree a

ab

Tree b

c

Tree c

Represents the three rooted trees

Page 20: Phylogenetic Trees Lecture 11

20

Positioning Roots in Unrooted Trees

We can estimate the position of the root by introducing an outgroup:

a set of species that are definitely distant from all the species of interest

Aardvark Bison Chimp Dog Elephant

Falcon

Proposed root

Page 21: Phylogenetic Trees Lecture 11

21

Type of Data

Distance-based Input is a matrix of distances between species Can be fraction of residue they disagree on, or

alignment score between them, or …

Character-based Examine each character (e.g., residue)

separately

Page 22: Phylogenetic Trees Lecture 11

22

Two Methods of Tree Construction

Distance- A weighted tree that realizes the distances between the objects.

Parsimony – A tree with a total minimum number of character changes between nodes.

We start with distance based methods, considering the following question:Given a set of species (leaves in a supposed tree), and distances between them – construct a phylogeny which best “fits” the distances.

USER
לפני הבניה יש להכניס את משפט 4 הנקודות (מקובץ נפרד), שיחליף את ההוכחה הקודמת שלו בהרצאה 12. כמו כן ייתכן שכדאי לוותר על UPGMA. הערה זו משפיעה כמובן גם על הרצאה 12.שלמה 12.3.03
Page 23: Phylogenetic Trees Lecture 11

23

Exact solution: Additive sets

Given a set M of L objects with an L×L distance matrix:d(i,i)=0, and for i≠j, d(i,j)>0d(i,j)=d(j,i). For all i,j,k it holds that d(i,k) ≤ d(i,j)+d(j,k).

Can we construct a weighted tree which realizes these distances?

Page 24: Phylogenetic Trees Lecture 11

24

Additive sets (cont)

We say that the set M with L objects is additive if there is a tree T, L of its nodes correspond to the L objects, with positive weights on the edges, such that for all i,j, d(i,j) = dT(i,j), the length of the path from i to j in T.

Note: Sometimes the tree is required to be binary, and then the edge weights are required to be non-negative.

Page 25: Phylogenetic Trees Lecture 11

25

Three objects sets are additive:

For L=3: There is always a (unique) tree with one internal node.

( , )( , )( , )

d i j a bd i k a cd j k b c

ab

c

i

j

k

m

Thus0

2

1 )],(),(),([),( jidkjdkidmkdc

Page 26: Phylogenetic Trees Lecture 11

26

How about four objects?

L=4: Not all sets with 4 objects are additive:

eg, there is no tree which realizes the below distances.

i j k l

i 0 2 2 2

j 0 2 2

k 0 3

l 0

Page 27: Phylogenetic Trees Lecture 11

27

The Four Points Condition

Theorem: A set M of L objects is additive iff any subset of four objects can be labeled i,j,k,l so that:

d(i,k) + d(j,l) = d(i,l) +d(k,j) ≥ d(i,j) + d(k,l) We call {{i,j},{k,l}} the “split” of {i,j,k,l}.

ik

lj

Proof:Additivity 4P Condition: By the figure...

Page 28: Phylogenetic Trees Lecture 11

28

4P Condition Additivity:Induction on the number of objects, L.For L ≤ 3 the condition is empty and tree exists. Consider L=4. B = d(i,k) +d(j,l) = d(i,l) +d(j,k) ≥ d(i,j) + d(k,l) = A

Let y = (B – A)/2 ≥ 0. Then the tree should look as follows:We have to find the distances a,b, c and f.

a b

i j

k

m

c

y

l

n

f

Page 29: Phylogenetic Trees Lecture 11

29

Tree construction for L=4

a

b

i

j

k

m

c

y

l

n

f

Construct the tree by the given distances as follows:1. Construct a tree for {i, j,k}, with internal vertex m2. Add vertex n ,d(m,n) = y3. Add edge (n,l), c+f=d(k,l)

n

f

n

f

n

fRemains to prove:

d(i,l) = dT(i,l)d(j,l) = dT(j,l)

Page 30: Phylogenetic Trees Lecture 11

30

Proof for L=4

a

b

i

j

k

m

c

y

l

n

f

By the 4 points condition and the definition of y:d(i,l) = d(i,j) + d(k,l) +2y - d(k,j) = a + y + f = dT(i,l) (the middle equality holds since d(i,j), d(k,l) and d(k,j) are realized by the tree)d(j,l) = dT(j,l) is proved similarly.

Page 31: Phylogenetic Trees Lecture 11

31

Induction step for L>4: Remove Object L from the set By induction, there is a tree, T’, for {1,2,…,L-1}. For each pair of labeled nodes (i,j) in T’, let aij, bij, cij be

defined by the following figure:

aij

bij

cij

i

j

L

mij

1[ ( , ) ( , ) ( , )]

2ijc d i L d j L d i j

Page 32: Phylogenetic Trees Lecture 11

32

Induction step: Pick i and j that minimize cij.

T is constructed by adding L (and possibly mij) to T’, as in the figure. Then d(i,L) = dT(i,L) and d(j,L) = dT(j,L)

Remains to prove: For each k ≠ i,j: d(k,L) = dT(k,L).

aij

bij

cij

i

j

L

mij

T’

Page 33: Phylogenetic Trees Lecture 11

33

Induction step (cont.)Let k ≠i,j be an arbitrary node in T’, and let n be the branching point of k in the path from i to j.

By the minimality of cij , {{i,j},{k,L}} is not a “split” of {i,j,k,L}. So assume WLOG that {{i,L},{j,k}} is a

“split” of {i,j, k,L}.

aij

bij

cij

i

j

L

mij

T’

k

n

Page 34: Phylogenetic Trees Lecture 11

34

Induction step (end)

Since {{i,L},{j,k}} is a split, by the 4 points condition

d(L,k) = d(i,k) + d(L,j) - d(i,j)

d(i,k) = dT(i,k) and d(i,j) = dT(i,j) by induction, and

d(L,j) = dT(L,j) by the construction.

Hence d(L,k) = dT(L,k).

QED

aij

bij

cij

i

j

L

mij

T’

k

n