Top Banner
1 Dan Graur Terminology of Terminology of Phylogenetic Phylogenetic Trees Trees
94

Terminology of Phylogenetic Trees

Feb 12, 2016

Download

Documents

abena

Terminology of Phylogenetic Trees. Dan Graur. Evolutionary relationships are usually illustrated by means of a ph y lo g enetic tree ( dendo g ram) . The “tree metaphor” cannot always be used. Ernst Heinrich Haeckel 1834-1919. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Terminology of Phylogenetic Trees

1

Dan Graur

Terminology of Terminology of Phylogenetic Phylogenetic

TreesTrees

Page 2: Terminology of Phylogenetic Trees

2

•Evolutionary relationships Evolutionary relationships are usually illustrated by are usually illustrated by means of a means of a phphyylologgenetic treeenetic tree ((dendodendoggram)ram). . •The “tree metaphor” cannot The “tree metaphor” cannot always be used.always be used.

Page 3: Terminology of Phylogenetic Trees

3

Ernst Heinrich Haeckel 1834-1919

Page 4: Terminology of Phylogenetic Trees

4

Jean-Baptiste [Pierre Antoine de Monet, Chevalier de] Lamarck. 1809

Page 5: Terminology of Phylogenetic Trees

5

Charles DarwinJuly 1837

July 2007

Page 6: Terminology of Phylogenetic Trees

6

Charles DarwinNovember 1859

Page 7: Terminology of Phylogenetic Trees

7

The terminology of phylogenetics is discombobulated.

Page 8: Terminology of Phylogenetic Trees

8

Graduate Student Assignments

Instead of a stream of emails that will yield unsatisfactory results, kindly set up appointments and let’s talk.

Page 9: Terminology of Phylogenetic Trees

9

In mathematics, a graph is an abstract representation of a set of objects called nodes (or vertices), some of which are connected to one another by links called branches (or edges). A path in a graph is a sequence of branches that connect any two nodes.

Page 10: Terminology of Phylogenetic Trees

10

Graphs = Trees + Non-Tree Graphs (or Networks)

In a tree (b), any two nodes are connected by a single path. In a network (a), there may be multiple pathways connecting two nodes.

Page 11: Terminology of Phylogenetic Trees

11

The evolutionary relationships among a group of organisms are illustrated by means of phylogenetic trees (or dendrograms).

Page 12: Terminology of Phylogenetic Trees

12

InternalExternal or Peripheral

Branch

Page 13: Terminology of Phylogenetic Trees

13

The branching pattern of a tree is called its topology.

Three different styles of trees, one topology.

Page 14: Terminology of Phylogenetic Trees

14

One topology

Page 15: Terminology of Phylogenetic Trees
Page 16: Terminology of Phylogenetic Trees

16

Page 17: Terminology of Phylogenetic Trees

17

Terminal node = Operational taxonomic unit (OTU)Internal node = Hypothetical taxonomic unit (HTU)Peripheral ( or terminal) branch = relationship between OTU and HTUInternal branch = relationship between two HTUs

Page 18: Terminology of Phylogenetic Trees

18

Bifurcating and multifurcating trees

A node is bifurcating (or binary or dichotomous) if it has only two immediate descendant lineages, but multifurcating (or polytomous) if it has three or more than two immediate descendant lineages. In a strictly bifurcating tree, each internal node is incident to exactly three branches, two derived and one ancestral.

Page 19: Terminology of Phylogenetic Trees

19

A bifurcation is always interpreted as a speciation event

Two possible interpretations for a multifurcation (polytomy) in a tree: 1. The polytomy represents the true sequence of events (hard polytomy), whereby an ancestral taxon gave rise to three or more descendant taxa simultaneously. 2. The polytomy represents a lack of resolution. The exact order of two or more bifurcations cannot be determined unambiguously with the available data (soft polytomy).

Page 20: Terminology of Phylogenetic Trees

20

Rooted and unrooted trees

In a rooted tree there exists a particular node, called the root, from which a unique path leads to any other node. The direction of each path corresponds to evolutionary time, and the root is the common ancestor of all the taxonomic units under study.

Page 21: Terminology of Phylogenetic Trees

21

In an unrooted tree with four external nodes, the internal branch is referred to as the central branchcentral branch.

Page 22: Terminology of Phylogenetic Trees

22

How many unrooted topologies are here?

a

b

c

d

e

a

ec

db

a

b

c

e

d

b

a

c

d

e

43

21

Page 23: Terminology of Phylogenetic Trees

23

• In an unrooted phylogenetic tree you cannot immediately assess evolutionary relationships.

• In a rooted phylogenetic tree, evolutionary relationships are evident.

Page 24: Terminology of Phylogenetic Trees

24

Phoronida (horseshoe worms)

Brachiopoda (lampshells)

Arthropoda (arthropods)

Which of the following taxa are evolutionarily the closest to Rick Perry? (a) Phoronida, (b) Brachiopoda, (c) Arthropoda, (d) all three taxa are equidistant from Perry, or (e) two taxa are closer to Perry than the third taxon.

Vertebrata (vertebrates)

Page 25: Terminology of Phylogenetic Trees

25

Cladograms & Phylograms(collectively Dendograms)

Bacterium 1

Bacterium 3Bacterium 2

Eukaryote 1

Eukaryote 4Eukaryote 3Eukaryote 2

Bacterium 1

Bacterium 3Bacterium 2

Eukaryote 1

Eukaryote 4Eukaryote 3Eukaryote 2

Phylograms show branch order and branch lengths

Cladograms show branching order - branch lengths are meaningless

Page 26: Terminology of Phylogenetic Trees

Unscaled phylogramScaled phylogram

The branch length is number of changes (e.g., nucleotide substitutions) that have occurred along a branch. The total number of changes in a particular tree is called the tree length.

Page 27: Terminology of Phylogenetic Trees

27

Page 28: Terminology of Phylogenetic Trees

28

Tree balance

Tree balance is a measure of the degree of symmetry of a rooted phylogenetic tree. It serves as an indication of the pattern of speciation events in the group of taxa under study.

Balanced tree Unbalanced or Pectinate (comb-like) tree

Page 29: Terminology of Phylogenetic Trees

29

Tree balance

In an unbalanced tree, only one descendant of a node continues to speciate after a splitting event. In a balanced tree, all descendants of a node participate equally in cladogenesis.

Balanced tree Unbalanced or Pectinate (comb-like) tree

Page 30: Terminology of Phylogenetic Trees

30

Tree balance

Tree balance is an important indicator of the ease of phylogenetic reconstruction. Because, by definition, unbalanced trees contain long branches, they are more difficult to reconstruct phylogenetically than balanced trees. In fact, unbalanced and balanced tree are sometimes referred to as “good” and “bad” trees, respectively (Sackin 1972).

Balanced tree Unbalanced or Pectinate (comb-like) tree

Page 31: Terminology of Phylogenetic Trees

31

How to describe a phylogenetic tree in computerese?

Page 32: Terminology of Phylogenetic Trees

32

The Newick format In computer programs, trees are represented in a linear form by a string of nested parentheses, enclosing taxon names (and possibly also branch lengths and bootstrap values), and separated by commas. This type of representation is called the Newick format. The originator of this format in mathematics was Arthur Cayley (1821–1895).

Page 33: Terminology of Phylogenetic Trees

33

The Newick format The Newick format for phylogenetic trees was adopted on June 26, 1986 at an informal meeting at Newick's Lobster House in Dover, New Hampshire. The Newick format currently serves as the de facto standard for representing phylogenetic tree and is employed by almost all phylogenetic software tools. Unfortunately, it has never been described in a formal publication; the first time it is mentioned in a publication is in 1992.

Page 34: Terminology of Phylogenetic Trees

34

The Newick format In the Newick format, the pattern of the parentheses indicates the topology of the tree by having each pair of parentheses enclose all members of a monophyletic group. A phylogenetic tree in the Newick format always ends in a semicolon (;).

;

Page 35: Terminology of Phylogenetic Trees

35

The Newick format One can use the Newick format to write down rooted trees, unrooted trees, multifurcations, branch lengths, and bootstrap values.

Page 36: Terminology of Phylogenetic Trees

36

Page 37: Terminology of Phylogenetic Trees

37

3 OTUs

1 unrooted tree = 3 rooted trees

Page 38: Terminology of Phylogenetic Trees

38

4 OTUs

3 unrooted trees = 15 rooted trees

Page 39: Terminology of Phylogenetic Trees

39

The number of pThe number of possibleossible bifurcating rooted trees bifurcating rooted trees (N(NRR) for ) for nn 22OTUsOTUsNR = (2n −3)!

2n−2(n−2)!

The number of possible The number of possible bifurcating unrooted trees bifurcating unrooted trees (N(NUU) for ) for nn 33OTUsOTUs

NU = (2n−5)!2n−3(n−3)!

Page 40: Terminology of Phylogenetic Trees

40

Number of OTUs Number of possible rooted tree

2 13 34 155 1056 9547 10,3958 135,1359 2,027,02510 34,459,42515 213,458,046,676,87520

8,200,794,532,637,891,559,375

Page 41: Terminology of Phylogenetic Trees

41

Evolution is an historical process.

Only one historical narrative is true.

From 8,200,794,532,637,891,559,375 possibilities, 1 possibility is true and 8,200,794,532,637,891,559,374 are false.

Truth is one, falsehoods are many.

Page 42: Terminology of Phylogenetic Trees

42

How do we know which of the

8,200,794,532,637,891,8,200,794,532,637,891,559,375559,375 trees is true?

Page 43: Terminology of Phylogenetic Trees

43

We don’t, we infer by using decision

criteria.

Page 44: Terminology of Phylogenetic Trees

44

True and inferred trees

The sequence of speciation events that has led to the formation of a group of OTUs is historically unique. A tree representing the true evolutionary history is called the true tree.

A tree that is obtained by using a certain set of data and a certain method of tree reconstruction is called an inferred tree.

An inferred tree may or may not be the true tree.

Page 45: Terminology of Phylogenetic Trees

45

ancestor

descendant 1 descendant 2

CladogenesisCladogenesis = the splitting of an evolutionary lineage into two genetically independent lineages.

Page 46: Terminology of Phylogenetic Trees

46descendant

AnagenesisAnagenesis = changes occurring along an evolutionary lineage.

Page 47: Terminology of Phylogenetic Trees

In molecular phylogenetics, we assume that species are only created by cladogenesis.

Page 48: Terminology of Phylogenetic Trees

48

Species Trees &

Gene Trees

Page 49: Terminology of Phylogenetic Trees

49

At every locus, if we trace back the history of any two alleles from any two populations, we will eventually reach a common ancestral allele from which both contemporary alleles have been derived.

Page 50: Terminology of Phylogenetic Trees

50

The routes of inheritance represent the passage of genes from parents to offspring, and the branching pattern depicts a gene tree.

Page 51: Terminology of Phylogenetic Trees

51

Different genes, however, may have different evolutionary histories, i.e., different routes of inheritance, different gene trees.

Page 52: Terminology of Phylogenetic Trees

52

The routes of inheritance are mostly confined by reproductive barriers—that is, gene flow occurs only within the species. A species is therefore like a bundle of genetic connections, in which many entangled parent-offspring lines form the ties that bundle individuals together into a species lineage.

Page 53: Terminology of Phylogenetic Trees

53

A gene tree may differ from a species tree

S = Divergence time for species 1 and 2

Page 54: Terminology of Phylogenetic Trees

54

A gene tree may differ from a species tree

S = Divergence time for species 1 and 2

G1 = Inferred divergence time by using alleles a and f

Page 55: Terminology of Phylogenetic Trees

55

A gene tree may differ from a species tree

Alleles d and b are closer to each other than alleles d and f.

Page 56: Terminology of Phylogenetic Trees

56

Incomplete lineage sorting due to polymorphism at speciation time

Page 57: Terminology of Phylogenetic Trees

57

Gene trees and species trees

It is often assumed that gene trees always equal species trees. This may be not be true.

a

b

c

A

B

D

Gene tree Species tree

Page 58: Terminology of Phylogenetic Trees

58

Taxon Taxon (singular);(singular); Taxa Taxa (plural)(plural) A taxon is a species or a group of species that has been given a name, e.g., Homo Homo sapienssapiens (modern humans) or LepidopteraLepidoptera (butterflies).

There are codes of biological nomenclature which seek to ensure that every taxon has a single and stable name, and that every name is used for only one taxon.

Page 59: Terminology of Phylogenetic Trees

59

• Strictly: A clade is a group of all the taxa that have been derived from a common ancestor plus the common ancestor itself.

• In molecular phylogenetics: A clade is a group of taxa under study that share a common ancestor, which is not shared by any other species outside the group.

Clades*

*also: monophyletic groups, natural clades

Page 60: Terminology of Phylogenetic Trees

60

• A taxon whose common ancestor is shared by any other taxon is called a paraphyletic taxon or an invalid taxon.

Paraphyletic Taxa

Reptiles are paraphyletic.

60

Page 61: Terminology of Phylogenetic Trees

61

• A named taxon that lacks phylogenetic validity, but is nonetheless used, is called a convenience taxon.

“a convenience fish”

Page 62: Terminology of Phylogenetic Trees

62

• If a clade is composed of two taxa, these are referred to as sister taxa.

Sister Taxa

Birds and crocodiles are sister taxa.

Page 63: Terminology of Phylogenetic Trees

63

Which of the following groups are not monophyletic?

E. coli rat mouse baboonchimp human

a. human, chimpanzee, baboon b. mouse, chimpanzee, baboonc. rat, moused. human, chimpanzee, baboon, rat, mouse

e. E. coli, human, chimpanzee, baboon, rat, mouse

Page 64: Terminology of Phylogenetic Trees

64

Which of the following groups are not monophyletic?

E. coli rat mouse baboonchimp human

a. human, chimpanzee, baboon b. mouse, chimpanzee, baboonc. rat, moused. human, chimpanzee, baboon, rat, mouse

e. E. coli, human, chimpanzee, baboon, rat, mouse

Page 65: Terminology of Phylogenetic Trees

65

Two or more sequences are said to be homologous if they are related by descent. Homology is often ascertained on the basis of sequence similarity. Thus, if two or more sequences exhibit high degrees of similarity, it is likely (but not always the case) that they are homologous. Sequence similarity may also arise without common ancestry: by chance, or due to convergence driven by similar selective pressures. Such sequences, which are similar but not homologous, are said to be analogous.

Page 66: Terminology of Phylogenetic Trees

66

Homology is a qualitative statement.

Similarity is a quantitative and, hence, quantifiable statement (e.g., percent similarity, percent identity).

Similarity is a fact. Homology is a hypothesis. Of course, as with any other scientific hypothesis, homology between two sequences may be tested and every so often rejected.

Page 67: Terminology of Phylogenetic Trees

67

Types of homology

•Orthology: Similarity due to speciation. •Paralogy: Similarity due to gene duplication. •Ohnology: A special case of paralogy in which similarity is due to genome duplication.•Xenology: Similarity due to horizontal gene transfer.

Page 68: Terminology of Phylogenetic Trees

68

Orthologs and Paralogs

a Ab c BC

Ancestral gene

Duplication yields 2 copies (paralogs) on the same genome

orthologousorthologous

paralogous

Page 69: Terminology of Phylogenetic Trees

69

Orthologs and Paralogs

a Ab* c BC

ACb

A mixture of orthologs and paralogs is sampled

Only b, C, and A are sampled

Page 70: Terminology of Phylogenetic Trees

70

Page 71: Terminology of Phylogenetic Trees

71

Page 72: Terminology of Phylogenetic Trees

72

A character provides

information about an individual

OTU.

A distance represents

a quantitative statement concerning

the dissimilarity between two OTUs.

Page 73: Terminology of Phylogenetic Trees

73

A character is a well-defined feature that in a taxonomic unit can assume one out of two or more mutually exclusive character states. Mutually exclusive: If David is tall, David cannot be short.

Page 74: Terminology of Phylogenetic Trees

74

Page 75: Terminology of Phylogenetic Trees

75

Page 76: Terminology of Phylogenetic Trees

76

Continuous Discrete

BinaryMultistate

Unordered

UnpolarPolarUnpolarPolar

Character

Ordered

Page 77: Terminology of Phylogenetic Trees

77

A character is unordered if a change from one character state to any other character state can occur in one step.

Page 78: Terminology of Phylogenetic Trees

78

A character is ordered if there exists a symmetrical path of change from one character state to another.

Page 79: Terminology of Phylogenetic Trees

79

Polar

A character is polar if there exists an asymmetrical (irreversible) path of change from one character state to another.

Page 80: Terminology of Phylogenetic Trees

80

The number of steps between two character states is specified by a step matrix.

Page 81: Terminology of Phylogenetic Trees

81

Assumptions about character evolution

Methods of phylogenetic reconstruction require that we make explicit assumptions about:

(1) the number of discrete steps required for one character state to change into another.

(2) the probability with which such a change may occur.

Page 82: Terminology of Phylogenetic Trees

82

Temporal Polarity of Character States

Character states may be ranked by relative antiquity into:

(1) primitive or ancestral (plesiomorphy)

(2) derived or novel (apomorphy)

Page 83: Terminology of Phylogenetic Trees

83

Taxonomic Distribution of Character States

A primitive state that is shared by several taxa is a symplesiomorphy.

A derived state that is shared by several taxa is a synapomorphy.

A derived character state unique to a particular taxon is an autapomorphy.

A character state that is shared by several taxa due to convergence, parallelism and reversals, rather than due to common descent, is a homoplasy.

symbioisissympathysynapsesynteny

Page 84: Terminology of Phylogenetic Trees

84

C C

C

A

A

A

B A A

A

B

plesiomorphy

apomorphy

(autapomorphy)synapomorphy

symplesiomorphy

homoplasy

A

D

Page 85: Terminology of Phylogenetic Trees

85

What is swimming in shark and carp?

shark carp guppy chickenrat

bat

a. symplesiomorphic b. synapomorphicc. autapomorphicd. homoplasic

Page 86: Terminology of Phylogenetic Trees

86

What are scales in guppy and carp?

shark carp guppy chickenrat

bat

a. symplesiomorphic b. synapomorphicc. autapomorphicd. homoplasic

Page 87: Terminology of Phylogenetic Trees

87

What are feathers in chicken?

shark carp guppy chickenrat

bat

a. symplesiomorphic b. synapomorphicc. autapomorphicd. homoplasic

Page 88: Terminology of Phylogenetic Trees

88

What are wings in chicken and bat?

shark carp guppy chickenrat

bat

a. symplesiomorphic b. synapomorphicc. autapomorphicd. homoplasic

Page 89: Terminology of Phylogenetic Trees

89

Distance Data

Page 90: Terminology of Phylogenetic Trees

90

Page 91: Terminology of Phylogenetic Trees

91

Most molecular data yield character states that are subsequently converted into distances.

Page 92: Terminology of Phylogenetic Trees

92

Page 93: Terminology of Phylogenetic Trees

93

Page 94: Terminology of Phylogenetic Trees

94

+

Ultrametricity = Strict Molecular Clock