Top Banner
Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes
30

Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Dec 15, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Phylogenetics workshop:Protein sequence phylogeny

week 2

Darren Soanes

Page 2: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

• Species trees• Interpretation of trees• Taxon sampling• Tools• Lateral (horizontal) gene transfer• Fast evolving genes

Page 3: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Using DNA sequence to construct trees

TGCTATT TGCTTTT TGCTTTT

TGCTATT – ancestral DNA sequence

TGCTTTT – sequence change due to mutation

Page 4: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Reversals can confuse phylogeniesTGCTATT TGCTATTTGCTTTT TGCTTTT TGCTTTT

TGCTATT – ancestral DNA sequence

TGCTTTT – sequence change

TGCTATTreversal

Page 5: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

To minimise the effect of reversals

• Use DNA sequences that are evolving slowly – mutations happen rarely.

• Use long stretches of DNA.• Align sequences, use the parts of the

alignment that show a high degree of conservation.

• rDNA sequences (genes that encode ribosomal RNA) are often used.

Page 6: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Species tree constructed using ribosomal DNA (rDNA) sequence

Page 7: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Using protein sequences to create species trees

• Advantages– protein sequences evolve more slowly than DNA

sequences (many DNA mutations are neutral – they do not change amino acid sequences)

– reversals are less common than in DNA

• Single copy protein encoding genes identified• Protein sequences joined together to create a

multiple protein sequence for each species• Sequences aligned • Disadvantage – need sequenced genomes

Page 8: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

basidiomycetes

ascomycetes

filamentous ascomycetes

yeasts

zygomycete

30 proteins

60 proteins

Fungal species trees – more proteins = better resolutionoomycete (not fungi)

microsporidia

plant

Page 9: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Fungal Species Tree (based on 153 concatenated protein sequences)

Page 10: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Clades

A clade consists of an ancestor organism and all its descendants.

Page 11: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.
Page 12: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Gene trees

• The evolutionary history of genes can be represented as phylogenetic trees based on alignment of protein sequences.

• Gene duplication and loss can be inferred from phylogenetic trees.

• Protein sequences evolve more slowly that DNA sequences (due to redundancy in genetic code)

Page 13: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Gene duplication

• Gene duplication due to unequal crossing over during meiosis can create gene families.

• Sequence and function of different members of a gene family can diverge.

Page 14: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Gene duplication

Page 15: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Sequence homology (1)

• Genes are said to be homologous if they share a common evolutionary ancestor.

• Orthologues are genes in different species that evolved from a common ancestral gene by speciation. Normally, orthologues retain the same function in the course of evolution. (e.g. myoglobin in mammals).

Page 16: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Sequence homology (2)

• Paralogous genes are related by duplication within a genome. Paralogues often evolve new functions, even if these are related to the original one.

• In-paralogues, paralogues that were duplicated after a speciation and are therefore in the same species

• Out-paralogues, paralogues that were duplicated before a speciation. Not necessarily in the same species.

Page 17: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Orthology and paralogy

Page 18: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Paralogues

In-paraloguesOut-paralogues

A, B and C are different species

α and β are different paralogues of the same gene

Page 19: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Evolution of globin superfamily in human lineage

Page 20: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

TOR gene duplication events in fungi

TOR: protein kinase, subunit of a complex that regulate cell growth in response to nutrient availability and cellular stresses

Page 21: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Taxon sampling methods

• BLAST easiest – though subjective• Occurence of Pfam (protein family) motif• Clustering e.g.

– INPARANOID http://inparanoid.sbc.su.se/cgi-bin/index.cgi

– orthoMCL http://www.orthomcl.org/cgi-bin/OrthoMclWeb.cgi

Page 22: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Minimum bootstrap

• 70% bootstrap is thought to be broadly similar to P-value 0.05

• Minimum bootstrap used depends on study• To improve bootstrap support

– remove poorly aligned sequences if possible, can be due to mis-annotation of genomes.

– Change taxon sampling

Page 23: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Collapse branches with bootstrap less than defined value

Page 24: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.
Page 25: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Lateral gene transfer (purine-cytosine permease)

oomycete

fungi

Page 26: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Eukaryotic Tree of Life

Phytophthora sojae

Aspergillus oryzae

Page 27: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.
Page 28: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Genes that evolve quickly (1)

• Synonymous substitution – change in DNA sequence that does not affect the amino acid sequence, often in the third position of a codon, e.g. CCG (Pro)→CCA (Pro).

• Non-synonymous substitution - change in DNA sequence that does affect the amino acid sequence, often in the first or second position of a codon, e.g. CCG (Pro)→CAG (Gln).

Page 29: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Genes that evolve quickly (2)

• For a given protein encoding gene (comparison between orthologues in more than one species)

• dN=number of non-synonomous mutations• dS=number of synonomous mutations• We can calculate the ratio dN/dS.• For most genes this is < 1• Genes under evolutionary pressure to change protein

sequence (diversify), dN/dS > 1

Page 30: Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.

Genes that evolve quickly (3)

• CodeML (part of the PAML package) will calculate dN/dS for a set of orthologues from different (closely related) species.

• Human vs Chimpanzee – rapidly evolving genes involved in immunity, reproduction and olfaction (smell).

• Genes with very low dN/dS (under purifying selection) involved in metabolism, intracellular signalling, nerve / brain function.