tp://www.nimr.mrc.ac.uk/images/multimedia/news/large/cake- large.jpg
Jun 21, 2015
http://www.nimr.mrc.ac.uk/images/multimedia/news/large/cake-large.jpg
Robert BeikoFaculty of Computer Science*Dalhousie UniversityHalifax, 2 feet of snow last week, CanadaApril 5, 2014
The dream of a Tree of Life
Can a ToL be[correctly][reliably][accurately]inferred?
Woese
“All happy phylogenies are alike; each unhappy phylogeny is unhappy in its own way.”
- Evolution Leo Tolstoy
Creevey et al. Proc. R. Soc. Lond. B (2004)
Early ancestral signal
is probably gone
It getsworse
W. Ford Doolittle, Sci Am (1999)
OMFG it gets even worse
Kunin et al. (2005) Genome Res
make it stop make it stop
Dagan et al. (2008) PNAS
Do not adjust your model
What is the meaning of this??• Signal saturation + tiny branches that happened a
long, long, long time ago• Other unpleasant biases (G+C, rates, etc)• Lateral gene transfer
Finding LGT
en.wikipedia.org
K-m
ers
or
cod
on
usa
ge
Wang et al. (2001) MBE
Phylo
geneti
c dis
cord
ance
Concordance weighted Discordance weighted
Euchlamydispirokaryotes
Extremarchalsobacteriae
Phylogenetics!MAFs, SPRs, LGTs
Chris Whidden+ Norbert Zeh
Building a MAF by edge cutting
Example case: a & c are sisters in the species tree, but not in the gene tree.What can we do to the gene tree?
• Naïve case: O(3kn)• Fancy refinements: O(2.42kn)• Even fancier refinements: O(2kn) (conjectured)
FIXED PARAMETER TRACTABLE –Exponential in the distance between trees, not the number of leaves
Hypotheses about LGTHypotheses about LGT
The Complexity Hypothesis(Jain et al., 1999)• “Informational” proteins have more interactions
with other proteins in the cell, and are therefore less likely to be successfully transferred than, say, metabolic stuff
• Cohen et al. (2011): forget about function, it’s all about the connections with other proteins in the cell
The Selfish Operon Hypothesis:Lawrence and Roth (1996)
• Genes associate in operons because it facilitates transfer of all constituents of a pathway at once
• If the genes were dispersed throughout the genome, then the selective advantage of a pathway could not be propagated via transfer
The Public Goods Hypothesis:McInerney et al. (2011)
• Genes are public goods that can be freely shared and cannot be excluded from being available
• These genes are constantly acquired and integrated into genomes, invalidating the idea of a unifying Tree of Life
Highways of gene sharing:Beiko et al. (2005)
• Gene sharing occurs preferentially between lineages, and successful gene acquisitions often reflect shared ecology
LGT stories
P. aeruginosaP. fluorescensP. lePewtidaP. syringaeP. entomophilaP. stutzeriP. mendocina
(Catherine) Holloway and Beiko, 2010
“Plume”
ProteobacteriaPlanar is plainer, could be pain-er
Beiko, 2011
244 taxa40,631 trees= Bacterial SPR supertree
LGT patterns for Clostridium
Whidden et al., 2014
Cold case – Aquifex aeolicus & friends
(Rob) Eveleigh et al., 2013
LGT in the Wild
Hehemann et al. (2010) Nature
WHY DO SOME GUT BACTERIA HAVE PORPHYRANASES?
OH
NORISON
Smillie et al. (2011) Science
Lachnospiraceae – Gut / mouth enthusiasts
(Conor) Meehan and Beiko (2014) GBE
“Good” strains ..?
“Not so good” strains ..?
Butyrate production – a crucialfunction, subject to LGT
Finding LGT in the microbiome?• Illumina sequencing - aaaaargh!• Mixed samples! [imagine what happens
when you try to assemble!]• Strain-level differentiation!• etc
What does it all mean?LGT seriously undermines the recovery (and validity?) of the Tree of Life
Even so, aggregation methods (supertrees, etc.) can provide a useful scaffold for inferring LGT events
LGT serves as a useful starting point for hypotheses of habitat adaptation / invasion
Metagenomic data offer new context to LGT events (and genomic data show we should be looking at communities), but present huge challenges to inference
FIN