Gene family evolution across the Fungi Jason Stajich Duke University Lineage-specific expansions and contractions
Gene family evolution across the Fungi
Jason StajichDuke University
Lineage-specific expansions and contractions
Fungal genome evolution
• How have fungi adapted to niches?
• Are family expansions or contractions the result of adaptive evolution?
37 Fully sequenced fungal genomes
Rhizopus oryazeNeurospora crassaPodospora anserinaChaetomium globosumMagnaporthe griseaFusarium verticillioidesFusarium graminearumTrichoderma reeseiSclerotinia sclerotiorumBotrytis cinereaStagonospora nodorumUncinocarpus reesiiCoccidioides immitisHistoplasma capsulatumAspergillus fumigatusAspergillus nidulansAspergillus terreusAspergillus oryzaeAshbya gosspyiiKluyveromyces lactisSaccharomyces cerevisiaeCandida glabrataCandida lusitaniaeDebaryomyces hanseniiCandida guilliermondiiCandida tropicalisCandida albicansCandida dubliniensisYarrowia lipolyticaSchizosaccharomyces pombeCryptococcus neoformansCryptococcus neoformans H99Cryptococcus gattii WM276Cryptococcus gattii R265Phanerochaete chrysosporiumCoprinus cinereusUstilago maydis
Saprophyte
Bread mold, Opp Hum pathogen
Saprophyte
Hemibiotroph - Rice
Hemibiotroph - wheat
Primary Hum pathogen
Primary Hum pathogen
Opp Hum pathogen
Saprophyte/Industrial uses
Hemibiotroph - maize
Saprophyte
Necrotroph
Necrotroph - fruits
Hemibiotroph - wheat
Opp Hum pathogen
Biotroph/Industrial uses
Industrial uses
Industrial uses
Opp Hum pathogen
Opp Hum pathogen
Opp Hum pathogen
Opp Hum pathogen
Opp Hum pathogen
Opp Hum pathogenSaprophyte
SaprophyteBiotroph - maize
Opp Hum pathogen
Opp Hum pathogen
Opp HumOpp Hum pathogen
Industrial uses
SaprophyteOpp Hum Pathogen
Saprophyte
pathogen
0100200300400500600700800900Million
years ago
Euascomycota
Zygomycota
Hemiascomycota
Archiascomycota
Basidiomycota
51+ More funded and in progress world-wide
Species Clade Sequencing centerSchizosaccharomyces japonicus Archaeascomycta Broad-FGISchizosaccharomyces octosporus Archaeascomycta Broad-FGIPneumocystis carinii Archaeascomycta Sanger, UC, Broad-FGIPneumocystis carinii hominis Archaeascomycta UC, Broad-FGI, UCAmanita bisporigera Basidiomycota: Homobasidiomycota MSUCrinipellis perniciosa Basidiomycota: Homobasidiomycota Univ CampinasGanoderma lucidum Basidiomycota: Homobasidiomycota Yang-Ming UnivHebeloma cylindrosporum Basidiomycota: Homobasidiomycota INRALaccaria bicolor Basidiomycota: Homobasidiomycota JGI-DOEPhakopsora pachyrhizi Basidiomycota: Homobasidiomycota JGI-DOEPostia placenta Basidiomycota: Homobasidiomycota JGI-DOESchizophyllum commune Basidiomycota: Homobasidiomycota JGI-DOESporobolomyces roseus Basidiomycota: Urediniomycota JGI-DOEPhakopsora meibomiae Basidiomycota: Urediniomycota JGI-DOEBatrachochytrium dendrobatidis Chytridiomycota Broad-FGI & JGI-DOEPiromyces sp. Chytridiomycota JGI-DOEGlomus intraradices Glomeromycota JGI-DOEPhycomyces blakesleeanus Zygomycota JGI-DOEBrachiola algerae Microsporidia GenoscopeNosema (Antonospora) locustae Microsporidia MBLEnterocytozoon bieneusi Microsporidia Tufts Univ
Table C.3: Additional funded fungal genome sequencing projects as of Spring 2006. This data was partially derivedfrom the Genomes online database (190)
124
Sequencing In-Progress
Species Clade Sequencing centerAspergillus niger Euascomycota: Eurotiomycota DOE-JGIAspergillus flavus Euascomycota: Eurotiomycota NCSUAspergillus clavatus Euascomycota: Eurotiomycota OUNeosartorya fischeri Euascomycota: Eurotiomycetes TIGRHistoplasma capsulatum WU24 Euascomycota: Eurotiomycota Broad-FGIHistoplasma capsulatum 186R,217B Euascomycota: Eurotiomycota WUSTLCoccidioides posadasii Euascomycota: Eurotiomycota TIGRCoccidioides immitis 10 strains Euascomycota: Eurotiomycota Broad-FGI & TIGRParacoccidioides brasiliensis Euascomycota: Eurotiomycota Univ of BrazilAscosphaera apis Euascomycota: Eurotiomycota BCMEpichloe festucae Euascomycota: Sordariomycetes UKPodospora anserina Euascomycota: Sordariomycetes Broad-FGITrichoderma atroviride Euascomycota: Sordariomycetes DOE-JGITrichoderma virens Euascomycota: Sordariomycetes DOE-JGILeptosphaeria maculans Euascomycota: Dothideomycetes GenoscopeAlternaria brassicicola Euascomycota: Dothideomycetes VPI & WUSTLXanthoria parietina (lichen) Euascomycota: Lecanoromycetes DOE-JGICandida albicans WO-1 Hemiascomycota Broad-FGILodderomyces elongisporus Hemiascomycota Broad-FGIPichia stipitis Hemiascomycota JGI-DOESaccharomces bayanus Hemiascomycota (49, 167)Saccharomces castellii Hemiascomycota (49)Saccharomces cerevevisiae RM11-1A Hemiascomycota Broad-FGISaccharomces cerevevisiae YJM789 Hemiascomycota (113)Saccharomyces kluyeri Hemiascomycota WUSTL (finishing)Saccharomces kudriavzevii Hemiascomycota (49)Saccharomces mikatae Hemiascomycota (49, 167)Saccharomces paradoxus Hemiascomycota (167)Saccharomyces pastorianus Hemiascomycota Kitasato UnivZygosaccharomyces rouxii Hemiascomycota CNRS-Genoscope
Table C.2: In progress and funded fungal genome sequencing projects as of Spring 2006. This data was partially derivedfrom the Genomes online database (190)
123
Sequ
enci
ng In
-Pro
gres
s
Genome Annotation
• Genome assembly only for half of genomes
• Developed automated annotation pipeline for fungal genome
• Combine evidence based and ab initio gene prediction and species-specific training
SNAPTwinscan
GlimmerGenscan
ZFF to GFF3
GFF2 to GFF3
Tools::Genscan
Tools::Glimmer
BLASTZBLASTN
SearchIO
Bio:
:DB:
:GFF
HMMER
GFF to AA
Genome
Proteins
predicted proteins
SearchIO
ProteinsFASTAall-vs-all
exonerateprotein2genome
GFF2 to GFF3
Findorthologs
protein to genome
coordinates
GLEAN(combiner)
Tools::GFF
exonerateest2genome
ESTs
SearchIO
MCL
Gene families
Rfam tRNAscan
Multiple sequence alignment
Intron mapping into
alignmentIntron analysis
Bio::AlignIOaa2cds alignment
AnalysisMethodology
http://fungal.genome.duke.edu
FinalGenes
Generic Genome Browser
A B C D E
FASTAall-vs-all
MCL
Gene families
CAFE
18 U. maydis
5 C. gattii R265
5 C. gattii WM276
5 gattii
5 C. neoformans JEC21
5 C. neoformans var grubii
5 neoformans
5 Cryptococcus
163 P. chrysosporium
141 C.cinereus
136 Homobasidiomycota
23 Hymenomycota
23 Basidiomycota
0100200300400
Family 1 P < 0.001 Branch A
Family 2 P < 0.001 Branch B
Family 3 P=0.02 Branch C,E
Family 4 P=0.03 Branch D
10 1 2
14 18 2
7 1 1
6 1 12
6 1 8
3 1 1+
Family count
37 fu
ngal
spe
cies
Gene family sizes follow power law distribution
N.crassaA.gossypiiR.oryzaeA.oryzaeA.terreusC.cinereusU.maydis
101
1
10
100
1000
10000
100
Family size
Fre
quency o
f F
am
ily s
ize
Multicopy genesSugar transporters
P450 Enzymes
single copy genesPRP8 (splicing)
CDC48 (cell cycle ATPase)
Phylogenetic evaluation of gene family size change• Previous methods only used ad hoc
statistics
• Explicit model for gene family size change according to a Birth-Death (BD) models
• Apply BD to family size along phylogeny using probabilistic graph models
• CAFE - Computational Analysis of gene Family Evolution
Hahn et al, Genome Res 2005De Bie, et al Bioinformatics 2006
CAFE
• Ancestral states
• Birth and Death rate
• Per branch changes
• P-values
Methods: gene family identification
• All-vs-All pairwise sequence searches (FASTP)
• Cluster genes by similarity using Markov CLustering (MCL) algorithm
• Identify families with unusually large size changes along phylogeny with CAFE
• 37 fungal genomes from 5 major clades
Summary of fungal family analysis
• 917 families have at least one member in outgroup present in > 75% of ingroup.
• 47 families had significant expansions and contractions
Families with significant expansionsVitamin & Cofactor transport
Lactose & sugar transport
Amine transport
Myo-instol, quinate, and glucose transport
Oligopeptide transport
ABC transporter
MFS, drug pump, & sugar transport
Transport
Monocarboxylate & sugar transport
ABC transport
Amino acid permease
Methytransferase
Cytochrome P450: CYP64
Cytochrome P450: CYP53,57A
Cytochrome P450
Kinase
Subtilase family
NADH flavin oxidoreductase
Aldehyde dehydrogenase
Aldo/kedo reductase
Multicopper oxidase
AMP-binding enzyme
TransportersKinasesP450
Oxidation
49 significant families
Transporters
• Of 45 significant families, 22 were related to transport
• Vitamin and amino acid transport
• Sugar and sugar-like transporters
• Multidrug and efflux pumps
• ABC transporters (ATP Binding Cassette)
21 Rhizopus oryaze
20 Neurospora crassa
27 Podospora anserina
19 Chaetomium globosum27
28
46 Magnaporthe grisea
31
84 Fusarium verticillioides
62 Fusarium graminearum63
33 Trichoderma reesei38
31 Sordariomycetes
22 Sclerotinia sclerotiorum
25 Botrytis cinerea24
30
66 Stagonospora nodorum
25 Uncinocarpus reesii
22 Coccidioides immitis24
17 Histoplasma capsulatum25
44 Aspergillus fumigatus
50 Aspergillus nidulans
64 Aspergillus terreus
90 Aspergillus oryzae62
53
48
33 Eurotiomycota
32
30 Euascomycota
5 Ashbya gosspyii
9 Kluyveromyces lactis8
7 Saccharomyces cerevisiae
6 Candida glabrata7
8
18 Candida lusitaniae
24 Debaryomyces hansenii
18 Candida guilliermondii18
17
8 Candida tropicalis
7 Candida albicans
7 Candida dubliniensis7
8
15 Candidacae
15
30 Yarrowia lipolytica
21 Hemiascomycota
23 HemiEuascomycota
10 Schizosaccharomyces pombe22 Ascomycota
33 Cryptococcus neoformans neoformans
32 Cryptococcus neoformans grubii32
25 Cryptococcus gattii WM276
26 Cryptococcus gattii R26526
28
27 Phanerochaete chrysosporium
24 Coprinus cinereus
24
13 Ustilago maydis22 Basidiomycota
22
22
0100200300400500600700800900
Vitamin & Cofactor
Transporters
Marked branches with
significant (P<0.05)
expansions or contractions
Branches with transporter expansions• Sugar related, Drug pump, and Major
Facilitator Superfamily
• Aspergillus spp, Fusarium spp, S. nodorum
• Euascomycota
• Vitamin transport
• C. neoformans, Fusarium
• A. nidulans (Biotin)
Aspergillus
Fusarium
S. nodorum
C. neoformans
Family size contractions
• Histoplasma, Coccidioides many families
• Hemiascomycetes - P450
• C. neoformans - P450
• U. maydis - Lactose transport
Podospora anserina
Chaetomium globosum
Neurospora crassaPyrenomycota
Magnaporthe grisea
Fusarium verticillioides
Fusarium graminearum
Trichoderma reesei
Sordariomycota
Sclerotinia sclerotiorum
Botrytis cinerea
Aspergillus terreus
Aspergillus oryzae
Aspergillus nidulans
Aspergillus fumigatusAspergillus
Uncinocarpus reesii
Coccidioides immitis
Histoplasma capsulatum
Eurotiomycota
Stagonospora nodorum
Debaryomyces hansenii
Candida guilliermondii
Candida lusitaniae
Candida albicans
Candida dubliniensis
Candida tropicalis
Candidacae
Ashbya gosspyii
Kluyveromyces lactis
Saccharomyces cerevisiae
Candida glabrata
Saccharomyces
Yarrowia lipolytica
Hemiascomycota
Hemi-Euascomycota
Schizosaccharomyces pombe
Ascomycota
Cryptococcus neoformans
Cryptococcus neoformans grubii
Cryptococcus gattii WM276
Cryptococcus gattii R265Phanerochaete chrysosporium
Coprinus cinereusHomobasidiomycota
Hymenomycota
Ustilago maydisBasidiomycota
Rhizopus oryaze0100200300400500600700800900
400 My
966 My
Focus on Basidiomycota
U.maydisC.cinereusP.chrysosporiumC.neoformans
Cryptococcus sugar transporters expansion
19 U.maydis
47 C.gattii R265
50 C.gattii WM276
50 gattii
57 C.neoformans JEC21
59 C.neoformans H99
57 neoformans
50 Cryptococcus
23 P.chrysosporium
20 C.cinereus
23 Homobasidio
24 Hymenomycota
24
0100200300400
Sugar transporter use in phytopathogens
• Sugar transporters are used to extract nutrients from host
• Haustorium: specialized structure for plant parasitism
• Many sugar transporters highly and specifically expressed in haustoria
Haustorium
http://tolweb.org/Robert Bauer
FIG
.1.
The
exte
rnal
and
inte
rnal
stru
ctur
esof
C.n
eofo
rman
sar
esh
own
bym
eans
ofa
mod
ified
Indi
ain
kpr
epar
atio
n.M
agni
ficat
ion,
ca.!
1,00
0.
Con
tinue
dfro
mpr
eced
ing
page
2291
Cryptococcus sugar transporters
• 3x as many sugar transporters in C. neoformans (~50) than other basidiomycetes
• “sugar coated killer”
• Capsule is a mixture of glucose, xylose, and mannose.
• Transporters could be important in capsule synthesis
Zerpa et al, 1996
Gene family changes
• P450 (CYP64) - [Homobasidio]
• Hydrophobins - [Homobasidio]
• Monosaccharide metabolism - [Homobasidio]
• Oxidoreductase - [Cryptococcus]
P450 CYP64
18 U. maydis
5 C. gattii R265
5 C. gattii WM276
5 gattii
5 C. neoformans JEC21
5 C. neoformans var grubii
5 neoformans
5 Cryptococcus
163 P. chrysosporium
141 C.cinereus
136 Homobasidiomycota
23 Hymenomycota
23 Basidiomycota
0100200300400
P450 enzymes involved in synthesis and cleavage of chemical bonds. Drug metabolism in animals.
CYP64: Step in Aspergillus spp aflatoxin pathwayP. chrysosporium implicated in lignin and hydrocarbon degradation.
Million yearsago
CYP64 was from independent duplication
ccin
03995 ccin
12432
ccin
12477
ccin
12447
ccin
03760 ccin
08843
ccin
08880
ccin
08948
ccin
08949
ccin
09228
ccin
08947
ccin
08946
ccin
12431
ccin
12515
ccin
08608
ccin
12514
ccin
07535
ccin
07536
ccin
05141
ccin
07531
ccin
04462
ccin
04461
ccin
04460
ccin
07538
ccin
01326
ccin
04884
ccin
07555
ccin
07554
ccin
03994
ccin
12516
ccin
09337
ccin
09357
ccin
09950
ccin
00042
ccin
00039
ccin
00043
ccin
11079
ccin
11073
ccin
12301
ccin
03618
ccin
12868
ccin
12386
ccin
03622
ccin
09244
ccin
08520
ccin
10950
ccin
13218
ccin
13220
pchr
04215
pchr
02481
pchr
02475
pchr
02324
pchr
02322
pchr
02461
pchr
02249
pchr
02248
pchr
02471
pchr
02460
pchr
02469
pchr
02472
pchr
02470
pchr
02468
pchr
02462
pchr
02442
pchr
02441
pchr
02317
pchr
02473
pchr
02477
pchr
02480
pchr
02479
pchr
02474
pchr
02459
pchr
02478
pchr
09197
pchr
08602
pchr
02326
pchr
08048
pchr
08046
pchr
08045
pchr
08047
pchr
02328
pchr
06733
tree 1
6195
pchr
10861
pchr
07430
pchr
07443
C. cinereus expansion P. chrysosporium expansion
Tom VolkMario Cervini
Local duplications created CYP64 expansion
9k 10k 11k 12k 13k 14k 15k 16k 17k 18k 19k 20k 21k 22k 23k 24k
pchr_24pchr_24
GLEAN modelsGLEAN_02414
Probability 1
GLEAN_02415
Probability 0.999937
GLEAN_02416
Probability 0.646357
GLEAN_02417
Probability 0.990598
Pfam domainsp450
Cytochrome P450 evalue:1e-28
p450 p450
Cytochrome P450 evalue:9e-07
p450p450
Cytochrome P450 evalue:6.3e-23
p450p450
Cytochrome P450 evalue:6e-26
p450
http://fungal.genome.duke.edu
Interpretation of CYP64 expansion
18 U. maydis
5 C. gattii R265
5 C. gattii WM276
5 gattii
5 C. neoformans JEC21
5 C. neoformans var grubii
5 neoformans
5 Cryptococcus
163 P. chrysosporium
141 C.cinereus
136 Homobasidiomycota
23 Hymenomycota
23 Basidiomycota
0100200300400Million yearsago
Angiosperm diversification
Hydrophobin Family
• Self assembling proteins involved in fungal cell wall
• Part of what makes a mushroom
• 8 Cysteine residues critical to function
• Help spores stay airborne resisting water
P.chr C.cin C.neo U.may
21 33 0 2
umay UM05010umay UM04433
ccin 10587ccin 10586
ccin 05414ccin 09268
ccin 05081ccin 11692
ccin 11691ccin 12456
ccin 12439ccin 03506
ccin 03524ccin 12453
ccin 06183ccin 06192
ccin 06185ccin 06184ccin 06194ccin 08744
ccin 06204ccin 05130ccin 05145
ccin 00406pchr 10481
pchr 10482pchr 03412
pchr 08984pchr 06735
pchr 09319pchr 02564
pchr 02565pchr 02739
pchr 09062pchr 09061
pchr 09060pchr 09067
pchr 00495pchr 08523
pchr 11384pchr 11183
pchr 11134pchr 00475
pchr 09066pchr 00499
ccin 08205ccin 08203ccin 08204
ccin 08198ccin 08201
ccin 08202ccin 08199
ccin 13133ccin 05197ccin 05199
ccin 086570.1
Local Duplications
C. cinereus
P. chrysosporium
Hydrophobin Expansion
• Due to several local duplications
• Expansion is lineage specific
• Important in cell wall construction -mushroom formation
Conclusions
• Sugar transporters are highly expanded in independent lineages
• Saprophytic and phytopathogenic lifestyles
• P450 CYP64 independent expansions in Homobasidiomycetes
• Lignin degradation and saprophytic lifestyles
• Family size contractions among lineages containing primary pathogens
• Genome streamlining?
Future directions
• Confirm contractions in clade of highly pathogenic fungi
• Focus on more clade specific families
• Can we classify fungal lifestyles by family composition?
Acknowledgements
DukeUPGG
Sequencing centersBroad InstituteDuke University
Joint Genome InstituteGénolevures
Stanford UniversityWashington University
Matthew Hahn
Fred Dietrich
Mario StankeIan KorfAaron Mackey