Top Banner
Protein Evolu-on Structure, Func-on, and Human Health 11/28/2013 Dr. Daniel Gaston, Department of Pathology 1
84

Protein Evolution: Structure, Function, and Human Health

May 06, 2015

Download

Education

Dan Gaston

Guest Lecture, Protein Biochemistry course on basics of evolution at the protein level and some applications.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Protein Evolution: Structure, Function, and Human Health

Protein  Evolu-on  

Structure,  Func-on,  and  Human  Health  

11/28/2013  Dr.  Daniel  Gaston,  Department  

of  Pathology  1  

Page 2: Protein Evolution: Structure, Function, and Human Health

So,  about  this  evolu-on  thing?  

Why  should  I  care?  What  use  is  it?  

Page 3: Protein Evolution: Structure, Function, and Human Health

Lots  of  reasons  

•  Knowledge  for  its  own  sake  is  good  – Otherwise,  why  do  science  at  all?  

Page 4: Protein Evolution: Structure, Function, and Human Health

Lots  of  reasons  

•  Knowledge  for  its  own  sake  is  good  – Otherwise,  why  do  science  at  all?  

•  Shapes  our  understanding  of  ecology  and  biological  diversity  

Page 5: Protein Evolution: Structure, Function, and Human Health

Lots  of  reasons  •  Knowledge  for  its  own  sake  is  good  

– Otherwise,  why  do  science  at  all?  •  Shapes  our  understanding  of  ecology  and  biological  diversity  

•  Prac-cal  reasons  – An-bio-c  resistance  – Microbiome:  Fecal  transplanta-on  –  Cancer  –  Predic-ng  gene/protein  func-on  –  Predic-ng  the  impact  of  muta-ons  for  poten-al  to  cause  human  disease  (Genotype:Phenotype)  

Page 6: Protein Evolution: Structure, Function, and Human Health

Evolu-on  of  Life  on  Earth  

A  (Very)  Brief  Overview  

Page 7: Protein Evolution: Structure, Function, and Human Health

Eubacteria"

ROOT Iwabe et al. 1989 Gogarten et al. 1989

Eukaryota"

Archaebacteria"

Page 8: Protein Evolution: Structure, Function, and Human Health

Eubacteria"

ROOT Iwabe et al. 1989 Gogarten et al. 1989

Eukaryota"

Archaebacteria"

Page 9: Protein Evolution: Structure, Function, and Human Health

Eubacteria"

ROOT Iwabe et al. 1989 Gogarten et al. 1989

Eukaryota"

Archaebacteria"

Page 10: Protein Evolution: Structure, Function, and Human Health
Page 11: Protein Evolution: Structure, Function, and Human Health

You  are  here  

Page 12: Protein Evolution: Structure, Function, and Human Health

A  Brief  History  of  Cells  and  Molecules  

•  Origin of the earth ~4.5 billion years ago •  Origin of life: ~3.0-4.0 billion years ago

–  Origin of self-replicating entities –  The RNA world (?) –  Origin of the first genes, proteins & membranes –  Gave rise to the first cells –  the Last Universal Common Ancestor (LUCA) of all cells

–  Probably had 500-1000 genes •  First microfossils of bacteria: ~3.5 billion years ago (controversial)

~2.7 billion years ago (for certain) •  Oxygenation of the atmosphere: 2.3-2.4 billion years ago (by

photosynthetic bacteria) •  Origin of eukaryotes: ~1.0-2.2 billion years ago (probably 1.5) •  Origin of animals: ~0.6-1.0 billion years ago

Page 13: Protein Evolution: Structure, Function, and Human Health

•  Homology = descent from a common ancestor – homology is all or nothing: sequences are either

homologous (related) or not homologous (not related)

– Not the same as “similarity” (degrees of similarity are possible)

Some  Defini-ons  

Page 14: Protein Evolution: Structure, Function, and Human Health

Some  Defini-ons  •  Divergence = change in two sequences over time

(after splitting from a common ancestor)

•  Convergence = similarity due to independent evolutionary events

–  On the amino acid sequence level, it is relatively rare & difficult to prove (but see an example later)

T T

Ancestral sequence

Sequence 1 Sequence 2

Page 15: Protein Evolution: Structure, Function, and Human Health

How does evolutionary change happen in proteins?

Page 16: Protein Evolution: Structure, Function, and Human Health

Evolu-on:  Two  Groups  of  Processes  

•  Muta-on  – Many  different  processes  that  generate  muta-ons  – Muta-ons  are  the  raw  materials  needed  for  evolu-on  to  happen  

•  Selec-on  and  DriY  – Muta-ons  happen  in  individuals  – Evolu-on  happens  in  popula-ons  of  organisms  – Selec-on  and  Gene-c  DriY  affect  the  frequency  of  muta-ons  in  a  popula-on  over  -me  

Page 17: Protein Evolution: Structure, Function, and Human Health

Muta-ons  

Page 18: Protein Evolution: Structure, Function, and Human Health

Point  Muta-ons

! ! AGGTTCCAATTAA!! ! TCCAAGGTCAATT!

!!AGGTTCCAATTAA ! TCCAAGGTTAATT!!

REPLICATION (meiotic or mitotic division)

Unrepaired mispaired base

Mutant allele Wild-type alleles

Mutant Gamete (for multicellular org.)

Wild-type Gamete (for multicellular org.)

AGGTTCCAGTTAA ! TCCAAGGTCAATT!

Page 19: Protein Evolution: Structure, Function, and Human Health

AGTCCAAGGCCTTAA -------------> AGTTCAAGGCCTTAA point mutation ���

CCTTA AGTCCAAGGCCTTAA -------------> AGTCCAAGGCCTTACCTTAA

insertion

AAGG AGTCCAAGGCCTTAA -------------> AGTCC-CCTTAA

deletion AGTCCAAGGCCTTAA -------------> AGTCCCCTTCCTTAA

` inversion AGTCCAAGGCCTTAA -------------> AGTCCAAGGCC + translocation + GGTCCTGGAATTCAG GGTCCTGGAATTCAGTTAA AGTCCAAGGCC --------------> AGTCCAAGGCCAGTCCAAGGCC duplication AAGG AGTCCAAGGCCTTAA ---------------> AGTCCAAAGGCTTAA

recombination AGGC

Page 20: Protein Evolution: Structure, Function, and Human Health

Larger  Scale  Muta-ons  

Page 21: Protein Evolution: Structure, Function, and Human Health

Exon  shuffling  and  Protein  Domains  

Exon1   Exon  2   Exon  3  

Page 22: Protein Evolution: Structure, Function, and Human Health

Exon  shuffling  and  Protein  Domains  

Exon1   Exon  2   Exon  3  

Domain  1   Domain  2  

Page 23: Protein Evolution: Structure, Function, and Human Health

Exon  shuffling  and  Protein  Domains  

Exon1  Exon  2   Exon  3  

Page 24: Protein Evolution: Structure, Function, and Human Health

Exon  shuffling  and  Protein  Domains  

Exon1  Exon  2   Exon  3  

Domain  2  Domain  A  

Page 25: Protein Evolution: Structure, Function, and Human Health

Genomic  Scale  Muta-ons  

Gene  1   Gene  2  

Page 26: Protein Evolution: Structure, Function, and Human Health

Genomic  Scale  Muta-ons  

Gene  1   Gene  2  

Page 27: Protein Evolution: Structure, Function, and Human Health

Gene  Duplica-on  

Gene  1   Gene  2  

Page 28: Protein Evolution: Structure, Function, and Human Health

Gene  Duplica-on  

Gene  1   Gene  2  Gene  1a  

Page 29: Protein Evolution: Structure, Function, and Human Health

Gene-c  DriY  and  Selec-on  

Page 30: Protein Evolution: Structure, Function, and Human Health

Mutations vs. substitutions

•  Mutations happen in individual organisms

•  A nucleotide ‘substitution’ occurs IF after many generations, all individuals in the population harbour the ‘mutation’

•  This process is called “fixation of mutations”

•  substitution = fixed mutation •  When comparing homologous protein sequences between

species, looking at amino acid substitutions

Page 31: Protein Evolution: Structure, Function, and Human Health

Fixation of alleles

N generations

Proportion of = 1.0 (100%) This is the same as saying that was fixed in the population in N generations The ‘mutation’ became a ‘substitution’ after it was fixed in the population

Population with two alleles:

Proportion of = 1/14 (7.1%) Proportion of = 13/14 (93%)

Page 32: Protein Evolution: Structure, Function, and Human Health

Natural selection and Neutral drift •  Positive selection

–  Mutation confers fitness advantage (more offspring that survive)

–  RARE •  Purifying selection (negative selection)

–  Mutation confers fitness disadvantage (less offspring or ‘no’ viable offspring - e.g. lethal)

–  FREQUENT •  Neutral evolution (genetic drift)

–  Mutation has very little fitness effect –  Will drift in frequency in the population due to random

sampling effects –  VERY FREQUENT

Page 33: Protein Evolution: Structure, Function, and Human Health

Nearly-neutral theory ���

Page 34: Protein Evolution: Structure, Function, and Human Health

Common  Examples  of  Posi-ve  Selec-on  

•  MHC  Genes  – Diversity  =  Good  – Very  polymorphic  in  humans  

•  Envelope  (gp120)  of  HIV  –  Immune  system  evasion  

•  Enzymes  involved  in  human  dietary  metabolism  – Accelerated  posi-ve  selec-on  over  last  ~10,000  years  

Page 35: Protein Evolution: Structure, Function, and Human Health

Gene-c  DriY  

Select  a  marble  randomly  from  a  jar  and  “copy”  it  in  to  the  next  Fixa-on  of  the  plain  blue  allele  in  5  genera-ons  

Page 36: Protein Evolution: Structure, Function, and Human Health

Polymorphism  

•  Polymorphisms  are  sites  with  more  than  one  allele  present  in  a  popula-on  – Muta-ons  that  have  not  yet  been  fixed  

Page 37: Protein Evolution: Structure, Function, and Human Health

Muta-on  and  Codons  

Not  all  muta-ons  are  created  equal  

Page 38: Protein Evolution: Structure, Function, and Human Health

Point mutations in protein genes are classified according to the genetic code:

The genetic code is degenerate: more than one codon often specifies a single amino acid. E.g. Serine has 6 codons, Tyrosine has 2 codons and Tryptophan has one codon!

Page 39: Protein Evolution: Structure, Function, and Human Health

Point mutations in ���protein-coding genes

•  synonymous (silent) substitutions: cause interchange between two codons that code for the same amino acid:

e.g. CTG --> CTA = Leu --> Leu Mostly invisible to selection

•  non-synonymous (replacement) mutations: cause change between codons that code for different amino acids (missense) or stop codons (nonsense)

e.g. CTG --> ATG = Leu --> Met TGG --> TGA = Trp --> Stop

Page 40: Protein Evolution: Structure, Function, and Human Health
Page 41: Protein Evolution: Structure, Function, and Human Health

8 kinds of 1st codon-position synonymous mutation: R-->R and L-->L

Page 42: Protein Evolution: Structure, Function, and Human Health

126 kinds of 3rd-codon position synonymous mutation:

Page 43: Protein Evolution: Structure, Function, and Human Health

A  Note  on  Indels  

•  Ignored  because  indels  are  far  more  likely  to  be  deleterious  – More  likely  to  result  in  frame  shiYs    

•  Can  s-ll  be  non-­‐deleterious  – Par-cularly  if  in  mul-ples  of  three  – Over  evolu-onary  -me  indels  more  oYen  observed  in  loops  than  more  constrained  structural  elements  

Page 44: Protein Evolution: Structure, Function, and Human Health

Evolu-onary  Rates  

Speed  of  Evolu-on  

Page 45: Protein Evolution: Structure, Function, and Human Health

Rates of protein evolution���(i.e. rates that individual amino acids are substituted)

•  Different regions in proteins have different rates of evolution (functional constraints)

•  Different proteins have different overall rates of evolution

Page 46: Protein Evolution: Structure, Function, and Human Health
Page 47: Protein Evolution: Structure, Function, and Human Health

Enolase •  Ubiquitous glycolytic enzyme, highly conserved throughout evolution

•  TIM Barrel family doing an α-proton abstraction

cMLE

MLE

Archaea

Bacteria

Euks

β α γ

Page 48: Protein Evolution: Structure, Function, and Human Health

All Eukaryotes site rates (63 taxa) mapped on Lobster Enolase

low rates blue high rates red

Page 49: Protein Evolution: Structure, Function, and Human Health

Site rate categories 1 and 2 (slowest sites)

Page 50: Protein Evolution: Structure, Function, and Human Health

Site rates Categories 3 and 4

Page 51: Protein Evolution: Structure, Function, and Human Health

Site rates Categories 5 and 6

Page 52: Protein Evolution: Structure, Function, and Human Health

Site rates Categories 7 and 8 (fastest sites)

Page 53: Protein Evolution: Structure, Function, and Human Health

Evolutionary rates as a function of enolase structure/function

•  Rates of evolution increase from the centre of the molecule (slow) to the surface (fast)

•  The pattern is probably due to: –  Distance from the catalytic centre --> catalytic residues don’t change

(slowest), residues that interact with catalytic residues are constrained (slow)

–  Geometric constraints - residues in the centre of the molecule have restricted ‘space’ around them that constrains them. At the surface, there are fewer such constraints

–  Hydrophobic core in centre –  More loops and alpha helices on surface

•  NOTE: this pattern seems to work for soluble globular enzymes with catalytic centre in the centre of mass. It does not hold for structural proteins like tubulin, actin etc.

Page 54: Protein Evolution: Structure, Function, and Human Health

Rates of evolution of sites versus their structural position

•  There are no completely general rules! –  It depends on what the protein is doing and where.

•  Functional sites (catalytic sites) or sites at interfaces (protein-protein interactions) are conserved

•  Geometric, chemical, folding and functional constraints (catalysis, binding) determine evolutionary constraints

Page 55: Protein Evolution: Structure, Function, and Human Health

Detec-ng  and  Quan-fying  Evolu-onary  Rela-onships  

Page 56: Protein Evolution: Structure, Function, and Human Health

How do we know if two proteins are homologous?

(A) If sequences > 100 amino long are >25% identical --> they are probably significantly similar and very likely to be homologous -BLAST, FASTA, Smith-Waterman algorithms are likely to find them “significantly similar” (E-value << 1x10-4)

(B) If they are >100 long and 15-25% identical (Twilight Zone) --> probably homologous BUT need to rigourously test it -a number of methods are available: permutation test

(C) If they are <15% identical......difficult to prove homology -test it -if its not significant look for motifs in multiple alignments -look at tertiary structure

Page 57: Protein Evolution: Structure, Function, and Human Health

15-23%!identity!

}!

Page 58: Protein Evolution: Structure, Function, and Human Health
Page 59: Protein Evolution: Structure, Function, and Human Health

Applica-ons  

•  Evolu-onary  methods  for  studying  protein  func-on  – Annota-ng  novel  proteins  – Func-onal  divergence  

•  Predic-ng  pathogenicity  of  muta-ons  Informing  protein  structure  predic-on  – Mendelian  disease  – Cancer  

Page 60: Protein Evolution: Structure, Function, and Human Health

Applica-ons  of  Evolu-onary  Biology  to  Medicine  

Inherited  Gene-c  Diseases  and  Cancer  

Page 61: Protein Evolution: Structure, Function, and Human Health

Lynch  Syndrome  

•  Autosomal  dominant  cancer  syndrome  •  Increased  risk  for  many  cancers,  mostly  colorectal  cancer  due  to  mismatch  repair  defects  

Page 62: Protein Evolution: Structure, Function, and Human Health

Lynch  Syndrome  

•  Autosomal  dominant  cancer  syndrome  •  Increased  risk  for  many  cancers,  mostly  colorectal  cancer  due  to  mismatch  repair  defects  

Page 63: Protein Evolution: Structure, Function, and Human Health

Mutator  Phenotype  

•  Inac-va-on  of  mismatch  repair  (MMR)  genes  led  to  mutator  phenotypes  in  E.  coli  and  yeast  •  Included  Microsatellite  instability  

 

Page 64: Protein Evolution: Structure, Function, and Human Health

Mutator  Phenotype  

•  Inac-va-on  of  mismatch  repair  (MMR)  genes  led  to  mutator  phenotypes  in  E.  coli  and  yeast  •  Included  Microsatellite  instability  

•  Careful  research  iden-fied  human  homologs  – MLH1  and  MSH2  – Defects  in  these  genes  cause  Lynch  Syndrome    

Page 65: Protein Evolution: Structure, Function, and Human Health

Mismatch  Repair  

•  Mismatch  Repair  -­‐>    •  Microsatellite  Instability  -­‐>    •  Cancer    Most  microsatellites  spread  throughout  the  genome  in  non-­‐genic  regions    But  some  are  found  in  important  tumor  suppressor  genes  

Page 66: Protein Evolution: Structure, Function, and Human Health

Applica-ons  of  Evolu-onary  Biology  to  Medicine  

Predic-ng  Pathogenicity  and  Impact  of  Human  Muta-ons  

Page 67: Protein Evolution: Structure, Function, and Human Health

The  Sequencing  Revolu-on  

Page 68: Protein Evolution: Structure, Function, and Human Health

Problem  

•  OYen  leY  with  hundreds  to  thousands  of  poten-al  muta-ons  in  a  family  that  “track”  with  the  disease  – Needle  in  a  “stack  of  needles”  problem  

•  Must  discriminate  neutral  missense  muta-ons  from  pathogenic  ones  

Page 69: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

•  Many  programs  exist  to  make  these  predic-ons:  – PolyPhen  – Muta-on  Taster  – EvoD  – SIFT  – PROVEAN  – FATHMM  – etc  

Page 70: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

•  Important  amino  acids  have  low  evolu-onary  rates  – Higher  conserva-on  

•  The  more  important  the  protein  the  more  likely  it  is  to  be  broadly  found  among  eukaryotes  – Also  higher  overall  conserva-on  

•  However  many  important  proteins  in  humans  only  found  in  primates,  mammals,  or  animals  

Page 71: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

…RPLAHTY…! …RPLAHTY…!…RPLVHTY…!…RPIAHTY…!…RPIGHTY…!…RPIICTY…!…RPLACTY…!…RPLLCTY…!!  

Reference  Sequence   Mul-ple  Sequence  Alignment  

Page 72: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

…RPLAHTY…! …RPLAHTY…!…RPLVHTY…!…RPIAHTY…!…RPIGHTY…!…RPIICTY…!…RPLACTY…!…RPLLCTY…!!  

Reference  Sequence   Mul-ple  Sequence  Alignment  

Compute  an  Evolu-onary  Conserva-on  Score  for  Each  Posi-on  

Page 73: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

…RPLACTY…! …RPLAHTY…!…RPLVHTY…!…RPIAHTY…!…RPIGHTY…!…RPIICTY…!…RPLACTY…!…RPLLCTY…!!  

Reference  Sequence   Mul-ple  Sequence  Alignment  

Conserva-ve  changes  more  likely  to  be  neutral  

Page 74: Protein Evolution: Structure, Function, and Human Health

Evolu-on  at  Work  

…RPLACTP…! …RPLAHTY…!…RPLVHTY…!…RPIAHTY…!…RPIGHTY…!…RPIICTY…!…RPLACTY…!…RPLLCTY…!!  

Reference  Sequence   Mul-ple  Sequence  Alignment  

Radical  changes  more  likely  to  be  deleterious  

Page 75: Protein Evolution: Structure, Function, and Human Health

Applica-ons  of  Evolu-onary  to  Protein  Func-on  

Func-onal  Divergence  

Page 76: Protein Evolution: Structure, Function, and Human Health

Func-onal  Divergence  

Gene  1   Gene  2  Gene  1a  

Over  evolu-onary  -me  scales  Gene  1  and  Gene  1a  are  known  as  paralogs,  a    subset  of  homologs    They  can  diverge  from  one  another  in  sequence,  as  well  as  func-on.  

Page 77: Protein Evolution: Structure, Function, and Human Health

Types  of  Func-onal  Divergence  

•  Subfunc-onaliza-on  – Paralog  specializes  and  retains  only  a  subset  of  ancestral  func-on    

•  Neofunc-onaliza-on  – Paralog  gains  a  new  func-on,  and  loses  old  func-on(s)  

•  Subneofunc-onaliza-on  – Paralog  undergoes  rapid  subfunc-onaliza-on  but  then  undergoes  neofunc-onaliza-on  

Page 78: Protein Evolution: Structure, Function, and Human Health

Gene  A  

Family  B  

Family  A  

Func-onal  Divergence  

Page 79: Protein Evolution: Structure, Function, and Human Health

Func-onal  Divergence  …A L H… Species 1 …A L H… Species 2 …A L H… Species 3 …A L H… Species 4 …A L H… Species 5 …A L H… Species 6

…R A H… Species 1 …R R H… Species 2 …R C H… Species 3 …R A H… Species 4 …R A H… Species 5 …R Y H… Species 6

Family  B  

Family  A  

Page 80: Protein Evolution: Structure, Function, and Human Health

Glyceraldehyde-­‐3-­‐Phosphate  Dehydrogenase  

NAD+  NADH  +Pi  +H+  

NAD+  NADH  +  Pi      +  H+  

Glyceraldehyde-­‐3-­‐Phosphate   1,3-­‐Biphosphoglycerate  

Cytosol:  Glycolysis  

Page 81: Protein Evolution: Structure, Function, and Human Health

Glyceraldehyde-­‐3-­‐Phosphate  Dehydrogenase  

NADP+  NADPH  +Pi  +H+  

NADP+  NADPH  +Pi  +H+  

Glyceraldehyde-­‐3-­‐Phosphate   1,3-­‐Biphosphoglycerate  

Plas-d:  Calvin  Cycle  

Page 82: Protein Evolution: Structure, Function, and Human Health

GAPDH  Evolu-on  

Green  Plants  

Cyanobacteria  

‘Chromalveolates’  

Cytosolic  GapC  

Cytosolic  GapC  

Page 83: Protein Evolution: Structure, Function, and Human Health

GAPDH  Structure  

Page 84: Protein Evolution: Structure, Function, and Human Health

NADPH  Binding  Necessary  for  Calvin  Cycle  Func-on