Jena Institute of Molecular Biotechnology Swetlana Nikolajewa, Thomas Wilhelm Theoretical Systems Biology
Mar 18, 2016
Jena
InstituteofMolecular Biotechnology
Swetlana Nikolajewa, Thomas Wilhelm Theoretical Systems Biology
Overview
The genetic code - introduction The new classification scheme of the genetic code shows:
symmetry characteristics explanation for the number (22) of tRNA genes in
mammalian mitochondrial genome amino-acids patterns and regularities of codons (strong,
mixed and weak codons) possible predecessors of our contemporary quaternary
triplet code
The Genetic Code
3 nucleotides bases (triplets) of A, G, C, U are used to code for 20 amino acids
two purines (A,G) two pyrimidines (C,U)
64 possible codons (4x4x4=43) 3 termination codons: UGA, UA(G/A) 61 codons for amino acid coding Met (AUG) codon is also the start codon
2nd base
U CA G
1stbase
U
UUU Phe UUC Phe UUA Leu UUG Leu
UCU Ser UCC Ser UCA Ser UCG Ser
UAU Tyr UAC Tyr UAA StopUAG Stop
UGU Cys UGC Cys UGA StopUGG Trp
UCAG
3rd base
C
CUU Leu CUC Leu CUA Leu CUG Leu
CCU Pro CCC Pro CCA Pro CCG Pro
CAU His CAC His CAA Gln CAG Gln
CGU Arg CGC Arg CGA Arg CGG Arg
UCAG
A
AUU Ile AUC Ile AUA Ile AUG Met
ACU Thr ACC Thr ACA Thr ACG Thr
AAU Asn AAC Asn AAA Lys AAG Lys
AGU Ser AGC Ser AGA Arg AGG Arg
UCAG
G
GUU Val GUC Val GUA Val GUG Val
GCU Ala GCC Ala GCA Ala GCG Ala
GAU Asp GAC Asp GAA Glu GAG Glu
GGU Gly GGC Gly GGA Gly GGG Gly
UCAG
The Common Genetic Code Table
The new classification scheme of the genetic code
binary representation of purines(A,G) → 1 pyrimidines(C,U) → 0
23 = 8 different binary triplets 000 , 001, … ,111each of these has again 8 possibilities, for instance: 000 stands for three pyrimidines: CCC, CCU, UUC, …, UUU 111 stands for three purines: GGG, GGA, GAA, …, AAA
C G binds via 3 hydrogen bonds in the complementary base-paring A U binds via 2 hydrogen bonds in the complementary base-paring
2nd base
U CA G
1stbase
U
UUU Phe UUC Phe UUA Leu UUG Leu
UCU Ser UCC Ser UCA Ser UCG Ser
UAU Tyr UAC Tyr UAA StopUAG Stop
UGU Cys UGC Cys UGA StopUGG Trp
UCAG
3rd base
C
CUU Leu CUC Leu CUA Leu CUG Leu
CCU Pro CCC Pro CCA Pro CCG Pro
CAU His CAC His CAA Gln CAG Gln
CGU Arg CGC Arg CGA Arg CGG Arg
UCAG
A
AUU Ile AUC Ile AUA Ile AUG Met
ACU Thr ACC Thr ACA Thr ACG Thr
AAU Asn AAC Asn AAA Lys AAG Lys
AGU Ser AGC Ser AGA Arg AGG Arg
UCAG
G
GUU Val GUC Val GUA Val GUG Val
GCU Ala GCC Ala GCA Ala GCG Ala
GAU Asp GAC Asp GAA Glu GAG Glu
GGU Gly GGC Gly GGA Gly GGG Gly
UCAG
The Common Genetic Code Table
The Common Genetic Code Table contains 64 fields…
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
Pro CC (A/G)Proline
Ala GC (C/U)Alanine
Ala GC (A/G)Alanine
Leu CU (A/G)Leucine
Thr AC (C/U)Threonine
Thr AC (A/G)Threonine
Ser UC (C/U)Serine
Val GU (C/U)Valine
Val GU (A/G)Valine
Phe UU (C/U)Phenylalanine
Ile AU (C/U)Isoleucine
Ile/Met AU (A/G)Isoleucine/Methionine
000
001
100
101
Arg CG (C/U)Arginine
Cys UG (C/U)Cystein
His CA (C/U)Histidine
Tyr UA (C/U)Tyrosine
010
Arg CG (A/G)Arginine
Stop/Trp UG (A/G)Tryptophan
Gln CA (A/G)Glutamine
Stop UA (A/G)011
Gly GG (C/U)Glycine
Asp GA (C/U)Asparatic acid
Asn AA (C/U)Asparagine
110
Gly GG (A/G)Glycine
Glu GA (A/G)Glutamatic acid
111
Leu CU (C/U)Leucine
Leu UU (A/G)Leucine
Ser UC (A/G)Serine
Ser AG (C/U)Serine
Arg AG (A/G)Arginine
Pro CC (C/U)Proline
Lys AA (A/G)Lysine
The new classification scheme (standard genetic code)
the new scheme contains the same information in only 32 fields.
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
Pro CC (A/G)
Ala GC (C/U)
Ala GC (A/G)
Leu CU (A/G)1/2
Thr AC (C/U)
Thr AC (A/G)
Val GU (C/U)
Val GU (A/G)
Ile AU (C/U)
Ile/Met AU (A/G)5/0
001
100
101
Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U)010
Arg CG (A/G) Stop /Trp UG (A/G)9/0
Gln CA (A/G) Stop UA (A/G)2/4
011
Gly GG (C/U) Asp GA (C/U) Asn AA (C/U)110
Gly GG (A/G) Glu GA (A/G)111
Leu UU (A/G)1/0
Ser UC (A/G)1/0
Ser AG (C/U)
Arg AG (A/G)6/6
Lys AA (A/G)3/0
Deviations from the Standard Code
Ser UC (C/U) Phe UU (C/U)000 Leu CU (C/U)1/1
Pro CC (C/U)
http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi
Mitochondrial Genomes Have Several Surprising Features
genetic code of mitochondria
only 22 tRNAs are required for mammalian mitochondrial protein synthesis
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
Pro CC (A/G)
Ala GC (C/U)
Ala GC (A/G)
Leu CU (A/G)
Thr AC (C/U)
Thr AC (A/G)
Val GU (C/U)
Val GU (A/G)
Ile AU (C/U)
Met/Met AU (A/G)
001
100
101
Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U)010
Arg CG (A/G) Trp /Trp UG (A/G) Gln CA (A/G) Stop UA (A/G)011
Gly GG (C/U) Asp GA (C/U) Asn AA (C/U)110
Gly GG (A/G) Glu GA (A/G)111
Leu UU (A/G)Ser UC (A/G)
Ser AG (C/U)
STOP AG (A/G) Lys AA (A/G)
The Mammalian Mitochondrial Genetic Code
Ser UC (C/U) Phe UU (C/U)000 Leu CU (C/U)Pro CC (C/U)
http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
tRNAIle AU (C/U)
tRNAMet AU (A/G)
001
100
101
tRNACys UG (C/U) tRNAHis CA (C/U) tRNATyr UA (C/U)010
tRNATrp UG (A/G) tRNAGln CA (A/G) STOP UA (A/G)011
tRNAAsp GA (C/U) tRNAAsn AA (C/U)110
tRNAGlu GA (A/G)111
tRNALeu2 UU (A/G)
tRNASer2 AG (C/U)
STOP AG (A/G) tRNALys AA (A/G)
The Mammalian Mitochondrial Code: 8 tRNAs for family codons + 14 tRNAs for non-family codons
tRNASer1 UC
tRNAPhe UU (C/U)000tRNALeu
1 CUtRNAPro CC
tRNAAla GC
tRNAArg CG
tRNAGly GG
tRNAThr AC tRNAVal GU
http://mamit-trna.u-strasbg.fr/2DStructures.html
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
Pro CC (A/G)
Ala GC (C/U)
Ala GC (A/G)
Leu CU (A/G)
Thr AC (C/U)
Thr AC (A/G)
Ser UC (C/U)
Val GU (C/U)
Val GU (A/G)
Phe UU (C/U)
Ile AU (C/U)
Ile/Met AU (A/G)
000
001
100
101
Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U)010
Arg CG (A/G) Stop/Trp UG (A/G) Gln CA (A/G) Stop UA (A/G)011
Gly GG (C/U) Asp GA (C/U)Asparatic acid
Asn AA (C/U)Asparagine
110
Gly GG (A/G) Glu GA (A/G)Glutamatic acid
111
Leu CU (C/U)
Leu UU (A/G)Ser UC (A/G)
Ser AG (C/U)
Arg AG (A/G)
Pro CC (C/U)
Lys AA (A/G)Lysine
Amino acids patterns: Polar requirement of NCN and NUN codons
C. R. Woese, G. J. Olsen, M. Ibba, D. Söll Aminoacyl-tRNA Synthetases, the Genetic Code, and the Evolutionary Process. MMBR 2000(64) 202-236
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
Pro CC (A/G)
Ala GC (C/U)
Ala GC (A/G)
Leu CU (A/G)
Thr AC (C/U)
Thr AC (A/G)
Ser UC (C/U)
Val GU (C/U)
Val GU (A/G)
Phe UU (C/U)
Ile AU (C/U)
Ile/Met AU (A/G)
000
001
100
101
Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U)010
Arg CG (A/G) Stop/Trp UG (A/G) Gln CA (A/G) Stop UA (A/G)011
Gly GG (C/U) Asp GA (C/U) Asn AA (C/U)110
Gly GG (A/G) Glu GA (A/G)111
Leu CU (C/U)
Leu UU (A/G)Ser UC (A/G)
Ser AG (C/U)
Arg AG (A/G)
Pro CC (C/U)
Lys AA (A/G)
Amino acids patterns: Hydrophobicity.
Kyte&Doolittle, 1982, http://biology-pages.info
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
Pro CC (A/G)
Ala GC (C/U)
Ala GC (A/G)
Leu CU (A/G)
Thr AC (C/U)
Thr AC (A/G)
Val GU (C/U)
Val GU (A/G)
Ile AU (C/U)
Ile/Met AU (A/G)
001
100
101
Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U)010
Arg CG (A/G) Stop/Trp UG (A/G) Gln CA (A/G) Stop UA (A/G)011
Gly GG (C/U) Asp GA (C/U) Asn AA (C/U)110
Gly GG (A/G) Glu GA (A/G)111
Leu UU (A/G)Ser UC (A/G)
Ser AG (C/U)
Arg AG (A/G) Lys AA (A/G)
Codon-Anticodon symmetry
Ser UC (C/U) Phe UU (C/U)000 Leu CU (C/U)Pro CC (C/U)
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
Pro CC (A/G)
Ala GC (C/U)
Ala GC (A/G)
Leu CU (A/G)
Thr AC (C/U)
Thr AC (A/G)
Ser UC (C/U)
Val GU (C/U)
Val GU (A/G)
Phe UU (C/U)
Ile AU (C/U)
Ile/Met AU (A/G)
000
001
100
101
Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U)010
Arg CG (A/G) Stop/Trp UG (A/G) Gln CA (A/G) Stop UA (A/G)011
Gly GG (C/U) Asp GA (C/U) Asn AA (C/U)110
Gly GG (A/G) Glu GA (A/G)111
Leu CU (C/U)
Leu UU (A/G)Ser UC (A/G)
Ser AG (C/U)
Arg AG (A/G)
Pro CC (C/U)
Lys AA (A/G)
Point symmetry
D. Halitsky Extending the (Hexa-)Rhombic Dodecahedral Model of the Genetic Code: the Code's Four 6-fold Degeneracies and the Ten Orthogonal Projections of the 5-cube as 3-cube. Computer Systems Technology 2004
Correlation of codon strength and amino acid properties Measure Strong codons Mixed codons Weak codons
Dinucleoside monophosphates
Hydrophilicity (Weber & Lacey 1978) 1.686 1.434 1.235
Hydrophilicity (Barzilay et al. 1973) 2.72 2.26 2.26
Hydrophobicity (Garel et al. 1973) 2.556 3.413 3.982
Amino acids
Molec. Weight (Handbook value) 907 1065.6 1217.5
Molec. Volume (Grantham 1974) 381 637.5 906
Refractivity (Jones 1975) 83.86 140.03 186.51
Alpha pK1 (Zimmermann et al. 1968) 16.96 17.11 17.43
Bulkiness (Zimmermann et al. 1968) 93.22 124.345 143.54
Specific volume (McMeekin et al. 1964) 5.26 5.37 5.8
Polarity (Zimmerman et al. 1968) 107.16 109.58 58.14
Polarity (Woese et al. 1967) 61.2 59.15 51
Polarity (Grantham 1974) 71.2 67 56.3
Hydrophobicity (Jones 1975) 9.18 8.385 16.93
Hydrophobicity (Levitt 1976) -2.2 1.6 8.8
Hydrophobicity (Bull & Breese 1974) 3880 -165 -6790
Hydrophilicity (Weber & Lacey 1978) 7.02 6.585 5.59
Partition coefficient (Garel et al. 1973) 1.88 5.58 7.6
Sequence Frequency (Jungck 1971) 4280 3522 2966
Evolution of the genetic code
binary doublet: 41=4 fields
00 00 00 00
01 01 01 01
10 10 10 10
11 11 11 11
00* 00* 00* 00*
01* 01* 01* 01*
10* 10* 10* 10*
11* 11* 11* 11*
quaternary doublet code: 42=16 fields
our contemporary code is the quaternary triplet code: 43=64 fields
00
01
10
11
CGU, UAC,…
CGU, UAC,…
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
Pro CC (A/G)Proline
Ala GC (C/U)Alanine
Ala GC (A/G)Alanine
Leu CU (A/G)Leucine
Thr AC (C/U)Threonine
Thr AC (A/G)Threonine
Ser UC (C/U)Serine
Val GU (C/U)Valine
Val GU (A/G)Valine
Phe UU (C/U)Phenylalanine
Ile AU (C/U)Isoleucine
Ile/Met AU (A/G)Isoleucine/Methionine
000
001
100
101
Arg CG (C/U)Arginine
Cys UG (C/U)Cystein
His CA (C/U)Histidine
Tyr UA (C/U)Tyrosine
010
Arg CG (A/G)Arginine
Stop/Trp UG (A/G)Tryptophan
Gln CA (A/G)Glutamine
Stop UA (A/G)011
Gly GG (C/U)Glycine
Asp GA (C/U)Asparatic acid
Asn AA (C/U)Asparagine
110
Gly GG (A/G)Glycine
Glu GA (A/G)Glutamatic acid
111
Leu CU (C/U)Leucine
Leu UU (A/G)Leucine
Ser UC (A/G)Serine
Ser AG (C/U)Serine
Arg AG (A/G)Arginine
Pro CC (C/U)Proline
Lys AA (A/G)Lysine
Evidence: Evolution of the Genetic Code
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
Pro CC (A/G)Proline
Ala GC (C/U)Alanine
Ala GC (A/G)Alanine
Leu CU (A/G)Leucine
Thr AC (C/U)Threonine
Thr AC (A/G)Threonine
Ser UC (C/U)Serine
Val GU (C/U)Valine
Val GU (A/G)Valine
Phe UU (C/U)Phenylalanine
Ile AU (C/U)Isoleucine
Ile/Met AU (A/G)Isoleucine/Methionine
000
001
100
101
Arg CG (C/U)Arginine
Cys UG (C/U)Cystein
His CA (C/U)Histidine
Tyr UA (C/U)Tyrosine
010
Arg CG (A/G)Arginine
Stop/Trp UG (A/G)Tryptophan
Gln CA (A/G)Glutamine
Stop UA (A/G)011
Gly GG (C/U)Glycine
Asp GA (C/U)Asparatic acid
Asn AA (C/U)Asparagine
110
Gly GG (A/G)Glycine
Glu GA (A/G)Glutamatic acid
111
Leu CU (C/U)Leucine
Leu UU (A/G)Leucine
Ser UC (A/G)Serine
Ser AG (C/U)Serine
Arg AG (A/G)Arginine
Pro CC (C/U)Proline
Lys AA (A/G)Lysine
Evidence: Evolution of the Genetic Code
Outlook Looking for binary patterns in the
genomes
Acknowledgment
Additional information
Thank you for your attention !
http://www.imb-jena.de/~sweta/genetic_code/
Maik Friedel Andreas
Beyer Frank Grosse
Code Strong codons6 hydrogen bonds
Mixed codons5 hydrogen bonds
Mixed codons5 hydrogen bonds
Weak codons4 hydrogen bonds
Pro CC (A/G)Proline
Ala GC (C/U)Alanine
Ala GC (A/G)Alanine
Leu CU (A/G)Leucine
Thr AC (C/U)Threonine
Thr AC (A/G)Threonine
Ser UC (C/U)Serine
Val GU (C/U)Valine
Val GU (A/G)Valine
Phe UU (C/U)Phenylalanine
Ile AU (C/U)Isoleucine
Ile/Met AU (A/G)Isoleucine/Methionine
000
001
100
101
Arg CG (C/U)Arginine
Cys UG (C/U)Cystein
His CA (C/U)Histidine
Tyr UA (C/U)Tyrosine
010
Arg CG (A/G)Arginine
Stop/Trp UG (A/G)Tryptophan
Gln CA (A/G)Glutamine
Stop UA (A/G)011
Gly GG (C/U)Glycine
Asp GA (C/U)Asparatic acid
Asn AA (C/U)Asparagine
110
Gly GG (A/G)Glycine
Glu GA (A/G)Glutamatic acid
111
Leu CU (C/U)Leucine
Leu UU (A/G)Leucine
Ser UC (A/G)Serine
Ser AG (C/U)Serine
Arg AG (A/G)Arginine
Pro CC (C/U)Proline
Lys AA (A/G)Lysine
The new classification scheme of the standard genetic code
T.Wilhelm, S.Nikolajewa A new classification scheme of the genetic code. J. Mol. Evol. (2004) 59: 598-605