Word document

Knight, Landweber and Yarus, p. 1

Tests of a stereochemical genetic code

Rob Knight, Laura Landweber† and Michael YarusDepartment of Molecular, Cellular and Developmental Biology

University of ColoradoBoulder, CO 80309-0347

† Dept. of Ecology & Evolutionary BiologyPrinceton University

Princeton, NJ 08544-1003

For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience


Abstract

Does the genetic code assign similar codons to similar amino acids because of chemical

interactions between them? Unlike adaptive explanations, which can only explain the

relative positions of amino acids in the code, stereochemical explanations could tie codon

assignments to absolute, verifiable rules. However, modern translation encodes amino

acid sequences without direct codon/amino acid interaction. If there is a relationship

between RNA sequences with intrinsic affinity for amino acids and the modern genetic

code, we must therefore explain a historical transition in which direct interactions were

abandoned.

We review the literature and find no evidence that interactions between short sequences

(mono-, di- or trinucleotides) and amino acids are strong or specific enough to originate

genetic coding. Instead, interactions between amino acids and longer nucleic acid

sequences appear to recapture some assignments of the modern code. For example, real

codons are concentrated in newly selected amino acid binding sites to a greater extent

than codons from similar, but randomized, codes. This implies that some initial coding

assignments were made by interaction with macromolecular RNA-like molecules, and

have survived. Thus subsequent selection, such as selection to minimize coding errors,

has not erased all primordial chemical relationships. Retention of initial stereochemical

codon assignments for three of six amino acids (arginine, isoleucine, and tyrosine, but not

glutamine, leucine or phenylalanine) is strongly supported.

Combining data for the six amino acids, significant stereochemical relationships are of

more than one type - codons and anticodons are each concentrated in some binding sites.

Further work will be required to catalog the relationships between amino acids and



binding site sequences, especially if, as now appears, more than one type of interaction

has been transmitted to the modern code.

1. The Codon Correspondence Hypothesis

The codon correspondence hypothesis, tested in any stereochemical theory of the origin

of the genetic code, may be stated:

For each amino acid there is a coding sequence for which it has the greatest association.

The association between these sequences and amino acids influenced the form and

content of the genetic code.

The codon-correspondence hypothesis is compatible with establishment of the genetic

code either before or during the RNA world. A direct association between mono-, di- or

trinucleotides and their cognate amino acids would suggest that the code arose before

complex RNA catalysts, since trinucleotides would likely occur before the reproducible

synthesis of longer oligonucleotides. Alternatively, an association between trinucleotides

and their cognate amino acids that requires RNA tertiary structure would suggest that the

genetic code arose in the RNA world (the earliest evolutionary time at which long RNA-

like molecules were available). Larger RNAs loosen the constraint on the role of the

coding sequences, which could then support the amino acid binding site but need not

comprise it entirely. Amino acid/RNA complexes might have functioned in translation

from the beginning, but alternatives abound. Their original functions may have been

varied: as coenzyme sites for ribozymes 1, to stabilize RNA double helices 2, or to label

tRNA-like genomic tags 3, 4.



2. Chemical Associations: A Historical Perspective

The idea that the genetic code might be stereochemically determined predates the

elucidation of the code. Gamow’s ‘diamond code’, in which amino acids would fit

specific pockets bounded by four DNA bases, relied on direct interaction between amino

acids and nucleic acids 5. More abstruse possibilities exist: mathematical (and even

numerological) schemes for solving the coding problem abounded before the actual

codon assignments were fully uncovered (reviewed in refs 6, 7).

The structure of the code showed clear patterns. Chemical explanations for such order

were sought by two routes. Physicochemical theorists 8, 9 hoped to measure interaction

between bases and amino acids. This might have resulted in chromatographic co-

partitioning on the early earth, which would be reproducible today by chemical

techniques. In contrast, stereochemical theorists 10, 11 assumed that molecular modeling

could reveal molecular complementarities between amino acids and coding triplets.

Stereochemistry/Molecular Models: The first chemical investigations of codon

assignments were via molecular modeling. Molecular models have been said to prove

that the genetic code was established in quite varied ways. For example, amino acids

might pair with codons 12 or anticodons 10, 13 in the tRNA. Codonic mononucleotides and

-helical homopolymeric amino acids may bind each other specifically (this model

“correctly predicts the glycine codon GGG”, although it unfortunately fails to predict any

other)14. Free glycine and free nucleotides 15 may have affinity, or free amino acids may

intercalate into adjacent bases in the anticodon doublet through H-bonding between

methylene groups and the π-electrons of the bases 16. Specific 2’ aminoacylation of the



second position anticodon base may have been mediated by the first position anticodon

base 17. Amino acids may be able to intercalate between first and second position bases in

double-stranded RNA molecules 18. Cavities caused by removal of the second-position

codon bases in B-DNA may accept amino acids 19. Perhaps amino acids nestle into a

pentanucleotide cup with the anticodon in the center 20. Pairing between amino acid side-

chains and cavities in a complex of four nucleotides (C4N) on the acceptor stem of tRNA

21 might occur. Or perhaps amino acids can bind their codons transposed 3’ 5’ 22, 23. A

double-stranded complex of the codon and anticodon has also been suggested 18, 24.

The modeling approach was tarnished early on when a claimed association between

codons and amino acids 12 relied on models that had been built backwards, 3’ to 5’ 25.

Nevertheless, even the idea that there is a relationship between reversed codons and

amino acids has been defended 22, 23.

Clearly, modeling methods used thus far are not sufficiently constrained. As a result, they

allow too many solutions. Additionally, these approaches tend to assume that the entire

code was uniquely determined by stereochemical fit (and even that modern variant codes

reflect fits induced by different environments 26). If amino acids were added to the code

over time and for different reasons, as seems probable 27, 28, such explanations are

overstatements that may prevent confirmation even if the basic hypothesis is true.

Physicochemical Effects/Chromatography: A second line of evidence comes from

chromatography. Because chromatographic properties of amino acids show regular

variation in the genetic code, any mechanism for the code’s origin must account for this

organization. Various studies have shown that the code conserves certain properties, such



as polarity. The polar requirement of amino acids (the ratio of the log relative mobility to

the log mole fraction water in a water-pyridine mixture) orders coding assignments

impressively. Amino acids with U in the second position of their codon are hydrophobic

while those with A are hydrophilic; those with C are intermediate, and those with G are

mixed. Furthermore, codons that share a doublet have almost identical polar requirements

even if not otherwise related (e.g. His and Gln; possibly Cys and Trp) 6, 8, 9, 29. Thus the

code is ordered with respect to amino acid properties, but such evidence cannot tell us

whether the code was optimized to minimize errors due to mutation or established by

direct chemical interactions 28.

Nor does such chemical order suggest a mechanism for actual codon assignments.

Partitioning of amino acids and nucleotides between aqueous and organic phases, as in a

primordial oil slick, might have associated AAA codons with Lys and UUU codons with

Phe 30. However, none of these molecules are produced in prebiotic syntheses 31 and a

further hypothesis is required to bring chromatographic partitioning to bear on codon

assignment. Analysis with two further chromatographic systems, water/micellar sodium

dodecanoate and hexane/ dodecylammonium propionate-trapped water, confirmed the

previous hydrophobicity scales in a context closer to prebiotic conditions 32. The relative

hydrophobicity of the homocodonic amino acids (Phe UUU, Pro CCC, Lys AAA, Gly

GGG) and the four nucleotides in an ammonium acetate/ammonium sulfate system

showed an anticodonic association, and for dinucleoside monophosphates the association

was also with the anticodon, rather than the codon, doublets 33. Multivariate analysis of

the properties of dinucleoside monophosphates and amino acids, focusing on



hydrophobicity, revealed many strong (p < 0.001) correlations between anticodons and

amino acids, but not between codons and amino acids 34.

Thus, chromatographic data suggest anticodonic, rather than codonic interactions (note

the underlying assumption that molecules with similar properties interact). However,

although chemical partitioning on the early earth could conceivably have led to specific

cofractionation between particular nucleotides (or oligonucleotides) and prebiotic amino

acids, there do not seem to be consistent correlations. Chromatographic separation on

various plausibly prebiotic surfaces (silicates, clays, hydroxyapatite, calcium carbonate,

etc.) showed that, on a silica surface under an aqueous solution of MgCl2 and

(NH4)H2PO4, Ala comigrates with CMP and Gly comigrates with GMP 35. Ala is assigned

the GCN codon class, while Gly has the GGN codon class. However, there was no strong

separation between GMP and UMP or between AMP and CMP even on silica, and many

prebiotic amino acids (Pro, Ile, Leu, Val) fell well outside the range of the nucleotides.

The situation was even worse on other surfaces, which did not provide any amino acid-

nucleotide concordances. Thus, the data do not support the conclusion that copartitioning

of nucleotides and amino acids led to the genetic code 35, especially in the absence of a

plausible mechanism for transforming a copartition into modern codon assignments.

Physicochemical Effects/Direct Interactions: The third type of evidence comes from

tests for direct interaction between nucleotides and amino acids. Mononucleotides show

nonspecific but charge-dependent interactions with polyamino acid chains, as measured

by the change in turbidity of the cosolution 14. Affinity chromatography, which tested

retardation of the four nucleotide monophosphates by each of nine amino acids (Gly, Lys,

Pro, Met, Arg, His, Phe, Trp, Tyr) immobilized by their carboxyl groups, showed no For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience


association between binding strength and codon or anticodon assignments 36. Interactions

between free amino acids and poly(A), as measured by the chemical shift of the C2 and C8

protons of A, are also “not easily reconcilable with the genetic code” 37. Further affinity

chromatography and NMR experiments on the interaction between amino acids and

mono-, di-, and trinucleotides showed that amino acids did selectively interact with

specific bases 38, although the interactions did not parallel the genetic code. Imidazole-

activated amino acids esterify the 2’-OH groups of RNA homopolymers with some

specificity 39. However, since the two amino acids tested, phenylalanine and glycine,

much preferred poly(U) over any other polynucleotide, the results do not support the

authors’ contention that this mechanism led to the present codon assignments.

The dissociation constants of AMP complexes with the methyl esters of amino acids also

show selectivity, ranging about seven-fold from Trp (120 mM) to Ser (850 mM) 40.

However, neither Trp (UGG) nor Ser (CUN, AGY) have particularly many or few A

residues in their codons or anticodons, while the amino acids that do (Lys AAR, Phe

UUY) have intermediate dissociation constants (320 and 196 mM respectively). These

data did show a strong negative correlation between the association constant (1/KD) and

amino acid hydrophobicity. There are positive correlations between the dissociation

constant and the number of codons assigned to the amino acid, and to frequency of the

amino acid in proteins 40. Condensation of dipeptides of the form Gly-X in the presence

of AMP, CMP, poly(A) and poly(U) was mainly enhanced by the anticodonic

nucleotides, where a pattern was apparent 41. Different amino acids differ in their ability

to stabilize poly(A)-poly(U) and poly(I)-poly(C) double helices2, although the order is

similar in each case and so cannot have contributed to the establishment of the genetic



code. Finally, D-ribose adenosine biases esters with L-Phe but not D-Phe towards the 3’-

OH (the pattern is reversed with L-ribose adenosine). Thus, single nucleotides

moderately regio- and stereo- selectively aminoacylate themselves 42.

Recent evidence also suggests that self-assembly of purine monolayers differentially

affects adsorption of amino acids. The spacing between residues is consistent with

peptide bond distances: such self-assembly might have formed a primordial code,

although apparently one very different from the modern genetic code 43-45.

Summary: Two comprehensive reviews of these and other data 46, 47 suggested that if the

genetic code were established by interactions between simple molecules (not more

complicated than dipeptides or trinucleotides) and amino acids, then the greatest specific

interaction was between amino acids and their anticodon nucleotides. However,

individual experiments were equivocal or correlated with both anticodons and

occasionally codons, so no strong direction is evident in the data.

The absence of obvious, strong or reproducible correlations from these highly varied

approaches, considered alone or especially in sum, weakens the hypothesis that the code

rests on the chemistry of trinucleotide-amino acid interactions. We suggest instead a later

origin for the code, involving larger RNAs.

3. Adaptors and Adaptation

Perhaps the simplest explanation for the observed order in the genetic code 11, 48-50 is that

codon assignments were determined by stereochemical association between

oligonucleotides and amino acids 8-10, 12. This mechanism would assign similar amino



acids to similar codons because of intrinsic affinity, rather than as a result of natural

selection among alternative codes. Although the resulting codon assignments might

appear adaptive, in that they reduce various errors relative to other possible codes, they

would not be an adaptation.

Stereochemical pairing: Several such stereochemical schemes are conceivable. Thus,

the primordial sequences with which pairing occurred can either be the actual codons, or

some simple transform thereof 9. As detailed in Stereochemistry/Molecular Models

above, interactions have been proposed between amino acids and codons 12, anticodons 10,

13, codons read 3’ 5’ instead of 5’ 3’ 22, 23, a complex of four nucleotides (C4N)

formed by the three 5’ nucleotides of tRNA with the fourth nucleotide from the 3’ end 21,

and a double-stranded complex of the codon and anticodon 18, 24.

A fundamental problem that all stereochemical models share is that codons and amino

acids are never stereochemically linked in modern translation. Thus an implied

evolutionary shift has occurred in which direct associations were lost, but their logic was

nevertheless transmitted to the present. Such a conservative transition, required to make a

stereochemical origin observable, is supported by a strong argument from continuity. The

shift to indirect associations must occur in a translation apparatus that is making useful

peptides (otherwise the translation apparatus itself could not have been selected). Thus

the logic of the older direct interactions must be preserved or the altered translation

apparatus will be of no use. After consideration of the evidence, we discuss this transition

to indirect coding again.



The existence of adaptors, tRNAs and aminoacyl-tRNA synthetases, in the modern

system allows codon assignments to be readily shuffled among amino acids 51.

Accordingly, adaptive evolution can erase primordial codon assignments. Thus we would

only expect some amino acids to show codon/site associations, especially if others were

added to the code later. Consequently, it is remarkable that any associations persist to the

present 52.

Amino Acid-Binding RNA: Most attention to sequence/binding site associations

initially focused on arginine, since arginine binds specifically to two completely distinct

classes of natural RNA molecules. The first class is the guanosine-binding site of self-

splicing group I introns, which binds arginine as a competitive inhibitor. The

guanidinium side-chain of arginine is similar in structure to the Watson-Crick face of G

53. A conserved Arg codon confers this activity, and the binding site is almost invariably

composed of several Arg codons in close juxtaposition 54, 55. The second class has been

extensively studied because of potential medical importance: free arginine can mimic the

natural interaction of HIV Tat peptides with TAR RNA 56. In this case, however, no Arg

codons are conserved at the binding site 57.

Natural amino acid-binding RNAs are few; more significantly, they can provide only

anecdotal evidence for codon/binding site interactions because they are almost certainly

under strong selection for properties other than binding to the free amino acids. However,

SELEX or selection-amplification, a technique for directed molecular evolution 58-60,

makes it possible to select those RNA molecules that perform a desired catalytic or

binding function from large random pools (see ref. 61 for review). This technological



advance makes it possible to find out whether RNA molecules that bind to particular

amino acids share any characteristic motifs at their binding sites.

Aptamers have now been isolated from a variety of amino acids (Table 7.1), including

hydrophobic amino acids such as valine 62, phenylalanine/tyrosine 63, isoleucine 64,

tyrosine 65, leucine (I. Majerfeld and M. Yarus, unpublished data), and phenylalanine 65a,

and hydrophilic amino acids such as glutamine (G. Tocchini-Valentini, unpublished data)

and citrulline, which is not normally found in proteins 66. However, RNA aptamers for

arginine are most abundant in the literature, and have been independently isolated in

several different experiments 66-73. Since structural information is available for many of

these sequences, it becomes possible to ask whether particular sequences are

overrepresented at recently selected binding sites, and, if so, whether these sequences

have any relationship to the modern genetic code.

4. Statistical Evidence for Triplet/Binding Site Associations

The theory that the code arose by stereochemical means is both specific and unique; its

predictions are explicit and different from other prevalent theories. Coevolution theories

(that coding was extended along biosynthetic pathways 74) are typically agnostic about

which trinucleotide-amino acid pairing established the initial codon assignments, but

predict that such pairings, if they exist at all, can account for only a small part of the

codon catalog. Optimization theories (that coding minimizes errors in expression 75)

predict no correspondence at all between trinucleotides and amino acid binding sites.

Evolution of Binding Triplets: Assuming that original amino acid binding sites were

RNA-like, they could have evolved into any of the components of modern translation:



tRNA, rRNA, mRNA, or primitive aminoacyl-tRNA synthetases (subsequently replaced

by protein enzymes). Depending on which modern translation component descended

from ancient amino acid interactions, we predict different associations between coding

nucleotides and amino acids. If binding sites evolved into tRNAs, for instance, the

anticodons should be overrepresented in amino acid binding sites, whereas if they

evolved into mRNA the codons should be overrepresented 76.

The selection of RNA molecules (aptamers) that bind amino acid ligands has made such

conjectures testable (Table 7.1). Because in vitro selection searches a large space of

possible sequences for optimal or near-optimal “solutions” to particular binding

problems, such directed evolution might be able to recapitulate primordial interactions

between amino acids and short RNA sequences. If amino acids interact favorably with

coding RNA sequences, this relation might be observed, or even proven. Since aptamers

can be selected for each amino acid, and since the specific nucleotides important to

binding can be determined, standard statistical tests for association (such as the 2 or G

tests) will reveal any consistent relation between binding-site nucleotides and nucleotides

in coding sequences 77.

Such a search for motifs faces predictable difficulties. RNA is more versatile than might

have once been thought, and many oligomers often bind an amino acid. The diversity of

RNAs that bind arginine, for example, shows that efforts to emulate a unique primordial

RNA for each amino acid would be futile 57. Recurrence of specific sequence motifs in

amino acid aptamers, such as codons or anticodons, cannot prove that similar interactions

led to the establishment of present codon assignments. However, suppose that coding

sequences embody such general interactions that they will still be detectable in the most For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience


probable modern binding sites. Proof of any specific pairings at all would show that the

specificity existed to originate a genetic code. If specific pairings detected with in vitro

selection actually match present codon assignments, then similar processes in ancient

translation are supported. If there are frequent, strong associations between present

codons or anticodons and amino acids, their involvement in the origin of the code is the

only plausible explanation.

Binding Site Preferences: That any codon/binding site associations could survive to the

present has been questioned 78. However, the association between arginine and its binding

sites is exceptionally strong, and has proven remarkably robust to statistical

methodology, choice of binding sites, and choice of sequences from selected pools 52, 76-78.

In particular, arginine binding sites show strong associations with arginine codons (Table

7.2), but not anticodons (Table 7.3), codon or anticodon sets for other amino acids, other

groups of 4+2 codons incorporating a family box plus a doublet, or other short motifs.

This relationship remains highly significant even with many plausible modifications.

Sequences where the selected binding site overlaps the constant regions can be excluded,

the data can be corrected for nucleotide bias at binding sites and alternative sequences

can be chosen from reported pools without altering the conclusion.

Arginine might have been unique: it acts as a nucleotide mimic 53, perhaps more so than

other amino acids. However, significant associations between Tyr aptamer binding sites

and codons have been reported 52, and Ile aptamers contain conserved Ile codons at their

active sites 64. Data from several other amino acids have become available, allowing a

more general test of generality for the association between binding sites and codons. We



now extend the analysis to all available amino acids (Table 7.1) and reassess hypotheses

about specific associations.

Testing Triplet/Site Associations: Codons occur more often in binding sites than

expected for each of the six amino acids for which data are available, an improbable

outcome itself (P = (0.5)6 = 0.016). Individually, the arginine aptamers showed a

significant codon/site association only. Tyrosine and isoleucine aptamers showed

significant associations between both codons and anticodons: except for the association

between tyrosine and its codons, these relationships persist even when corrected for six

multiple comparisons (P < 0.01). Glutamine, leucine and phenylalanine have no

significant tendency to locate codons or anticodons in their binding sites (when corrected

for multiple comparisons). The most sensitive tests combine all data; then we observe

highly significant associations overall with both codons and anticodons, even when the

single most influential amino acid is excluded from the analysis (P < 10-6 in all cases).

Thus there is reason to believe that codons and anticodons are associated with binding

sites, and this conclusion does not depend on any one selection or set of binding sites.

On the other hand, controls show that this method can rule out certain possibilities. There

was no significant association for any amino acid, or for the set as a whole, with the

codons reversed 3’ to 5’, indicating that this hypothesis can be clearly rejected.

It is possible that the 21 codon (or anticodon) sets are an unfair comparison class, since

they range in size from 1 to 6 codons. A less precise, but perhaps more robust, test is to

see whether there is a significant association between the amino acid binding sites and the

codon (or anticodon) that contains the cognate doublet: this reflects the intuitively



plausible idea that the primitive code may have assigned amino acids only to family

boxes. However, doublet analysis (Table 7.4) does not greatly change the outcome.

Significant associations are observed for both doublets and codons/anticodons. Thus,

again, the results to date suggest both associations between codons and anticodons.

We can carry these conclusions a step further by freeing them of the assumptions

required even for standard statistical tests. If there is an association between the triplets

found at amino acid binding sites and the modern genetic code, it should be found only

with the actual genetic code and not with randomized versions of it. Accordingly, we

generated many alternative codes, and tested for codon/binding site associations. This

preserves important aspects of the experimental results, such as the spatial correlations

within binding sites (they occur in specific sections of the molecule), and the influence of

the occurrence of each triplet on the probability of finding others. In order to eliminate

dependence on any particular method for generating variant codes, we used several quite

different permutation methods.

An ISO C program randomized the code according to the following schemes:

1. Codon permutation: a codon can randomly and independently take on any

identity (including its real one). This keeps the number of codons per amino

acid constant, but usually completely disrupts the fine structure of the code

(such as wobble relations). This potentially generates 64! = 1.2 x 1089 possible

codes.

2. Amino acid permutation: any amino acid can randomly and independently

take any existing coding block(s), including those of stop codons. This



preserves the structure of the code entirely (the number and size of blocks for

codons are preserved, and their relative positions are preserved within the

coding table), but amino acids can be given different numbers of codons. At

one extreme, Arg, which normally has 6 codons split into a 4-block and a 2-

block, might end up with Trp’s single codon. This potentially generates 21! =

5.1 x 1019 possible codes.

3. Codon block permutation. Keeping the structure of the code constant, we

randomly assorted amino acid identities among groups of codons of the same

size. For example, the CGN block assigned to Arg might be swapped with the

CCN block normally assigned to Pro, but could not swap with the single UGA

codon assigned to Trp. Treating the three Ile codons as a 2-block and a 1-

block, this leads to 8!x14!x4! = 8.4 x 1016 codes with 8 4-blocks, 14 2-blocks,

and 4 1-blocks. This “n-block” scheme completely preserves the degeneracy

of the code, and also conserves the number of codons assigned to each amino

acid. Compared to the other randomization schemes, amino acids are far more

likely to retain some of their actual codons.

4. Base identity permutation: in addition to the block permutation of method 3,

this method randomizes the meaning of the first and second position base .

This partially disrupts the code’s structure (so that, for example, the UGN

codon block need not be split into blocks of 2, 1, and 1), but preserves the

degeneracy across a row and down a column. This multiplies the number of

codes from method 3 by a factor of (4!x4!)/2 for a total of 2.4 x 1019 codes,

and dramatically reduces retention of fragments of the present code.For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience


5. Codon doublet permutation: like method 4, except that any codon doublet

independently takes on the meaning of any other codon doublet. This leads to

16!/(8!x6!x2!) = 360360 times as many codes as method 3, for a total of 3.0 x

1022 possible codes. Both this and method 4 preserve the number of codons

assigned to each amino acid and their block structure (e.g. Arg will always

have a 4-block and a 2-block), but this method does not preserve the relation

between blocks of particular sizes as does method 4.

We generated 10 million randomized codes for each of the 5 schemes listed above, and

compared codon/site associations in observed amino acid binding sites with those found

in the actual code (Fig. 7.1). The “n-block” model (#3) is uniquely right-skewed, because

some of the codons can only swap with a few partners under this model (e.g. there are

only 4 blocks containing one codon) so that some of the present structure of the code will

often be preserved. Even under this highly constrained model, however, only 0.8% of

randomized codes give apparent associations between codons and binding sites better

than the actual code. For the other, more completely scrambled models, between 0.11%

(method 2) and 0.04% (methods 4 and 5) of all random codes do better than the actual

code. Said another way, real codons are more associated with real binding sites than in

99.2 to 99.96% of all randomized codes, even though randomized codes include

fragments of the actual code. Using Fisher’s method for independent probabilities rather

than performing a G test on the summed counts gave similar results (data not shown).

Thus, our result is general and not sensitive to choice of alternative codes or sensitive to

statistical methodology. It is highly unlikely that we would see as significant an

association between codons and binding sites for a genetic code picked at random as that



actually seen with the real code. Randomization of anticodon assignments gives similar

results, but slightly less significant than for codons. Randomized anticodons are less

associated with binding sites than real ones in 99.2 to 99.5% of all codes. This small

difference in significance appears also in the statistical tests (Table 7.3).

These controls argue strongly that the most probable modern RNA-amino acid binding

sites capture something of the essential nature of the code. In particular, a stereochemical

process involving macromolecular RNA-like binding sites containing codons, and

perhaps anticodons, gave rise to the present genetic code. Considering individual amino

acids, primordial RNA-like binding sites were probably relevant to the assignment of

codons for at least three of six amino acids for which we have data.

5. Concluding Remarks

We now return to the direct to indirect coding transition implied by every stereochemical

model. RNA amino acid binding sites contain sequences likely to be relevant to the

appearance of the code. Thus the logically predicted transition from direct to indirect

coding rests first on the ability of coding sequences to serve as structural elements in

amino acid binding sites, and then to subsequently serve in normal base pairing. Triplets

that became codons might begin as essential elements in binding sites (indirect coding),

and later pair with primordial tRNAs (direct coding). Triplets that became anticodons

might begin within binding sites (indirect), then employ their more well-known base-

pairing activity when they begin to act as anticodons (direct coding). The conservative

logic of the direct to indirect transition, required by argument from continuity, is implicit



as soon as it is known that nucleotide triplets can be essential elements of amino acid

binding sites (compare the DRT theory 57).

Descendants of the original amino acid-binding sites could play four possible roles: as

tRNAs, mRNAs, ribosomes, or aminoacyl-tRNA synthetases. All these activities are

known to be possible activities for RNA 79-85, because they exist in modern selected

parallels. With present data, it appears that arginine may have been bound in primordial

sites containing sequences that became codons in mRNA. We found no strong evidence

for association between glutamine, leucine and phenylalanine and their coding sequences.

These are negative results based on limited data; however, these codons may have been

assigned by other means during later code evolution. Tyrosine and isoleucine present a

case we had not anticipated, in which both codons and anticodons are overrepresented

(though not because they are paired in the molecules). We cannot confidently specify the

descent of the coding sequences for these amino acids. Their binding sequences could

have become both tRNA-like and mRNA-like molecules, or these data may be the first

indication of the need for a new, more comprehensive theory.

Ideally, with a large sample of independently derived families of aptamer that bind each

of the amino acids, it should be possible to test associations between binding sites and

individual trinucleotides. If there are, as now appears, to be several classes of amino acids

with different relations to coding sequences, such high resolution may be required. It is

possible that high-throughput techniques for aptamer isolation will achieve this in the

future, but, for the moment, isolating aptamers and determining binding sites is a time-

consuming process. Consequently, it may be several years before site/triplet associations

are maximally resolved.For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience


However, it is clearly not true that each aptamer binds its target amino acid using only the

cognate codons. Amino acid binding sites always require other nucleotides for their

construction. Where structures are known, the coding sequences can be in contact with

the amino acid or providing less central support for the site - in some cases they are in

both places 52. The fact that binding with detectable affinities are far more complex than

single trinucleotides strongly suggests that the code probably began in an RNA world,

after complex RNA molecules were prevalent. Assuming that the RNA world biota were

our immediate antecedents, translation was also probably devised in the RNA world 89.

An economical interpretation is therefore that coding assignments arose predominantly

during initial selection for templated peptide synthesis, rather than via other activities.

These techniques have substantial potential for further analysis. It may be possible to

discover why some amino acids have the actual codon assignments they do, and perhaps

why some amino acids were incorporated into the code while others available on the

early earth or as metabolic intermediates were excluded. Furthermore, with complete data

in hand it may be possible to define a minimal, stereochemically determined code, and

therefore to estimate the relative roles of chemistry and selection in shaping modern

codon assignments.



Amino Acid Kd Comments Reference

Arg 400µMGroup I intron: naturally binds G 86

Arg 4mMTAR: Naturally binds Tat peptide in HIV 56

Arg 1mM3 families selected; no structures available 67

Arg 4mMSelected against GMP binding 68

Arg 2-4mMSelected by salt elution to mimic TAR 70

Arg 60µM

Derived from citrulline binder by mutagenesis/reselection; NMR structure available 66

Arg 330nM

Intensive selection with heat-denaturation; only one sequence structurally characterized, though many selected 72

Val 12mM No structural data 62

Ile 200-500µMOnly one family survived selection 64

Phe/Tyr 2-25mM No structural data 63

Trp 18µMBinds D-Trp-agarose, not free L-Trp; no structural data 87

Tyr 35µMAlso binds Trp; evolved from L-DOPA binder 65

Phe <1mMSome clones bind only Phe-agarose 65a

Leu ~1mM Majerfeld & Yarus, unpublished data

Gln 18-20mM Mannironi et al., unpublished data

Table 7.1: Natural and Artificial Amino Acid-Binding RNA. Entries in bold are those

with sufficient structural information to define binding site nucleotides, used to test for

statistical association between binding sites and triplet motifs. Natural RNA sequences

that bind arginine were excluded from the analysis, because they are probably under

selection for other properties.



Codons Arg Tyr Ile Gln Phe LeuTer 0.05 1.28 -5.02 -4.19 15.86 2.65Ala 0.09 -16.95 -11.97 -11.57 -18.51 -0.38Cys -16.97 -0.66 -8.42 -3.32 -4.79 0.04Asp 0.15 3.96 -3.45 2.89 -1.82 -1.08Glu 3.44 -3.17 -1.79 -0.01 6.81 1.47Phe -3.38 -2.38 -8.42 1.26 3.73 -2.00Gly 0.35 0.25 31.57 8.94 2.25 0.00His -1.04 -1.87 -6.14 -0.02 -3.69 -0.68Ile 2.86 9.18 10.35 3.43 0.01 -4.60Lys 1.34 -14.86 0.00 1.74 1.39 0.62Leu -19.92 -4.16 8.14 -10.60 -7.57 0.83Met -5.60 3.06 0.00 -0.02 -0.15 -1.35Asn 5.46 -0.04 -1.79 3.25 1.04 0.01Pro 0.00 -2.30 -11.17 -9.55 -8.26 -0.15Gln 0.27 2.30 -2.85 2.98 2.00 0.62Arg 29.11 0.24 -25.10 1.66 0.17 -0.78Ser -6.07 -4.95 -15.73 -7.54 -11.32 5.65Thr -0.10 0.57 -16.54 1.94 -7.32 2.61Val -0.13 4.45 -0.04 -0.38 5.53 2.82Trp -7.26 0.04 42.58 -1.14 -2.52 0.28Tyr -3.38 6.69 10.90 -0.33 0.03 -0.12

Rank 1 2 4 4 4 6

Table 7.2: Tests for association between amino acid binding sites and their cognate

codons. Rows: codon sets for each amino acid; columns: amino acids for which aptamers

with known structures have been reported. Bold values indicate the cognate codon sets

for each amino acid aptamer; values in italics indicate codon sets with at least as strong

an association as the actual codon set. Tabulated numbers are G values for association

between codons and binding sites, with the Williams correction 88; negative values

indicate codon sets that are found less frequently at binding sites than would be expected

by chance. ‘Rank’ indicates the rank order of the cognate amino acid’s codon set.

Binding sites for this table and all others are taken from ref. 52 where applicable (Arg, Ile,

Tyr), or otherwise from personal communications from the specific aptamer laboratories.

See ref. 76 for discussion of the effects of different choices of binding site.



Codons n+b+

c +b-c -b+c -b-c G PArg 5 36 16 38 106 29.1 3.4E-08Tyr 3 12 71 9 179 6.7 4.8E-03Ile 5 15 25 30 181 10.4 6.5E-04Gln 3 6 36 6 108 3.0 4.2E-02Leu 2 16 46 19 78 0.8 1.8E-01Phe 8 11 74 35 504 3.7 2.7E-02Total 26 96 268 137 1156 51.6 3.5E-13

Total - Arg 21 60 252 99 1050 25.1 2.7E-07

Anticodons # seq+b+

c +b-c -b+c -b-c G PArg 5 20 32 37 107 2.9 4.5E-02Tyr 3 18 65 6 182 21.7 1.6E-06Ile 5 16 24 23 188 17.1 1.7E-05Gln 3 1 41 17 97 -5.9 9.9E-01Leu 2 27 35 23 74 6.7 4.7E-03Phe 8 12 73 40 499 3.7 2.8E-02Total 26 94 270 146 1147 43.1 2.6E-11

Total - Tyr 21 74 238 109 1040 39.6 1.6E-10

Rev. Codons # seq+b+

c +b-c -b+c -b-c G PArg 5 16 36 42 102 0.05 8.3E-01Tyr 3 3 80 6 182 0.03 8.6E-01Ile 1 10 30 25 186 4.10 4.3E-02Gln 3 7 35 11 103 1.34 2.5E-01Leu 2 12 50 29 68 -2.22 1.4E-01Phe 8 2 83 43 496 -4.33 3.7E-02Total 22 50 314 156 1137 0.71 4.0E-01

Table 7.3: Test for association between binding sites and the cognate codons, anticodons,

and codons reversed 3’ to 5’. Column headings: n, number of sequences; +b+c, number

of bases both in codons and in binding sites; +b-c, number of bases in binding sites but

not in codons; -b+c, number of bases in codons but not in binding sites; -b-c, number of

bases neither in codons nor in binding sites; G, the G test for association in a 2 x 2 table,

with the Williams correction; P, 1-tailed test for independence with 1 degree of freedom.

Values in italics are significant to P < 0.01 after correcting for 6 comparisons. There are

significant associations between some amino acid binding sites and both codons and

anticodons, even when the single most significant association is removed. However, there

is no association at all between amino acid binding sites and the reversed codons.



Codon Doublets # seq +b+c +b-c -b+c -b-c G PArg 5 24 28 24 120 16.4 2.5E-05Tyr 3 22 61 20 168 10.2 7.1E-04Ile 5 15 25 30 181 10.4 6.5E-04Gln 3 9 33 15 99 1.5 1.1E-01Leu 2 7 55 21 76 -2.9 9.6E-01Phe 8 17 68 96 443 0.2 3.2E-01Total 26 94 270 206 1087 17.5 1.4E-05Total - Arg 21 70 242 182 967 7.1 3.9E-03Anticodon Doublets # seq +b+c +b-c -b+c -b-c G PArg 5 11 41 27 117 0.1 3.6E-01Tyr 3 23 60 19 169 12.5 2.1E-04Ile 5 8 32 45 166 0.0 5.7E-01Gln 3 5 37 46 68 -12.6 1.0E+00Leu 2 22 40 16 81 7.2 3.6E-03Phe 8 27 58 72 467 15.6 3.8E-05Total 26 96 268 225 1068 13.8 1.0E-04Total - Phe 18 85 227 198 951 14.8 6.1E-05

Table 7.4: Test for association between binding sites and codon doublets (XYN) or

anticodon doublets (NY’X’), where X and Y are specified and N is any base. For

example, the codon doublet for Phe is UUN within a binding site, and the anticodon

doublet is NAA within a site. Again, the specific associations hold for both codons and

anticodons overall, although few of the results are individually significant. Italics indicate

significant values after correction for 6 comparisons.



Fig. 7.1: Distribution of likelihood for randomized genetic codes. The lines correspond to the different models for random codes described in Testing Triplet/Site Associations. The gray vertical line at the right (G = 51.5) gives the position of the actual code: very few randomized codes give a higher association between ‘codons’ and binding sites, making it highly unlikely that the observed association for the real code is due to chance. The “n-block” line (x) is skewed strongly to the right, because some codons can occupy relatively few blocks under this model. Thus n-block randomization preserves many similarities to the real code.



References

1. Szathmáry E. Coding coenzyme handles: A hypothesis for the origin of the

genetic code. Proc Natl Acad Sci USA 1993;90: 9916-9920.

2. Porschke D. Differential effect of amino acid residues on the stability of double

helices formed from polyribonucleotides and its possible relation to the evolution of the

genetic code. J Mol Evol 1985;21: 192-198.

3. Maizels N, Weiner AM. Peptide-specific ribosomes, genomic tags, and the origin

of the genetic code. Cold Spring Harb Symp Quant Biol 1987;LII: 743-749.

4. Maizels N, Weiner AM. The genomic tag hypothesis: modern viruses as

molecular fossils of ancient strategies for genomic replication, in The RNA world,

Gesteland RF and Atkins JF, Eds. Cold Spring Harbor Laboratory Press: New York

1993;577-602.

5. Gamow G. Possible mathematical relation between deoxyribonucleic acid and

protein. Kgl Dansk Videnskab Selskab Biol Medd 1954;22: 1-13.

6. Woese CR. The genetic code: the molecular basis for genetic expression. New

York: Harper & Row 1967.

7. Ycas M. The biological code. North-Holland Research Monographs: Frontiers of

Biology, ed. Neuberger A and Tatum EL. Vol. 12. Amsterdam: North-Holland publishing

Company 1969.

8. Woese CR, Dugre DH, Dugre SA, et al. On the fundamental nature and evolution

of the genetic code. Cold Spring Harb Symp Quant Biol 1966;31: 723-736.

9. Woese CR, Dugre DH, Saxinger WC, et al. The molecular basis for the genetic

code. Proc Natl Acad Sci USA, 1966;55: 966-974.



10. Dunnill P. Triplet nucleotide-amino acid pairing: a stereochemical basis for the

division between protein and nonprotein amino acids. Nature, 1966. 210: 1267-1268.

11. Pelc SR. Correlation between coding triplets and amino acids. Nature 1965;207:

597-599.

12. Pelc SR, Welton MGE. Stereochemical relationship between coding triplets and

amino-acids. Nature 1966;209: 868-872.

13. Ralph RK. A suggestion on the origin of the genetic code. Biochem Biophys Res

Comm 1968;33: 213-218.

14. Lacey Jr JC, Pruitt KM. Origin of the genetic code. Nature 1969;223: 799-804.

15. Rendell MS, Harlos JP, Rein R. Specificity in the genetic code: the role of

nucleotide base-amino acid interaction. Biopolymers 1971;10: 2083-2094.

16. Melcher G. Stereospecificity of the genetic code. J Mol Evol 1974;3: 121-140.

17. Nelsesteuen GL. Amino acid-directed nucleic acid synthesis. J Mol Evol 1978;11:

109-120.

18. Hendry LB, Whitham FH. Stereochemical recognition in nucleic acid-amino acid

interactions and its implications in biological coding: a model approach. Perspect Biol

Med 1979;22: 333-345.

19. Hendry LB, Bransome Jr ED, Hutson MS, et al. First approximation of a

stereochemical rationale for the genetic code based on the topography and physichemical

properties of "cavities" constructed from models of DNA. Proc Natl Acad Sci USA

1981;78: 7440-7444.



20. Balasubramanian R. Origin of life: a hypothesis for the origin of adaptor-mediated

ordered synthesis of proteins and an explanation for the choice of terminating codons in

the genetic code. Bio Systems 1982;15: 99-104.

21. Shimizu M. Molecular basis for the genetic code. J Mol Evol 1982;18: 297-303.

22. Root-Bernstein RS. Amino acid pairing. J theor Biol 1982;94: 885-894.

23. Root-Bernstein RS. On the origin of the genetic code. J theor Biol 1982;94: 895-

904.

24. Alberti S. The origin of the genetic code and protein synthesis. J Mol Evol

1997;45: 352-358.

25. Crick FHC. An error in model building. Nature 1967;213: 798.

26. Mellersh A. A model for the prebiotic synthesis of peptides and the genetic code.

Orig Life Evol Biosph 1993;23: 261-274.

27. Crick FHC. The origin of the genetic code. J Mol Biol 1968;38: 367-379.

28. Knight RD, Freeland SJ, Landweber LF. Selection, history and chemistry: the

three faces of the genetic code. Trends Biochem Sci 1999;24: 241-7.

29. Woese CR. Evolution of the genetic code. Naturwissenschaften 1973;60: 447-59.

30. Nagyvary J, Fendler JH. Origin of the genetic code: a physical-chemical model of

primitive codon assignments. Orig Life 1974;5: 357-362.

31. Miller SL. Which organic compounds could have occurred on the prebiotic earth?

Cold Spring Harb Symp Quant Biol 1987;LII: 17-27.

32. Fendler JH, Nome F, Nagyvary J. compartmentalization of amino acids in

surfactant aggregates. J Mol Evol 1975;6: 215-232.



33. Weber AL, Lacey Jr JC. Genetic code correlations: amino acids and their

anticodon nucleotides. J Mol Evol 1978;11: 199-210.

34. Jungck JR. The genetic code as a periodic table. J Mol Evol 1978;11: 211-224.

35. Lehmann U. Chromatographic separation as selection process for prebiotic

evolution and the origin of the genetic code. Bio Systems 1985;17: 193-208.

36. Saxinger C, Ponnamperuma C. Experimental investigation on the origin of the


37. Raszka M, Mandel M. Is there a physical chemical basis for the present genetic

code? J Mol Evol 1972;2: 38-43.

38. Saxinger C, Ponnamperuma C. Interactions between amino acids and nucleotides

in the prebiotic milieu. Orig Life 1974;5: 189-200.

39. Lacey Jr JC, Weber AL, White Jr WE. A model for the coevolution of the genetic

code and the process of protein synthesis: review and assessment. Orig Life 1975;6: 273-

283.

40. Reuben J, Polk FE. Nucleotide-amino acid interactions and their relation to the


41. Podder SK, Basu HS. Specificity of protein-nucleic acid interaction and the

biochemcial evolution. Orig Life 1984;14: 477-484.

42. Lacey Jr JC, Wickramasinghe NSMD, Cook GW, et al. Couplings of character

and of chirality in the origin of the genetic system. J Mol Evol 1993;37: 233-239.

43. Sowerby SJ, Cohn CA, Heckl WM, et al. Differential adsorption of nucleic acid

bases: relevance to the origin of life. Proc Natl Acad Sci USA 2001;98: 820-822.



44. Sowerby SJ, Heckl WM. The role of self-assembled monolayers of the purine and

pyrimidine bases in the emergence of life. Orig Life Evol Biosph 1998;28: 283-310.

45. Sowerby SJ, Stockwell PA, Heckl WM, et al. Self-programmable, self-assembling

two-dimensional genetic matter. Orig Life Evol Biosph 2000;30: 81-99.

46. Lacey Jr JC, Mullins Jr DW. Experimental studies related to the origin of the

genetic code and the process of protein synthesis-a review. Orig Life 1983;13: 3-42.

47. Lacey Jr JC. Experimental studies on the origin of the genetic code and the

process of protein synthesis: a review update. Orig Life Evol Biosph 1992;22: 243-275.

48. Epstein CJ. Role of the amino-acid 'code' and of selection for conformation in the

evolution of proteins. Nature 1966;210: 25-28.

49. Volkenstein MV. Coding of polar and non-polar amino acids. Nature, 1965;207:

294-295.

50. Woese CR. Order in the genetic code. Proc Natl Acad Sci USA 1965;54: 71-75.

51. Saks ME, Sampson JR, Abelson J. Evolution of a transfer RNA gene through a

point mutation in the anticodon. Science 1998;279: 1665-1670.

52. Yarus M. RNA-ligand chemistry: a testable source for the genetic code. RNA

2000;6: 475-484.

53. Yarus M. A specific amino acid binding site composed of RNA. Science

1988;240: 1751-1758.

54. Yarus M. Specificity of arginine binding by the Tetrahymena intron.

Biochemistry 1989;28: 980-988.

55. Yarus M. An RNA-amino acid complex and the origin of the genetic code. New

Biologist 1991;3: 183-189.



56. Tao J, Frankel AD. Specific binding of arginine to TAR RNA. Proc Natl Acad Sci

USA 1992;89: 2723-2726.

57. Yarus M. Amino Acids as RNA Ligands: a Direct-RNA-Template Theory for the

Code's Origin. J Mol Evol 1998;47: 109-117.

58. Ellington AD, Szostak JW. In vitro selection of RNA molecules that bind

specific ligands. Nature 1990;346: 818-822.

59. Tuerk C, Gold L. Systematic evolution of ligands by exponential enrichment:

RNA ligands to bacteriophage T4 DNA polymerase. Science 1990;249: 505-510.

60. Robertson DL, Joyce GF. Selection in vitro of an RNA enzyme that specifically

cleaves single- stranded DNA. Nature 1990;344: 467-468.

61. Ciesiolka J, Illangasekare M, Majerfeld I, et al. Affinity selection-amplification

from randomized ribooligonucleotide pools. Meth Enzymol 1996;267: 315-335.

62. Majerfeld I, Yarus M. An RNA pocket for an aliphatic hydrophobe. Nat Struct

Biol 1994;1: 287-292.

63. Zinnen S, Yarus M. An RNA pocket for the planar aromatic side chains of

phenylalanine and tryptophane. Nucl Acid Symp Ser 1995;33: 148-151.

64. Majerfeld I, Yarus M. Isoleucine:RNA sites with essential coding sequences.

RNA 1998;4: 471-478.

65. Mannironi C, Scerch C, Fruscoloni P, et al. Molecular recognition of amino acids

by RNA aptamers: the evolution into an L-tyrosine binder of a dopamine-binding RNA

motif. RNA 2000;6: 520-527.

65a. Illangasekare M, Yarus M. Phenylalanine-binding RNAs and genetic code

evolution. J Mol Evol 2002;54: 298-311.



66. Famulok M. Molecular recognition of amino acids by RNA-aptamers: an L-

citrulline binding RNA motif and its evolution into an L-arginine binder. J Am Chem Soc

1994;116: 1698-1706.

67. Connell GJ, Illangsekare M, Yarus M. Three small ribooligonucleotides with

specific arginine sites. Biochemistry 1993;32: 5497-5502.

68. Connell GJ, Yarus M. RNAs with dual specificity and dual RNAs with similar

specificity. Science 1994;264: 1137-1141.

69. Yarus M. An RNA-amino acid affinity, in The RNA World, Gesteland RF, Atkins

JF, Editors. Cold Spring Harbor Laboratory Press: New York 1993;205-217.

70. Tao J, Frankel AD. Arginine-binding RNAs resembling TAR identified by in

vitro selection. Biochemistry 1996;35: 2229-2238.

71. Burgstaller P, Kochoyan M, Famulok M. Structural probing and damage selection

of citrulline- and arginine-specific RNA aptamers identify base positions required for

binding. Nucl Acid Res 1995;23: 4769-4776.

72. Geiger A, Burgstaller P, von der Eltz H, et al. RNA aptamers that bind L-arginine

with sub-micromolar dissociation constants and high enantioselectivity. Nucl Acid Res

1996;24: 1029-1036.

73. Yang Y, Kochoyan M, Burgstaller P, et al. Structural basis of ligand

discrimination by two related RNA aptamers resolved by NMR spectroscopy. Science

1996;272: 1343-1346.

74. Wong JT-F. A co-evolution theory of the genetic code. Proc Natl Acad Sci USA

1975;72: 1909-1912.



75. Sonneborn TM. Degeneracy of the genetic code: extent, nature, and genetic

implications, in Evolving Genes and Proteins, Bryson V and Vogel HJ, Eds. Academic

Press: New York 1965;377-297.

76. Knight RD, Landweber LF. Guilt by association: the arginine case revisited.

RNA, 2000;6: 499-510.

77. Knight RD, Landweber LF. Rhyme or reason: RNA-arginine interactions and the

genetic code. Chem Biol 1998;5: R215-R220.

78. Ellington AD, Khrapov M, Shaw CA. The scene of a frozen accident. RNA

2000;6: 485-498.

79. Illangasekare M, Sanchez G, Nickles T, et al. Aminoacyl-RNA synthesis

catalyzed by an RNA. Science 1995;267: 643-647.

80. Illangasekare M, Yarus M. Specific, rapid synthesis of Phe-RNA by RNA. Proc

Natl Acad Sci U S A 1999;96: 5470-5475.

81. Illangasekare M, Yarus M. A tiny RNA that catalyzes both aminoacyl-RNA and

peptidyl-RNA synthesis. RNA 1999;5: 1482-1489.

82. Welch M, Majerfeld I, Yarus M. 23S rRNA similarity from selection for peptidyl

transferase mimicry. Biochemistry 1997;36: 6614-6623.

83. Nissen P, Hansen J, Ban N, et al. The structural basis of ribosome activity in

peptide bond synthesis. Science 2000;289: 920-930.

84. Yarus M, Welch M. Peptidyl transferase: ancient and exiguous. Chem Biol

2000;7: R187-R190.

85. Kumar RK, Yarus M. RNA-catalyzed amino acid activation. Biochemistry

2001;40: 6998-7004.



86. Yarus M, Majerfield I. Co-optimization of ribozyme substrate stacking and L-

arginine binding. J Mol Biol 1992;225: 945-949.

87. Famulok M, Szostak JW. Stereospecific recognition of tryptophan agarose by in

vitro selected RNA. J Am Chem Soc 1992;114: 3990-3991.

88. Sokal RR, Rohlf FJ, Biometry: The Principles and Practice of Statistics in

Biological Research. 3rd ed. New York: W. H. Freeman and Company 1995.

89. Yarus, M. On translation by RNAs alone. Cold Spring Harb Symp Quant Biol

2001;66: 207-215.


Word document

Documents

standard statistical tests

codon binding site associations

cognate amino acids

present codon assignments

actual codon assignments

modern genetic code

amino acid

amino acids