Top Banner
Bioinformatic Analyses of the tRNA: (guanine 26, N 2 , N 2 )-dimethyltransferase (Trm1) Family Janusz M. Bujnicki 1 *, Richard A. Leach 3 , Janusz Debski 1 , and Leszek Rychlewski 2 1 Bioinformatics Laboratory, International Institute of Molecular and Cell Biology, ul. ks. Trojdena 4, 02-109 Warsaw, Poland 2 BioInfoBank Institute, ul. Limanowskiego 24A, 60-744 Poznan, Poland 3 Division of Experimental Therapeutics, Walter Reed Army Institute of Research, 503 Robert Grant Avenue, Silver Spring, Maryland 20910, USA Abstract Functional analyses of the tRNA:(guanine 26, N 2 , N 2 )-dimethyltransferase (Trm1) have been hampered by a lack of structural information about the enzyme and by low sequence similarity to better studied methyltransferases. Here we used computational methods to detect novel homologs of Trm1, infer the evolutionary relationships of the family, and predict the structure of the Trm1 methyltransferase. The N-terminal region of the protein is predicted to form an S-adenosylmethio- nine-binding domain, which harbors the active site. The C-terminal region is rich in predicted a-helices and, in analogy to other nucleic acid methyltransferases, may constitute the target recognition domain of the enzyme. Interposing these two domains, most Trm1 homologs possess a highly variable inserted sequence that is delim- ited by a Cys 4 cluster, likely forming a Zn-finger structure. The residues of Trm1 predicted to participate in cofactor binding, target recognition, and catalysis, were mapped onto a preliminary structural model, providing a platform for design- ing new experiments to better understand the molecular functions of this protein family. In addition, identification of novel, atypical Trm1 homologs suggests candidates for cloning and biochemical characterization. Introduction RNAs contain a plethora of chemically altered nucleo- sides, which are formed by enzymatic modification of the primary transcript during RNA maturation. Although modified nucleosides have been identified in all RNAs, the tRNAs contain the greatest number and variety of modifications by far (Rozenski et al., 1999; Grosjean and Benne, 1998). In general, modified nucleosides are of minor importance for cell growth and/or survival and their functions have remained obscure for many years. However, it is now known that these modifications improve the fidelity and efficiency of tRNA in decoding genetic messages and in main- taining the reading frame during protein synthesis (reviewed by (Curran, 1998; Bjork et al., 1999)). Among numerous naturally occurring nucleotide modifications, the simplest and most common are methylation of nucleotide bases and the 2 0 -hydroxyl group of ribose (Rozenski et al., 1999). One such ex- ample is dimethylation of the exocyclic 2-amino group of guanine. N 2 ,N 2 -dimethylguanosine (m 2 2 G) has been detected in almost two-thirds of the tRNAs from Eukaryota and in about one-half of the tRNAs from Archaea, but never in any bacterial tRNAs (Sprinzl et al., 1998). In addition to the commonly occurring methylation at position 26 in eukaryotic tRNA, m 2 2 G is occasionally found also at position 27, whereas in Archaea m 2 2 G can also be found at position 10, but never concurrently with a m 2 2 G26 modification (reviewed by Grosjean et al. , 1995). It has been demonstrated that m 2 2 G26, but not m 2 2 G10, is generated by the S-adenosyl-L-methionine (AdoMet)-dependent methyltransferase (MTase) Trm1 ( tRNA methyltransferase 1) and that the reaction proceeds via a monomethylated intermediate (m 2 G) (Reinhart et al., 1986; Constantinesco et al., 1999). Genes encoding Trm1 in Pyrococcus furiosus (Constantinesco et al., 1998) Saccharomyces cerevisiae (Liu and Straby, 1998), Caenorhabditis elegans (Liu et al., 1999) and humans (Liu and Straby, 2000) (hereafter referred to as PfTrm1, ScTrm1, CeTrm1, and hTrm1, respectively) have been cloned, and the corresponding protein pro- ducts have been studied experimentally. A Trm1 ho- molog has been identified in the completely sequenced genome of the thermophilic bacterium Aquifex aeolicus (Deckert et al., 1998). However, to our knowledge, it has not been characterized, and the tRNAs from this bacterium have not been examined for the presence of m 2 2 G. The function of m 2 2 G26 is not known; S. cerevi- siae cells having the TRM1 gene deleted exhibit no phenotype (Ellis et al., 1987). However, it has been observed that the TRM1 deletion in Schizosacchar- omyces pombe increases the capacity of the sup3-I serine tRNA to translate the UAA (ochre) codon by some unknown mechanism (Niederberger et al., 1999). It has been hypothesized that the dimethylated gua- nine is involved in determining the flexibility of the tRNA molecule, and that in so doing, it facilitates the interaction with various macromolecules in the cell (Edqvist et al., 1995). It should be noted that the steric hindrance introduced by the dimethylation eliminates * For correspondence. Email [email protected]; Tel. +48-22 668 5384; Fax. +48-22 668 5288. J. Mol. Microbiol. Biotechnol. (2002) 4(4): 405–415. JMMB Research Article # 2002 Horizon Scientific Press
12

Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

Aug 03, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

Bioinformatic Analyses of the tRNA: (guanine 26,N 2,N 2)-dimethyltransferase (Trm1) Family

Janusz M. Bujnicki1*, Richard A. Leach3,Janusz Debski1, and Leszek Rychlewski2

1Bioinformatics Laboratory, International Institute ofMolecular and Cell Biology, ul. ks. Trojdena 4, 02-109Warsaw, Poland2BioInfoBank Institute, ul. Limanowskiego 24A, 60-744Poznan, Poland3Division of Experimental Therapeutics, Walter ReedArmy Institute of Research, 503 Robert Grant Avenue,Silver Spring, Maryland 20910, USA

Abstract

Functional analyses of the tRNA:(guanine 26,N 2,N 2)-dimethyltransferase (Trm1) have beenhampered by a lack of structural information aboutthe enzyme and by low sequence similarity tobetter studied methyltransferases. Here we usedcomputational methods to detect novel homologsof Trm1, infer the evolutionary relationships of thefamily, and predict the structure of the Trm1methyltransferase. The N-terminal region of theprotein is predicted to form an S-adenosylmethio-nine-binding domain, which harbors the activesite. The C-terminal region is rich in predicteda-helices and, in analogy to other nucleic acidmethyltransferases, may constitute the targetrecognition domain of the enzyme. Interposingthese two domains, most Trm1 homologs possessa highly variable inserted sequence that is delim-ited by a Cys4 cluster, likely forming a Zn-fingerstructure. The residues of Trm1 predicted toparticipate in cofactor binding, target recognition,and catalysis, were mapped onto a preliminarystructural model, providing a platform for design-ing new experiments to better understand themolecular functions of this protein family. Inaddition, identification of novel, atypical Trm1homologs suggests candidates for cloning andbiochemical characterization.

Introduction

RNAs contain a plethora of chemically altered nucleo-sides, which are formed by enzymatic modificationof the primary transcript during RNA maturation.Although modified nucleosides have been identifiedin all RNAs, the tRNAs contain the greatest numberand variety of modifications by far (Rozenski et al.,

1999; Grosjean and Benne, 1998). In general, modifiednucleosides are of minor importance for cell growthand/or survival and their functions have remainedobscure for many years. However, it is now known thatthese modifications improve the fidelity and efficiencyof tRNA in decoding genetic messages and in main-taining the reading frame during protein synthesis(reviewed by (Curran, 1998; Bjork et al., 1999)).

Among numerous naturally occurring nucleotidemodifications, the simplest and most common aremethylation of nucleotide bases and the 20-hydroxylgroup of ribose (Rozenski et al., 1999). One such ex-ample is dimethylation of the exocyclic 2-amino group ofguanine. N 2,N 2-dimethylguanosine (m2

2G) has beendetected in almost two-thirds of the tRNAs from Eukaryotaand in about one-half of the tRNAs from Archaea, but neverin any bacterial tRNAs (Sprinzl et al., 1998). In additionto the commonly occurring methylation at position 26in eukaryotic tRNA, m2

2G is occasionally found also atposition 27, whereas in Archaea m2

2G can also befound at position 10, but never concurrently with am2

2G26 modification (reviewed by Grosjean et al.,1995). It has been demonstrated that m2

2G26, but notm2

2G10, is generated by the S-adenosyl-L-methionine(AdoMet)-dependent methyltransferase (MTase) Trm1(tRNA methyltransferase 1) and that the reactionproceeds via a monomethylated intermediate (m2G)(Reinhart et al., 1986; Constantinesco et al., 1999). Genesencoding Trm1 in Pyrococcus furiosus (Constantinescoet al., 1998) Saccharomyces cerevisiae (Liu and Straby,1998), Caenorhabditis elegans (Liu et al., 1999) andhumans (Liu and Straby, 2000) (hereafter referred toas PfTrm1, ScTrm1, CeTrm1, and hTrm1, respectively)have been cloned, and the corresponding protein pro-ducts have been studied experimentally. A Trm1 ho-molog has been identified in the completely sequencedgenome of the thermophilic bacterium Aquifex aeolicus(Deckert et al., 1998). However, to our knowledge, ithas not been characterized, and the tRNAs from thisbacterium have not been examined for the presenceof m2

2G.The function of m2

2G26 is not known; S. cerevi-siae cells having the TRM1 gene deleted exhibit nophenotype (Ellis et al., 1987). However, it has beenobserved that the TRM1 deletion in Schizosacchar-omyces pombe increases the capacity of the sup3-Iserine tRNA to translate the UAA (ochre) codon bysome unknown mechanism (Niederberger et al., 1999).It has been hypothesized that the dimethylated gua-nine is involved in determining the flexibility of thetRNA molecule, and that in so doing, it facilitates theinteraction with various macromolecules in the cell(Edqvist et al., 1995). It should be noted that the sterichindrance introduced by the dimethylation eliminates

*For correspondence. Email [email protected]; Tel. +48-22 668 5384;Fax. +48-22 668 5288.

J. Mol. Microbiol. Biotechnol. (2002) 4(4): 405–415. JMMB Research Article

# 2002 Horizon Scientific Press

Page 2: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

• MALDI-TOF Mass Spectrometry in Microbiology

Edited by: M Kostrzewa, S Schubert (2016) www.caister.com/malditof

• Aspergillus and Penicillium in the Post-genomic Era

Edited by: RP Vries, IB Gelber, MR Andersen (2016) www.caister.com/aspergillus2

• The Bacteriocins: Current Knowledge and Future Prospects

Edited by: RL Dorit, SM Roy, MA Riley (2016) www.caister.com/bacteriocins

• Omics in Plant Disease Resistance

Edited by: V Bhadauria (2016) www.caister.com/opdr

• Acidophiles: Life in Extremely Acidic Environments

Edited by: R Quatrini, DB Johnson (2016) www.caister.com/acidophiles

• Climate Change and Microbial Ecology: Current Research and Future Trends

Edited by: J Marxsen (2016) www.caister.com/climate

• Biofilms in Bioremediation: Current Research and Emerging Technologies

Edited by: G Lear (2016) www.caister.com/biorem

• Microalgae: Current Research and Applications

Edited by: MN Tsaloglou (2016) www.caister.com/microalgae

• Gas Plasma Sterilization in Microbiology: Theory, Applications, Pitfalls and New Perspectives

Edited by: H Shintani, A Sakudo (2016) www.caister.com/gasplasma

• Virus Evolution: Current Research and Future Directions

Edited by: SC Weaver, M Denison, M Roossinck, et al. (2016) www.caister.com/virusevol

• Arboviruses: Molecular Biology, Evolution and Control

Edited by: N Vasilakis, DJ Gubler (2016) www.caister.com/arbo

• Shigella: Molecular and Cellular Biology

Edited by: WD Picking, WL Picking (2016) www.caister.com/shigella

• Aquatic Biofilms: Ecology, Water Quality and Wastewater Treatment

Edited by: AM Romaní, H Guasch, MD Balaguer (2016) www.caister.com/aquaticbiofilms

• Alphaviruses: Current Biology

Edited by: S Mahalingam, L Herrero, B Herring (2016) www.caister.com/alpha

• Thermophilic Microorganisms

Edited by: F Li (2015) www.caister.com/thermophile

• Flow Cytometry in Microbiology: Technology and Applications

Edited by: MG Wilkinson (2015) www.caister.com/flow

• Probiotics and Prebiotics: Current Research and Future Trends

Edited by: K Venema, AP Carmo (2015) www.caister.com/probiotics

• Epigenetics: Current Research and Emerging Trends

Edited by: BP Chadwick (2015) www.caister.com/epigenetics2015

• Corynebacterium glutamicum: From Systems Biology to Biotechnological Applications

Edited by: A Burkovski (2015) www.caister.com/cory2

• Advanced Vaccine Research Methods for the Decade of Vaccines

Edited by: F Bagnoli, R Rappuoli (2015) www.caister.com/vaccines

• Antifungals: From Genomics to Resistance and the Development of Novel Agents

Edited by: AT Coste, P Vandeputte (2015) www.caister.com/antifungals

• Bacteria-Plant Interactions: Advanced Research and Future Trends

Edited by: J Murillo, BA Vinatzer, RW Jackson, et al. (2015) www.caister.com/bacteria-plant

• Aeromonas

Edited by: J Graf (2015) www.caister.com/aeromonas

• Antibiotics: Current Innovations and Future Trends

Edited by: S Sánchez, AL Demain (2015) www.caister.com/antibiotics

• Leishmania: Current Biology and Control

Edited by: S Adak, R Datta (2015) www.caister.com/leish2

• Acanthamoeba: Biology and Pathogenesis (2nd edition)

Author: NA Khan (2015) www.caister.com/acanthamoeba2

• Microarrays: Current Technology, Innovations and Applications

Edited by: Z He (2014) www.caister.com/microarrays2

• Metagenomics of the Microbial Nitrogen Cycle: Theory, Methods and Applications

Edited by: D Marco (2014) www.caister.com/n2

Caister Academic Press is a leading academic publisher of advanced texts in microbiology, molecular biology and medical research. Full details of all our publications at caister.com

Further Reading

Order from caister.com/order

Page 3: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

the ability of G to form a Watson-Crick type pair with C,whereas m2

2G pairing to U, A, or G is practicallyunchanged because the N2 group of guanosine is notimplicated in these base pairs. Interestingly, it hasbeen proposed that m2

2G26 is required to avoid ex-tension of the anticodon stem to more than six basepairs, and hence, prevent cytosolic tRNAs from foldinginto the ‘‘alternative’’ type-5 or type-7 pattern observedin some unusual mitochondrial tRNAs (Steinberg andCedergren, 1994; Steinberg and Cedergren, 1995).

The structure and catalytic mechanism of Trm1remain unknown. Other families of MTases thatmethylate exocyclic amino groups of nitrogenous basesin DNA and RNA, namely N6 of adenine (m6A), N4 ofcytosine (m4C), and N2 of guanine (m2G), have aconserved ‘‘catalytic’’ motif IV (the [N/D/S]-P-P-[F/Y/W/H] tetrapeptide) (Klimasauskas et al., 1989; Maloneet al., 1995; Fauman et al., 1999; Bujnicki, 2000).Hence, it has been proposed that these enzymesemploy a similar reaction mechanism and should beclassified as a single family of N-MTases. However, asimilar motif could not be identified in the Trm1sequence, as this particular enzyme has only limitedsimilarity to other MTases. Because of this fact, it isimpossible to make straightforward inferences aboutfunctionally important sites based solely on comparisonto known MTase structures. Except for the predictedAdoMet-binding site (Constantinesco et al., 1998) andseveral other amino acid residues shown to beimportant for the activity of the yeast enzyme (Liu andStraby, 1998), little is known about residues that may beinvolved in catalysis. Further, the structural elementsresponsible for tRNA binding and positioning of G26 inthe active site are unknown.

To learn more about the evolutionary origin andthe sequence-structure-function relationships in theTrm1 family, we attempted to predict the structure ofthis enzyme in greater detail. Here, we report theresults of fold recognition and the analyses ofpreliminary models built for all domains identified inthe Trm1 primary structure. Structure-based sequenceanalysis allowed prediction of residues involved incofactor binding, the methyl transfer reaction, andtarget recognition.

Results and Discussion

Sequence Analysis and Preliminary StructurePredictionThe amino-acid sequences of ScTrm1 and PfTrm1 wereused in PSI-BLAST searches to identify homologousproteins. They were subsequently submitted to the foldrecognition MetaServer in order to identify boundariesof the of the AdoMet-binding domain as well as otherpossible domains (see Experimental Procedures).Figure 1 shows the refined multiple sequence alignmentof the Trm1 family along with the predicted secondarystructure. The sequence similarity is higher in theN-proximal part of the alignment. However, severalwell-conserved motifs can be delineated in theC-proximal part. It is noteworthy that the archaealsequences are shorter than their eukaryotic counter-

parts, which possess extended termini and variableinsertions in the central part of the protein (correspond-ing to the vicinity of residue 260 in Pf Trm1).

According to the fold recognition MetaServer, theN-terminal region, corresponding to aa 30–240 in PfTrm1, exhibited strong similarity to the MTase fold(Pcons score 10.35). Most servers reported theuncharacterized MTase MJ0882 from M. jannaschii,1dus in PDB, as most similar to Trm1. This proteinstructure was solved as a part of a structural genomicpilot project. However, to our knowledge its functionalcharacterization has not been reported. Sequenceanalysis suggests that MJ0882 is closely related to afamily of bacterial rRNA:m2G MTases (Bujnicki, 2000)J.M. Bujnicki and coworkers, manuscript in prepara-tion). Hence, the structural similarity of tRNA:m2

2GMTases and MJ0882 detected by threading suggeststhat these two families of guanine-N2-specific en-zymes might share a common ancestor. In addition tothe compatibility of the Trm1 sequence with the MTasefold, the pattern of secondary structures predicted forTrm1 agreed perfectly with that observed in experi-mentally determined structures of MTases. Thus, thecommon MTase motifs map to conserved regions inthe alignment of Trm1 sequences (Figure 1).

The C-terminal part of PfTrm1 (aa 283-381)exhibited no clear sequence similarity to known proteindomains. Only its central fragment exhibited somesimilarity to the ‘‘winged helix’’ (wH) family of nucleicacid-binding proteins (reviewed by (Gajiwala andBurley, 2000)) at the level of the secondary andtertiary structure (Pcons score 4.06; Figure 1). It istempting to speculate that the C-terminal domain oftRNA:m2

2G MTases is involved in tRNA binding, inanalogy to other MTases (Fauman et al., 1999).However, in the absence of unequivocal sequencesimilarity to wH proteins, the tertiary fold assignmentfor the C-terminal domain 1 must await further studies.

The central part of the Trm1 sequences isextremely variable and its length varies from roughly30 aa in Trm1 from Pyrococci, Methanococci, andThermoplasmales, to roughly 50 aa in yeast, to 150 aain Plasmodium falciparum (Figure 1). According to thesecondary structure prediction methods, this region isquite poor in helices and in strands. Interestingly, allproteins except those from the above-mentionedarchaeal lineages, possess a cluster of four conservedCys residues [C-x2-C-xn-C-x1–2-C, where x is anyamino acid] that demarcate the variable region. Onlythe Aeropyrum pernix cluster is partially eroded.Additionally, the first pair of Cys residues fromAspergillus fumigatus Trm1 is separated by threerather than two amino acids. Remarkably, substitutionof the first Cys residue of the cluster for Arg ineukaryotic MTases eliminated their activity (Liu andStraby, 2000). Threading analysis using the Methano-bacterium thermoautotrophicum Trm1 homolog (aa236–281) revealed that this region is similar to varioussmall Fe or Zn-liganding domains, for instance therubredoxin domain of rubrerythrin (1bj6 in PDB; Pconsscore 3.06). Localization of the Cys4 cluster domain inthe variable region between the predicted catalytic and

406 Bujnicki et al.

Page 4: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

Figure 1. Multiple sequence alignment of the Trm1 family. Sequences are denoted by their NCBI Gene Identification numbers; the ‘‘unf’’ suffixindicates sequences obtained from the unfinished genome data. Conserved motifs are labeled according to the nomenclature described for theAdoMet-dependent MTase superfamily (Fauman et al., 1999). Conserved residues are highlighted in black, the residues with invariantphysicochemical character (hydrophobic, small etc.) are highlighted in gray. Numbers show the size of termini (unknown in preliminary sequences)or insertions omitted for clarity. Based on the threading results, the N-terminal domain has been aligned to the structure of the Mj0882 protein (1dusin PDB) and the C-terminal structure has been aligned to the structure of the SmtB repressor (1smt in PDB). Secondary structures are shown belowthe alignment, predicted for the Trm1 family and derived from the experiment in case of the crystal structures.

Page 5: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

Figure 1. Continued

408 Bujnicki et al.

Page 6: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

RNA-binding domains suggests that the cluster may beinvolved in mutual positioning of the two functionaldomains respective to each other and to the substrate.

The ‘‘orthodox’’ Trm1 family members from Eukar-yota possess a common extension in the C-terminus,which is dissimilar to other protein sequences in thedatabase (data not shown). Deletion of up to 30C-terminal residues of ScTrm1 lowered its activity to! 7% of the wild type. Similarly, deletion of 29N-terminal residues from ScTrm1 decreased its m2GMTase activity and profoundly reduced its ability toperform the methylation of m2G to m2

2G (Liu andStraby, 1998). Deletion of 31 N-terminal amino acidsresulted in complete elimination of ScTrm1 activity.Moreover, combinations of N- and C-terminal deletionsexceeded the added individual effects. However, asmany as 37 N-terminal residues of ScTrm1 have nocounterpart in PfTrm1 and some of the archaealenzymes have even shorter N-termini. Furthermore,only the eukaryotic Trm1 proteins have the extendedC-terminus. Hence, these regions are unlikely to bedirectly involved in catalysis, but may mediate someEukaryote-specific interactions with the target tRNA orother proteins. Interestingly, hTrm1 has an additionalC-terminal 90 aa protrusion, which harbors a putativeZn finger and a Pro-rich region, and whose deletion didnot affect the m2

2G MTase activity (Liu and Straby,2000). In contrast, the uncharacterized human Trm1homolog (C1ORF25, identified as such by Sood et al.,2001) exhibits an unrelated N-terminal extension ofapproximately. 200 aa. This extension also harborsa Pro-rich region in its N-proximal part and 7 Cysresidues in its C-proximal part.

ScTrm1 is necessary for the m22G26 formation in

both mitochondrial and cytoplasmic tRNAs (Ellis et al.,1986). It is also targeted to the nucleus and an efficientnuclear localization sequence (NLS) has been mappedto the region between aa 95 and 102 (Rose et al.,1992), corresponding to the insertion between motifsX, which is typical for eukaryotic Trm1 proteins(Figure 1). Other eukaryotic TRM1 genes are alsobelieved to encode proteins with different subcellulardistributions, called ‘‘sorting isozymes’’; in their se-quences a similar basic type of NLS has beenidentified (Stanford et al., 2000). However, the mito-chondrial targeting sequence (MTS) could not bepredicted with confidence. Among ‘‘additional’’ regionstypical for eukaryotic proteins that could target them tothe appropriate subcellular locations, the N-terminusimplicated in mitochondrial targeting of ScTrm1 (Elliset al., 1989) is not conserved in all family members andthe C-terminus has been shown to be essential for theenzymatic activity of Trm1 rather than subcellularsorting (Liu and Straby, 1998). Stanford et al. (Stan-ford et al., 2000) noted that the insertion within theCys4 cluster is much larger in eukaryotic Trm1.However, we could not detect any obvious MTS inthis region; it remains to be determined experimentallyif it indeed contains the targeting information.

The P. falciparum, T. brucei and L. major Trm1homologs exhibit particularly long insertions in thecommon core, which display a low-complexity char-

acter (data not shown). Such regions are quite typicalfor proteins from these organisms and are believed toencode nonglobular domains that are extruded fromthe protein core and do not interfere with the functionof the protein (Pizzi and Frontali, 2001).

Structure-Based Prediction of FunctionalResidues in Trm1Since Trm1 comprises several domains, and sincesome of the elaborations of the common fold (inser-tions and terminal extensions) cannot be confidentlyaligned to their counterparts in known structures, weconsidered detailed homology modeling unfeasible.However, using the fold recognition results andsecondary structure prediction, we were able to predictthe architecture of Trm1 and compare it with knownstructures of MTases. Figure 2 shows a working modelof a Trm1 structure, which reflects the mutual orienta-tion of all structural elements and onto which most ofthe conserved residues can be mapped. By analogy tothe known MTase structures, we suggest that motifsI–III are involved in binding of the methyl group donor,while the charged, polar, and aromatic residues frommotifs X, IV, and VI form the core of the active site.

Analysis of the Cofactor-Binding Regionof the MTase DomainThe AdoMet-binding site is the only region in MTasesrelatively easily recognized at the sequence level,

Figure 2. Preliminary structural model of the Trm1 MTase. The proteinstructure and the simplistic representation of the tRNA molecule arenot drawn to scale. Black arrows indicate b-strands, white tubesindicate a-helices. Conserved residues proposed to participate inbinding or catalysis are listed, numbers correspond to positions in thePfTrm1 sequence. Specific contacts to functional groups of thecofactor moiety are shown as arrows. The residues predicted to formthe catalytic pocket are shown in the vicinity of the guanine nucleotide.The residues predicted to participate in tRNA binding are shown neartRNA. The arrangement of secondary structure elements in theC-terminal domain is arbitrary and does not imply any particular fold.

In Silico Analysis of tRNA: m22G26 MTases 409

Page 7: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

when comparing subfamilies modifying different sub-strates (Fauman et al., 1999). Motif I, the hallmark ofthe MTase superfamily, was originally discovered byanalysis of DNA:m5C MTases (Posfai et al., 1989). It isa glycine-rich sequence that makes a sharp turn to formthe pocket that binds the methionine moiety of AdoMet.To date, motif I has been the only common MTase motifcorrectly recognized in the Trm1 sequence (Constan-tinesco et al., 1998; Liu et al., 1999). Recently, high-resolution crystal structure of rRNA:20-O-ribose MTaseRrmJ showed a conserved D57 residue from motif Icoordinates the methionine amino group of AdoMet viaan ordered water molecule (Bugl et al., 2000). Theacidic side chain at this position, in strand 1, isconserved in nearly all MTase families analyzed (Fau-man et al., 1999). It has been shown to be essential foractivity of the cap MTase of yeast (Saha et al., 1999).Correspondingly, it is predicted that the nearly invariantD55 in PfTrm1 takes part in AdoMet binding using thesame mechanism. The other conserved Asp residues,predicted to make contacts to AdoMet via the ribose 20

and 30 hydroxyl group and the N6 amino group of theadenine moiety, are D80 from motif II and D122 from motifIII (Figure 2).

The region corresponding to motif III in thisassignment has been predicted to be the counterpartof motif IV by Constantinesco et al. (Constantinescoet al., 1998) named motif II according to the ‘‘alter-native’’ nomenclature of Kagan and Clarke (1994).However, none of the sequence analysis or threadingprograms reported such match. Homology models ofthe Trm1 MTase domain based on the alignmentadjusted to this earlier prediction displayed a disruptedhydrophobic core and a misfolded cofactor-bindingregion (data not shown). The model presented hereinoffers a convenient basis for experimental verificationof either hypothesis (e.g. by site-directed mutagenesisof predicted AdoMet-binding residues).

Analysis of the Guanine-Binding/Active SiteRegion of the MTase DomainThe remainder of the core MTase structure exhibitssequence similarities only within groups of MTases thatmethylate chemically-related targets, presumably re-flecting distinct catalytic requirements (Fauman et al.,1999). MTases that act on the exocyclic amines ofadenine, cytosine or guanine (N-MTases), have thedegenerate consensus (N/D/S)-P-P-(F/Y/W/H) in motifIV, which is quite easily recognizable in multiple se-quence alignments (Malone et al., 1995; Bujnicki andRadlinska, 1999; Fauman et al., 1999; Bujnicki, 2000).The structure of the DNA:m4C MTase M.PvuII led to theproposal that the polar side chain of the N/D/S and ofthe main chain carbonyl of the subsequent Pro, form apair of hydrogen bonds to the target amino group.Concurrently, the aromatic F/Y/W/H makes a face-to-face p-stacking interaction with the target base(Gong et al., 1997). This hypothesis has beenconfirmed by the recently published protein-DNAcocrystal structure of the DNA:m6A MTase M.TaqI(Goedecke et al., 2001).

Threading analysis of the Trm1 sequence andpredicted structure revealed an unambiguous match ofmotif IV in known MTase structures to the D-P-(F/Y)-Gtetrapeptide in Trm1 (Figure 1). Conservation of theD140 residue between Trm1 and N-MTases suggeststhat its side chain makes equivalent contacts to theN-2 amino group of the target guanine, as contacts tothe N-4 group of cytosine and the N-6 group of adenineare made by its counterparts (N/D/S) in DNA MTases.Likewise, we propose that the carbonyl oxygen of theinvariant P141 residue hydrogen bonds to the N-2amino group of G26. Surprisingly, in the Trm1 motif IV,the conserved aromatic residue in the third position(F142 in PfTrm1) is misaligned with respect to thearomatic residue observed in the fourth position inmost N-MTases studied to date (Malone et al., 1995;Fauman et al., 1999; Bujnicki, 2000). We simulated asubstitution of the third and fourth consensus positionsin the M.Taq I MTase structure with the FG dipeptidein silico and found the contact of F142 with the targetbase likely to occur (data not shown). However, the‘‘original’’ position of the aromatic side chain in M.TaqIwith respect to the target base could not be reached bythe F142 phenyl ring, unless the conformation of theP141 backbone was modified. This resulted in disrup-tion of the predicted essential contact to the methy-lated N2 group. Therefore, it is unlikely that F142 stacksagainst the guanine in the same manner as does thehydrophobic side chain in other N-MTases (includingthe family of m2G MTases related to MJ0882 (Bujnicki,2000)) unless Trm1 binds the target base in a differentmode. Another conserved aromatic residue in thevicinity of motif IV corresponds to F148 in PfTrm1;however in all preliminary models based on thethreading alignments this amino acids was buried inthe hydrophobic core and could not make contacts withthe target base (data not shown). A conserved Tyrresidue (Y184 in PfTrm1) is present in nearly all Trm1family members in the insertion between motifs VI andVII predicted to form an additional a-helix. However,this region is absent form other known N-MTases anddue to the lack of a suitable structural template couldnot be modeled with confidence. It remains to bedetermined experimentally if F148 or Y184 or both ofthese residues interact with the target base.

Other conserved residues of PfTrm1 predicted toline up the catalytic pocket and the tRNA-bindingsurface of the MTase domain include T167, D168, E196,and R200 from motifs VI and VII, as well as tripeptidesFYN31 and NRD37 from motif X. Threading the Trm1sequence onto the structure of MJ0882 suggests thatF29 or Y30 are likely candidates to make van der Waalsinteractions with the opposite face of the targetguanine. Another highly conserved charged residuethat mapped to the catalytic face of the model, is Glufrom the highly conserved EG11 dipeptide. In ScTrm1,a substitution of E47 and G48 with alanine resulted in10-fold reduced activity (Liu and Straby, 1998),suggesting that these residues are important, but notessential for catalysis. The ‘‘EG motif’’ is located in aregion that has no conserved counterparts in struc-turally characterized MTases, however its predicted

410 Bujnicki et al.

Page 8: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

b-hairpin structure (Figures 1 and 3) can be alignedwith itscounterpart in the MJ0882 template. We predict that thisregion in Trm1 packs against the three helices ofthe AdoMet-binding subdomain, just as the N-terminal

b-hairpin does in the MJ0882 structure. However,prediction of the exact location of the EG11 dipeptidewith respect to the active site of PfTrm1 (Figure 2) isbeyond the limits of the current model.

Figure 3. Unrooted evolutionary tree of the Trm1 family. Numbers at the nodes indicate the statistical support of the branching order by the bootstrapcriterion. The nodes with bootstrap support < 50% are shown as unresolved. The bar at the bottom of the phylogram indicates the evolutionarydistance, to which the branch lengths are scaled based on the estimated divergence.

In Silico Analysis of tRNA: m22G26 MTases 411

Page 9: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

Conservation of the ‘‘DP’’ part of motif IV andpresence of the conserved aromatic residues in thevicinity of the predicted active site suggests that Trm1and its homologs employ a mechanism of methylationsimilar to that observed in DNA m4C and m6A MTases.Hence, these MTases flip G26 out of the anticodon stemto place it in the catalytic pocket. It should be noted thatthe N2 group of guanine is located in the minor grooveof the double-stranded nucleic acid, unlike the N4 groupof cytosine or the N6 group of adenine, which arelocated in the major groove. This suggests that theflipping mechanism, if relevant, may be significantlymodified compared to DNA N-MTases. It is tempting tospeculate that the two subsequent methylations carriedout by Trm1 may require only two flipping events,between which the cofactor molecule is exchanged. Wehope that the presented model will stimulate experi-mental studies aimed at resolving this issue.

Data on mutations that interfere with the Trm1activity are available for only one residue in thepredicted MTase domain. Mutations R246G in CaTrm1and K290A in ScTrm1, which cause a severe loss ofm2

2G MTase activity (Liu et al., 1999), map to the sameposition, corresponding to L215 in PfTrm1 (Figure 1). Inour model, this residue is located in strand b7, which iscrucial for the integrity of the MTase core. The basicresidue at this position is not conserved in the Trm1family and is quite remote from the predicted activesite. However replacing large chains of Arg or Lys withGly or Ala will likely create large cavity in the proteincore that would have a strong destabilizing effect on thewhole protein structure.

Analysis of the C-Terminal DomainIn many nucleic acid MTase families, additionaldomains appended at the N- or C-termini or insertedinto the catalytic domain have been implicated in targetrecognition (reviewed by Fauman et al., 1999). The‘‘variable’’ region essential for sequence-specific tar-get recognition was originally described in DNA:m5CMTases (Lauster et al., 1989) and dubbed the ‘‘targetrecognition domain’’ (TRD). In some RNA MTasefamilies, predicted TRDs resemble known nucleic acidbinding proteins (Aravind and Koonin, 1999; Bujnicki,2000; Bujnicki et al., 2001a), but often they lacksimilarity to proteins in the database and their struc-tures cannot be predicted with confidence (Bujnicki andRychlewski, 2000, 2001).

According to secondary structure prediction, the com-mon C-terminus of Trm1 is rich in a-helices (Figure 1).The a-helical TRD has been identified in the C-termi-nus of two structurally characterized members of therRNA:m6A MTase family (Yu et al., 1997; Bussiereet al., 1998). However, we could not detect anystatistically significant similarity between the C-term-inal domains of the rRNA:m6A family and the Trm1family using either threading or software for sequenceprofile comparisons. We found that a part of theC-terminal region of Trm1 matched the structures ofthe ‘‘winged helix’’ (wH) family of nucleic acid-bindingproteins, with the SmtB repressor (1smt in PDB)reported as the top hit by the consensus server (Pcons

score 4.06; data not shown). However, this regionappeared to be only a part of a large helical extensionand exhibited a very different conservation patternfrom that seen in ‘‘classical’’ wH proteins (reviewed byGajiwala and Burley, 2000). Hence, three-dimensionalstructure prediction of the C-terminal domainmust awaitexperimental support. The S467L mutation, whichabolished all MTase activity in ScTrm1 (Liu and Straby,1998), maps to this region (T356 in PfTrm1, Figure 2),however the role of other conserved residues remains tobe tested by mutagenesis. We hope that the presentedalignment with predicted secondary structures will serveas a useful platform for planning experiments to identifythe tRNA-binding determinants in Trm1.

Evolutionary AnalysisIn addition to the Trm1 orthologs, uncharacterizedTrm1 homologs from humans and plants were de-tected. Moreover, Trm1 homologs were also found inthe unfinished genomic sequences of CyanobacteriaProchlorococcus marinus MIT9313 and Synechococ-cus sp. WH8102. Sequence analysis revealed thatthese proteins possess the central Cys4 clusterdomain, similar to the orthodox Trm1 family membersfrom Eukaryota and A. aeolicus (Figure 1). Thepresence of the shorter version of the insertioncorresponding to the Cys4 domain in some Archaeaand its absence from the others (Figure 1) waspuzzling. It was unclear if this ‘‘patchy’’ distribution isdue to multiple loss or acquisition of the Cys4 moduleor the horizontal transfer and mutual replacement ofthe Trm1 orthologs between the archaeal species.

To identify evolutionary relationships in the Trm1family, phylogenetic trees were inferred from thealignment (as described in Experimental Procedures).The final tree (shown in Figure 3) was generated basedon the conserved catalytic domain (aa 27–235 inPfTrm1), excluding regions with gaps in more than50% sequences. Still, for some branches, the boot-strap support was ! 50%. However, the orthodoxeukaryotic Trm1 proteins clustered apart from allothers, and the branch supporting this pattern receivedhighly significant bootstrap support (98.7% withoutand 91.5% with the sequences of P. falciparum andG. intestinalis). Most of the sequences from this groupshare a common C-terminal extension (not shown),which can be regarded as a likely synapomorphy – ashared feature derived from a common ancestor.Although the entire Trm1 family tree was unrooted,the eukaryotic family served as an outgroup to root theremainder of the tree, and vice versa; the remainingTrm1 homologs served to root the eukaryotic family(Figure 3). The trees for the two branches wererecalculated, and the results confirmed that the deepbranches with bootstrap support 50–60% were cor-rectly resolved in the tree using all sequences.

The newly identified, uncharacterized Trm1 homo-logs from Cyanobacteria, plants, and vertebrates,groups together in a separate subfamily receiving 76%bootstrap support. It is known that strongly divergedsequences producing long branches appear to attractone another due to the increasing level of homoplasies,

412 Bujnicki et al.

Page 10: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

and can be erroneously inferred to be too closely related(reviewed by Lyons-Weiler and Hoelzer, 1997). How-ever, the same branching pattern was observed in a treewhich was generated using the maximum likelihoodapproach, which is known to be less sensitive to thedifferences in evolutionary rate among lineages (Hase-gawa et al., 1991). Members of this subfamily exhibitedno common deviation from the consensus pattern ofconserved residues, therefore no specific modificationof function can be inferred. It remains to be determinedexperimentally if these proteins function as m2

2GMTases and if they exhibit distinct specificity from theorthodox Trm1 family members.

The topology of the archaeal branch is largelyconsistent with the topology of the trees based on SSUrRNA (Takai and Horikoshi, 1999) and large combinedprotein sequence data sets (Brown et al., 2001). Similarcongruence of the protein and organismal phylogeniesis seen in the orthodox eukaryotic Trm1 family mem-bers, suggesting that the ancestor of the Trm1 familywas present in the last common ancestor of Eukaryotaand Archaea. The archaeal proteins lacking the Cys4cluster were mapped to the tree (Figure 3), showing nocorrelation with the organismal phylogeny. It could beargued that this domain originated independently multi-ple times, but the Cys4 cluster is present in all proteinsfrom both ‘‘eukaryotic’’ lineages and its archaealversions are similar to each other despite being ratherisolated on the phylogenetic landscape. Hence, themost parsimonious hypothesis is that this domain is anancient feature and has been independently lost inseveral lineages. Similar loss of metal-binding Cysresidues has been observed in other protein families, forinstance among the bba-Me nucleases, i.e. a subclassof the treble Zn-finger superfamily (Grishin, 2001).

Concluding RemarksSequence analysis and structure prediction of the Trm1family suggests that these proteins comprise two ormore domains responsible for binding to tRNA, bindingthe cofactor, and catalyzing a methyl transfer reaction.A preliminary structural model of Trm1 has beendeveloped, onto which most of the conserved residueswere mapped. This is the first prediction of RNA bindingand catalytic residues for Trm1, therefore this study hasa great potential to become a guide for site-directedmutagenesis experiments. We hope that our analysiswill stimulate cloning and functional characterization ofa new family of Trm1 homologs, as well as detailedstudies on the mechanism of tRNA recognition, possi-ble base flipping, and the role of the Cys4 clusterdomain, especially in the archaeal Trm1 homologs.

Experimental Procedures

Sequence Database SearchesThe position-specific, iterative (PSI-) version ofBLAST (Altschul et al., 1997; Schaffer et al., 2001)was used to search the protein sequence databases.The non-redundant (nr) database and translations ofthe EST (expressed sequence tag), STS (sequence-tagged site), HTG (high throughput genomic) and

GSS (genome survey sequence) divisions of theGenBank database (Wheeler et al., 2001) and thepublicly available complete and incomplete genomesequences, were searched via the websites of theBioinformatics Laboratory of the International Instituteof Molecular and Cell Biology (Warsaw, Poland;http://blast.bioinfo.pl) and the National Center forBiotechnology Information (Bethesda, USA; http://www.ncbi.nlm.nih.gov). The unfinished genomic se-quences of Eubacteria and Archaea used in this workhave been obtained from the U.S. Department ofEnergy Joint Genomic Institute (http://www.jgi.doe.gov) with the exception of the unfinished Pyrobaculumaerophilum sequence analyzed via the ERGO data-base website (http://wit.integratedgenomics.com/ERGO/). The unfinished genomic sequences of fungiAspergillus fumigatus and Candida albicans havebeen obtained from the Institute for Genomic Re-search (http://www.tigr.org) and Stanford GenomeTechnology Center (http://sequence-www.stanford.edu), respectively. Searches were generally initiatedwith the stringent expectation (e) value profile inclu-sion threshold of 10"20 to avoid multitude of hits tothe common AdoMet-binding region of multiple para-logous MTase families. The e-value threshold wasrelaxed in subsequent iterations as the profilesbecome balanced.

Fragments of sequences were assembled usingthe sequences of genuine Trm1 MTases as guides;the predicted mRNA splice sites were verified usingreciprocal BLAST searches against the database com-prising sequences of Trm1 MTases. The multiple se-quence alignment was retrieved from the PSI-BLASToutput using BIB-VIEW (http://bioinfo.pl/bibview.pl),and all columns with gaps were removed. The gaplesssequence alignment was then used as a profile, towhich all the full-length sequences were realignedusing CLUSTALX (Thompson et al., 1997). Manualadjustments were introduced based on the BLASTpairwise comparisons, secondary structure predictions,and threading results.

Structure PredictionProtein structure predictions were performed via theMeta Server interface (Bujnicki et al., 2001b) (http://bioinfo.pl/meta/) using publicly available online ser-vices for fold recognition: FFAS (Rychlewski et al.,2000), 3DPSSM (Kelley et al., 2000), BIOINBGU(Fischer, 2000), GenThreader (Jones, 1999), SAM-T99(Karplus et al., 1999), and FUGUE (Shi et al., 2001),and secondary structure prediction methods: JPRED2(Cuff et al., 1999), sam-t99 (Karplus et al., 1999) andPSI-PRED (McGuffin et al., 2000). Results were furtherprocessed by the Pcons neural network (Lundstromet al., 2001), which compares the structural modelsand the associated scores produced by the individualservers, and produces a ranking of potentially best pre-dictions including confidence scores. The performanceof the MetaServer/Pcons system and the participatingfold recognition servers have been extensively bench-marked, demonstrating that they provide a reliabletool for detection of distant evolutionary relationships

In Silico Analysis of tRNA: m22G26 MTases 413

Page 11: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

(see http://bioinfo.pl/LiveBench, Bujnicki et al., 2001c).At a Pcons score of 3, 87% of the threading models arecorrect; while at a score of 5, 98% of the models arecorrect.

Preliminary homology modeling for individualdomains and their fragments was carried out basedon pairwise target-template alignments produced bythe Meta Server. SWISS-PDB VIEWER was used forall manipulations with protein structures and allcalculations of electrostatic charge distribution (Guexand Peitsch, 1997). The SWISS-MODEL/PROMOD IIserver (Guex and Peitsch, 1997) and the GROMOSforcefield (Scott et al., 1999) were used for modelingand energy minimization. Evaluation of the stereo-chemical and the energetic parameters of models wascarried out using the PROSAII module (Sippl, 1993)embedded within the PROMOD server.

Phylogenetic AnalysisThe evolutionary inference was carried out according tothe neighbor-joining (NJ) method of Saitou and Nei(Saitou and Nei, 1987) using the algorithm implementedin BIONJ (Gascuel, 1997) and the maximum likelihood(ML) method, as implemented in PAML (Yang, 1997).The NJ trees were used to initiate the computationallymore expensive ML search. The software was run viathe Institut Pasteur website (http://bioweb.pasteur.fr).The number of amino acid replacements per sequenceposition in the alignment was estimated using the JTTmodel (Jones et al., 1992). Multiple runs were con-ducted with a randomized sequence input order to avoidthe tree being caught in a local statistical minimum. Thesampling variance of the distance values was estimatedfrom 1000 bootstrap resamplings of the alignmentcolumns, and the nodes with bootstrap support < 50%were collapsed. The phylogenetic groupings observedin the distance matrix-based tree were correlated withthe presence of sequence signatures that can beregarded as synapomorphies (shared features derivedfrom a common ancestor).

Acknowledgements

We are grateful to H. Grosjean (CNRS, France) for stimulatingdiscussions and critical reading of the manuscript. We would also liketo thank all the genome sequencing groups that make their preliminarydata publicly available, without which this work could not be done.Especially, the use of data generated at the Department of EnergyJoint Genomic Institute, the Institute for Genomic Research, StanfordGenome Technology Center, and the California Institute of Technol-ogy is gratefully acknowledged. This work was supported by KBN(grant 8T11F01019 to J.M.B.)

References

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z.,Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST:a new generation of protein database search programs. NucleicAcids Res. 25: 3389–3402.

Aravind, L., and Koonin, E.V. 1999. Novel predicted RNA-bindingdomains associated with the translation machinery. J. Mol. Evol. 48:291–302.

Bjork, G.R., Durand, J.M., Hagervall, T.G., Leipuviene, R., Lundgren,H.K., Nilsson, K., Chen, P., Qian, Q., and Urbonavicius, J. 1999.Transfer RNA modification: influence on translational frameshiftingand metabolism. FEBS Lett. 452: 47–51.

Brown, J.R., Douady, C.J., Italia, M.J., Marshall, W.E., and Stanhope,M.J. 2001. Universal trees based on large combined proteinsequence data sets. Nat. Genet. 28: 281–285.

Bugl, H., Fauman, E.B., Staker, B.L., Zheng, F., Kushner, S.R., Saper,M.A., Bardwell, J.C., and Jakob, U. 2000. RNA methylation underheat shock control. Mol. Cell. 6: 349–360.

Bujnicki, J.M. 2000. Phylogenomic analysis of 16S rRNA:(guanine-N2)methyltransferases suggests new family members and revealshighly conserved motifs and a domain structure similar to othernucleic acid amino-methyltransferases. FASEB J 14: 2365–2368.

Bujnicki, J.M. 2001. Sequence-structure-function relationships inevolution of nucleic acid modification enzymes. Ph.D. thesis,Warsaw University.

Bujnicki, J.M., Blumenthal, R.M., and Rychlewski, L. 2001a. Sequenceanalysis and structure prediction of 23S rRNA:m1G methyltrans-ferases reveals a conserved core augmented with a putativeZn-binding domain in the N-terminus and family-specific elabora-tions in the C-terminus. J. Mol. Microbiol. Biotechnol. (in press)

Bujnicki, J.M., Elofsson, A., Fischer, D., and Rychlewski, L. 2001b.Structure prediction Meta Server. Bioinformatics 17: 750–751.

Bujnicki, J.M., Elofsson, A., Fischer, D., and Rychlewski, L. 2001c.LiveBench-1: continuous benchmarking of protein structure predic-tion servers. Protein Sci. 10: 352–361.

Bujnicki, J.M., and Radlinska, M. 1999. Molecular evolution of DNA-(cytosine-N4) methyltransferases: evidence for their polyphyleticorigin. Nucleic Acids Res. 27: 4501–4509

Bujnicki, J.M., and Rychlewski, L. 2000. Prediction of a novel RNA20-O-ribose methyltransferase subfamily encoded by the Escherichiacoli YgdE open reading frame and its orthologs. Acta Microbiol. Pol.49: 253–260.

Bujnicki, J.M., and Rychlewski, L. 2001. Sequence analysis andstructure prediction of aminoglycoside-resistance 16S rRNA:m7Gmethyltransferases. Acta Microbiol. Pol. 50: 7–17.

Bussiere, D.E., Muchmore, S.W., Dealwis, C.G., Schluckebier, G.,Nienaber, V.L., Edalji, R.P., Walter, K.A., Ladror, U.S., Holzman,T.F., and Abad-Zapatero, C. 1998. Crystal structure of ErmC’, anrRNA methyltransferase which mediates antibiotic resistance inbacteria. Biochemistry 37: 7103–7112.

Constantinesco, F., Benachenhou, N., Motorin, Y., and Grosjean, H.1998. The tRNA(guanine-26,N 2-N 2) methyltransferase (Trm1) fromthe hyperthermophilic archaeon Pyrococcus furiosus: cloning,sequencing of the gene and its expression in Escherichia coli.Nucleic Acids Res. 26: 3753–3761.

Constantinesco, F.,Motorin, Y., andGrosjean, H. 1999. Characterisationand enzymatic properties of tRNA(guanine-26, N 2-N 2)-dimethyltrans-ferase (Trm1p) from Pyrococcus furiosus. J. Mol. Biol. 291: 375–392.

Cuff, J.A., Clamp, M.E., Siddiqui, A.S., Finlay, M., and Barton, G.J.1999. JPred: a consensus secondary structure prediction server.Bioinformatics 14: 892–893.

Curran, J.F. 1998. Modified nucleosides in translation. In: Modificationand editing of RNA Grosjean, H., and Benne, R., (eds). ASM Press,Washington, DC, pp. 493–516.

Deckert, G., Warren, P.V., Gaasterland, T., Young, W.G., Lenox, A.L.,Graham, D.E., Overbeek, R., Snead, M.A., Keller, M., Aujay, M.,Huber, R., Feldman, R.A., Short, J.M., Olsen, G.J., and Swanson,R.V. 1998. The complete genome of the hyperthermophilic bacter-ium Aquifex aeolicus. Nature 392: 353–358.

Edqvist, J., Straby, K.B., and Grosjean, H. 1995. Enzymatic formationof N 2,N 2-dimethylguanosine in eukaryotic tRNA: importance of thetRNA architecture. Biochimie. 77: 54–61.

Ellis, S.R., Hopper, A.K., and Martin, N.C. 1987. Amino-terminalextension generated from an upstream AUG codon is not requiredfor mitochondrial import of yeast N 2,N 2-dimethylguanosine-specifictRNA methyltransferase. Proc. Natl. Acad. Sci. U.S.A. 84:5172–5176.

Ellis, S.R., Hopper, A.K., and Martin, N.C. 1989. Amino-terminalextension generated from an upstream AUG codon increases theefficiency of mitochondrial import of yeast N2,N2-dimethylguanosine-specific tRNA methyltransferases. Mol. Cell. Biol. 9: 1611–1620.

Ellis, S.R., Morales, M.J., Li, J.M., Hopper, A.K., and Martin, N.C.1986. Isolation and characterization of the TRM1 locus, a geneessential for the N2,N2-dimethylguanosine modification of bothmitochondrial and cytoplasmic tRNA in Saccharomyces cerevisiae.J. Biol. Chem. 261: 9703–9709.

Fauman, E.B., Blumenthal, R.M., and Cheng, X. 1999. Structure andevolution of AdoMet-dependent MTases. In: S-Adenosylmethionine-dependent methyltransferases: structures and functions Cheng, X.,and Blumenthal, R.M., (eds).World Scientific Inc., Singapore, p. 1–38.

414 Bujnicki et al.

Page 12: Bioinformatic Analyses of the tRNA: (guanine 26, N ... · N-terminal region, corresponding to aa 30–240 in Pf Trm1, exhibited strong similarity to the MTase fold (Pcons score 10.35).

Fischer, D. 2000. Hybrid fold recognition: combining sequence derivedproperties with evolutionary information. Pac. Symp. Biocomput.119–130.

Gajiwala, K.S., and Burley, S.K. 2000. Winged helix proteins. Curr.Opin. Struct. Biol. 10: 110–116.

Gascuel, O. 1997. BIONJ: an improved version of the NJ algorithm based ona simple model of sequence data. Mol. Biol. Evol. 14: 685–695.

Goedecke, K., Pignot, M., Goody, R.S., Scheidig, A.J., and Weinhold,E. 2001. Structure of the N6-adenine DNA methyltransferase M.TaqIin complex with DNA and a cofactor analog. Nat. Struct. Biol. 8:121–125.

Gong, W., O’Gara, M., Blumenthal, R.M., and Cheng, X. 1997. Struc-ture of PvuII DNA-(cytosine N4) methyltransferase, an example ofdomain permutation and protein fold assignment. Nucleic Acids Res.25: 2702–2715.

Grishin, N.V. 2001. Treble clef finger–a functionally diverse zinc-binding structural motif. Nucleic Acids Res. 29: 1703–1714.

Grosjean, H., and Benne, R. 1998. Modification and editing of RNAASM Press, Washington, DC.

Grosjean, H., Sprinzl, M., and Steinberg, S. 1995. Posttranscription-ally modified nucleosides in transfer RNA: their locations andfrequencies. Biochimie. 77: 139–141.

Guex, N., and Peitsch, M.C. 1997. SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling.Electrophoresis 18: 2714–2723.

Hasegawa, M., Kishino, H., and Saitou, N. 1991. On the maximumlikelihood method in molecular phylogenetics. J. Mol. Evol. 32: 443–445.

Jones, D.T. 1999. GenTHREADER: an efficient and reliable protein foldrecognition method for genomic sequences. J. Mol. Biol. 287: 797–815.

Jones, D.T., Taylor, W.R., and Thornton, J.M. 1992. The rapidgeneration of mutation data matrices from protein sequences.Comput. Appl. Biosci. 8: 275–282.

Kagan, R.M., and Clarke, S. 1994. Widespread occurrence of threesequence motifs in diverse S-adenosylmethionine-dependentmethyltransferases suggests a common structure for these en-zymes. Arch. Biochem. Biophys. 310: 417–427.

Karplus, K., Barrett, C., Cline, M., Diekhans, M., Grate, L., andHughey, R. 1999. Predicting protein structure using only sequenceinformation. Proteins Suppl. 3: 121–125.

Kelley, L.A., McCallum, C.M., and Sternberg, M.J. 2000. Enhancedgenome annotation using structural profiles in the program3D-PSSM. J. Mol. Biol. 299: 501–522.

Klimasauskas, S., Timinskas, A., Menkevicius, S., Butkiene, D.,Butkus, V., and Janulaitis, A. 1989. Sequence motifs characteristicof DNA[cytosine-N4]methyltransferases: similarity to adenine andcytosine-C5 DNA-methylases. Nucleic Acids Res. 17: 9823–9832.

Lauster, R., Trautner, T.A., and Noyer-Weidner, M. 1989. Cytosine-specific type II DNA methyltransferases. A conserved enzyme corewith variable target-recognizing domains. J. Mol. Biol. 206: 305–312.

Liu, J., and Straby, K.B. 1998. Point and deletion mutations eliminateone or both methyl group transfers catalysed by the yeast TRM1encoded tRNA (m2

2G26)dimethyltransferase. Nucleic Acids Res. 26:5102–5108.

Liu, J., and Straby, K.B. 2000. The human tRNA(m22G(26))dimethyl-

transferase: functional expression and characterization of a clonedhTRM1 gene. Nucleic Acids Res. 28: 3445–3451.

Liu, J., Zhou, G.Q., and Straby, K.B. 1999. Caenorhabditis elegansZC376.5 encodes a tRNA (m2

2G(26))dimethyltransferance in which(246)arginine is important for the enzyme activity. Gene 226: 73–81.

Lundstrom, Rychlewski, L., Bujnicki, J.M., and Elofsson, A. 2001.Pcons: a neural network based consensus predictor that improvesfold recognition. Protein Sci. 10: 2354–2362.

Lyons-Weiler, J., and Hoelzer, G.A. 1997. Escaping from theFelsenstein zone by detecting long branches in phylogenetic data.Mol. Phylogenet. Evol. 8: 375–384.

Malone, T., Blumenthal, R.M., and Cheng, X. 1995. Structure-guidedanalysis reveals nine sequence motifs conserved among DNAamino-methyltransferases, and suggests a catalytic mechanism forthese enzymes. J. Mol. Biol. 253: 618–632.

McGuffin, L.J., Bryson, K., and Jones, D.T. 2000. The PSIPREDprotein structure prediction server. Bioinformatics 16: 404–405.

Niederberger, C., Graub, R., Costa, A., Desgres, J., and Schwein-gruber, M.E. 1999. The tRNA N 2,N 2-dimethylguanosine-26 methyl-transferase encoded by gene trm1 increases efficiency of

suppression of an ochre codon in Schizosaccharomyces pombe.FEBS Lett. 464: 67–70.

Pizzi, E., and Frontali, C. 2001. Low-complexity regions in Plasmo-dium falciparum proteins. Genome Res. 11: 218–229.

Posfai, J., Bhagwat, A.S., Posfai, G., and Roberts, R.J. 1989.Predictive motifs derived from cytosine methyltransferases. NucleicAcids Res. 17: 2421–2435.

Reinhart, M.P., Lewis, J.M., and Leboy, P.S. 1986. A single tRNA(guanine)-methyltransferase from Tetrahymena with both mono- anddi-methylating activity. Nucleic Acids Res. 14: 1131–1148.

Rose, A.M., Joyce, P.B., Hopper, A.K., and Martin, N.C. 1992.Separate information required for nuclear and subnuclear localiza-tion: additional complexity in localizing an enzyme shared bymitochondria and nuclei. Mol. Cell. Biol. 12: 5652–5658.

Rozenski, J., Crain, P.F., and McCloskey, J.A. 1999. The RNA ModificationDatabase: 1999 update. Nucleic Acids Res. 27: 196–197.

Rychlewski, L., Jaroszewski, L., Li, W., and Godzik, A. 2000.Comparison of sequence profiles. Strategies for structural predic-tions using sequence information. Protein Sci. 9: 232–241.

Saha, N., Schwer, B., and Shuman, S. 1999. Characterization of human,Schizosaccharomyces pombe, and Candida albicans mRNA cap methyl-transferases and complete replacement of the yeast capping apparatusby mammalian enzymes. J. Biol. Chem. 274: 16553–16562.

Saitou, N., and Nei, M. 1987. The neighbor-joining method: a new methodfor reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406–425.

Schaffer, A.A., Aravind, L., Madden, T.L., Shavirin, S., Spouge, J.L., Wolf,Y.I., Koonin, E.V., and Altschul, S.F. 2001. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics andother refinements. Nucleic Acids Res. 29: 2994–3005.

Scott, W.R.P., Hunenberger, P.H., Tironi, I.G., Mark, A.E., Billeter,S.R., Fennen, J., Torda, A.E., Huber, T., Kruger, P., and vanGunsteren, W.F. 1999. The GROMOS biomolecular simulationprogram package. J. Phys. Chem. 103: 3596–3607.

Shi, J., Blundell, T.L., and Mizuguchi, K. 2001. Fugue: sequence-structure homology recognition using environment-specific substitutiontables and structure-dependent gap penalties. J. Mol. Biol. 310: 243–257.

Sippl, M.J. 1993. Recognition of errors in three-dimensional structuresof proteins. Proteins 17: 355–362.

Sood, R., Bonner, T.I., Makalowska, I., Stephan, D.A., Robbins, C.M.,Connors, T.D., Morgenbesser, S.D., Su, K., Faruque, M.U., Pinkett, H.,Graham, C., Baxevanis, A.D., Klinger, K.W., Landes, G.M., Trent, J.M.,and Carpten, J.D. 2001. Cloning and characterization of 13 noveltranscripts and the human RGS8 gene from the 1q25 region encompass-ing the hereditary prostate cancer (HPC1) locus. Genomics 73: 211–222.

Sprinzl, M., Horn, C., Brown, M., Ioudovitch, A., and Steinberg, S.1998. Compilation of tRNA sequences and sequences of tRNAgenes. Nucleic Acids Res. 26: 148–153.

Stanford, D.R., Martin, N.C., and Hopper, A.K. 2000. ADEPTs:information necessary for subcellular distribution of eukaryoticsorting isozymes resides in domains missing from eubacterial andarchaeal counterparts. Nucleic Acids Res. 28: 383–392.

Steinberg, S., and Cedergren, R. 1994. Structural compensation inatypical mitochondrial tRNAs. Nat. Struct. Biol. 1: 507–510.

Steinberg, S., and Cedergren, R. 1995. A correlation between N 2-dimethylguanosine presence and alternate tRNA conformers. RNA1: 886–891.

Takai, K., and Horikoshi, K. 1999. Genetic diversity of archaea indeep-sea hydrothermal vent environments. Genetics 152: 1285–1297.

Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., andHiggins, D.G. 1997. The CLUSTAL_X windows interface: flexiblestrategies for multiple sequence alignment aided by quality analysistools. Nucleic Acids Res. 25: 4876–4882.

Wheeler, D.L., Church, D.M., Lash, A.E., Leipe, D.D., Madden, T.L.,Pontius, J.U., Schuler, G.D., Schriml, L.M., Tatusova, T.A., Wagner,L., and Rapp, B.A. 2001. Database resources of the National Centerfor Biotechnology Information. Nucleic Acids Res. 29: 11–16.

Yang, Z. 1997. PAML: a program package for phylogenetic analysis bymaximum likelihood. Comput. Appl. Biosci. 13: 555–556.

Yu, L., Petros, A.M., Schnuchel, A., Zhong, P., Severin, J.M., Walter, K.,Holzman, T.F., and Fesik, S.W. 1997. Solution structure of an rRNAmethyltransferase (ErmAM) that confers macrolide-lincosamide-strepto-gramin antibiotic resistance. Nat. Struct. Biol. 4: 483–489.

In Silico Analysis of tRNA: m22G26 MTases 415