This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
BioMed CentralBMC Genomics
ss
Open AcceResearch articleFrom biomedicine to natural history research: EST resources for ambystomatid salamandersSrikrishna Putta†1, Jeramiah J Smith†1, John A Walker†1, Mathieu Rondet2, David W Weisrock1, James Monaghan1, Amy K Samuels1, Kevin Kump1, David C King3, Nicholas J Maness4, Bianca Habermann5, Elly Tanaka6, Susan V Bryant2, David M Gardiner2, David M Parichy7 and S Randal Voss*1
Address: 1Department of Biology, University of Kentucky, Lexington, KY 40506, USA, 2Department of Developmental and Cell Biology and the Developmental Biology Center, University of California, Irvine, CA 92697, USA, 3The Life Sciences Consortium, 519 Wartik Laboratory, Penn State University, University Park, PA 16802, USA, 4Department of Zoology, University of Wisconsin-Madison, 250 N. Mills, Madison, WI 53706, USA, 5Scionics Computer Innovation GmbH, Pfotenhauerstrasse 110, 01307 Dresden, Germany, 6Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany and 7Section of Integrative Biology and Section of Molecular, Cell and Developmental Biology, Institute for Cellular and Molecular Biology, University of Texas, Austin, TX 78712, USA
AbstractBackground: Establishing genomic resources for closely related species will provide comparative insights that are crucial forunderstanding diversity and variability at multiple levels of biological organization. We developed ESTs for Mexican axolotl(Ambystoma mexicanum) and Eastern tiger salamander (A. tigrinum tigrinum), species with deep and diverse research histories.
Results: Approximately 40,000 quality cDNA sequences were isolated for these species from various tissues, includingregenerating limb and tail. These sequences and an existing set of 16,030 cDNA sequences for A. mexicanum were processed toyield 35,413 and 20,599 high quality ESTs for A. mexicanum and A. t. tigrinum, respectively. Because the A. t. tigrinum ESTs wereobtained primarily from a normalized library, an approximately equal number of contigs were obtained for each species, with21,091 unique contigs identified overall. The 10,592 contigs that showed significant similarity to sequences from the humanRefSeq database reflected a diverse array of molecular functions and biological processes, with many corresponding to genesexpressed during spinal cord injury in rat and fin regeneration in zebrafish. To demonstrate the utility of these EST resources,we searched databases to identify probes for regeneration research, characterized intra- and interspecific nucleotidepolymorphism, saturated a human – Ambystoma synteny group with marker loci, and extended PCR primer sets designed for A.mexicanum / A. t. tigrinum orthologues to a related tiger salamander species.
Conclusions: Our study highlights the value of developing resources in traditional model systems where the likelihood ofinformation transfer to multiple, closely related taxa is high, thus simultaneously enabling both laboratory and natural historyresearch.
BackgroundEstablishing genomic resources for closely related specieswill provide comparative insights that are crucial forunderstanding diversity and variability at multiple levelsof biological organization. Expressed sequence tags (EST)are particularly useful genomic resources because theyenable multiple lines of research and can be generated forany organism: ESTs allow the identification of molecularprobes for developmental studies, provide clones for DNAmicrochip construction, reveal candidate genes formutant phenotypes, and facilitate studies of genomestructure and evolution. Furthermore, ESTs provide rawmaterial from which strain-specific polymorphisms canbe identified for use in population and quantitativegenetic analyses. The utility of such resources can be tai-lored to target novel characteristics of organisms whenESTs are isolated from cell types and tissues that areactively being used by a particular research community, soas to bias the collection of sequences towards genes of spe-cial interest. Finally, EST resources produced for modelorganisms can greatly facilitate comparative and evolu-tionary studies when their uses are extended to other,closely related taxa.
Salamanders (urodele amphibians) are traditional modelorganisms whose popularity was unsurpassed early in the20th century. At their pinnacle, salamanders were the pri-mary model for early vertebrate development. Embryo-logical studies in particular revealed many basicmechanisms of development, including organizer andinducer regions of developing embryos [1]. Salamanderscontinue to be important vertebrate model organisms forregeneration because they have by far the greatest capacityto regenerate complex body parts in the adult phase. Incontrast to mammals, which are not able to regenerateentire structures or organ systems upon injury or amputa-tion, adult salamanders regenerate their limbs, tail, lens,retina, spinal cord, heart musculature, and jaw [2-7]. Inaddition, salamanders are the model of choice in a diver-sity of areas, including vision, embryogenesis, heart devel-opment, olfaction, chromosome structure, evolution,ecology, science education, and conservation biology [8-15]. All of these disciplines are in need of genomicresources as fewer than 4100 salamander nucleotidesequences had been deposited in GenBank as of 3/10/04.
Here we describe results from an EST project for twoambystomatid salamanders: the Mexican axolotl,Ambystoma mexicanum and the eastern tiger salamander,A. tigrinum tigrinum. These two species are members of theTiger Salamander Complex [16], a group of closely relatedspecies and subspecies that are widely distributed inNorth America. Phylogenetic reconstruction suggests thatthese species probably arose from a common ancestorabout 10–15 million years ago [16]. Ambystoma mexica-
num has a long research history of over 100 years and isnow principally supplied to the research community bythe Axolotl Colony [17], while A. t. tigrinum is obtainedfrom natural populations in the eastern United States.Although closely related with equally large genomes (32 ×109 bp)[18], these two species and others of the Complexdiffer dramatically in life history: A. mexicanum is a paedo-morphic species that retains many larval features and livesin water throughout it's life cycle while A. t. tigrinumundergoes a metamorphosis that is typical of manyamphibians. Like many other traditional model organ-isms of the last century, interest in these two speciesdeclined during the rise of genetic models like the fly,zebrafish, and mouse [19]. However, "early" modelorganisms such as salamanders are beginning to re-attractattention as genome resources can rapidly be developedto exploit the unique features that originally identifiedtheir utility for research. We make this point below byshowing how the development of ESTs for these two spe-cies is enabling research in several areas. Furthermore, weemphasize the value of developing resources in modelsystems where the likelihood of information transfer tomultiple, closely related taxa is high, thus simultaneouslyenabling both laboratory and natural history researchprograms.
Results and DiscussionSelection of libraries for EST sequencingEleven cDNA libraries were constructed using a variety oftissues (Table 1). Pilot sequencing of randomly selectedclones revealed that the majority of the non-normalizedlibraries were moderate to highly redundant for relativelyfew transcripts. For example, hemoglobin-like transcriptsrepresented 15–25% of the sampled clones from cDNAlibraries V1, V2, and V6. Accordingly, we chose to focusour sequencing efforts on the non-normalized MATHlibrary as well as the normalized AG library, which hadlower levels of redundancy (5.5 and 0.25% globins,respectively). By concentrating our sequencing efforts onthese two libraries we obtained transcripts deriving
primarily from regenerating larval tissues in A. mexicanumand several non-regenerating larval tissues in A. t.tigrinum.
EST sequencing and clusteringA total of 46,064 cDNA clones were sequenced, yielding39,982 high quality sequences for A. mexicanum and A. t.tigrinum (Table 2). Of these, 3,745 corresponded tomtDNA and were removed from the dataset; completemtDNA genome data for these and other ambystomatid
species will be reported elsewhere. The remaining nuclearESTs for each species were clustered and assembled sepa-rately. We included in our A. mexicanum assembly anadditional 16,030 high quality ESTs that were generatedrecently for regenerating tail and neurula stage embryos[20]. Thus, a total of 32,891 and 19,376 ESTs wereclustered for A. mexicanum and A. t. tigrinum, respectively.Using PaCE clustering and CAP3 assembly, a similarnumber of EST clusters and contigs were identified foreach species (Table 2). Overall contig totals were 11,190and 9,901 for A. mexicanum and A. t. tigrinum respectively.Thus, although 13,515 more A. mexicanum ESTs wereassembled, a roughly equivalent number of contigs wereobtained for both species. This indicates that EST develop-ment was more efficient for A. t. tigrinum, presumablybecause ESTs were obtained primarily from the normal-ized AG library; indeed, there were approximately twice asmany ESTs on average per A. mexicanum contig (Table 2).Thus, our EST project yielded an approximately equiva-lent number of contigs for A. mexicanum and A. t. tigrinum,and overall we identified > 21,000 different contigs.Assuming that 20% of the contigs correspond to redun-dant loci, which has been found generally in large ESTprojects [21], we identified transcripts for approximately17,000 different ambystomatid loci. If ambystomatid sal-amanders have approximately the same number of loci asother vertebrates (e.g. [22]), we have isolated roughly halfthe expected number of genes in the genome.
Table 2: EST summary and assembly results.
A. mex A. t. tig
cDNA clones sequenced 21830 24234high-quality sequences 19383 20599mt DNA sequence 2522 1223seqs submitted to NCBI 16861 19376sequences assembled 32891a 19376
PaCE clusters 11381 10226ESTs in contigs 25457 12676contigs 3756 3201singlets 7434 6700putative transcripts 11190 9901
aIncludes 16,030 ESTs from [20].
Results of BLASTX and TBLASTX searches to identify best BLAST hits for Ambystoma contigs searched against NCBI human RefSeq, nr, and Xenopus Unigene databasesFigure 1Results of BLASTX and TBLASTX searches to identify best BLAST hits for Ambystoma contigs searched against NCBI human RefSeq, nr, and Xenopus Unigene databases.
Identification of vertebrate sequences similar to Ambystoma contigsWe searched all contigs against several vertebrate data-bases to identify sequences that exhibited significantsequence similarity. As our objective was to reliably anno-tate as many contigs as possible, we first searched against19,804 sequences in the NCBI human RefSeq database(Figure 1), which is actively reviewed and curated by biol-ogists. This search revealed 5619 and 4973 "best hit"matches for the A. mexicanum and A. t. tigrinum EST data-sets at a BLASTX threshold of E = 10-7. The majority ofcontigs were supported at more stringent E-value thresh-
olds (Table 3). Non-matching contigs were subsequentlysearched against the Non-Redundant (nr) Protein data-base and Xenopus tropicalus and X. laevis UNIGENE ESTs(Figure 1). These later two searches yielded a few hundredmore 'best hit' matches, however a relatively large numberof ESTs from both ambystomatid species were not similarto any sequences from the databases above. Presumably,these non-matching sequences were obtained from thenon-coding regions of transcripts or they contain protein-coding sequences that are novel to salamander. Althoughthe majority are probably of the former type, we did iden-tify 3,273 sequences from the non-matching set that hadopen reading frames (ORFs) of at least 200 bp, and 911 ofthese were greater than 300 bp.
The distribution of ESTs among contigs can provide per-spective on gene expression when clones are randomlysequenced from non-normalized cDNA libraries. In gen-eral, frequently sampled transcripts may be expressed athigher levels. We identified the 20 contigs from A. mexica-num and A. t. tigrinum that contained the most assembledESTs (Table 4). The largest A. t. tigrinum contigs containedfewer ESTs than the largest A. mexicanum contigs, probablybecause fewer overall A. t. tigrinum clones were sequenced,with the majority selected from a normalized library. How-ever, we note that the contig with the most ESTs was iden-tified for A. t. tigrinum: delta globin. In both species,transcripts corresponding to globin genes were sampledmore frequently than all other loci. This may reflect the factthat amphibians, unlike mammals, have nucleated redblood cells that are transcriptionally active. In addition toglobin transcripts, a few other house-keeping genes wereidentified in common from both species, however themajority of the contigs were unique to each list. Overall, thestrategy of sequencing cDNAs from a diverse collection oftissues (from normalized and non-normalized libraries)yielded different sets of highly redundant contigs. Only25% and 28% of the A. mexicanum and A. t. tigrinum con-tigs, respectively, were identified in common (Figure 2). Wealso note that several hundred contigs were identified incommon between Xenopus and Ambystoma; this will helpfacilitate comparative studies among these amphibianmodels.
Functional annotationFor the 10,592 contigs that showed significant similarityto sequences from the human RefSeq database, weobtained Gene Ontology (23) information to describeESTs in functional terms. Although there are hundreds ofpossible annotations, we chose a list of descriptors formolecular and biological processes that we believe are ofinterest for research programs currently utilizing salaman-ders as model organisms (Table 5). In all searches, wecounted each match between a contig and a RefSeqsequence as identifying a different ambystomatid gene,
Table 3: Ambystoma contig search of NCBI human RefSeq, nr, and Xenopus Unigene databases.
Venn diagram of BLAST comparisons among amphibian EST projectsFigure 2Venn diagram of BLAST comparisons among amphibian EST projects. Values provided are numbers of reciprocal best BLAST hits (E<10-20) among quality masked A. mexicanum and A. t. tigrinum assemblies and a publicly availa-ble X. tropicalis EST assembly http://www.sanger.ac.uk/Projects/X_tropicalis
A. mexicanum
7909
A. t. tigrinum
6912
X. tropicalis
34296
353
523
465
2296
Page 4 of 17(page number not for citation purposes)
even when different contigs matched the same RefSeq ref-erence. In almost all cases, approximately the samenumber of matches was found per functional descriptorfor both species. This was not simply because the sameloci were being identified for both species, as only 20% ofthe total number of searched contigs shared sufficientidentity (BLASTN; E<10-80 or E<10-20) to be potentialhomologues. In this sense, the sequencing effort betweenthese two species was complementary in yielding a morediverse collection of ESTs that were highly similar tohuman gene sequences.
Informatic searches for regeneration probesThe value of a salamander model to regeneration researchwill ultimately rest on the ease in which data and results canbe cross-referenced to other vertebrate models. For exam-ple, differences in the ability of mammals and salamandersto regenerate spinal cord may reflect differences in the waycells of the ependymal layer respond to injury. As isobserved in salamanders, ependymal cells in adult mam-mals also proliferate and differentiate after spinal cordinjury (SCI) [24,25]; immediately after contusion injury inadult rat, ependymal cell numbers increase and prolifera-tion continues for at least 4 days [[26]; but see [27]]. Ratependymal cells share some of the same gene expression
Table 4: Top 20 contigs with the most assembled ESTs.
and protein properties of embryonic stem cells [28], how-ever no new neurons have been observed to derive fromthese cells in vivo after SCI [29]. Thus, although endogenousneural progenitors of the ependymal layer may have latentregenerative potential in adult mammals, this potential isnot realized. Several recently completed microarray analy-ses of spinal cord injury in rat now make it possible tocross-reference information between amphibians andmammals. For example, we searched the complete list ofsignificantly up and down regulated genes from Carmel et
al. [30] and Song et al. [31] against all Ambystoma ESTs.Based upon amino acid sequence similarity of translatedESTs (TBLASTX; E<10-7), we identified DNA sequences cor-responding to 69 of these 164 SCI rat genes (Table 6). It islikely that we have sequence corresponding to other pre-sumptive orthologues from this list as many of our ESTsonly contain a portion of the coding sequence or theuntranslated regions (UTR), and in many cases our searchesidentified closely related gene family members. Thus, manyof the genes that show interesting expression patterns afterSCI in rat can now be examined in salamander.
Similar gene expression programs may underlie regenera-tion of vertebrate appendages such as fish fins and tetrapodlimbs. Regeneration could depend on reiterative expressionof genes that function in patterning, morphogenesis, andmetabolism during normal development and homeostasis.Or, regeneration could depend in part on novel genes thatfunction exclusively in this process. We investigated thesealternatives by searching A. mexicanum limb regenerationESTs against UNIGENE zebrafish fin regeneration ESTs(Figure 3). This search identified 1357 significant BLASThits (TBLASTX; E<10-7) that corresponded to 1058 uniquezebrafish ESTs. We then asked whether any of these poten-tial regeneration homologues were represented uniquely inlimb and fin regeneration databases (and not in databasesderived from other zebrafish tissues). A search of the 1058zebrafish ESTs against > 400,000 zebrafish ESTs that weresampled from non-regenerating tissues revealed 43 thatwere unique to the zebrafish regeneration database (Table7). Conceivably, these 43 ESTs may represent transcriptsimportant to appendage regeneration. For example, oursearch identified several genes (e.g. hspc128, pre-B-cell col-ony enhancing factor 1, galectin 4, galectin 8) that may beexpressed in progenitor cells that proliferate and differenti-ate during appendage regeneration. Overall, our results sug-gest that regeneration is achieved largely through thereiterative expression of genes having additional functionsin other developmental contexts, however a small numberof genes may be expressed uniquely during appendageregeneration.
DNA sequence polymorphisms within and between A. mexicanum and A. t. tigrinumThe identification of single nucleotide polymorphisms(SNPs) within and between orthologous sequences of A.mexicanum and A. t. tigrinum is needed to develop DNAmarkers for genome mapping [32], quantitative geneticanalysis [33], and population genetics [34]. We estimatedwithin species polymorphism for both species by calculat-ing the frequency of SNPs among ESTs within the 20 larg-est contigs (Table 4). These analyses considered a total of30,638 base positions for A. mexicanum and 18,765 basepositions for A. t. tigrinum. Two classes of polymorphismwere considered in this analysis: those occurring at
moderate (identified in 10–30% of the EST sequences)and high frequencies (identified in at least 30% of the ESTsequences). Within the A. mexicanum contigs, 0.49% and0.06% of positions were polymorphic at moderate andhigh frequency, while higher levels of polymorphismwere observed for A. t. tigrinum (1.41% and 0.20%).Higher levels of polymorphism are expected for A. t. tigri-
num because they exist in larger, out-bred populations innature.
To identify SNPs between species, we had to first identifypresumptive, interspecific orthologues. We did this byperforming BLASTN searches between the A. mexicanumand A. t. tigrinum assemblies, and the resulting alignments
MexCluster_3498_Contig1 gi|436934| Sprague Dawley protein kinase C rec. 0TigCluster_6648_Contig1 0MexSingletonClusters_BL279A_B12 gi|464196| phosphodiesterase I E-49
TigSingletonClusters_Salamander_25_P03_ab1 E-75
MexCluster_8708_Contig1 gi|466438| 40kDa ribosomal protein E-168
TigCluster_5877_Contig1 E-168
MexSingletonClusters_nm_14_a9_t3_ gi|493208| stress activated protein kinase alpha II E-51
TigSingletonClusters_Salamander_11_A13_ab1 gi|517393| tau microtubule-associated protein E-44
were filtered to retain only those alignments betweensequences that were one another's reciprocal best BLASThit. As expected, the number of reciprocal 'best hits' varieddepending upon the E value threshold, although increasingthe E threshold by several orders of magnitude had a dis-proportionately small effect on the overall total length ofBLAST alignments. A threshold of E<10-80yielded 2414alignments encompassing a total of 1.25 Mbp from eachspecies, whereas a threshold of E<10-20 yielded 2820 align-ments encompassing a total of 1.32 Mbp. The percentsequence identity of alignments was very high amongpresumptive orthologues, ranging from 84–100% at themore stringent E threshold of E<10-80. On average, A. mex-icanum and A. t. tigrinum transcripts are estimated to be97% identical at the nucleotide level, including both pro-tein coding and UTR sequence. This estimate for nuclearsequence identity is surprisingly similar to estimatesobtained from complete mtDNA reference sequences forthese species (96%, unpublished data), and to estimates for
partial mtDNA sequence data obtained from multiple nat-ural populations [16]. These results are consistent with theidea that mitochondrial mutation rates are lower in coldversus warm-blooded vertebrates [35]. From a resource per-spective, the high level of sequence identity observedbetween these species suggests that informatics will enablerapidly the development of probes between these and otherspecies of the A. tigrinum complex.
Extending EST resources to other ambystomatid speciesRelatively little DNA sequence has been obtained fromspecies that are closely related to commonly used modelorganisms, and yet, such extensions would greatly facili-tate genetic studies of natural phenotypes, populationstructures, species boundaries, and conservatism anddivergence of developmental mechanisms. Like manyamphibian species that are threatened by extinction,many of these ambystomatid salamanders are currently inneed of population genetic studies to inform conservationand management strategies [e.g. [13]]. We characterizedSNPs from orthologous A. mexicanum and A. t. tigrinumESTs and extended this information to develop informa-tive molecular markers for a related species, A. ordinarium.Ambystoma ordinarium is a stream dwelling paedomorphendemic to high elevation habitats in central Mexico [36].This species is particularly interesting from an ecologicaland evolutionary standpoint because it harbors a highlevel of intraspecific mitochondrial variation, and as anindependently derived stream paedomorph, is uniqueamong the typically pond-breeding tiger salamanders. Asa reference of molecular divergence, Ambystomaordinarium shares approximately 98 and 97% mtDNAsequence identity with A. mexicanum and A. t. tigrinumrespectively [16].
To identify informative markers for A. ordinarium, A. mexi-canum and A. t. tigrinum EST contigs were aligned to identifyorthologous genes with species-specific sequence variations(SNPs or Insertion/Deletions = INDELs). Primer pairs cor-responding to 123 ESTs (Table 8) were screened by PCRusing a pool of DNA template made from individuals of 10A. ordinarium populations. Seventy-nine percent (N = 97)of the primer pairs yielded amplification products that wereapproximately the same size as corresponding A. mexica-num and A. t. tigrinum fragments, using only a single set ofPCR conditions. To estimate the frequency of intraspecificDNA sequence polymorphism among this set of DNAmarker loci, 43 loci were sequenced using a single individ-ual sampled randomly from each of the 10 populations,which span the geographic range of A. ordinarium. At leastone polymorphic site was observed for 20 of the sequencedloci, with the frequency of polymorphisms dependentupon the size of the DNA fragment amplified. Our resultssuggest that the vast majority of primer sets designed for A.mexicanum / A. t. tigrinum EST orthologues can be used to
Results of BLASTN and TBLASTX searches to identify best BLAST hits for A. mexicanum regeneration ESTs searched against zebrafish EST databasesFigure 3Results of BLASTN and TBLASTX searches to iden-tify best BLAST hits for A. mexicanum regeneration ESTs searched against zebrafish EST databases. A total of 14,961 A. mexicanum limb regeneration ESTs were assembled into 4485 contigs for this search.
TBLASTX vs 19,039
zfish regeneration ESTs
A. mexicanum limb
regeneration ESTs
14,961
1058
BLASTN vs 404,876 zfish
non-regeneration ESTs
candidate regeneration
homologues
43
potential regeneration
homologues
Page 9 of 17(page number not for citation purposes)
Comparative gene mappingSalamanders occupy a pivotal phylogenetic position forreconstructing the ancestral tetrapod genome structureand for providing perspective on the extremely derivedanuran Xenopus (37) that is currently providing the bulkof amphibian genome information. Here we show the
utility of ambystomatid ESTs for identifying chromo-somal regions that are conserved between salamandersand other vertebrates. A region of conserved synteny thatcorresponds to human chromosome (Hsa) 17q has beenidentified in several non-mammalian taxa including rep-tiles (38) and fishes (39). In a previous study Voss et al.(40) identified a region of conserved synteny betweenAmbystoma and Hsa 17q that included collagen type 1alpha 1 (Col1a1), thyroid hormone receptor alpha (Thra),homeo box b13 (Hoxb13), and distal-less 3 (Dlx3) (Figure4). To evaluate both the technical feasibility of mappingESTs and the likelihood that presumptive orthologuesmap to the same synteny group, we searched our assem-blies for presumptive Hsa 17 orthologues and thendeveloped a subset of these loci for genetic linkage map-ping. Using a joint assembly of A. mexicanum and A. t.tigrinum contigs, 97 Hsa 17 presumptive orthologues wereidentified. We chose 15 genes from this list and designedPCR primers to amplify a short DNA fragment containing1 or more presumptive SNPs that were identified in thejoint assembly (Table 9). All but two of these genes weremapped, indicating a high probability of mapping successusing markers developed from the joint assembly of A.mexicanum and A. t. tigrinum contigs. All 6 ESTs thatexhibited 'best hits' to loci within the previously definedhuman-Ambystoma synteny group did map to this region(Hspc009, Sui1, Krt17, Krt24, Flj13855, and Rpl19). Ourresults show that BLAST-based definitions of orthologyare informative between salamanders and human. Allother presumptive Hsa 17 loci mapped to Ambystomachromosomal regions outside of the previously definedsynteny group. It is interesting to note that two of theseloci mapped to the same ambystomatid linkage group(Cgi-125, Flj20345), but in human the presumptive ortho-logues are 50 Mb apart and distantly flank the syntenicloci in Figure 4. Assuming orthology has been assignedcorrectly for these loci, this suggests a dynamic history forsome Hsa 17 orthologues during vertebrate evolution.
Future directionsAmbystomatid salamanders are classic model organismsthat continue to inform biological research in a variety ofareas. Their future importance in regenerative biology and
Table 8: EST loci used in a population-level PCR amplification screen in A. ordinarium (Continued)
Comparison of gene order between Ambystoma linkage group 1 and an 11 Mb region of Hsa17 (37.7 Mb to 48.7 Mb)Figure 4Comparison of gene order between Ambystoma link-age group 1 and an 11 Mb region of Hsa17 (37.7 Mb to 48.7 Mb). Lines connect the positions of putatively orthologous genes.
39348
SUI1
KRT10
0.0 cM
DLX3
HSPC009
KRT17FLJ13855HOXB13
RPL19
48736 Kb
48544
4746047279
38596
41323
4014940218
37732 Kb THRα63.7 cM
HSA 17Ambystoma LG1
Page 13 of 17(page number not for citation purposes)
metamorphosis will almost certainly escalate as genomeresources and other molecular and cellular approachesbecome widely available. Among the genomic resourcescurrently under development (see [41]) are a comparativegenome map, which will allow mapping of candidategenes, QTL, and comparative anchors for cross-referenc-ing the salamander genome to fully sequenced vertebratemodels. In closing, we reiterate a second benefit toresource development in Ambystoma. Genome resourcesin Ambystoma can be extended to multiple, closely relatedspecies to explore the molecular basis of natural, pheno-typic variation. Such extensions can better inform our
understanding of ambystomatid biodiversity in natureand draw attention to the need for conserving such natu-ralistic systems. Several paedomorphic species, includingA. mexicanum, are on the brink of extinction. We can thinkof no better investment than one that simultaneouslyenhances research in all areas of biology and draws atten-tion to the conservation needs of model organisms intheir natural habitats.
ConclusionsApproximately 40,000 cDNA sequences were isolatedfrom a variety of tissues to develop expressed sequence
Table 9: Presumptive human chromosome 17 loci that were mapped in Ambystoma
Marker ID Primersa Diagnosisb LGc Symbold RefSeq IDe E-valuef
Pl_6_E/F_6 F-GAAAACCTGCTCAGCATTAGTGT ASA ul PFN1 NP_005013 E-34
R-TCTATTACCATAGCATTAATTGGCAGPl_5_G/H_5 F-CTATTTCATCTGAGTACCGTTGAATG PE (A) 23 CGI-125 NP_057144 E-56
Pl_6_C/D_5 F-CCGTAAATGTTTCTAAATGACAGTTG PE (G) 2 ACTG1 NP_001605 0R-GGAAAGAAAGTACAATCAAGTCCTTCE-GATTGAAAACTGGAACCGAAAGAAGATAAA
aSequences are 5' amplification primers, 3' amplification primers, or primer extension probes, and are preceded by F-, R-, and E- respectively. bGenotyping methods are abbreviated: allele specific amplification (ASA), size polymorphism (SP), restriction digestion (RD), primer extension (PE). Diagnostic restriction enzymes and diagnostic extension bases are provided in parentheses. cAmbystoma linkage group ID. "ul" designates markers that are unlinked. dOfficial gene symbols as defined by the Human Genome Organization Gene Nomenclature Committee http://www.gene.ucl.ac.uk/nomenclature/. eBest BLASTX hit (highest e-value) from the human RefSeq database using the contig from which each marker was designed as a query sequence. fHighest E-value statistic obtained by searching contigs, from which EST markers were designed, against the human RefSeq database.
Page 14 of 17(page number not for citation purposes)
tags for two model salamander species (A. mexicanum andA. t. tigrinum). An approximately equivalent number ofcontigs were identified for each species, with 21,091unique contigs identified overall. The strategy to sequencecDNAs from a diverse collection of tissues from normal-ized and non-normalized libraries yielded different sets ofhighly redundant contigs. Only 25% and 28% of the A.mexicanum and A. t. tigrinum contigs, respectively, wereidentified in common. To demonstrate the utility of theseEST resources, we searched databases to identify newprobes for regeneration research, characterized intra- andinterspecific nucleotide polymorphism, saturated ahuman/Ambystoma synteny group with marker loci, andextended PCR primer sets designed for A. mexicanum / A.t. tigrinum orthologues to a related tiger salamander spe-cies. Over 100 new probes were identified for regenera-tion research using informatic approaches. With respect tocomparative mapping, 13 of 15 EST markers weremapped successfully, and 6 EST markers were mapped toa previously defined synteny group in Ambystoma. Theseresults indicate a high probability of mapping successusing EST markers developed from the joint assembly ofA. mexicanum and A. t. tigrinum contigs. Finally, we foundthat primer sets designed for A. mexicanum / A. t. tigrinumEST orthologues can be used to amplify the correspondingsequence in a related A. tigrinum complex species. Overall,the EST resources reported here will enable a diversity ofnew research areas using ambystomatid salamanders.
MethodscDNA library constructionTen cDNA libraries were constructed for the project usingvarious larval tissues of A. mexicanum and A. t. tigrinum(Table 1). Larval A. mexicanum were obtained from adultanimals whose ancestry traces back to the Axolotl Colony[17]. Larval A. t. tigrinum were obtained from Charles Sul-livan Corp. The GARD and MATH A. mexicanum limbregeneration libraries were constructed using regeneratingforelimb mesenchyme. Total RNAs were collected fromanterior and posterior limbs amputated at the mid-stylo-pod level on 15 cm animals, and from the resultingregenerates at 12 h, 2 days, 5 days and early bud stages.One hundred µg fractions of each were pooled togetherand polyA-selected to yield 5 µg that was utilized for direc-tional library construction (Lambda Zap, Stratagene). TheV1 (A. mex), V2 (A. tig), V4-5 (A. tig), and V6-7 (A. mex)libraries were made from an assortment of larval tissues(see Table 1) using the SMART cDNA cloning kits (Clon-tech). Total RNAs were isolated and reverse transcribed toyield cDNAs that were amplified by long distance PCRand subsequently cloned into pTriplEX. The V3 and AGlibraries were constructed by commercial companies(BioS&T and Agencourt, respectively).
cDNA template preparation and sequencingcDNA inserts were mass excised as phagemids, picked intomicrotitre plates, grown overnight in LB broth, and thendiluted (1/20) to spike PCR reactions: (94°C for 2 min;then 30 cycles at 94°C for 45 sec, 58°C for 45°sec, and72°C for 7 min). All successful amplifications with insertslarger than ~500 bp were sequenced (ABI Big Dye orAmersham Dye terminator chemistry and 5' universalprimer). Sequencing and clean-up reactions was carriedout according to manufacturers' protocols. ESTs weredeposited into NCBI database under accession numbersBI817205-BI818091 and CN033008-CN045937 andCN045944-CN069430.
EST sequence processing and assemblyThe PHRED base-calling program [42] was used to gener-ate sequence and quality scores from trace files. PHREDfiles were then quality clipped and vector/contaminantscreened. An in-house program called QUALSCREEN wasused to quality clip the ends of sequence traces. Starting atthe ends of sequence traces, this program uses a 20 bpsliding window to identify a continuous run of bases thathas an average PHRED quality score of 15. MitochondrialDNA sequences were identified by searching all ESTsagainst the complete mtDNA genome sequence of A. mex-icanum (AJ584639). Finally, all sequences less than 100bp were removed. The average length of the resulting ESTswas 629 bp. The resulting high quality ESTs were clusteredinitially using PaCE [43] on the U.K. HP Superdome com-puter. Multi-sequence clusters were used as inputsequence sets for assembly using CAP3 [44] with an 85%sequence similarity threshold. Clusters comprising singleESTs were assembled again using CAP3 with an 80%sequence similarity threshold to identify multi-EST con-tigs that were missed during the initial analysis. This pro-cedure identified 550 additional contigs comprising 1150ESTs.
Functional annotationAll contigs and singletons were searched against thehuman RefSeq database (Oct. 2003 release) usingBLASTX. The subset of sequences that yielded no BLASThit was searched against the non-redundant proteinsequence database (Feb. 2004) using BLASTX. Theremaining subset of sequences that yielded no BLAST hitwas searched against Xenopus laevis and X. tropicalis UNI-GENE ESTs (Mar. 2004) using TBLASTX. Zebrafish ESTswere downloaded from UNIGENE ESTs (May 2004).BLAST searches were done with an E-value threshold of E<10-7 unless specified.
Sequence comparison of A. mexicanum and A. t. tigrinum assembliesAll low quality base calls within contigs were maskedusing a PHRED base quality threshold of 16. To identify
Page 15 of 17(page number not for citation purposes)
polymorphisms for linkage mapping, contigs from A.mexicanum and A. t. tigrinum assemblies were joined intoa single assembly using CAP3 and the following criteria:an assembly threshold of 12 bp to identify initial matches,a minimum 100 bp match length, and 85% sequenceidentity. To identify putatively orthologous genes from A.mexicanum and A. t. tigrinum assemblies, and generate anestimate of gene sequence divergence, assemblies werecompared using BLASTN with a threshold of E <10-20.Following BLAST, alignments were filtered to obtainreciprocal best BLAST hits.
Extending A. mexicanum / A. t. tigrinum sequence information to A. ordinariumPolymorphic DNA marker loci were identified by locatingsingle nucleotide polymorphisms (SNPs) in the joint A.mexicanum and A. t. tigrinum assembly. Polymerase chainreaction (PCR) primers were designed using Primer 3 [45]to amplify 100 – 500 bp SNP-containing fragments from123 different protein-coding loci (Table 8). DNA was iso-lated from salamander tail clips using SDS, RNAse andproteinase K treatment, followed by phenol-chloroformextraction. Fragments were amplified using 150 ng DNA,75 ng each primer, 1.5 mM MgCl2, 0.25 U Taq, and a 3-step profile (94°C for 4 min; 33 cycles of 94°C for 45 s,60°C for 45 s, 72°C for 30 s; and 72°C for 7 min). DNAfragments were purified and sequenced using ABI Big Dyeor Amersham Dye terminator chemistry. Singlenucleotide polymorphisms were identified by eye fromsequence alignments.
Linkage mapping of human chromosome 17 orthologous genesPutative salamander orthologues of genes on humanchromosome 17 (Hsa 17) were identified by comparingthe joint A. mexicanum and A. t. tigrinum assembly tosequences from the human RefSeq (NCBI) protein data-base, using BLASTX at threshold E<10-7. Linkage distanceand arrangement among markers was estimated usingMapManager QTXb19 software [46] and the Kosambimapping function at a threshold of p = 0.001. All markerswere mapped using DNA from a previously describedmeiotic mapping panel [40]. All PCR primers and primerextension probes were designed using Primer 3 [45] andArray Designer2 (Premier Biosoft) software. Species-spe-cific polymorphisms were assayed by allele specific ampli-fication, restriction digestion, or primer extension, usingthe reagent and PCR conditions described above. Primerextension markers were genotyped using the AcycloPrime-FP SNP detection assay (Perkin Elmer). See Table 9 foramplification and extension primer sequences, and infor-mation about genotyping methodology.
Author's contributionsSP and DK: bioinformatics; JW: clone management andsequencing in support of A. mexicanum and A. t. tigrinumESTs; JS: comparative mapping and polymorphism esti-mation; DW: extending ESTs to A. ordinarium; JM, KK, AS,NM: PCR and gel electrophoresis; BH and ET: cDNAlibrary construction and sequencing for spinal cord regen-eration ESTs; MR, SB, DG: cDNA library construction andclone management for limb regeneration ESTs; DP and SVconceived of the project and participated in its design andcoordination. All authors read and approved the finalmanuscript.
AcknowledgementsWe thank the Axolotl Colony. We thank Greg Chinchar and Betty David-son for providing RNA to make cDNA libraries V3 and V4. We acknowl-edge the support of the National Science Foundation, the National Center for Research Resources at the National Institutes of Health, the Kentucky Spinal Cord and Head Injury Research Trust, and the NSF EPSCOR initia-tive in Functional Genomics at University of Kentucky.
References1. Beetschen J-C: How did urodele embryos come into promi-
nence as a model system. Int J Dev Biol 1996, 40:629-636.2. Gardiner DM, Endo T, Bryant SV: The molecular basis of amphib-
ian limb regeneration: integrating the old with the new. SeminCell Dev Biol 2002, 13:345-352.
3. Echeverri K, Tanaka EM: Ectoderm to mesoderm lineageswitching during axolotl tail regeneration. Science 2002,298:1933-1936.
4. Del Rio-Tsonis K, Jung JC, Chiu IM, Tsonis PA: Conservation offibroblast growth factor function in lens regeneration. ProcNatl Acad Sci USA 1997, 94:13701-13706.
5. Ikegami Y, Mitsuda S, Araki M: Neural cell differentiation fromretinal pigment epithelial cells of the newt: an organ culturemodel for the urodele retinal regeneration. J Neurobiol 2002,50:209-20.
6. Chernoff EAG, Stocum DL, Nye HLD, Cameron JA: Urodele spinalcord regeneration and related processes. Dev Dyn 2003,226:295-307.
7. Ferretti P: Re-examining jaw regeneration in urodeles: whathave we learnt? Int J Dev Biol 1996, 40:807-811.
8. Zhang J, Wu SM: Goalpha labels ON bipolar cells in the tigersalamander retina. J Comp Neurol 2003, 461:276-289.
9. Falck P, Hanken J, Olsson L: Cranial neural crest emergence andmigration in the Mexican axolotl (Ambystoma mexicanum).Zoology-Jena 2002, 105:195-202.
10. Zhang C, Dube DK, Huang X, Zajdel RW, Bhatia R, Foster D, Leman-ski SL, Lemanski LF: A point mutation in bioactive RNA resultsin the failure of mutant heart correction in mexican axolotls.Anat Embryol 2003, 206:495-506.
11. Kauer JS: On the scents of smell in the salamander. Nature2002, 417:336-342.
12. Voss SR, Prudic KL, Oliver JC, Shaffer HB: Candidate gene analysisof metamorphic timing in ambystomatid salamanders. MolEcol 2003, 12:1217-1223.
17. Axolotl Colony Website [http://www.indiana.edu/~axolotl/]18. Straus NA: Comparative DNA renaturation kinetics in
amphibians. Proc Nat Acad Sci USA 1971, 68:799-802.19. Davis RH: The age of model organisms. Nat Rev 2004, 5:69-77.20. Habermann B, Bebin A-G, Herklotz S, Volkmer M, Eckelt K, Pehlke K,
Epperlein HH, Schackert HK, Wiebe G, Tanaka EM: An Ambystoma-tid mexicanum EST sequencing project: Analysis of 17,352expressed sequence tags from embryonic and regeneratingblastema cDNA libraries. Genome Biology in press.
21. Kawai J, Shinagawa A, Shibata K, et al.: Functional annotation of afull-length mouse cDNA collection. Nature 2001, 409:685-690.
22. Ewing B, Green P: Analysis of expressed sequence tags indi-cates 35,000 human genes. Nat Gen 2000, 25:232-234.
24. Adrian EK Jr, Walker BE: Incorporation of thymidine-H3 by cellsin normal and injured mouse spinal cord. J Neuropathol ExpNeurol 1962, 21:597-609.
25. Namiki J, Tator CH: Cell proliferation and nestin expression inthe ependyma of the adult rat spinal cord after injury. J Neu-ropathol Exp Neurol 1999, 58:489-98.
26. Bruni JE, Anderson WA: Ependyma of the rat fourth ventricleand central canal: Response to injury. Acta Anat 1985,128:265-273.
27. Takahashi M, Yasuhisa A, Kurosawa H, Sueyoshi N, Shirai S: Epend-ymal cell reactions in spinal cord segments after compres-sion injury in adult rat. J Neuropath Exp Neurol 2003, 62:185-194.
28. Yamamoto S, Nagao M, Sugimori M, Kosako H, Nakatomi H,Yamamoto N, Takebayashi H, Nabeshima Y, Kitamura T, WeinmasterG, Nakamura K, Nakafuku M: Trancription factor expressionand notch-dependent regulation of neural progenitors in theadult rat spinal cord. J Neurosci 2001, 21:9814-9823.
29. Horner PJ, Power AE, Kempermann G, Kuhn HG, Plamer TD, Win-kler J, Thal LJ, Gage FH: Proliferation and differentiation of pro-genitor cells throughout the intact adult spinal cord. J Neurosci2000, 20:2218-2228.
30. Carmel JB, Galante A, Soteropoulos P, Tolias P, Recce M, Young W,Hart RP: Gene expression profiling of acute spinal cord injuryreveals spreading inflammatory signals and neuron loss.Physio Genomics 2001, 7:201-213.
31. Song G, Cechvala C, Resnick DK, Dempsey RJ, Rao VLR: GeneChipanalysis after acute spinal cord injury in rat. J Neurochem 2001,79:804-815.
32. Parichy DM, Stigson S, Voss SR: Genetic analysis of Steel and thePG-M/versican-encoding gene AxPG as candidate genes forthe white (d) pigmentation mutant in the salamanderAmbystoma mexicanum. Dev Genes Evol 1999, 209:349-356.
33. Voss SR, Shaffer HB: Adaptive evolution via a major geneeffect: paedomorphosis in the Mexican axolotl. Proc Natl AcadSci USA 1997, 94:14185-14189.
34. Fitzpatrick BM, Shaffer HB: Environment dependent admixturedynamics in a tiger salamander hybrid zone. Int J Org Evolution2004, 58:1282-1293.
35. Martin AP, Palumbi SR: Rate of mitochondrial DNA evolution isslow in sharks compared to mammals. Proc Natl Acad Sci USA1993, 90:4087-4091.
36. Anderson JD, Worthington RD: The life history of the Mexicansalamander Ambystoma ordinarium Taylor. Herpetologica 1971,27:165-176.
37. Cannatella DC, De Sa RO: Xenopus laevis as a model organism.Syst Biol 1993, 42:476-507.
38. Schmid M, Nanda I, Guttenbach M, Steinlein C, Hoehn H, Schartl M,Haaf T, Weigend S, Fries R, Buderstedde J-M, et al.: First report onchicken genes and chromosomes. Cytogenet Cell Genet 2000,90:169-218.
39. Postlethwait JH, Woods IG, Ngo-Hazelett YP, Yan Y-L, Kelly PD, ChuF, Huang H, Hill-Force A, Talbot WS: Zebrafish comparativegenomics and the origins of vertebrate chromosomes.Genome Res 2000, 10:1890-1902.
40. Voss SR, Smith JJ, Gardiner DM, Parichy DM: Conserved verte-brate chromosome segments in the large salamandergenome. Genetics 2001, 158:735-746.
42. Ewing B, Green P: Base-calling of automated sequencer tracesusing phred. II. Error probabilities. Genome Res 1998, 8:186-194.
43. Kalyanaraman A, Aluru S, Kothari S, Brendel V: Efficient clusteringof large EST data sets on parallel computers. Nucleic Acids Res2003, 31:2963-2974.
44. Huang X, Madan A: CAP3: A dna sequence assembly program.Genome Res 1999, 9:868-877.
45. Rozen S, Skaletsky HJ: Primer 3. 1999 [http://frodo.wi.mit.edu/primer3/primer3_code.html].