This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Insect Molecular Biology (2006)
15
(5) 703ndash714
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
703
Blackwell Publishing Ltd
Caste development and reproduction a genome-wide analysis of hallmarks of insect eusociality
A S Cristino F M F Nunesdagger C H Lobodagger M M G Bitondisect Z L P Simotildeessect L da Fontoura Costapara H M G Lattorff R F A Moritz J D Evansdaggerdagger and K HartfelderDagger
Instituto de Matemaacutetica e Estatiacutestica Universidade de Satildeo Paulo Satildeo Paulo Brazil
dagger
Departamento de Geneacutetica and
Dagger
Departamento de Biologia Celular e Molecular e Bioagentes Patogecircnicos Faculdade de Medicina de Ribeiratildeo Preto Universidade de Satildeo Paulo Ribeiratildeo Preto Brazil
sect
Departamento de Biologia Faculdade de Filosofia Ciecircncias e Letras de Ribeiratildeo Preto Universidade de Satildeo Paulo Ribeiratildeo Preto Brazil
para
Instituto de Fiacutesica de Satildeo Carlos Universidade de Satildeo Paulo Satildeo Carlos Brazil
Bee Research Laboratory USDA-ARS BARC-E Beltsville MD USA
Abstract
The honey bee queen and worker castes are a modelsystem for developmental plasticity We used estab-lished expressed sequence tag information for a GeneOntology based annotation of genes that are differen-tially expressed during caste development Metabolicregulation emerged as a major theme with a caste-specific difference in the expression of oxidoreduct-ases vs hydrolases Motif searches in upstream regionsrevealed group-specific motifs providing an entrypoint to
cis
-regulatory network studies on caste genesFor genes putatively involved in reproduction meiosis-associated factors came out as highly conservedwhereas some determinants of embryonic axes eitherdo not have clear orthologs (
bag of marbles
gurken
torso
) or appear to be lacking (
trunk
) in the bee
genome Our results are the outcome of a firstgenome-based initiative to provide an annotatedframework for trends in gene regulation during femalecaste differentiation (representing developmentalplasticity) and reproduction
Keywords caste development oogenesis meiosisUCR motifs AlignACE
Introduction
The evolution of social organization in the Hymenoptera isintricately linked to the division of reproductive activitiesbetween highly fertile queens and functionally sterile workers(Wilson 1971) Ontogenetically these alternative pheno-types primarily reflect the differential feeding of larvaea mechanism that is especially pronounced in the honeybee
Apis mellifera
Queen-destined larvae are fed largeamounts of royal jelly during the entire larval feeding phasewhereas larvae destined to become workers receive analtered diet during the last larval instars (Haydak 1970)This differential feeding program in turn acts on the endo-crine system where it generates caste-specific signaturesin juvenile hormone (JH) and ecdysteroid titres (Hartfelderamp Engels 1998 Rachinsky
et al
1990) These metamorphichormones are part of the endocrine programme that drivesmorphogenesis into either of the two alternative pathways
The major differences between an adult honey beequeen and a worker reside in the reproductive system Aqueen usually has close to 200 ovarioles per ovary and iscapable of producing several hundred eggs per dayWorkers in contrast have between two and 12 ovariolesper ovary (Snodgrass 1956) which do not show signs ofongoing oogenesis as long as the queen is present If thequeen is lost a number of workers can activate theirovaries and produce haploid eggs that will develop intodrones (Kropaacutecovaacute amp Haslbachovaacute 1971 Page amp Erickson1988 Moritz
et al
1996)In order to come to an understanding of the molecular
nature and the signal transduction pathways underlyingthese developmental and ovary activation signals differentialgene expression profiling in honey bee caste development
Received 20 April 2006 accepted after revision 17 July 2006 Correspond-ence Klaus Hartfelder Departamento de Biologia Celular e Moleculare Bioagentes Patogecircnicos Faculdade de Medicina de Ribeiratildeo PretoUniversidade de Satildeo Paulo Avenida Bandeirantes 3900 14049-900Ribeiratildeo Preto Brazil Tel +55 16 36023063 fax +55 16 36331786e-mail klausfmrpuspbr
Re-use of this article is permitted in accordance with the Creative CommonsDeed Attribution 2middot5 which does not permit commercial exploitation
704
A S Cristino
et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
was initiated in the late nineties The main body of currentlyavailable data resulted from a cDNA library generated bysuppression subtractive hybridization (SSH) that con-trasted queen and worker larvae (Evans amp Wheeler 1999)Subsequent macroarray analyses (Evans amp Wheeler 2000)revealed a clustering of these expressed sequence tags(ESTs) into three distinct groups genes overexpressed inyoung (bipotent) larvae genes overexpressed in fifth-instarqueen larvae and genes overexpressed in fifth-instar workerlarvae A second study focusing on oxidative metabolismidentified a set of differentially expressed mitochondrialgenes (Corona
et al
1999) The third approach was a DDRT-PCR screen for hormone responsive genes to investigatethe mode of action of ecdysteroids in the differentiation of thelarval ovary (Hepperle amp Hartfelder 2001) Many of theseEST sets could not be properly annotated at that time eitherbecause of a limited number of fully sequenced insectgenomes or because the libraries contained large numbersof transcripts in 3
prime
-gene regions including poorly con-served untranslated regions (UTRs) The draft assembly forthe honey bee genome (Honey Bee Genome SequencingConsortium 2006) now permits a much more reliable anno-tation of this unique set of experimentally validated genes
Reproductive activity of honey bees is determined in atwo-step process The basic differences in reproductivecapacity between queen and workers manifest themselvesduring larval development by a wave of programmed celldeath that leads to the destruction of over 95 of the ovarioleprimordia in the larval ovary of workers (Schmidt-Capella ampHartfelder 1998) In the adult life cycle of each caste theco-ordinated flux of egg production through previtellogenicand vitellogenic growth will require the activity of other setsof genes Some of these act as determinants of the majoregg and also embryonic axes As the fruit fly is the most welldeveloped insect model for axis determination (St Johnstonamp Nuumlsslein-Volhard 1992) and maternal factors have notyet been functionally characterized in the honey bee search-ing the genome assembly (Honey Bee Genome Sequenc-ing Consortium 2006) provides the first major opportunityto explore putative patterning networks in honey bees
The vitellogenic growth phase of the honey bee oocytehas long been the centre of attention as a means ofdescribing differential fertility of the female castes (Engels1974) The synthesis of large amounts of vitellogenin by thequeen fat body is intimately related to her high reproductiverate The equally high vitellogenin titres in haemolymph ofnonreproducing young worker bees however have beenan enigma as their ovaries are inactive in the presence ofthe queen Vitellogenin expression has apparently becomeuncoupled from oocyte growth during the evolution of thesterile worker caste and has acquired secondary functionsIt became involved in the production of royal jelly (Amdam
et al
2003) and in the regulation of worker lifespan(Amdam
et al
2004) through an inhibitory effect on the
endocrine system (Guidugli
et al
2005) Along with suchunique life-history traits related to socially organized repro-duction honey bees also promise to answer new questionsinvolving meiosis as the honey bee genome exhibitsrecombination rates that exceed those of all other higherorganisms (Hunt amp Page 1995 Solignac
et al
2004) andas honey bee males being haploid forego meiosis I inproducing gametes
The honey bee genome sequence database (Honey BeeGenome Sequencing Consortium 2006) has become anextremely valuable resource not only for comparative genom-ics but also for functional genomics One of the oldest andfor evolutionary biologists most challenging question insocial insect biology is the development of a reproductiveand a nonreproductive caste (Darwin 1859) Apart from itsimplications on evolutionary theory in terms of kin selection(Hamilton 1964) this is essentially a question of howdevelopmental pathways diverge to shape distinct pheno-types and how oogenesis is regulated to achieve levels ofextremely high (queen) and extremely low (worker) fertility
The annotation of genes related to caste developmentand differential reproduction in the honey bee has impli-cations well beyond this species It represents the firstgenome-wide annotation of a molecular architecture behindreproductive division of labour In the light of current dis-cussions on the importance of alternative phenotypes inthe evolution of novelties (West-Eberhard 2003) the honeybee genome information is certainly one of the most valuableresources In the present manuscript we delineate a strategyon how to transcend from a straightforward gene annota-tion approach to functional studies based on motif analysisof upstream regulatory regions
Results and discussion
From caste to BLAST differentially expressed genes in caste development
The full list of genes that are overexpressed in fifth-instarqueen or worker larvae is made available online in theSupplementary material (Table 1S) This list includesscaffold number corresponding EST number(s) GLEAN3-predicted protein sequence similarity and identity indicesto corresponding
Drosophila melanogaster
orthologs aswell as protein domain information (Pfam)
A general result was that a relatively large subset ofgenes (nine of 34) overexpressed in honey bee queenlarvae is represented by putative
Drosophila
orthologs forwhich no Gene Ontology (GO) term for Biological Processis indicated in Flybase In contrast all worker genes corre-spond to functionally relatively well-defined
Drosophila
genes Even when taking into consideration the conceptuallimits in attributing GO terms on biological process from
Drosophila
orthologs to honey bee genes this finding couldhave a bearing on basic questions in socioevolution
Genomics of honey bee caste development and reproduction
705
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
namely which caste is the novelty the queen or theworker(s) Phrased in other terms the genome sequenceinformation now permits to address at a molecular levelquestions that are fundamental to understand the role of(and evolutionary trends in) ontogenetic processes thatstructure insect societies especially in hymenopteransSuch basic questions are (1) how many degrees offreedom (or release from constraints) may actually havebeen gained from splitting the functions normally performedby a solitary ancestral hymenopteran female into two ormore castes and (2) how was this release from constraintsintegrated into postembryonic differentiation processesto generate truly alternative phenotypes A second obser-vation of potential interest to functional genomics was thata relatively large subset of the caste-related genes mapsto chromosome 2 (seven of 51 unique sequences)
Most genes in the caste gene list are represented by oneor two EST hits except for a predicted
hexamerin 70b
gene(GB10869-PA) This gene was evidenced by 10 ESTs onein a 5
prime
-located exon and nine in the 3
prime
region (five ESTscomprising parts of exon 7 and parts of the 3
prime
-UTR theother four ESTs landing in exons 6 and 7) The macroarraydata (Evans amp Wheeler 2000) established this gene asoverexpressed in the worker caste Hexamerins are animportant class of storage proteins that show interestingexpression patterns related to caste and reproduction inmany social insects (Martinez
et al
2001 Hunt
et al
2003 Zhou
et al
2006ab) A cDNA encoding the honeybee Hexamerin 70b subunit has recently been cloned andsequenced (Cunha
et al
2005) and hormone manipula-tion experiments showed that the abundance of
hexamerin70b
transcripts in larval development is positively cor-related with high levels of JH and ecdysteroids This
could actually reflect a regulatory feedback function in JHtitre regulation as exemplified in the termite
Reticulitermesflavipes
where the Hex1Hex2 ratio controls JH availabilityfor caste-specifically differentiating tissues (Zhou
et al
2006b)
Within the honey bee caste genes for which GO infor-mation was imported and deduced from their
Drosophila
orthologs we noted a predominance of terms clusteringas lsquocellular physiological processrsquo (95 GO0050875) andlsquometabolismrsquo (90 GO0008152) in the lsquoBiological Processrsquo(GO0008150) category (Fig 1A) GO-statistics differencesbetween queens and workers became apparent in termsclustering as lsquocell differentiationrsquo (0 for queen and 285for workers GO0030154) and lsquometabolismrsquo (96 forqueen and 785 for worker GO0008152) in the lsquoBiologicalProcessrsquo (GO0008150) (Fig 2A)
With respect to lsquoMolecular Functionrsquo (GO0003674)most terms were related to mRNA translation (lsquonucleic acidbindingrsquo (38 GO0003676) lsquostructural constituent ofribosomersquo (24 GO0003735) lsquoprotein bindingrsquo (12GO0005515) lsquonucleotide bindingrsquo (12 GO0000166)lsquotranslation factor activity nucleic acid bindingrsquo (7GO0008135) Further important terms were lsquooxidoreductaseactivityrsquo (19 GO0016491) and lsquohydrolase activityrsquo(165 GO0016787) (Fig 1B) For these latter two termswe noted potentially interesting differences related to castewith lsquohydrolase activityrsquo being overrepresented by workertranscribed genes whereas lsquooxidoreductase activityrsquo wasexclusively represented by queen genes (Fig 2B) Eventhough these GO assignments on Molecular Function arebased on evidence from
D melanogaster
without experi-mental evidence for
Apis mellifera
the correspondinggenes are well conserved in sequence and show the
Figure 1 Dominant gene ontology terms for (A) Biological Process and (B) Molecular Function in honey bee genes with an experimentally validated caste-specific expression pattern during the last larval instar The graph was generated by a FatiGO analysis set at level 3 Frequencies indicate the appearance of GO terms in the total set of queen and worker differentially expressed genes
706
A S Cristino
et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
relevant protein domains (Supplementary materialTable 1S
)
and thus are indicative of functional trendsIn general terms the caste-specific separation into
metabolic pathway preferences oxidoreductases vshydrolases may reflect the switch in diet that a workerlarva experiences during the fourth and fifth larval instarThis represents a switch from a proteinlipid-rich diet toa more carbohydrate-rich diet (Haydak 1970) and thisswitch apparently is accompanied by an increase in theexpression of genes coding for proteins with hydrolaseactivity Similar switches in gene expression patternshave recently been reported for
D melanogaster
in anexperiment where larvae were shifted from a cornmealdiet to a banana diet (Carsten
et al
2005) resulting inthe up- or downregulation of 55 genes of a test populationof 6000 Among these are five genes with dehydrogenaseoxidoreductase activity These parallels in dietary switchresponses are indicative of conserved coregulated genenetworks An open question is of course how these can beco-opted to generate different phenotypes such as thecastes of social insects In this respect social insects clearlygo a big step beyond the simple metabolic switch responseseen in
Drosophila
They have apparently incorporateddivergent metabolic regulation into a network architectureconsistent with morphogenetic differentiation This requiredthat metabolic regulation became integrated throughthe endocrine system with developmental patterningprocesses
The importance of metabolic regulation on caste devel-opment has also come to light in a recent RepresentationalDifference Analysis (RDA) study on caste development inthe highly eusocial stingless bee
Melipona quadrifasciata
(Judice
et al
2006)
This is particularly interesting becausein this genus caste development is thought to be based ona genetic predisposition (Kerr 1950) Metabolic regulationmay thus be a
sine qua non
for caste development and
caste-specific metabolic pathways may be set in motionrather independently of the nature of the initial switch(nutritional or genetic) The question of how this metabolicswitch may integrate with the resultant endocrine signaturecharacteristic for each caste is still a widely open field butrecent studies in
Drosophila
showing an interaction betweenecdysone and insulin signalling in the determination ofbody size (Colombani
et al
2005 Mirth
et al
2005) mayprovide a lead
This is also the point to reflect on how justified it is toheuristically rely on
Drosophila
orthologs and to use their GOattributes in a developmental context (caste differentiation)that has no parallel in
Drosophila
A recent gene expressionprofiling study in the ant
Camponotus festinatus
employinga microarray set-up of 384 clones showed significantlydifferent expression levels for larval vs adult ants in 91genes (21 confirmed by qRTndashPCR) including an
Apishexamerin 70b
ortholog (Goodisman
et al
2005) Whencomparing the temporal expression patterns of these antgenes with expression profiles for their respective
Drosophila
orthologs (Arbeitsman
et al
2002) relatively little accordwas noted for the two species leading to the suggestionthat these genes may have taken on distinct functionsdue to the long divergence time between dipterans andhymenopterans (Goodisman
et al
2005) Differencesaside these examples show that in practically all studies onlarge-scale functional considerations in gene expression weare strongly wedded with
Drosophila
and even thoughfunctional divergence in orthologs may have occurredthere is little experimental gene-by-gene evidence availablefor any of the major insect orders outside of Diptera
Functional studies are clearly profiting from the nowavailable honey bee genome sequence as evident fromthe increasing number of RNAi experiments in honeybees(see citations in Honey Bee Genome Sequencing Consor-tium 2006) This is still a small number compared with
Figure 2 Gene Ontology categories with caste-specific expression patterns for Biological Process (A) Genes classified as part of cell differentiation processes are significantly overexpressed in workers whereas genes related to metabolism are overexpressed in queen larvae In the Molecular Function categories (B) we observed an apparent split indicating differential enzyme preferences in queens (overexpress oxidoreductases) and in workers (overexpress hydrolases) The graph was generated by a FatiGO analysis set at level 3 Frequencies indicate the appearance of GO terms in the queen (black bars) and worker differentially expressed genes (grey bars)
Genomics of honey bee caste development and reproduction
707
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
the large-scale RNAi assays established for
Drosophila
(Boutros
et al
2004) but the development of cell cultureapproaches in the honey bee (Bergem
et al
2006) repre-sents a step in this direction
Alternatively regulatory functional associations betweengenes and their integration into networks can be inferredfrom the presence of response elements in upstreamcontrol regions In our analysis of differentially expressedgenes in queen-worker development we took a bioinformaticsapproach for a first look into the molecular architecture of adevelopmental polyphenism
Motif search in upstream regions of differentially expressed genes
The genes related to caste development are among thefirst honey bee genes for which experimentally validatedexpression data were generated (Corona
2004) Certainly these 51 genes donot comprise all the genes involved in caste developmentbut they are expected be prominent players as they werethe ones that stood out in the SSH and DDRT-PCRapproaches The 51 caste genes do not represent genefamilies but rather fall into many very different molecularfunction categories This made us ask whether theobserved overexpression pattern of different genes ineither queen or worker larvae may be associated with theoccurrence of specific regulatory motifs in the upstreamcontrol regions (UCR) of these genes
Three different algorithms AlignACE (Roth
et al
1998)MEME (Bailey amp Elkan 1995) and MDscan (Liu
et al
2002) were used to construct a pipeline for detectingoverrepresented motifs in the two unaligned sets of UCRsequences for the caste-specifically expressed genes Thispipeline was run on a lsquotop-10rsquo set of 12 genes (six for eachcaste) which showed the most pronounced caste differ-ences in expression (Evans amp Wheeler 2000) and alsoon a randomly selected set of UCRs (background control)We calculated four different metrics for each motif MAPscore (Roth
et al
1998) a group-specificity score (Churchscore) (Hughes
et al
2000) and a ROC AUC and MNCPmetric (Clarke amp Granek 2003) A first set of filters wasused to detect motifs with a potential for regulatory functions(MAP score
ge
5 ROC AUC
ge
07) This resulted in 46 motifsout of 123 total UCR motifs found in the queen UCR setand in 71 motifs out of 261 total found in the worker UCRset (Supplementary material Table 2S
)
A parametric statistical test (
MANOVA
P
= 00001Wilksrsquo = 078
F
= 72) and a nonparametric statistical test(KolmogorovndashSmirnov Table 1) on ROC AUC and MNCPindices showed that these two sets of filtered motifs aresignificantly different from a randomly selected set ofmotifs The rank-order metrics ROC AUC and MNCP havepreviously been used to compare the association of short
regulatory sequence features with gene expression data(microarray analyses on coregulated genes) and they havebeen useful in flagging false positives erroneously includedin lsquotop-10rsquo sets of differentially expressed genes (Clarke ampGranek 2003)
To select highly specific motifs found in each data set weused the group-specificity score (Church score
le
1e
minus
05
Hughes
et al
2000) to identify the most likely motifsinvolved in decision making for pathways leading to queen(two motifs Fig 3A) or to worker development (12 motifswith Church score
le
1e
minus
07
Fig 3B) As the SSH and DDRT-PCR approaches on caste development can be expectedto retrieve only a subpopulation of such genes these motifsrepresent only a partial scenario of the transcriptionalregulatory network underlying caste development Themotifs can now be used to screen other GLEAN3-predictedgenes to integrate a candidate list of putatively coregulatedgenes in caste development that can be submitted tofurther experimental validation
Each motif found in UCRs of queen (46) and worker (71)overexpressed genes was compared with the entire setof
D melanogaster cis
-regulatory motifs contained in theTRANSFAC database (version 40 Wingender
et al
2000)Only alignments passing 80 identity for each position-specific site matrix (PSSM) were considered as significantmatches Whereas none of the most specific motifs foreach caste showed similarity to any of the
D melanogaster
motifs some of the more ubiquitous ones did resemblebinding sites of transcription factors such as
AntennapediaUltrabithorax
zerknuumlllt
even skipped
trithorax-like
tailless
paired
fushi tarazu
and
Adh transcription factor 1
(Supple-mentary material Table 2S)
When we plotted the positions of the two queen and the12 worker motifs in the UCRs of the caste-specificallyexpressed genes (Fig 4) an interesting pattern emergedfor the worker-specific motifs Some of the worker motifsappeared to be clustered and occurring in tandem further-more they were positioned relatively close to the predictedtranslation start sites in some of the genes that are over-expressed during worker development (annotation resultsof these genes are listed in Supplementary materialTable 1S) A position close to the predicted translation start
Table 1 KolmogorovndashSmirnov analysis of ROC AUC and MNCP metric for statistical significance of putative regulatory motifs in upstream control regions of genes with queen or worker-specific expression patterns These motifs were contrasted with a random set of motifs detected in a random set of UCRs of GLEAN3-predicted honey bee genes
Group pairs ROC AUC MNCP
Random times (Queen + Worker) P gt 01 P lt 0001Random times Queen P gt 01 P lt 0005Random times Worker P gt 01 P lt 0001Queen times Worker P lt 01 P gt 01
708 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
sites is generally taken as a sign of strong regulatory effect(Davidson 2001)
As caste development is highly dependent on changesin haemolymph titres of JH and ecdysteroids we alsoscreened the UCRs of the differentially expressed genesfor putative nuclear receptor binding sites Regulatoryelements involved in the JH response are not well under-stood yet so any prediction in this direction would beelusive (Wheeler amp Nijhout 2003) Functional ecdysoneresponse elements (EcRE) have however been identifiedand it is now well established that the EcRUSP complexbinds to direct or inverted (palindromic) repeats (Riddi-hough amp Pelham 1987 Antoniewski et al 1995 Perera
et al 2005) A PSSM search (Wassermann amp Sandelin2004) based on a canonical representation (rGkTCAaT-Gamcy) (Perera et al 2005) did not reveal any putativeEcRE motif in the UCRs of the 51 caste-differentiallyexpressed genes However this does not rule out thatthese genes respond to changes in JH andor ecdysteroidtitres as these hormones require EcRUSP binding prima-rily in the expression of early genes but not necessarily forthe late response genes (Li amp White 2003 Sullivan ampThummel 2003)
In conclusion the predictions from such a combinedstrategy that searches for group-specific and for conservedregulatory motifs in GLEAN3 predicted honey bee genes
Figure 3 Putative regulatory motifs and their consensus sequences in UCRs of queen and worker overexpressed genes Scores for MAP Church ROC AUC and MNCP metrics indicate degree of group specificity and significance level
Genomics of honey bee caste development and reproduction 709
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
represents a major transition from nonhypothesis-drivenhigh-throughput screens to hypothesis-driven searches forcontext-dependent gene expression in honey bees Suchdirected search results can serve as a platform for experi-mental analyses of genome-wide integration in hormonalcontrol of caste development in bees In addition this studyexemplifies how existent algorithms for detecting sharedregulatory motifs can be joined into a toolkit for predictingcoregulated gene expression patterns in honey bees Thesemethods have been shown to be robust and are gainingacceptance for use in functional and comparative genomics(Liu et al 2004 Pritsker et al 2004 Zhu et al 2005)
Oogenesis and reproduction
As caste development sets the stage for reproductivedivision of labour genes involved in reproductive processesare strong candidates for functional analyses In the presentstudy we performed BLAST searches to identify honey beeorthologs for a list of 32 fly genes with the GO attributelsquooogenesisrsquo and for four genes specifically related tolsquovitellogenesisrsquo The list for fly genes involved in nuclearevents in germ cells consisted of 20 genes for lsquofemalemeiosisrsquo 12 genes for lsquorecombinationrsquo and 21 genes underthe heading lsquochromosome segregation including segregation
distortionrsquo (Supplementary material Table 3S) In some casesthese GO attributes for fly genes of course overlapped
BLASTN and BLASTX searches for these fly genesagainst the honey bee genome assembly 30 and theGLEAN3 Official Set (aa) retrieved statistically well sup-ported putative bee orthologs for most of these candidatesFor the genes involved in meiosis recombination and chro-mosome segregation this finding although not unexpectedis of interest as meiosis in the haploid honey bee drone isstrongly modified when compared with a normal diploidmeiosis The first meiosis is initiated but the nucleusremains undivided and only the superfluous centrioles areeliminated as cytoplasmatic buds (Hoage amp Kessel 1968)An interesting gene thelytoky (th) has recently beenmapped in this context (Lattorff et al 2005) It preventsalmost completely meiotic recombination in the automixisof laying workers of the Cape honey bee As an indicationof the interplay between meiosis and later developmentthis locus also appears to be an integral part of variousgene cascades involved in caste determination (Lattorffet al 2006)
The fly genes retrieved in the GO searches for lsquooogenesisrsquorepresent a much larger range of Molecular Functioncategories such as transcription factors proteins regulating
Figure 4 Map of the group-specific motifs found in queen and worker UCRs of caste-specifically expressed genes The coding region is represented by the GLEAN3 prediction number (assembly 40) with arrows indicating the translation start site Asterisks mark UCRs of the lsquotop10rsquo set used to find the over-represented motifs
710 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
translation by RNA binding RNA helicases enzymes(ubiquitination transfer of sugar residues sulfotransferase)GTPase activity and several factors binding to cytoskeletalproteins (Supplementary material Table 3S) This widerange of functional categories is expected as these genesare involved in a series of different steps during oogenesisin the polytrophic meroistic ovary Oogenesis starts out withthe maintenance of germline and somatic stem cell identityin the germline niche in the upper germarium A key geneinvolved in this process is pumilio (Forbes amp Lehmann1998) which is represented by a highly conserved beeortholog GB10504-PA The second step is the formation ofgerm cell cysts the determination of an oocyte within eachcyst and the survival of these cysts involving genes suchas benign gonial cell neoplasm (Lin et al 1994) encore(Hawkins et al 1996) ovo and ovarian tumour (otu) (Staabamp Steinmann-Zwicky 1995) all well conserved in the honeybee genome Interestingly we could not find a clear beeortholog for bag of marbles (bam) which is one of the primeearly response genes in the cystoblast differentiationpathway in Drosophila (McKearin 1997)
The third step comprises previtellogenic growth of thefollicle and during these stages a number of maternalfactors are deposited and anchored either within the oocyteor in the perivitelline space that define the egg and thefuture embryonic axes (for review see St Johnston ampNuumlsslein-Volhard (1992) In the list of Drosophila genesinvolved in early steps of axis determination a couple ofsurprises came up in the search for honey bee orthologsA big surprise was that we could not find a gurken orthologin the bee even though this gene sets up both the anteriorndashposterior and dorsalndashventral axes in the Drosophila egg(Gonzaacutelez-Reyes et al 1995) whereas downstreamcomponents of the Gurken signalling cascade appear tobe preserved in the bee genome (Honey Bee GenomeSequencing Consortium 2006) Similar apparent gaps inconstituents of patterning cascades were noted for theterminal regions of the embryo such as a lack of a torsoortholog whereas its ligand torso-like is represented by awell conserved ortholog in the bee (GB18663-PA)
With respect to genes involved in the final processes ofoogenesis we primarily looked at genes that play a partduring vitellogenesis There are four genes of interest in thisclass the primary one coding for the yolk protein precursorvitellogenin This gene has already been sequenced for thehoney bee (Piulachs et al 2003) and as expected it ismuch more related to vitellogenins of other insects andeven vertebrates than to the Drosophila yolk proteinswhich apparently are derived from lipases (Hagedorn et al1998) The second gene of interest is the bee ortholog toyolkless as this (GB16571-PA) could represent a putativevitellogenin receptor The other two Drosophila genes withclear orthologs in the bee are CG18641 and CG12139which code for a lipase and an LDL receptor respectively
General conclusions
The current analysis made use of previous experimentalanalyses on differential transcription during caste develop-ment of honey bee larvae In the annotation of these geneswhich includes references to Gene Ontology terms asso-ciated with their respective Drosophila orthologs two majorconfigurations emerged First of all worker genes werebetter defined in terms of GO attributes compared with therelatively large number of queen genes that had no GOterms associated to their respective Drosophila orthologsEven when taking into consideration the conceptual limitsin attributing GO terms on molecular function and biologicalprocess from Drosophila orthologs to honey bee genesthis finding could have a bearing on general basicquestions in socioevolution namely which caste is moredivergent from a nonsocial reproductive female bee proto-type or reproductive ground plan the queen or the workerLess speculative is the second major conclusion comingout of the GO analysis for Molecular Function showing andconfirming (Eder et al 1983 Corona et al 1999) theimportant role of metabolic regulation in caste developmentThis facet is demonstrated especially clearly in thecaste-specific expression of oxidoreductases (queen)vs hydrolases (workers)
The honey bee genome information provided not only amuch improved annotation platform for caste-specificallyexpressed ESTs but even more so opens the possibilityto explore putative regulatory features of the honey beegenome In the current study we employed modified Gibbssampling and expectation-maximization algorithms (Alig-nACE MDScan MEME) to detect group-specific motifs ingene regions up to 1000 bp upstream of translation startsites We detected 14 motifs that were significantly over-represented in the caste genes when compared withcorresponding motifs found in a random set of GLEAN3-predicted honey bee genes The localization of suchmotifs in UCRs of worker-overexpressed genes revealeda clustering of such motifs close to the predicted basalpromotor regions suggesting strong regulatory effectsSuch search strategies and the detected motifs can providethe lead to reveal and unravel cis-regulatory networks forand within specific contexts of honey bee biology
Caste polyphenism in social insects makes a strong casefor the emergence of novelties at a microevolutionary level(West-Eberhard 2003) Without the pretension to discussexhaustively the mechanisms underlying this surge ofdevelopmental plasticity two major themes becomeapparent in this and other studies Regulatory change hasbeen demonstrated in the shut-down of wing disc pattern-ing cascades in ants (Abouheif amp Wray 2002) and iscertainly also implicit in observed temporal changes ingene expression during postembryonic development ofants and bumble bees (Goodisman et al 2005 Pereboom
Genomics of honey bee caste development and reproduction 711
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
et al 2005) Such change would be expected to involvecis-regulatory elements that is change in transcriptionfactor binding sites in UCRs as approached in this studyand also evolutionary change in response thresholds tocirculating morphogenetic hormones (for review see Hart-felder amp Emlen 2005) The second and quite unexpectedtheme is the acquisition of new systemic functions byevolutionary rather old proteins such as vitellogenin andhexamerins These apparently unspectacular proteinshave evolved into key players for caste evolution and repro-ductive division of labour via novel regulatory connectivitywith JH (Amdam et al 2004 Guidugli et al 2005 Zhouet al 2006ab)
Experimental procedures
Selection and annotation of ESTs representing differentially expressed genes in honey bee caste development
The starting point were 164 entries (mainly 3prime-ESTs) in GenBank(BG101532ndashBG101697) from an SSH library (Evans amp Wheeler1999 Evans amp Wheeler 2000) When validated by macroarrayanalyses a clustering into three major classes became apparent(I) genes overexpressed in young larvae (II) genes overexpressedin last instar queen larvae and (III) genes overexpressed in lastinstar worker larvae For this study we excluded the class I ESTsbecause their expression is not caste-specific but rather representsexpression differences between young (still bipotent) and olderlarvae To the class II queen ESTs (82) we added one completecDNA entry (AY601642) from a DDRT-PCR screen (Corona et al1999) and to the class III set of worker ESTs (40) we added sevenGenBank dbEST entries (BG149167ndashBG149173) from a DDRT-PCR screen on ovary development (Hepperle amp Hartfelder 2001)
The EST sequences were submitted to BLASTN searches(parameters -G 2 -E 3 -W 15 -F lsquom Drsquo -U -e 1e-20) against genomesequence assembly Amel_v30 to retrieve matches in linked orunlinked genomic contigs and to exclude no-matches (seven ESTsin queen) ESTs that aligned within the same scaffold werechecked for clustering and overlap This clustering also served toexclude genes that were represented by non-overlapping ESTsfrom both castes This procedure generated a set of 51 uniqueputative gene sequences overexpressed in either queen (34) or inworker larvae (17) These 51 nonredundant sequences weresubmitted to BLASTX searches against the Official Set of GLEAN3-predicted protein sequences (cut-off value at 1eminus20) For ESTs withno significant protein sequence matches the genomic regionsadjacent to the mapped EST were searched to find neighbouringORFs especially those nearest to putative 3prime UTRs of predictedproteins as the EST libraries have a bias in this direction
Official Set protein sequences were aligned against Amel_v30sequence assembly using TBLASTN to map protein to genomeand subsequently they were aligned using BLASTP against theGenBank nonredundant (nr) and the Flybase protein sequencedatabases The manual features annotation procedure of theArtemis 70 program (Rutherford et al 2000) was used to mapORFs putative splice sites of exons and ESTs to genome coordi-nates The final annotation file was generated with a Python scriptin GFF format (httpwwwsangeracukSoftwareformatsGFF)
Honey bee sequences annotated as orthologs to D mela-nogaster genes were putatively assigned the GO terms listed in
the respective Flybase entry In addition the definition of new GOterms (Biological Process ontology) related to caste developmentand polyphenism (GO0048651 and GO0048650 respectively)was co-ordinated with the Gene Ontology Consortium (Ashburneret al 2000) The FatiGO web tool (Al-Sharour et al 2004) wasused to cluster GO terms (level 3 setting) for Biological Processand Molecular Function
For the detection of conserved domains the 51 proteinsequences were screened against the Pfam database (httpwwwsangeracukSoftwarePfam) using the HMMER platform(current release 232 httphmmerwustledu) with a cut-offvalue set at 1eminus10
Annotation of oogenesis and reproduction genes
In order to identify putative honey bee orthologs to D melanogastergenes we searched the following GO terms in Flybase lsquooogenesisrsquo(GO0009993) lsquovitellogenesisrsquo (GO0007296) lsquofemale meiosisrsquo(GO0007143) lsquoDNA recombinationrsquo (GO0006310) and lsquochromo-some segregationrsquo (GO0007059) Genes related to segregationdistortion were searched for in Flybase in phenotypic descriptionsand mutant effects of D melanogaster genes as this phenomenonis not represented by a GO term Hence this group may be moreheterogeneous than the others From this list we removed genesof pleiotropic function (multifaceted GO entries in Biological Process)and genes that lacked defined transcripts in the Drosophilagenome database
For the GO terms lsquooogenesisrsquo and lsquovitellogenesisrsquo we performedTBLASTN and BLASTP searches for 42 fruit fly genes against theAmel_v30 genome assembly and the GLEAN3-predicted proteinsequences (Honey Bee Genome Sequencing Consortium 2006)respectively The orthologous D melanogaster gene was charac-terized by the same procedure as described above (reciprocalbest hit) For the GO terms lsquofemale meiosisrsquo lsquoDNA recombinationrsquolsquochromosome segregationrsquo and the non-GO group lsquosegregationdistortionrsquo transcripts of D melanogaster were searched againstnr databases at NCBI using BLASTP The obtained sequenceswere searched against the Amel_v30 genome assembly and theGLEAN3-predicted protein sequences using TBLASTN Homolo-gous sequences (threshold 1eminus10) were predicted using the BioEditsoftware (Hall 1999) ORFs showing significant homology (BLASTPthreshold 1eminus20) were assembled and used in BLASTP searchesagainst the nr databases at NCBI
Motif search in upstream regions in caste-specifically expressed genes
In order to detect overrepresented motifs in the upstream controlregions (UCRs) of the two sets of caste-related genes we selectedgene subsets based on two criteria (1) those that had shown thehighest caste-specificity in the array analyses (Evans amp Wheeler2000) and (2) those that had a conserved 5prime region when comparedwith the Drosophila orthologs These lsquotop10rsquo genes consisted of sixqueen genes (GB13072 GB11628 GB19380 GB14798 GB16047and GB18242) and of six worker genes (GB10869 GB12371GB12239 GB10428 GB19006 GB14758) The motif search wasconducted separately on the two sets of UCR sequences using threemethods AlignAce (Roth et al 1998) MEME (Bailey amp Elkan1995) and MDscan (Liu et al 2002) Default parameters valueswere used in all searches except that GC content in intergenicregions was set to 25 representing the background value estab-lished for the honey bee UCR database generated in this study
712 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
The database containing 10156 UCR sequences was gener-ated by parsing the Official Set annotation file (downloaded in GFFformat from httpwwwbeegenomehgscbcmtmcedubeeftphtml)to extract upstream regions starting from the terminal 5prime-genomiccoordinate of each predicted CDS The UCRs were arbitrarily setto a size frame of 1000 nucleotides (Roth et al 1998) but weretrimmed whenever another predicted ORF was detected in any ofthese regions
The MAP (maximum a priori log likelihood) score group specificityscore (called Church score in this manuscript) (Hughes et al2000) ROC AUC (area under the curve for a receiver-operatorcharacteristic plot) metric and MNCP (mean normalized condi-tional probability) metric (Clarke amp Granek 2003) were used todetect motifs that most likely correspond to biologically significantcis-regulatory elements The filters ran on the UCRs of the subsetsof queen and worker genes were a MAP score cut-off value of 50followed by a ROC AUC cut-off at 07 followed by a group specificityscore cut-off at 1eminus05 The UCR database of all GLEAN3 predictedhoney bee genes was used as the background to calculatethese metrics
A parametric test (MANOVA) and a nonparametric test (KolmogorovndashSmirnov) were conducted to identify significance levels for the twosets of filtered motifs found in the UCRs of caste-specificallyexpressed genes against filtered motifs found in the random UCRset Random motifs were sampled from a motif database (10 391motifs) generated by running our script 100 times with a randomsets of UCR sequences
The main criterion for identifying known regulatory motifs amongthese caste-specific ones was the alignment of the PSSM for eachbee motif with the Drosophila melanogaster sequences in theTRANSFAC database (release 40) (Wingender et al 2000) Onlythe alignments passing a threshold of 80 identity for each PSSMwere considered as significant matches In addition we checkedfor a specific binding motif the EcRUSP motif (rGkTCAaTGamcy-3prime) known to function in the expression of genes responding tomorphogenetic hormone titres (Perera et al 2005)
Operating system and programming tools
An Ubuntu Linux (version Breezy) operating system was used toimplement all scripts and pipelines designed for annotation proce-dures and motif discovery The Python programming language(httpwwwpythonorg) Biopython (httpwwwbiopythonorg)and TAMO (Tools for Analysis of Motifs) packages (Gordon et al2005) were used in program design Other web applications werebuilt using the Zope application server (httpwwwzopeorg)hosted at httpzulufmrpuspbrbeelab
Acknowledgements
This work was supported by grants from FAPESP(Fundaccedilatildeo de Amparo a Pesquisa do Estado de SatildeoPaulo 9900719-6) and CNPq (Conselho Nacional deDesenvolvimento Cientiacutefico e Tecnoloacutegico 4729632-3-1)
References
Abouheif E and Wray GA (2002) Evolution of the genetic networkunderlying wing polyphenism in ants Science 297 249ndash252
Al-Sharour F Diaz-Uriarte R and Dopazo J (2004) FatiGO aweb tool for finding significant associations of Gene Ontologyterms with groups of genes Bioinformatics 20 578ndash580
Amdam GV Norberg K Hagen A and Omholt SW (2003)Social exploitation of vitellogenin Proc Natl Acad Sci USA 1001799ndash1802
Amdam GV Simotildees ZLP Hagen A Norberg K SchroderK Mikkelsen O et al (2004) Hormonal control of the yolkprecursor vitellogenin regulates immune function and longevityin honeybees Exp Gerontol 39 767ndash773
Antoniewski C OrsquoGrady MS Edmondson RG Lassieur SMand Benes H (1995) Characterization of an EcRUSP heter-odimer target site that mediates ecdysone responsiveness ofthe Drosophila Lsp-2 gene Mol Gen Genet 249 545ndash556
Arbeitsman MN Furlong EEM Imam F Johnson E NullBH Baker BS et al (2002) Gene expression during the lifecycle of Drosophila melanogaster Science 297 2270ndash2275
Ashburner M Ball CA Blake JA Botstein D Butler HCherry JM et al (2000) Gene Ontology tool for the unificationof biology Nature Genet 25 25ndash29
Bailey TL and Elkan D (1995) Unsupervised learning of multiplemotifs in biopolymers using expectation maximizationMachine Learning J 21 58ndash83
Bergem M Norberg K and Aamodt RM (2006) Long-termmaintenance of in vitro cultured honeybee (Apis mellifera)embryonic cells BMC Dev Biol 6 17 online
Boutros M Kiger AA Armknecht S Kerr K Hild M Koch Bet al (2004) Genome-wide RNAi analysis of growth and viabil-ity in Drosophila cells Science 303 832ndash835
Carsten LD Watts T and Markov TA (2005) Gene expressionpatterns accompanying a dietary switch in Drosophila mela-nogaster Mol Ecol 14 3203ndash3208
Clarke ND and Granek JA (2003) Rank order metrics for quan-tifying the association of sequence features with gene regula-tion Bioinformatics 19 212ndash218
Colombani J Bianchini L Layalle S Pondeville E Dauphin-Villemant C Antoniewski C et al (2005) Antagonisticactions of ecdysone and insulins determine final size inDrosophila Science 310 667ndash670
Corona M Estrada E and Zurita M (1999) Differential expressionof mitochondrial genes between queens and workers duringcaste determination in the honeybee Apis mellifera J Exp Biol202 929ndash938
Cunha AC Nascimento AM Guidugli KR Simotildees ZLPand Bitondi MMG (2005) Molecular cloning and expressionof a hexamerin cDNA from the honey bee J Insect Physiol 511135ndash1147
Darwin CR (1859) On the Origin of Species by Means of NaturalSelection John Murray London
Davidson EH (2001) Genomic Regulatory Systems Developmentand Evolution Academic Press San Diego
Eder J Kremer JP and Rembold H (1983) Correlation ofcytochrome c titer and respiration in Apis mellifera adaptiveresponses to caste determination defines workers intercastesand queens Comp Biochem Physiol B 76 703ndash716
Engels W (1974) Occurrence and significance of vitellogenins infemale castes of social Hymenoptera Am Zool 14 1229ndash1237
Evans JD and Wheeler DE (1999) Differential gene expressionbetween developing queens and workers in the honey beeApis mellifera Proc Natl Acad Sci USA 96 5575ndash5580
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
704
A S Cristino
et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
was initiated in the late nineties The main body of currentlyavailable data resulted from a cDNA library generated bysuppression subtractive hybridization (SSH) that con-trasted queen and worker larvae (Evans amp Wheeler 1999)Subsequent macroarray analyses (Evans amp Wheeler 2000)revealed a clustering of these expressed sequence tags(ESTs) into three distinct groups genes overexpressed inyoung (bipotent) larvae genes overexpressed in fifth-instarqueen larvae and genes overexpressed in fifth-instar workerlarvae A second study focusing on oxidative metabolismidentified a set of differentially expressed mitochondrialgenes (Corona
et al
1999) The third approach was a DDRT-PCR screen for hormone responsive genes to investigatethe mode of action of ecdysteroids in the differentiation of thelarval ovary (Hepperle amp Hartfelder 2001) Many of theseEST sets could not be properly annotated at that time eitherbecause of a limited number of fully sequenced insectgenomes or because the libraries contained large numbersof transcripts in 3
prime
-gene regions including poorly con-served untranslated regions (UTRs) The draft assembly forthe honey bee genome (Honey Bee Genome SequencingConsortium 2006) now permits a much more reliable anno-tation of this unique set of experimentally validated genes
Reproductive activity of honey bees is determined in atwo-step process The basic differences in reproductivecapacity between queen and workers manifest themselvesduring larval development by a wave of programmed celldeath that leads to the destruction of over 95 of the ovarioleprimordia in the larval ovary of workers (Schmidt-Capella ampHartfelder 1998) In the adult life cycle of each caste theco-ordinated flux of egg production through previtellogenicand vitellogenic growth will require the activity of other setsof genes Some of these act as determinants of the majoregg and also embryonic axes As the fruit fly is the most welldeveloped insect model for axis determination (St Johnstonamp Nuumlsslein-Volhard 1992) and maternal factors have notyet been functionally characterized in the honey bee search-ing the genome assembly (Honey Bee Genome Sequenc-ing Consortium 2006) provides the first major opportunityto explore putative patterning networks in honey bees
The vitellogenic growth phase of the honey bee oocytehas long been the centre of attention as a means ofdescribing differential fertility of the female castes (Engels1974) The synthesis of large amounts of vitellogenin by thequeen fat body is intimately related to her high reproductiverate The equally high vitellogenin titres in haemolymph ofnonreproducing young worker bees however have beenan enigma as their ovaries are inactive in the presence ofthe queen Vitellogenin expression has apparently becomeuncoupled from oocyte growth during the evolution of thesterile worker caste and has acquired secondary functionsIt became involved in the production of royal jelly (Amdam
et al
2003) and in the regulation of worker lifespan(Amdam
et al
2004) through an inhibitory effect on the
endocrine system (Guidugli
et al
2005) Along with suchunique life-history traits related to socially organized repro-duction honey bees also promise to answer new questionsinvolving meiosis as the honey bee genome exhibitsrecombination rates that exceed those of all other higherorganisms (Hunt amp Page 1995 Solignac
et al
2004) andas honey bee males being haploid forego meiosis I inproducing gametes
The honey bee genome sequence database (Honey BeeGenome Sequencing Consortium 2006) has become anextremely valuable resource not only for comparative genom-ics but also for functional genomics One of the oldest andfor evolutionary biologists most challenging question insocial insect biology is the development of a reproductiveand a nonreproductive caste (Darwin 1859) Apart from itsimplications on evolutionary theory in terms of kin selection(Hamilton 1964) this is essentially a question of howdevelopmental pathways diverge to shape distinct pheno-types and how oogenesis is regulated to achieve levels ofextremely high (queen) and extremely low (worker) fertility
The annotation of genes related to caste developmentand differential reproduction in the honey bee has impli-cations well beyond this species It represents the firstgenome-wide annotation of a molecular architecture behindreproductive division of labour In the light of current dis-cussions on the importance of alternative phenotypes inthe evolution of novelties (West-Eberhard 2003) the honeybee genome information is certainly one of the most valuableresources In the present manuscript we delineate a strategyon how to transcend from a straightforward gene annota-tion approach to functional studies based on motif analysisof upstream regulatory regions
Results and discussion
From caste to BLAST differentially expressed genes in caste development
The full list of genes that are overexpressed in fifth-instarqueen or worker larvae is made available online in theSupplementary material (Table 1S) This list includesscaffold number corresponding EST number(s) GLEAN3-predicted protein sequence similarity and identity indicesto corresponding
Drosophila melanogaster
orthologs aswell as protein domain information (Pfam)
A general result was that a relatively large subset ofgenes (nine of 34) overexpressed in honey bee queenlarvae is represented by putative
Drosophila
orthologs forwhich no Gene Ontology (GO) term for Biological Processis indicated in Flybase In contrast all worker genes corre-spond to functionally relatively well-defined
Drosophila
genes Even when taking into consideration the conceptuallimits in attributing GO terms on biological process from
Drosophila
orthologs to honey bee genes this finding couldhave a bearing on basic questions in socioevolution
Genomics of honey bee caste development and reproduction
705
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
namely which caste is the novelty the queen or theworker(s) Phrased in other terms the genome sequenceinformation now permits to address at a molecular levelquestions that are fundamental to understand the role of(and evolutionary trends in) ontogenetic processes thatstructure insect societies especially in hymenopteransSuch basic questions are (1) how many degrees offreedom (or release from constraints) may actually havebeen gained from splitting the functions normally performedby a solitary ancestral hymenopteran female into two ormore castes and (2) how was this release from constraintsintegrated into postembryonic differentiation processesto generate truly alternative phenotypes A second obser-vation of potential interest to functional genomics was thata relatively large subset of the caste-related genes mapsto chromosome 2 (seven of 51 unique sequences)
Most genes in the caste gene list are represented by oneor two EST hits except for a predicted
hexamerin 70b
gene(GB10869-PA) This gene was evidenced by 10 ESTs onein a 5
prime
-located exon and nine in the 3
prime
region (five ESTscomprising parts of exon 7 and parts of the 3
prime
-UTR theother four ESTs landing in exons 6 and 7) The macroarraydata (Evans amp Wheeler 2000) established this gene asoverexpressed in the worker caste Hexamerins are animportant class of storage proteins that show interestingexpression patterns related to caste and reproduction inmany social insects (Martinez
et al
2001 Hunt
et al
2003 Zhou
et al
2006ab) A cDNA encoding the honeybee Hexamerin 70b subunit has recently been cloned andsequenced (Cunha
et al
2005) and hormone manipula-tion experiments showed that the abundance of
hexamerin70b
transcripts in larval development is positively cor-related with high levels of JH and ecdysteroids This
could actually reflect a regulatory feedback function in JHtitre regulation as exemplified in the termite
Reticulitermesflavipes
where the Hex1Hex2 ratio controls JH availabilityfor caste-specifically differentiating tissues (Zhou
et al
2006b)
Within the honey bee caste genes for which GO infor-mation was imported and deduced from their
Drosophila
orthologs we noted a predominance of terms clusteringas lsquocellular physiological processrsquo (95 GO0050875) andlsquometabolismrsquo (90 GO0008152) in the lsquoBiological Processrsquo(GO0008150) category (Fig 1A) GO-statistics differencesbetween queens and workers became apparent in termsclustering as lsquocell differentiationrsquo (0 for queen and 285for workers GO0030154) and lsquometabolismrsquo (96 forqueen and 785 for worker GO0008152) in the lsquoBiologicalProcessrsquo (GO0008150) (Fig 2A)
With respect to lsquoMolecular Functionrsquo (GO0003674)most terms were related to mRNA translation (lsquonucleic acidbindingrsquo (38 GO0003676) lsquostructural constituent ofribosomersquo (24 GO0003735) lsquoprotein bindingrsquo (12GO0005515) lsquonucleotide bindingrsquo (12 GO0000166)lsquotranslation factor activity nucleic acid bindingrsquo (7GO0008135) Further important terms were lsquooxidoreductaseactivityrsquo (19 GO0016491) and lsquohydrolase activityrsquo(165 GO0016787) (Fig 1B) For these latter two termswe noted potentially interesting differences related to castewith lsquohydrolase activityrsquo being overrepresented by workertranscribed genes whereas lsquooxidoreductase activityrsquo wasexclusively represented by queen genes (Fig 2B) Eventhough these GO assignments on Molecular Function arebased on evidence from
D melanogaster
without experi-mental evidence for
Apis mellifera
the correspondinggenes are well conserved in sequence and show the
Figure 1 Dominant gene ontology terms for (A) Biological Process and (B) Molecular Function in honey bee genes with an experimentally validated caste-specific expression pattern during the last larval instar The graph was generated by a FatiGO analysis set at level 3 Frequencies indicate the appearance of GO terms in the total set of queen and worker differentially expressed genes
706
A S Cristino
et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
relevant protein domains (Supplementary materialTable 1S
)
and thus are indicative of functional trendsIn general terms the caste-specific separation into
metabolic pathway preferences oxidoreductases vshydrolases may reflect the switch in diet that a workerlarva experiences during the fourth and fifth larval instarThis represents a switch from a proteinlipid-rich diet toa more carbohydrate-rich diet (Haydak 1970) and thisswitch apparently is accompanied by an increase in theexpression of genes coding for proteins with hydrolaseactivity Similar switches in gene expression patternshave recently been reported for
D melanogaster
in anexperiment where larvae were shifted from a cornmealdiet to a banana diet (Carsten
et al
2005) resulting inthe up- or downregulation of 55 genes of a test populationof 6000 Among these are five genes with dehydrogenaseoxidoreductase activity These parallels in dietary switchresponses are indicative of conserved coregulated genenetworks An open question is of course how these can beco-opted to generate different phenotypes such as thecastes of social insects In this respect social insects clearlygo a big step beyond the simple metabolic switch responseseen in
Drosophila
They have apparently incorporateddivergent metabolic regulation into a network architectureconsistent with morphogenetic differentiation This requiredthat metabolic regulation became integrated throughthe endocrine system with developmental patterningprocesses
The importance of metabolic regulation on caste devel-opment has also come to light in a recent RepresentationalDifference Analysis (RDA) study on caste development inthe highly eusocial stingless bee
Melipona quadrifasciata
(Judice
et al
2006)
This is particularly interesting becausein this genus caste development is thought to be based ona genetic predisposition (Kerr 1950) Metabolic regulationmay thus be a
sine qua non
for caste development and
caste-specific metabolic pathways may be set in motionrather independently of the nature of the initial switch(nutritional or genetic) The question of how this metabolicswitch may integrate with the resultant endocrine signaturecharacteristic for each caste is still a widely open field butrecent studies in
Drosophila
showing an interaction betweenecdysone and insulin signalling in the determination ofbody size (Colombani
et al
2005 Mirth
et al
2005) mayprovide a lead
This is also the point to reflect on how justified it is toheuristically rely on
Drosophila
orthologs and to use their GOattributes in a developmental context (caste differentiation)that has no parallel in
Drosophila
A recent gene expressionprofiling study in the ant
Camponotus festinatus
employinga microarray set-up of 384 clones showed significantlydifferent expression levels for larval vs adult ants in 91genes (21 confirmed by qRTndashPCR) including an
Apishexamerin 70b
ortholog (Goodisman
et al
2005) Whencomparing the temporal expression patterns of these antgenes with expression profiles for their respective
Drosophila
orthologs (Arbeitsman
et al
2002) relatively little accordwas noted for the two species leading to the suggestionthat these genes may have taken on distinct functionsdue to the long divergence time between dipterans andhymenopterans (Goodisman
et al
2005) Differencesaside these examples show that in practically all studies onlarge-scale functional considerations in gene expression weare strongly wedded with
Drosophila
and even thoughfunctional divergence in orthologs may have occurredthere is little experimental gene-by-gene evidence availablefor any of the major insect orders outside of Diptera
Functional studies are clearly profiting from the nowavailable honey bee genome sequence as evident fromthe increasing number of RNAi experiments in honeybees(see citations in Honey Bee Genome Sequencing Consor-tium 2006) This is still a small number compared with
Figure 2 Gene Ontology categories with caste-specific expression patterns for Biological Process (A) Genes classified as part of cell differentiation processes are significantly overexpressed in workers whereas genes related to metabolism are overexpressed in queen larvae In the Molecular Function categories (B) we observed an apparent split indicating differential enzyme preferences in queens (overexpress oxidoreductases) and in workers (overexpress hydrolases) The graph was generated by a FatiGO analysis set at level 3 Frequencies indicate the appearance of GO terms in the queen (black bars) and worker differentially expressed genes (grey bars)
Genomics of honey bee caste development and reproduction
707
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
the large-scale RNAi assays established for
Drosophila
(Boutros
et al
2004) but the development of cell cultureapproaches in the honey bee (Bergem
et al
2006) repre-sents a step in this direction
Alternatively regulatory functional associations betweengenes and their integration into networks can be inferredfrom the presence of response elements in upstreamcontrol regions In our analysis of differentially expressedgenes in queen-worker development we took a bioinformaticsapproach for a first look into the molecular architecture of adevelopmental polyphenism
Motif search in upstream regions of differentially expressed genes
The genes related to caste development are among thefirst honey bee genes for which experimentally validatedexpression data were generated (Corona
2004) Certainly these 51 genes donot comprise all the genes involved in caste developmentbut they are expected be prominent players as they werethe ones that stood out in the SSH and DDRT-PCRapproaches The 51 caste genes do not represent genefamilies but rather fall into many very different molecularfunction categories This made us ask whether theobserved overexpression pattern of different genes ineither queen or worker larvae may be associated with theoccurrence of specific regulatory motifs in the upstreamcontrol regions (UCR) of these genes
Three different algorithms AlignACE (Roth
et al
1998)MEME (Bailey amp Elkan 1995) and MDscan (Liu
et al
2002) were used to construct a pipeline for detectingoverrepresented motifs in the two unaligned sets of UCRsequences for the caste-specifically expressed genes Thispipeline was run on a lsquotop-10rsquo set of 12 genes (six for eachcaste) which showed the most pronounced caste differ-ences in expression (Evans amp Wheeler 2000) and alsoon a randomly selected set of UCRs (background control)We calculated four different metrics for each motif MAPscore (Roth
et al
1998) a group-specificity score (Churchscore) (Hughes
et al
2000) and a ROC AUC and MNCPmetric (Clarke amp Granek 2003) A first set of filters wasused to detect motifs with a potential for regulatory functions(MAP score
ge
5 ROC AUC
ge
07) This resulted in 46 motifsout of 123 total UCR motifs found in the queen UCR setand in 71 motifs out of 261 total found in the worker UCRset (Supplementary material Table 2S
)
A parametric statistical test (
MANOVA
P
= 00001Wilksrsquo = 078
F
= 72) and a nonparametric statistical test(KolmogorovndashSmirnov Table 1) on ROC AUC and MNCPindices showed that these two sets of filtered motifs aresignificantly different from a randomly selected set ofmotifs The rank-order metrics ROC AUC and MNCP havepreviously been used to compare the association of short
regulatory sequence features with gene expression data(microarray analyses on coregulated genes) and they havebeen useful in flagging false positives erroneously includedin lsquotop-10rsquo sets of differentially expressed genes (Clarke ampGranek 2003)
To select highly specific motifs found in each data set weused the group-specificity score (Church score
le
1e
minus
05
Hughes
et al
2000) to identify the most likely motifsinvolved in decision making for pathways leading to queen(two motifs Fig 3A) or to worker development (12 motifswith Church score
le
1e
minus
07
Fig 3B) As the SSH and DDRT-PCR approaches on caste development can be expectedto retrieve only a subpopulation of such genes these motifsrepresent only a partial scenario of the transcriptionalregulatory network underlying caste development Themotifs can now be used to screen other GLEAN3-predictedgenes to integrate a candidate list of putatively coregulatedgenes in caste development that can be submitted tofurther experimental validation
Each motif found in UCRs of queen (46) and worker (71)overexpressed genes was compared with the entire setof
D melanogaster cis
-regulatory motifs contained in theTRANSFAC database (version 40 Wingender
et al
2000)Only alignments passing 80 identity for each position-specific site matrix (PSSM) were considered as significantmatches Whereas none of the most specific motifs foreach caste showed similarity to any of the
D melanogaster
motifs some of the more ubiquitous ones did resemblebinding sites of transcription factors such as
AntennapediaUltrabithorax
zerknuumlllt
even skipped
trithorax-like
tailless
paired
fushi tarazu
and
Adh transcription factor 1
(Supple-mentary material Table 2S)
When we plotted the positions of the two queen and the12 worker motifs in the UCRs of the caste-specificallyexpressed genes (Fig 4) an interesting pattern emergedfor the worker-specific motifs Some of the worker motifsappeared to be clustered and occurring in tandem further-more they were positioned relatively close to the predictedtranslation start sites in some of the genes that are over-expressed during worker development (annotation resultsof these genes are listed in Supplementary materialTable 1S) A position close to the predicted translation start
Table 1 KolmogorovndashSmirnov analysis of ROC AUC and MNCP metric for statistical significance of putative regulatory motifs in upstream control regions of genes with queen or worker-specific expression patterns These motifs were contrasted with a random set of motifs detected in a random set of UCRs of GLEAN3-predicted honey bee genes
Group pairs ROC AUC MNCP
Random times (Queen + Worker) P gt 01 P lt 0001Random times Queen P gt 01 P lt 0005Random times Worker P gt 01 P lt 0001Queen times Worker P lt 01 P gt 01
708 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
sites is generally taken as a sign of strong regulatory effect(Davidson 2001)
As caste development is highly dependent on changesin haemolymph titres of JH and ecdysteroids we alsoscreened the UCRs of the differentially expressed genesfor putative nuclear receptor binding sites Regulatoryelements involved in the JH response are not well under-stood yet so any prediction in this direction would beelusive (Wheeler amp Nijhout 2003) Functional ecdysoneresponse elements (EcRE) have however been identifiedand it is now well established that the EcRUSP complexbinds to direct or inverted (palindromic) repeats (Riddi-hough amp Pelham 1987 Antoniewski et al 1995 Perera
et al 2005) A PSSM search (Wassermann amp Sandelin2004) based on a canonical representation (rGkTCAaT-Gamcy) (Perera et al 2005) did not reveal any putativeEcRE motif in the UCRs of the 51 caste-differentiallyexpressed genes However this does not rule out thatthese genes respond to changes in JH andor ecdysteroidtitres as these hormones require EcRUSP binding prima-rily in the expression of early genes but not necessarily forthe late response genes (Li amp White 2003 Sullivan ampThummel 2003)
In conclusion the predictions from such a combinedstrategy that searches for group-specific and for conservedregulatory motifs in GLEAN3 predicted honey bee genes
Figure 3 Putative regulatory motifs and their consensus sequences in UCRs of queen and worker overexpressed genes Scores for MAP Church ROC AUC and MNCP metrics indicate degree of group specificity and significance level
Genomics of honey bee caste development and reproduction 709
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
represents a major transition from nonhypothesis-drivenhigh-throughput screens to hypothesis-driven searches forcontext-dependent gene expression in honey bees Suchdirected search results can serve as a platform for experi-mental analyses of genome-wide integration in hormonalcontrol of caste development in bees In addition this studyexemplifies how existent algorithms for detecting sharedregulatory motifs can be joined into a toolkit for predictingcoregulated gene expression patterns in honey bees Thesemethods have been shown to be robust and are gainingacceptance for use in functional and comparative genomics(Liu et al 2004 Pritsker et al 2004 Zhu et al 2005)
Oogenesis and reproduction
As caste development sets the stage for reproductivedivision of labour genes involved in reproductive processesare strong candidates for functional analyses In the presentstudy we performed BLAST searches to identify honey beeorthologs for a list of 32 fly genes with the GO attributelsquooogenesisrsquo and for four genes specifically related tolsquovitellogenesisrsquo The list for fly genes involved in nuclearevents in germ cells consisted of 20 genes for lsquofemalemeiosisrsquo 12 genes for lsquorecombinationrsquo and 21 genes underthe heading lsquochromosome segregation including segregation
distortionrsquo (Supplementary material Table 3S) In some casesthese GO attributes for fly genes of course overlapped
BLASTN and BLASTX searches for these fly genesagainst the honey bee genome assembly 30 and theGLEAN3 Official Set (aa) retrieved statistically well sup-ported putative bee orthologs for most of these candidatesFor the genes involved in meiosis recombination and chro-mosome segregation this finding although not unexpectedis of interest as meiosis in the haploid honey bee drone isstrongly modified when compared with a normal diploidmeiosis The first meiosis is initiated but the nucleusremains undivided and only the superfluous centrioles areeliminated as cytoplasmatic buds (Hoage amp Kessel 1968)An interesting gene thelytoky (th) has recently beenmapped in this context (Lattorff et al 2005) It preventsalmost completely meiotic recombination in the automixisof laying workers of the Cape honey bee As an indicationof the interplay between meiosis and later developmentthis locus also appears to be an integral part of variousgene cascades involved in caste determination (Lattorffet al 2006)
The fly genes retrieved in the GO searches for lsquooogenesisrsquorepresent a much larger range of Molecular Functioncategories such as transcription factors proteins regulating
Figure 4 Map of the group-specific motifs found in queen and worker UCRs of caste-specifically expressed genes The coding region is represented by the GLEAN3 prediction number (assembly 40) with arrows indicating the translation start site Asterisks mark UCRs of the lsquotop10rsquo set used to find the over-represented motifs
710 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
translation by RNA binding RNA helicases enzymes(ubiquitination transfer of sugar residues sulfotransferase)GTPase activity and several factors binding to cytoskeletalproteins (Supplementary material Table 3S) This widerange of functional categories is expected as these genesare involved in a series of different steps during oogenesisin the polytrophic meroistic ovary Oogenesis starts out withthe maintenance of germline and somatic stem cell identityin the germline niche in the upper germarium A key geneinvolved in this process is pumilio (Forbes amp Lehmann1998) which is represented by a highly conserved beeortholog GB10504-PA The second step is the formation ofgerm cell cysts the determination of an oocyte within eachcyst and the survival of these cysts involving genes suchas benign gonial cell neoplasm (Lin et al 1994) encore(Hawkins et al 1996) ovo and ovarian tumour (otu) (Staabamp Steinmann-Zwicky 1995) all well conserved in the honeybee genome Interestingly we could not find a clear beeortholog for bag of marbles (bam) which is one of the primeearly response genes in the cystoblast differentiationpathway in Drosophila (McKearin 1997)
The third step comprises previtellogenic growth of thefollicle and during these stages a number of maternalfactors are deposited and anchored either within the oocyteor in the perivitelline space that define the egg and thefuture embryonic axes (for review see St Johnston ampNuumlsslein-Volhard (1992) In the list of Drosophila genesinvolved in early steps of axis determination a couple ofsurprises came up in the search for honey bee orthologsA big surprise was that we could not find a gurken orthologin the bee even though this gene sets up both the anteriorndashposterior and dorsalndashventral axes in the Drosophila egg(Gonzaacutelez-Reyes et al 1995) whereas downstreamcomponents of the Gurken signalling cascade appear tobe preserved in the bee genome (Honey Bee GenomeSequencing Consortium 2006) Similar apparent gaps inconstituents of patterning cascades were noted for theterminal regions of the embryo such as a lack of a torsoortholog whereas its ligand torso-like is represented by awell conserved ortholog in the bee (GB18663-PA)
With respect to genes involved in the final processes ofoogenesis we primarily looked at genes that play a partduring vitellogenesis There are four genes of interest in thisclass the primary one coding for the yolk protein precursorvitellogenin This gene has already been sequenced for thehoney bee (Piulachs et al 2003) and as expected it ismuch more related to vitellogenins of other insects andeven vertebrates than to the Drosophila yolk proteinswhich apparently are derived from lipases (Hagedorn et al1998) The second gene of interest is the bee ortholog toyolkless as this (GB16571-PA) could represent a putativevitellogenin receptor The other two Drosophila genes withclear orthologs in the bee are CG18641 and CG12139which code for a lipase and an LDL receptor respectively
General conclusions
The current analysis made use of previous experimentalanalyses on differential transcription during caste develop-ment of honey bee larvae In the annotation of these geneswhich includes references to Gene Ontology terms asso-ciated with their respective Drosophila orthologs two majorconfigurations emerged First of all worker genes werebetter defined in terms of GO attributes compared with therelatively large number of queen genes that had no GOterms associated to their respective Drosophila orthologsEven when taking into consideration the conceptual limitsin attributing GO terms on molecular function and biologicalprocess from Drosophila orthologs to honey bee genesthis finding could have a bearing on general basicquestions in socioevolution namely which caste is moredivergent from a nonsocial reproductive female bee proto-type or reproductive ground plan the queen or the workerLess speculative is the second major conclusion comingout of the GO analysis for Molecular Function showing andconfirming (Eder et al 1983 Corona et al 1999) theimportant role of metabolic regulation in caste developmentThis facet is demonstrated especially clearly in thecaste-specific expression of oxidoreductases (queen)vs hydrolases (workers)
The honey bee genome information provided not only amuch improved annotation platform for caste-specificallyexpressed ESTs but even more so opens the possibilityto explore putative regulatory features of the honey beegenome In the current study we employed modified Gibbssampling and expectation-maximization algorithms (Alig-nACE MDScan MEME) to detect group-specific motifs ingene regions up to 1000 bp upstream of translation startsites We detected 14 motifs that were significantly over-represented in the caste genes when compared withcorresponding motifs found in a random set of GLEAN3-predicted honey bee genes The localization of suchmotifs in UCRs of worker-overexpressed genes revealeda clustering of such motifs close to the predicted basalpromotor regions suggesting strong regulatory effectsSuch search strategies and the detected motifs can providethe lead to reveal and unravel cis-regulatory networks forand within specific contexts of honey bee biology
Caste polyphenism in social insects makes a strong casefor the emergence of novelties at a microevolutionary level(West-Eberhard 2003) Without the pretension to discussexhaustively the mechanisms underlying this surge ofdevelopmental plasticity two major themes becomeapparent in this and other studies Regulatory change hasbeen demonstrated in the shut-down of wing disc pattern-ing cascades in ants (Abouheif amp Wray 2002) and iscertainly also implicit in observed temporal changes ingene expression during postembryonic development ofants and bumble bees (Goodisman et al 2005 Pereboom
Genomics of honey bee caste development and reproduction 711
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
et al 2005) Such change would be expected to involvecis-regulatory elements that is change in transcriptionfactor binding sites in UCRs as approached in this studyand also evolutionary change in response thresholds tocirculating morphogenetic hormones (for review see Hart-felder amp Emlen 2005) The second and quite unexpectedtheme is the acquisition of new systemic functions byevolutionary rather old proteins such as vitellogenin andhexamerins These apparently unspectacular proteinshave evolved into key players for caste evolution and repro-ductive division of labour via novel regulatory connectivitywith JH (Amdam et al 2004 Guidugli et al 2005 Zhouet al 2006ab)
Experimental procedures
Selection and annotation of ESTs representing differentially expressed genes in honey bee caste development
The starting point were 164 entries (mainly 3prime-ESTs) in GenBank(BG101532ndashBG101697) from an SSH library (Evans amp Wheeler1999 Evans amp Wheeler 2000) When validated by macroarrayanalyses a clustering into three major classes became apparent(I) genes overexpressed in young larvae (II) genes overexpressedin last instar queen larvae and (III) genes overexpressed in lastinstar worker larvae For this study we excluded the class I ESTsbecause their expression is not caste-specific but rather representsexpression differences between young (still bipotent) and olderlarvae To the class II queen ESTs (82) we added one completecDNA entry (AY601642) from a DDRT-PCR screen (Corona et al1999) and to the class III set of worker ESTs (40) we added sevenGenBank dbEST entries (BG149167ndashBG149173) from a DDRT-PCR screen on ovary development (Hepperle amp Hartfelder 2001)
The EST sequences were submitted to BLASTN searches(parameters -G 2 -E 3 -W 15 -F lsquom Drsquo -U -e 1e-20) against genomesequence assembly Amel_v30 to retrieve matches in linked orunlinked genomic contigs and to exclude no-matches (seven ESTsin queen) ESTs that aligned within the same scaffold werechecked for clustering and overlap This clustering also served toexclude genes that were represented by non-overlapping ESTsfrom both castes This procedure generated a set of 51 uniqueputative gene sequences overexpressed in either queen (34) or inworker larvae (17) These 51 nonredundant sequences weresubmitted to BLASTX searches against the Official Set of GLEAN3-predicted protein sequences (cut-off value at 1eminus20) For ESTs withno significant protein sequence matches the genomic regionsadjacent to the mapped EST were searched to find neighbouringORFs especially those nearest to putative 3prime UTRs of predictedproteins as the EST libraries have a bias in this direction
Official Set protein sequences were aligned against Amel_v30sequence assembly using TBLASTN to map protein to genomeand subsequently they were aligned using BLASTP against theGenBank nonredundant (nr) and the Flybase protein sequencedatabases The manual features annotation procedure of theArtemis 70 program (Rutherford et al 2000) was used to mapORFs putative splice sites of exons and ESTs to genome coordi-nates The final annotation file was generated with a Python scriptin GFF format (httpwwwsangeracukSoftwareformatsGFF)
Honey bee sequences annotated as orthologs to D mela-nogaster genes were putatively assigned the GO terms listed in
the respective Flybase entry In addition the definition of new GOterms (Biological Process ontology) related to caste developmentand polyphenism (GO0048651 and GO0048650 respectively)was co-ordinated with the Gene Ontology Consortium (Ashburneret al 2000) The FatiGO web tool (Al-Sharour et al 2004) wasused to cluster GO terms (level 3 setting) for Biological Processand Molecular Function
For the detection of conserved domains the 51 proteinsequences were screened against the Pfam database (httpwwwsangeracukSoftwarePfam) using the HMMER platform(current release 232 httphmmerwustledu) with a cut-offvalue set at 1eminus10
Annotation of oogenesis and reproduction genes
In order to identify putative honey bee orthologs to D melanogastergenes we searched the following GO terms in Flybase lsquooogenesisrsquo(GO0009993) lsquovitellogenesisrsquo (GO0007296) lsquofemale meiosisrsquo(GO0007143) lsquoDNA recombinationrsquo (GO0006310) and lsquochromo-some segregationrsquo (GO0007059) Genes related to segregationdistortion were searched for in Flybase in phenotypic descriptionsand mutant effects of D melanogaster genes as this phenomenonis not represented by a GO term Hence this group may be moreheterogeneous than the others From this list we removed genesof pleiotropic function (multifaceted GO entries in Biological Process)and genes that lacked defined transcripts in the Drosophilagenome database
For the GO terms lsquooogenesisrsquo and lsquovitellogenesisrsquo we performedTBLASTN and BLASTP searches for 42 fruit fly genes against theAmel_v30 genome assembly and the GLEAN3-predicted proteinsequences (Honey Bee Genome Sequencing Consortium 2006)respectively The orthologous D melanogaster gene was charac-terized by the same procedure as described above (reciprocalbest hit) For the GO terms lsquofemale meiosisrsquo lsquoDNA recombinationrsquolsquochromosome segregationrsquo and the non-GO group lsquosegregationdistortionrsquo transcripts of D melanogaster were searched againstnr databases at NCBI using BLASTP The obtained sequenceswere searched against the Amel_v30 genome assembly and theGLEAN3-predicted protein sequences using TBLASTN Homolo-gous sequences (threshold 1eminus10) were predicted using the BioEditsoftware (Hall 1999) ORFs showing significant homology (BLASTPthreshold 1eminus20) were assembled and used in BLASTP searchesagainst the nr databases at NCBI
Motif search in upstream regions in caste-specifically expressed genes
In order to detect overrepresented motifs in the upstream controlregions (UCRs) of the two sets of caste-related genes we selectedgene subsets based on two criteria (1) those that had shown thehighest caste-specificity in the array analyses (Evans amp Wheeler2000) and (2) those that had a conserved 5prime region when comparedwith the Drosophila orthologs These lsquotop10rsquo genes consisted of sixqueen genes (GB13072 GB11628 GB19380 GB14798 GB16047and GB18242) and of six worker genes (GB10869 GB12371GB12239 GB10428 GB19006 GB14758) The motif search wasconducted separately on the two sets of UCR sequences using threemethods AlignAce (Roth et al 1998) MEME (Bailey amp Elkan1995) and MDscan (Liu et al 2002) Default parameters valueswere used in all searches except that GC content in intergenicregions was set to 25 representing the background value estab-lished for the honey bee UCR database generated in this study
712 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
The database containing 10156 UCR sequences was gener-ated by parsing the Official Set annotation file (downloaded in GFFformat from httpwwwbeegenomehgscbcmtmcedubeeftphtml)to extract upstream regions starting from the terminal 5prime-genomiccoordinate of each predicted CDS The UCRs were arbitrarily setto a size frame of 1000 nucleotides (Roth et al 1998) but weretrimmed whenever another predicted ORF was detected in any ofthese regions
The MAP (maximum a priori log likelihood) score group specificityscore (called Church score in this manuscript) (Hughes et al2000) ROC AUC (area under the curve for a receiver-operatorcharacteristic plot) metric and MNCP (mean normalized condi-tional probability) metric (Clarke amp Granek 2003) were used todetect motifs that most likely correspond to biologically significantcis-regulatory elements The filters ran on the UCRs of the subsetsof queen and worker genes were a MAP score cut-off value of 50followed by a ROC AUC cut-off at 07 followed by a group specificityscore cut-off at 1eminus05 The UCR database of all GLEAN3 predictedhoney bee genes was used as the background to calculatethese metrics
A parametric test (MANOVA) and a nonparametric test (KolmogorovndashSmirnov) were conducted to identify significance levels for the twosets of filtered motifs found in the UCRs of caste-specificallyexpressed genes against filtered motifs found in the random UCRset Random motifs were sampled from a motif database (10 391motifs) generated by running our script 100 times with a randomsets of UCR sequences
The main criterion for identifying known regulatory motifs amongthese caste-specific ones was the alignment of the PSSM for eachbee motif with the Drosophila melanogaster sequences in theTRANSFAC database (release 40) (Wingender et al 2000) Onlythe alignments passing a threshold of 80 identity for each PSSMwere considered as significant matches In addition we checkedfor a specific binding motif the EcRUSP motif (rGkTCAaTGamcy-3prime) known to function in the expression of genes responding tomorphogenetic hormone titres (Perera et al 2005)
Operating system and programming tools
An Ubuntu Linux (version Breezy) operating system was used toimplement all scripts and pipelines designed for annotation proce-dures and motif discovery The Python programming language(httpwwwpythonorg) Biopython (httpwwwbiopythonorg)and TAMO (Tools for Analysis of Motifs) packages (Gordon et al2005) were used in program design Other web applications werebuilt using the Zope application server (httpwwwzopeorg)hosted at httpzulufmrpuspbrbeelab
Acknowledgements
This work was supported by grants from FAPESP(Fundaccedilatildeo de Amparo a Pesquisa do Estado de SatildeoPaulo 9900719-6) and CNPq (Conselho Nacional deDesenvolvimento Cientiacutefico e Tecnoloacutegico 4729632-3-1)
References
Abouheif E and Wray GA (2002) Evolution of the genetic networkunderlying wing polyphenism in ants Science 297 249ndash252
Al-Sharour F Diaz-Uriarte R and Dopazo J (2004) FatiGO aweb tool for finding significant associations of Gene Ontologyterms with groups of genes Bioinformatics 20 578ndash580
Amdam GV Norberg K Hagen A and Omholt SW (2003)Social exploitation of vitellogenin Proc Natl Acad Sci USA 1001799ndash1802
Amdam GV Simotildees ZLP Hagen A Norberg K SchroderK Mikkelsen O et al (2004) Hormonal control of the yolkprecursor vitellogenin regulates immune function and longevityin honeybees Exp Gerontol 39 767ndash773
Antoniewski C OrsquoGrady MS Edmondson RG Lassieur SMand Benes H (1995) Characterization of an EcRUSP heter-odimer target site that mediates ecdysone responsiveness ofthe Drosophila Lsp-2 gene Mol Gen Genet 249 545ndash556
Arbeitsman MN Furlong EEM Imam F Johnson E NullBH Baker BS et al (2002) Gene expression during the lifecycle of Drosophila melanogaster Science 297 2270ndash2275
Ashburner M Ball CA Blake JA Botstein D Butler HCherry JM et al (2000) Gene Ontology tool for the unificationof biology Nature Genet 25 25ndash29
Bailey TL and Elkan D (1995) Unsupervised learning of multiplemotifs in biopolymers using expectation maximizationMachine Learning J 21 58ndash83
Bergem M Norberg K and Aamodt RM (2006) Long-termmaintenance of in vitro cultured honeybee (Apis mellifera)embryonic cells BMC Dev Biol 6 17 online
Boutros M Kiger AA Armknecht S Kerr K Hild M Koch Bet al (2004) Genome-wide RNAi analysis of growth and viabil-ity in Drosophila cells Science 303 832ndash835
Carsten LD Watts T and Markov TA (2005) Gene expressionpatterns accompanying a dietary switch in Drosophila mela-nogaster Mol Ecol 14 3203ndash3208
Clarke ND and Granek JA (2003) Rank order metrics for quan-tifying the association of sequence features with gene regula-tion Bioinformatics 19 212ndash218
Colombani J Bianchini L Layalle S Pondeville E Dauphin-Villemant C Antoniewski C et al (2005) Antagonisticactions of ecdysone and insulins determine final size inDrosophila Science 310 667ndash670
Corona M Estrada E and Zurita M (1999) Differential expressionof mitochondrial genes between queens and workers duringcaste determination in the honeybee Apis mellifera J Exp Biol202 929ndash938
Cunha AC Nascimento AM Guidugli KR Simotildees ZLPand Bitondi MMG (2005) Molecular cloning and expressionof a hexamerin cDNA from the honey bee J Insect Physiol 511135ndash1147
Darwin CR (1859) On the Origin of Species by Means of NaturalSelection John Murray London
Davidson EH (2001) Genomic Regulatory Systems Developmentand Evolution Academic Press San Diego
Eder J Kremer JP and Rembold H (1983) Correlation ofcytochrome c titer and respiration in Apis mellifera adaptiveresponses to caste determination defines workers intercastesand queens Comp Biochem Physiol B 76 703ndash716
Engels W (1974) Occurrence and significance of vitellogenins infemale castes of social Hymenoptera Am Zool 14 1229ndash1237
Evans JD and Wheeler DE (1999) Differential gene expressionbetween developing queens and workers in the honey beeApis mellifera Proc Natl Acad Sci USA 96 5575ndash5580
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
Genomics of honey bee caste development and reproduction
705
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
namely which caste is the novelty the queen or theworker(s) Phrased in other terms the genome sequenceinformation now permits to address at a molecular levelquestions that are fundamental to understand the role of(and evolutionary trends in) ontogenetic processes thatstructure insect societies especially in hymenopteransSuch basic questions are (1) how many degrees offreedom (or release from constraints) may actually havebeen gained from splitting the functions normally performedby a solitary ancestral hymenopteran female into two ormore castes and (2) how was this release from constraintsintegrated into postembryonic differentiation processesto generate truly alternative phenotypes A second obser-vation of potential interest to functional genomics was thata relatively large subset of the caste-related genes mapsto chromosome 2 (seven of 51 unique sequences)
Most genes in the caste gene list are represented by oneor two EST hits except for a predicted
hexamerin 70b
gene(GB10869-PA) This gene was evidenced by 10 ESTs onein a 5
prime
-located exon and nine in the 3
prime
region (five ESTscomprising parts of exon 7 and parts of the 3
prime
-UTR theother four ESTs landing in exons 6 and 7) The macroarraydata (Evans amp Wheeler 2000) established this gene asoverexpressed in the worker caste Hexamerins are animportant class of storage proteins that show interestingexpression patterns related to caste and reproduction inmany social insects (Martinez
et al
2001 Hunt
et al
2003 Zhou
et al
2006ab) A cDNA encoding the honeybee Hexamerin 70b subunit has recently been cloned andsequenced (Cunha
et al
2005) and hormone manipula-tion experiments showed that the abundance of
hexamerin70b
transcripts in larval development is positively cor-related with high levels of JH and ecdysteroids This
could actually reflect a regulatory feedback function in JHtitre regulation as exemplified in the termite
Reticulitermesflavipes
where the Hex1Hex2 ratio controls JH availabilityfor caste-specifically differentiating tissues (Zhou
et al
2006b)
Within the honey bee caste genes for which GO infor-mation was imported and deduced from their
Drosophila
orthologs we noted a predominance of terms clusteringas lsquocellular physiological processrsquo (95 GO0050875) andlsquometabolismrsquo (90 GO0008152) in the lsquoBiological Processrsquo(GO0008150) category (Fig 1A) GO-statistics differencesbetween queens and workers became apparent in termsclustering as lsquocell differentiationrsquo (0 for queen and 285for workers GO0030154) and lsquometabolismrsquo (96 forqueen and 785 for worker GO0008152) in the lsquoBiologicalProcessrsquo (GO0008150) (Fig 2A)
With respect to lsquoMolecular Functionrsquo (GO0003674)most terms were related to mRNA translation (lsquonucleic acidbindingrsquo (38 GO0003676) lsquostructural constituent ofribosomersquo (24 GO0003735) lsquoprotein bindingrsquo (12GO0005515) lsquonucleotide bindingrsquo (12 GO0000166)lsquotranslation factor activity nucleic acid bindingrsquo (7GO0008135) Further important terms were lsquooxidoreductaseactivityrsquo (19 GO0016491) and lsquohydrolase activityrsquo(165 GO0016787) (Fig 1B) For these latter two termswe noted potentially interesting differences related to castewith lsquohydrolase activityrsquo being overrepresented by workertranscribed genes whereas lsquooxidoreductase activityrsquo wasexclusively represented by queen genes (Fig 2B) Eventhough these GO assignments on Molecular Function arebased on evidence from
D melanogaster
without experi-mental evidence for
Apis mellifera
the correspondinggenes are well conserved in sequence and show the
Figure 1 Dominant gene ontology terms for (A) Biological Process and (B) Molecular Function in honey bee genes with an experimentally validated caste-specific expression pattern during the last larval instar The graph was generated by a FatiGO analysis set at level 3 Frequencies indicate the appearance of GO terms in the total set of queen and worker differentially expressed genes
706
A S Cristino
et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
relevant protein domains (Supplementary materialTable 1S
)
and thus are indicative of functional trendsIn general terms the caste-specific separation into
metabolic pathway preferences oxidoreductases vshydrolases may reflect the switch in diet that a workerlarva experiences during the fourth and fifth larval instarThis represents a switch from a proteinlipid-rich diet toa more carbohydrate-rich diet (Haydak 1970) and thisswitch apparently is accompanied by an increase in theexpression of genes coding for proteins with hydrolaseactivity Similar switches in gene expression patternshave recently been reported for
D melanogaster
in anexperiment where larvae were shifted from a cornmealdiet to a banana diet (Carsten
et al
2005) resulting inthe up- or downregulation of 55 genes of a test populationof 6000 Among these are five genes with dehydrogenaseoxidoreductase activity These parallels in dietary switchresponses are indicative of conserved coregulated genenetworks An open question is of course how these can beco-opted to generate different phenotypes such as thecastes of social insects In this respect social insects clearlygo a big step beyond the simple metabolic switch responseseen in
Drosophila
They have apparently incorporateddivergent metabolic regulation into a network architectureconsistent with morphogenetic differentiation This requiredthat metabolic regulation became integrated throughthe endocrine system with developmental patterningprocesses
The importance of metabolic regulation on caste devel-opment has also come to light in a recent RepresentationalDifference Analysis (RDA) study on caste development inthe highly eusocial stingless bee
Melipona quadrifasciata
(Judice
et al
2006)
This is particularly interesting becausein this genus caste development is thought to be based ona genetic predisposition (Kerr 1950) Metabolic regulationmay thus be a
sine qua non
for caste development and
caste-specific metabolic pathways may be set in motionrather independently of the nature of the initial switch(nutritional or genetic) The question of how this metabolicswitch may integrate with the resultant endocrine signaturecharacteristic for each caste is still a widely open field butrecent studies in
Drosophila
showing an interaction betweenecdysone and insulin signalling in the determination ofbody size (Colombani
et al
2005 Mirth
et al
2005) mayprovide a lead
This is also the point to reflect on how justified it is toheuristically rely on
Drosophila
orthologs and to use their GOattributes in a developmental context (caste differentiation)that has no parallel in
Drosophila
A recent gene expressionprofiling study in the ant
Camponotus festinatus
employinga microarray set-up of 384 clones showed significantlydifferent expression levels for larval vs adult ants in 91genes (21 confirmed by qRTndashPCR) including an
Apishexamerin 70b
ortholog (Goodisman
et al
2005) Whencomparing the temporal expression patterns of these antgenes with expression profiles for their respective
Drosophila
orthologs (Arbeitsman
et al
2002) relatively little accordwas noted for the two species leading to the suggestionthat these genes may have taken on distinct functionsdue to the long divergence time between dipterans andhymenopterans (Goodisman
et al
2005) Differencesaside these examples show that in practically all studies onlarge-scale functional considerations in gene expression weare strongly wedded with
Drosophila
and even thoughfunctional divergence in orthologs may have occurredthere is little experimental gene-by-gene evidence availablefor any of the major insect orders outside of Diptera
Functional studies are clearly profiting from the nowavailable honey bee genome sequence as evident fromthe increasing number of RNAi experiments in honeybees(see citations in Honey Bee Genome Sequencing Consor-tium 2006) This is still a small number compared with
Figure 2 Gene Ontology categories with caste-specific expression patterns for Biological Process (A) Genes classified as part of cell differentiation processes are significantly overexpressed in workers whereas genes related to metabolism are overexpressed in queen larvae In the Molecular Function categories (B) we observed an apparent split indicating differential enzyme preferences in queens (overexpress oxidoreductases) and in workers (overexpress hydrolases) The graph was generated by a FatiGO analysis set at level 3 Frequencies indicate the appearance of GO terms in the queen (black bars) and worker differentially expressed genes (grey bars)
Genomics of honey bee caste development and reproduction
707
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
the large-scale RNAi assays established for
Drosophila
(Boutros
et al
2004) but the development of cell cultureapproaches in the honey bee (Bergem
et al
2006) repre-sents a step in this direction
Alternatively regulatory functional associations betweengenes and their integration into networks can be inferredfrom the presence of response elements in upstreamcontrol regions In our analysis of differentially expressedgenes in queen-worker development we took a bioinformaticsapproach for a first look into the molecular architecture of adevelopmental polyphenism
Motif search in upstream regions of differentially expressed genes
The genes related to caste development are among thefirst honey bee genes for which experimentally validatedexpression data were generated (Corona
2004) Certainly these 51 genes donot comprise all the genes involved in caste developmentbut they are expected be prominent players as they werethe ones that stood out in the SSH and DDRT-PCRapproaches The 51 caste genes do not represent genefamilies but rather fall into many very different molecularfunction categories This made us ask whether theobserved overexpression pattern of different genes ineither queen or worker larvae may be associated with theoccurrence of specific regulatory motifs in the upstreamcontrol regions (UCR) of these genes
Three different algorithms AlignACE (Roth
et al
1998)MEME (Bailey amp Elkan 1995) and MDscan (Liu
et al
2002) were used to construct a pipeline for detectingoverrepresented motifs in the two unaligned sets of UCRsequences for the caste-specifically expressed genes Thispipeline was run on a lsquotop-10rsquo set of 12 genes (six for eachcaste) which showed the most pronounced caste differ-ences in expression (Evans amp Wheeler 2000) and alsoon a randomly selected set of UCRs (background control)We calculated four different metrics for each motif MAPscore (Roth
et al
1998) a group-specificity score (Churchscore) (Hughes
et al
2000) and a ROC AUC and MNCPmetric (Clarke amp Granek 2003) A first set of filters wasused to detect motifs with a potential for regulatory functions(MAP score
ge
5 ROC AUC
ge
07) This resulted in 46 motifsout of 123 total UCR motifs found in the queen UCR setand in 71 motifs out of 261 total found in the worker UCRset (Supplementary material Table 2S
)
A parametric statistical test (
MANOVA
P
= 00001Wilksrsquo = 078
F
= 72) and a nonparametric statistical test(KolmogorovndashSmirnov Table 1) on ROC AUC and MNCPindices showed that these two sets of filtered motifs aresignificantly different from a randomly selected set ofmotifs The rank-order metrics ROC AUC and MNCP havepreviously been used to compare the association of short
regulatory sequence features with gene expression data(microarray analyses on coregulated genes) and they havebeen useful in flagging false positives erroneously includedin lsquotop-10rsquo sets of differentially expressed genes (Clarke ampGranek 2003)
To select highly specific motifs found in each data set weused the group-specificity score (Church score
le
1e
minus
05
Hughes
et al
2000) to identify the most likely motifsinvolved in decision making for pathways leading to queen(two motifs Fig 3A) or to worker development (12 motifswith Church score
le
1e
minus
07
Fig 3B) As the SSH and DDRT-PCR approaches on caste development can be expectedto retrieve only a subpopulation of such genes these motifsrepresent only a partial scenario of the transcriptionalregulatory network underlying caste development Themotifs can now be used to screen other GLEAN3-predictedgenes to integrate a candidate list of putatively coregulatedgenes in caste development that can be submitted tofurther experimental validation
Each motif found in UCRs of queen (46) and worker (71)overexpressed genes was compared with the entire setof
D melanogaster cis
-regulatory motifs contained in theTRANSFAC database (version 40 Wingender
et al
2000)Only alignments passing 80 identity for each position-specific site matrix (PSSM) were considered as significantmatches Whereas none of the most specific motifs foreach caste showed similarity to any of the
D melanogaster
motifs some of the more ubiquitous ones did resemblebinding sites of transcription factors such as
AntennapediaUltrabithorax
zerknuumlllt
even skipped
trithorax-like
tailless
paired
fushi tarazu
and
Adh transcription factor 1
(Supple-mentary material Table 2S)
When we plotted the positions of the two queen and the12 worker motifs in the UCRs of the caste-specificallyexpressed genes (Fig 4) an interesting pattern emergedfor the worker-specific motifs Some of the worker motifsappeared to be clustered and occurring in tandem further-more they were positioned relatively close to the predictedtranslation start sites in some of the genes that are over-expressed during worker development (annotation resultsof these genes are listed in Supplementary materialTable 1S) A position close to the predicted translation start
Table 1 KolmogorovndashSmirnov analysis of ROC AUC and MNCP metric for statistical significance of putative regulatory motifs in upstream control regions of genes with queen or worker-specific expression patterns These motifs were contrasted with a random set of motifs detected in a random set of UCRs of GLEAN3-predicted honey bee genes
Group pairs ROC AUC MNCP
Random times (Queen + Worker) P gt 01 P lt 0001Random times Queen P gt 01 P lt 0005Random times Worker P gt 01 P lt 0001Queen times Worker P lt 01 P gt 01
708 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
sites is generally taken as a sign of strong regulatory effect(Davidson 2001)
As caste development is highly dependent on changesin haemolymph titres of JH and ecdysteroids we alsoscreened the UCRs of the differentially expressed genesfor putative nuclear receptor binding sites Regulatoryelements involved in the JH response are not well under-stood yet so any prediction in this direction would beelusive (Wheeler amp Nijhout 2003) Functional ecdysoneresponse elements (EcRE) have however been identifiedand it is now well established that the EcRUSP complexbinds to direct or inverted (palindromic) repeats (Riddi-hough amp Pelham 1987 Antoniewski et al 1995 Perera
et al 2005) A PSSM search (Wassermann amp Sandelin2004) based on a canonical representation (rGkTCAaT-Gamcy) (Perera et al 2005) did not reveal any putativeEcRE motif in the UCRs of the 51 caste-differentiallyexpressed genes However this does not rule out thatthese genes respond to changes in JH andor ecdysteroidtitres as these hormones require EcRUSP binding prima-rily in the expression of early genes but not necessarily forthe late response genes (Li amp White 2003 Sullivan ampThummel 2003)
In conclusion the predictions from such a combinedstrategy that searches for group-specific and for conservedregulatory motifs in GLEAN3 predicted honey bee genes
Figure 3 Putative regulatory motifs and their consensus sequences in UCRs of queen and worker overexpressed genes Scores for MAP Church ROC AUC and MNCP metrics indicate degree of group specificity and significance level
Genomics of honey bee caste development and reproduction 709
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
represents a major transition from nonhypothesis-drivenhigh-throughput screens to hypothesis-driven searches forcontext-dependent gene expression in honey bees Suchdirected search results can serve as a platform for experi-mental analyses of genome-wide integration in hormonalcontrol of caste development in bees In addition this studyexemplifies how existent algorithms for detecting sharedregulatory motifs can be joined into a toolkit for predictingcoregulated gene expression patterns in honey bees Thesemethods have been shown to be robust and are gainingacceptance for use in functional and comparative genomics(Liu et al 2004 Pritsker et al 2004 Zhu et al 2005)
Oogenesis and reproduction
As caste development sets the stage for reproductivedivision of labour genes involved in reproductive processesare strong candidates for functional analyses In the presentstudy we performed BLAST searches to identify honey beeorthologs for a list of 32 fly genes with the GO attributelsquooogenesisrsquo and for four genes specifically related tolsquovitellogenesisrsquo The list for fly genes involved in nuclearevents in germ cells consisted of 20 genes for lsquofemalemeiosisrsquo 12 genes for lsquorecombinationrsquo and 21 genes underthe heading lsquochromosome segregation including segregation
distortionrsquo (Supplementary material Table 3S) In some casesthese GO attributes for fly genes of course overlapped
BLASTN and BLASTX searches for these fly genesagainst the honey bee genome assembly 30 and theGLEAN3 Official Set (aa) retrieved statistically well sup-ported putative bee orthologs for most of these candidatesFor the genes involved in meiosis recombination and chro-mosome segregation this finding although not unexpectedis of interest as meiosis in the haploid honey bee drone isstrongly modified when compared with a normal diploidmeiosis The first meiosis is initiated but the nucleusremains undivided and only the superfluous centrioles areeliminated as cytoplasmatic buds (Hoage amp Kessel 1968)An interesting gene thelytoky (th) has recently beenmapped in this context (Lattorff et al 2005) It preventsalmost completely meiotic recombination in the automixisof laying workers of the Cape honey bee As an indicationof the interplay between meiosis and later developmentthis locus also appears to be an integral part of variousgene cascades involved in caste determination (Lattorffet al 2006)
The fly genes retrieved in the GO searches for lsquooogenesisrsquorepresent a much larger range of Molecular Functioncategories such as transcription factors proteins regulating
Figure 4 Map of the group-specific motifs found in queen and worker UCRs of caste-specifically expressed genes The coding region is represented by the GLEAN3 prediction number (assembly 40) with arrows indicating the translation start site Asterisks mark UCRs of the lsquotop10rsquo set used to find the over-represented motifs
710 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
translation by RNA binding RNA helicases enzymes(ubiquitination transfer of sugar residues sulfotransferase)GTPase activity and several factors binding to cytoskeletalproteins (Supplementary material Table 3S) This widerange of functional categories is expected as these genesare involved in a series of different steps during oogenesisin the polytrophic meroistic ovary Oogenesis starts out withthe maintenance of germline and somatic stem cell identityin the germline niche in the upper germarium A key geneinvolved in this process is pumilio (Forbes amp Lehmann1998) which is represented by a highly conserved beeortholog GB10504-PA The second step is the formation ofgerm cell cysts the determination of an oocyte within eachcyst and the survival of these cysts involving genes suchas benign gonial cell neoplasm (Lin et al 1994) encore(Hawkins et al 1996) ovo and ovarian tumour (otu) (Staabamp Steinmann-Zwicky 1995) all well conserved in the honeybee genome Interestingly we could not find a clear beeortholog for bag of marbles (bam) which is one of the primeearly response genes in the cystoblast differentiationpathway in Drosophila (McKearin 1997)
The third step comprises previtellogenic growth of thefollicle and during these stages a number of maternalfactors are deposited and anchored either within the oocyteor in the perivitelline space that define the egg and thefuture embryonic axes (for review see St Johnston ampNuumlsslein-Volhard (1992) In the list of Drosophila genesinvolved in early steps of axis determination a couple ofsurprises came up in the search for honey bee orthologsA big surprise was that we could not find a gurken orthologin the bee even though this gene sets up both the anteriorndashposterior and dorsalndashventral axes in the Drosophila egg(Gonzaacutelez-Reyes et al 1995) whereas downstreamcomponents of the Gurken signalling cascade appear tobe preserved in the bee genome (Honey Bee GenomeSequencing Consortium 2006) Similar apparent gaps inconstituents of patterning cascades were noted for theterminal regions of the embryo such as a lack of a torsoortholog whereas its ligand torso-like is represented by awell conserved ortholog in the bee (GB18663-PA)
With respect to genes involved in the final processes ofoogenesis we primarily looked at genes that play a partduring vitellogenesis There are four genes of interest in thisclass the primary one coding for the yolk protein precursorvitellogenin This gene has already been sequenced for thehoney bee (Piulachs et al 2003) and as expected it ismuch more related to vitellogenins of other insects andeven vertebrates than to the Drosophila yolk proteinswhich apparently are derived from lipases (Hagedorn et al1998) The second gene of interest is the bee ortholog toyolkless as this (GB16571-PA) could represent a putativevitellogenin receptor The other two Drosophila genes withclear orthologs in the bee are CG18641 and CG12139which code for a lipase and an LDL receptor respectively
General conclusions
The current analysis made use of previous experimentalanalyses on differential transcription during caste develop-ment of honey bee larvae In the annotation of these geneswhich includes references to Gene Ontology terms asso-ciated with their respective Drosophila orthologs two majorconfigurations emerged First of all worker genes werebetter defined in terms of GO attributes compared with therelatively large number of queen genes that had no GOterms associated to their respective Drosophila orthologsEven when taking into consideration the conceptual limitsin attributing GO terms on molecular function and biologicalprocess from Drosophila orthologs to honey bee genesthis finding could have a bearing on general basicquestions in socioevolution namely which caste is moredivergent from a nonsocial reproductive female bee proto-type or reproductive ground plan the queen or the workerLess speculative is the second major conclusion comingout of the GO analysis for Molecular Function showing andconfirming (Eder et al 1983 Corona et al 1999) theimportant role of metabolic regulation in caste developmentThis facet is demonstrated especially clearly in thecaste-specific expression of oxidoreductases (queen)vs hydrolases (workers)
The honey bee genome information provided not only amuch improved annotation platform for caste-specificallyexpressed ESTs but even more so opens the possibilityto explore putative regulatory features of the honey beegenome In the current study we employed modified Gibbssampling and expectation-maximization algorithms (Alig-nACE MDScan MEME) to detect group-specific motifs ingene regions up to 1000 bp upstream of translation startsites We detected 14 motifs that were significantly over-represented in the caste genes when compared withcorresponding motifs found in a random set of GLEAN3-predicted honey bee genes The localization of suchmotifs in UCRs of worker-overexpressed genes revealeda clustering of such motifs close to the predicted basalpromotor regions suggesting strong regulatory effectsSuch search strategies and the detected motifs can providethe lead to reveal and unravel cis-regulatory networks forand within specific contexts of honey bee biology
Caste polyphenism in social insects makes a strong casefor the emergence of novelties at a microevolutionary level(West-Eberhard 2003) Without the pretension to discussexhaustively the mechanisms underlying this surge ofdevelopmental plasticity two major themes becomeapparent in this and other studies Regulatory change hasbeen demonstrated in the shut-down of wing disc pattern-ing cascades in ants (Abouheif amp Wray 2002) and iscertainly also implicit in observed temporal changes ingene expression during postembryonic development ofants and bumble bees (Goodisman et al 2005 Pereboom
Genomics of honey bee caste development and reproduction 711
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
et al 2005) Such change would be expected to involvecis-regulatory elements that is change in transcriptionfactor binding sites in UCRs as approached in this studyand also evolutionary change in response thresholds tocirculating morphogenetic hormones (for review see Hart-felder amp Emlen 2005) The second and quite unexpectedtheme is the acquisition of new systemic functions byevolutionary rather old proteins such as vitellogenin andhexamerins These apparently unspectacular proteinshave evolved into key players for caste evolution and repro-ductive division of labour via novel regulatory connectivitywith JH (Amdam et al 2004 Guidugli et al 2005 Zhouet al 2006ab)
Experimental procedures
Selection and annotation of ESTs representing differentially expressed genes in honey bee caste development
The starting point were 164 entries (mainly 3prime-ESTs) in GenBank(BG101532ndashBG101697) from an SSH library (Evans amp Wheeler1999 Evans amp Wheeler 2000) When validated by macroarrayanalyses a clustering into three major classes became apparent(I) genes overexpressed in young larvae (II) genes overexpressedin last instar queen larvae and (III) genes overexpressed in lastinstar worker larvae For this study we excluded the class I ESTsbecause their expression is not caste-specific but rather representsexpression differences between young (still bipotent) and olderlarvae To the class II queen ESTs (82) we added one completecDNA entry (AY601642) from a DDRT-PCR screen (Corona et al1999) and to the class III set of worker ESTs (40) we added sevenGenBank dbEST entries (BG149167ndashBG149173) from a DDRT-PCR screen on ovary development (Hepperle amp Hartfelder 2001)
The EST sequences were submitted to BLASTN searches(parameters -G 2 -E 3 -W 15 -F lsquom Drsquo -U -e 1e-20) against genomesequence assembly Amel_v30 to retrieve matches in linked orunlinked genomic contigs and to exclude no-matches (seven ESTsin queen) ESTs that aligned within the same scaffold werechecked for clustering and overlap This clustering also served toexclude genes that were represented by non-overlapping ESTsfrom both castes This procedure generated a set of 51 uniqueputative gene sequences overexpressed in either queen (34) or inworker larvae (17) These 51 nonredundant sequences weresubmitted to BLASTX searches against the Official Set of GLEAN3-predicted protein sequences (cut-off value at 1eminus20) For ESTs withno significant protein sequence matches the genomic regionsadjacent to the mapped EST were searched to find neighbouringORFs especially those nearest to putative 3prime UTRs of predictedproteins as the EST libraries have a bias in this direction
Official Set protein sequences were aligned against Amel_v30sequence assembly using TBLASTN to map protein to genomeand subsequently they were aligned using BLASTP against theGenBank nonredundant (nr) and the Flybase protein sequencedatabases The manual features annotation procedure of theArtemis 70 program (Rutherford et al 2000) was used to mapORFs putative splice sites of exons and ESTs to genome coordi-nates The final annotation file was generated with a Python scriptin GFF format (httpwwwsangeracukSoftwareformatsGFF)
Honey bee sequences annotated as orthologs to D mela-nogaster genes were putatively assigned the GO terms listed in
the respective Flybase entry In addition the definition of new GOterms (Biological Process ontology) related to caste developmentand polyphenism (GO0048651 and GO0048650 respectively)was co-ordinated with the Gene Ontology Consortium (Ashburneret al 2000) The FatiGO web tool (Al-Sharour et al 2004) wasused to cluster GO terms (level 3 setting) for Biological Processand Molecular Function
For the detection of conserved domains the 51 proteinsequences were screened against the Pfam database (httpwwwsangeracukSoftwarePfam) using the HMMER platform(current release 232 httphmmerwustledu) with a cut-offvalue set at 1eminus10
Annotation of oogenesis and reproduction genes
In order to identify putative honey bee orthologs to D melanogastergenes we searched the following GO terms in Flybase lsquooogenesisrsquo(GO0009993) lsquovitellogenesisrsquo (GO0007296) lsquofemale meiosisrsquo(GO0007143) lsquoDNA recombinationrsquo (GO0006310) and lsquochromo-some segregationrsquo (GO0007059) Genes related to segregationdistortion were searched for in Flybase in phenotypic descriptionsand mutant effects of D melanogaster genes as this phenomenonis not represented by a GO term Hence this group may be moreheterogeneous than the others From this list we removed genesof pleiotropic function (multifaceted GO entries in Biological Process)and genes that lacked defined transcripts in the Drosophilagenome database
For the GO terms lsquooogenesisrsquo and lsquovitellogenesisrsquo we performedTBLASTN and BLASTP searches for 42 fruit fly genes against theAmel_v30 genome assembly and the GLEAN3-predicted proteinsequences (Honey Bee Genome Sequencing Consortium 2006)respectively The orthologous D melanogaster gene was charac-terized by the same procedure as described above (reciprocalbest hit) For the GO terms lsquofemale meiosisrsquo lsquoDNA recombinationrsquolsquochromosome segregationrsquo and the non-GO group lsquosegregationdistortionrsquo transcripts of D melanogaster were searched againstnr databases at NCBI using BLASTP The obtained sequenceswere searched against the Amel_v30 genome assembly and theGLEAN3-predicted protein sequences using TBLASTN Homolo-gous sequences (threshold 1eminus10) were predicted using the BioEditsoftware (Hall 1999) ORFs showing significant homology (BLASTPthreshold 1eminus20) were assembled and used in BLASTP searchesagainst the nr databases at NCBI
Motif search in upstream regions in caste-specifically expressed genes
In order to detect overrepresented motifs in the upstream controlregions (UCRs) of the two sets of caste-related genes we selectedgene subsets based on two criteria (1) those that had shown thehighest caste-specificity in the array analyses (Evans amp Wheeler2000) and (2) those that had a conserved 5prime region when comparedwith the Drosophila orthologs These lsquotop10rsquo genes consisted of sixqueen genes (GB13072 GB11628 GB19380 GB14798 GB16047and GB18242) and of six worker genes (GB10869 GB12371GB12239 GB10428 GB19006 GB14758) The motif search wasconducted separately on the two sets of UCR sequences using threemethods AlignAce (Roth et al 1998) MEME (Bailey amp Elkan1995) and MDscan (Liu et al 2002) Default parameters valueswere used in all searches except that GC content in intergenicregions was set to 25 representing the background value estab-lished for the honey bee UCR database generated in this study
712 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
The database containing 10156 UCR sequences was gener-ated by parsing the Official Set annotation file (downloaded in GFFformat from httpwwwbeegenomehgscbcmtmcedubeeftphtml)to extract upstream regions starting from the terminal 5prime-genomiccoordinate of each predicted CDS The UCRs were arbitrarily setto a size frame of 1000 nucleotides (Roth et al 1998) but weretrimmed whenever another predicted ORF was detected in any ofthese regions
The MAP (maximum a priori log likelihood) score group specificityscore (called Church score in this manuscript) (Hughes et al2000) ROC AUC (area under the curve for a receiver-operatorcharacteristic plot) metric and MNCP (mean normalized condi-tional probability) metric (Clarke amp Granek 2003) were used todetect motifs that most likely correspond to biologically significantcis-regulatory elements The filters ran on the UCRs of the subsetsof queen and worker genes were a MAP score cut-off value of 50followed by a ROC AUC cut-off at 07 followed by a group specificityscore cut-off at 1eminus05 The UCR database of all GLEAN3 predictedhoney bee genes was used as the background to calculatethese metrics
A parametric test (MANOVA) and a nonparametric test (KolmogorovndashSmirnov) were conducted to identify significance levels for the twosets of filtered motifs found in the UCRs of caste-specificallyexpressed genes against filtered motifs found in the random UCRset Random motifs were sampled from a motif database (10 391motifs) generated by running our script 100 times with a randomsets of UCR sequences
The main criterion for identifying known regulatory motifs amongthese caste-specific ones was the alignment of the PSSM for eachbee motif with the Drosophila melanogaster sequences in theTRANSFAC database (release 40) (Wingender et al 2000) Onlythe alignments passing a threshold of 80 identity for each PSSMwere considered as significant matches In addition we checkedfor a specific binding motif the EcRUSP motif (rGkTCAaTGamcy-3prime) known to function in the expression of genes responding tomorphogenetic hormone titres (Perera et al 2005)
Operating system and programming tools
An Ubuntu Linux (version Breezy) operating system was used toimplement all scripts and pipelines designed for annotation proce-dures and motif discovery The Python programming language(httpwwwpythonorg) Biopython (httpwwwbiopythonorg)and TAMO (Tools for Analysis of Motifs) packages (Gordon et al2005) were used in program design Other web applications werebuilt using the Zope application server (httpwwwzopeorg)hosted at httpzulufmrpuspbrbeelab
Acknowledgements
This work was supported by grants from FAPESP(Fundaccedilatildeo de Amparo a Pesquisa do Estado de SatildeoPaulo 9900719-6) and CNPq (Conselho Nacional deDesenvolvimento Cientiacutefico e Tecnoloacutegico 4729632-3-1)
References
Abouheif E and Wray GA (2002) Evolution of the genetic networkunderlying wing polyphenism in ants Science 297 249ndash252
Al-Sharour F Diaz-Uriarte R and Dopazo J (2004) FatiGO aweb tool for finding significant associations of Gene Ontologyterms with groups of genes Bioinformatics 20 578ndash580
Amdam GV Norberg K Hagen A and Omholt SW (2003)Social exploitation of vitellogenin Proc Natl Acad Sci USA 1001799ndash1802
Amdam GV Simotildees ZLP Hagen A Norberg K SchroderK Mikkelsen O et al (2004) Hormonal control of the yolkprecursor vitellogenin regulates immune function and longevityin honeybees Exp Gerontol 39 767ndash773
Antoniewski C OrsquoGrady MS Edmondson RG Lassieur SMand Benes H (1995) Characterization of an EcRUSP heter-odimer target site that mediates ecdysone responsiveness ofthe Drosophila Lsp-2 gene Mol Gen Genet 249 545ndash556
Arbeitsman MN Furlong EEM Imam F Johnson E NullBH Baker BS et al (2002) Gene expression during the lifecycle of Drosophila melanogaster Science 297 2270ndash2275
Ashburner M Ball CA Blake JA Botstein D Butler HCherry JM et al (2000) Gene Ontology tool for the unificationof biology Nature Genet 25 25ndash29
Bailey TL and Elkan D (1995) Unsupervised learning of multiplemotifs in biopolymers using expectation maximizationMachine Learning J 21 58ndash83
Bergem M Norberg K and Aamodt RM (2006) Long-termmaintenance of in vitro cultured honeybee (Apis mellifera)embryonic cells BMC Dev Biol 6 17 online
Boutros M Kiger AA Armknecht S Kerr K Hild M Koch Bet al (2004) Genome-wide RNAi analysis of growth and viabil-ity in Drosophila cells Science 303 832ndash835
Carsten LD Watts T and Markov TA (2005) Gene expressionpatterns accompanying a dietary switch in Drosophila mela-nogaster Mol Ecol 14 3203ndash3208
Clarke ND and Granek JA (2003) Rank order metrics for quan-tifying the association of sequence features with gene regula-tion Bioinformatics 19 212ndash218
Colombani J Bianchini L Layalle S Pondeville E Dauphin-Villemant C Antoniewski C et al (2005) Antagonisticactions of ecdysone and insulins determine final size inDrosophila Science 310 667ndash670
Corona M Estrada E and Zurita M (1999) Differential expressionof mitochondrial genes between queens and workers duringcaste determination in the honeybee Apis mellifera J Exp Biol202 929ndash938
Cunha AC Nascimento AM Guidugli KR Simotildees ZLPand Bitondi MMG (2005) Molecular cloning and expressionof a hexamerin cDNA from the honey bee J Insect Physiol 511135ndash1147
Darwin CR (1859) On the Origin of Species by Means of NaturalSelection John Murray London
Davidson EH (2001) Genomic Regulatory Systems Developmentand Evolution Academic Press San Diego
Eder J Kremer JP and Rembold H (1983) Correlation ofcytochrome c titer and respiration in Apis mellifera adaptiveresponses to caste determination defines workers intercastesand queens Comp Biochem Physiol B 76 703ndash716
Engels W (1974) Occurrence and significance of vitellogenins infemale castes of social Hymenoptera Am Zool 14 1229ndash1237
Evans JD and Wheeler DE (1999) Differential gene expressionbetween developing queens and workers in the honey beeApis mellifera Proc Natl Acad Sci USA 96 5575ndash5580
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
706
A S Cristino
et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
relevant protein domains (Supplementary materialTable 1S
)
and thus are indicative of functional trendsIn general terms the caste-specific separation into
metabolic pathway preferences oxidoreductases vshydrolases may reflect the switch in diet that a workerlarva experiences during the fourth and fifth larval instarThis represents a switch from a proteinlipid-rich diet toa more carbohydrate-rich diet (Haydak 1970) and thisswitch apparently is accompanied by an increase in theexpression of genes coding for proteins with hydrolaseactivity Similar switches in gene expression patternshave recently been reported for
D melanogaster
in anexperiment where larvae were shifted from a cornmealdiet to a banana diet (Carsten
et al
2005) resulting inthe up- or downregulation of 55 genes of a test populationof 6000 Among these are five genes with dehydrogenaseoxidoreductase activity These parallels in dietary switchresponses are indicative of conserved coregulated genenetworks An open question is of course how these can beco-opted to generate different phenotypes such as thecastes of social insects In this respect social insects clearlygo a big step beyond the simple metabolic switch responseseen in
Drosophila
They have apparently incorporateddivergent metabolic regulation into a network architectureconsistent with morphogenetic differentiation This requiredthat metabolic regulation became integrated throughthe endocrine system with developmental patterningprocesses
The importance of metabolic regulation on caste devel-opment has also come to light in a recent RepresentationalDifference Analysis (RDA) study on caste development inthe highly eusocial stingless bee
Melipona quadrifasciata
(Judice
et al
2006)
This is particularly interesting becausein this genus caste development is thought to be based ona genetic predisposition (Kerr 1950) Metabolic regulationmay thus be a
sine qua non
for caste development and
caste-specific metabolic pathways may be set in motionrather independently of the nature of the initial switch(nutritional or genetic) The question of how this metabolicswitch may integrate with the resultant endocrine signaturecharacteristic for each caste is still a widely open field butrecent studies in
Drosophila
showing an interaction betweenecdysone and insulin signalling in the determination ofbody size (Colombani
et al
2005 Mirth
et al
2005) mayprovide a lead
This is also the point to reflect on how justified it is toheuristically rely on
Drosophila
orthologs and to use their GOattributes in a developmental context (caste differentiation)that has no parallel in
Drosophila
A recent gene expressionprofiling study in the ant
Camponotus festinatus
employinga microarray set-up of 384 clones showed significantlydifferent expression levels for larval vs adult ants in 91genes (21 confirmed by qRTndashPCR) including an
Apishexamerin 70b
ortholog (Goodisman
et al
2005) Whencomparing the temporal expression patterns of these antgenes with expression profiles for their respective
Drosophila
orthologs (Arbeitsman
et al
2002) relatively little accordwas noted for the two species leading to the suggestionthat these genes may have taken on distinct functionsdue to the long divergence time between dipterans andhymenopterans (Goodisman
et al
2005) Differencesaside these examples show that in practically all studies onlarge-scale functional considerations in gene expression weare strongly wedded with
Drosophila
and even thoughfunctional divergence in orthologs may have occurredthere is little experimental gene-by-gene evidence availablefor any of the major insect orders outside of Diptera
Functional studies are clearly profiting from the nowavailable honey bee genome sequence as evident fromthe increasing number of RNAi experiments in honeybees(see citations in Honey Bee Genome Sequencing Consor-tium 2006) This is still a small number compared with
Figure 2 Gene Ontology categories with caste-specific expression patterns for Biological Process (A) Genes classified as part of cell differentiation processes are significantly overexpressed in workers whereas genes related to metabolism are overexpressed in queen larvae In the Molecular Function categories (B) we observed an apparent split indicating differential enzyme preferences in queens (overexpress oxidoreductases) and in workers (overexpress hydrolases) The graph was generated by a FatiGO analysis set at level 3 Frequencies indicate the appearance of GO terms in the queen (black bars) and worker differentially expressed genes (grey bars)
Genomics of honey bee caste development and reproduction
707
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
the large-scale RNAi assays established for
Drosophila
(Boutros
et al
2004) but the development of cell cultureapproaches in the honey bee (Bergem
et al
2006) repre-sents a step in this direction
Alternatively regulatory functional associations betweengenes and their integration into networks can be inferredfrom the presence of response elements in upstreamcontrol regions In our analysis of differentially expressedgenes in queen-worker development we took a bioinformaticsapproach for a first look into the molecular architecture of adevelopmental polyphenism
Motif search in upstream regions of differentially expressed genes
The genes related to caste development are among thefirst honey bee genes for which experimentally validatedexpression data were generated (Corona
2004) Certainly these 51 genes donot comprise all the genes involved in caste developmentbut they are expected be prominent players as they werethe ones that stood out in the SSH and DDRT-PCRapproaches The 51 caste genes do not represent genefamilies but rather fall into many very different molecularfunction categories This made us ask whether theobserved overexpression pattern of different genes ineither queen or worker larvae may be associated with theoccurrence of specific regulatory motifs in the upstreamcontrol regions (UCR) of these genes
Three different algorithms AlignACE (Roth
et al
1998)MEME (Bailey amp Elkan 1995) and MDscan (Liu
et al
2002) were used to construct a pipeline for detectingoverrepresented motifs in the two unaligned sets of UCRsequences for the caste-specifically expressed genes Thispipeline was run on a lsquotop-10rsquo set of 12 genes (six for eachcaste) which showed the most pronounced caste differ-ences in expression (Evans amp Wheeler 2000) and alsoon a randomly selected set of UCRs (background control)We calculated four different metrics for each motif MAPscore (Roth
et al
1998) a group-specificity score (Churchscore) (Hughes
et al
2000) and a ROC AUC and MNCPmetric (Clarke amp Granek 2003) A first set of filters wasused to detect motifs with a potential for regulatory functions(MAP score
ge
5 ROC AUC
ge
07) This resulted in 46 motifsout of 123 total UCR motifs found in the queen UCR setand in 71 motifs out of 261 total found in the worker UCRset (Supplementary material Table 2S
)
A parametric statistical test (
MANOVA
P
= 00001Wilksrsquo = 078
F
= 72) and a nonparametric statistical test(KolmogorovndashSmirnov Table 1) on ROC AUC and MNCPindices showed that these two sets of filtered motifs aresignificantly different from a randomly selected set ofmotifs The rank-order metrics ROC AUC and MNCP havepreviously been used to compare the association of short
regulatory sequence features with gene expression data(microarray analyses on coregulated genes) and they havebeen useful in flagging false positives erroneously includedin lsquotop-10rsquo sets of differentially expressed genes (Clarke ampGranek 2003)
To select highly specific motifs found in each data set weused the group-specificity score (Church score
le
1e
minus
05
Hughes
et al
2000) to identify the most likely motifsinvolved in decision making for pathways leading to queen(two motifs Fig 3A) or to worker development (12 motifswith Church score
le
1e
minus
07
Fig 3B) As the SSH and DDRT-PCR approaches on caste development can be expectedto retrieve only a subpopulation of such genes these motifsrepresent only a partial scenario of the transcriptionalregulatory network underlying caste development Themotifs can now be used to screen other GLEAN3-predictedgenes to integrate a candidate list of putatively coregulatedgenes in caste development that can be submitted tofurther experimental validation
Each motif found in UCRs of queen (46) and worker (71)overexpressed genes was compared with the entire setof
D melanogaster cis
-regulatory motifs contained in theTRANSFAC database (version 40 Wingender
et al
2000)Only alignments passing 80 identity for each position-specific site matrix (PSSM) were considered as significantmatches Whereas none of the most specific motifs foreach caste showed similarity to any of the
D melanogaster
motifs some of the more ubiquitous ones did resemblebinding sites of transcription factors such as
AntennapediaUltrabithorax
zerknuumlllt
even skipped
trithorax-like
tailless
paired
fushi tarazu
and
Adh transcription factor 1
(Supple-mentary material Table 2S)
When we plotted the positions of the two queen and the12 worker motifs in the UCRs of the caste-specificallyexpressed genes (Fig 4) an interesting pattern emergedfor the worker-specific motifs Some of the worker motifsappeared to be clustered and occurring in tandem further-more they were positioned relatively close to the predictedtranslation start sites in some of the genes that are over-expressed during worker development (annotation resultsof these genes are listed in Supplementary materialTable 1S) A position close to the predicted translation start
Table 1 KolmogorovndashSmirnov analysis of ROC AUC and MNCP metric for statistical significance of putative regulatory motifs in upstream control regions of genes with queen or worker-specific expression patterns These motifs were contrasted with a random set of motifs detected in a random set of UCRs of GLEAN3-predicted honey bee genes
Group pairs ROC AUC MNCP
Random times (Queen + Worker) P gt 01 P lt 0001Random times Queen P gt 01 P lt 0005Random times Worker P gt 01 P lt 0001Queen times Worker P lt 01 P gt 01
708 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
sites is generally taken as a sign of strong regulatory effect(Davidson 2001)
As caste development is highly dependent on changesin haemolymph titres of JH and ecdysteroids we alsoscreened the UCRs of the differentially expressed genesfor putative nuclear receptor binding sites Regulatoryelements involved in the JH response are not well under-stood yet so any prediction in this direction would beelusive (Wheeler amp Nijhout 2003) Functional ecdysoneresponse elements (EcRE) have however been identifiedand it is now well established that the EcRUSP complexbinds to direct or inverted (palindromic) repeats (Riddi-hough amp Pelham 1987 Antoniewski et al 1995 Perera
et al 2005) A PSSM search (Wassermann amp Sandelin2004) based on a canonical representation (rGkTCAaT-Gamcy) (Perera et al 2005) did not reveal any putativeEcRE motif in the UCRs of the 51 caste-differentiallyexpressed genes However this does not rule out thatthese genes respond to changes in JH andor ecdysteroidtitres as these hormones require EcRUSP binding prima-rily in the expression of early genes but not necessarily forthe late response genes (Li amp White 2003 Sullivan ampThummel 2003)
In conclusion the predictions from such a combinedstrategy that searches for group-specific and for conservedregulatory motifs in GLEAN3 predicted honey bee genes
Figure 3 Putative regulatory motifs and their consensus sequences in UCRs of queen and worker overexpressed genes Scores for MAP Church ROC AUC and MNCP metrics indicate degree of group specificity and significance level
Genomics of honey bee caste development and reproduction 709
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
represents a major transition from nonhypothesis-drivenhigh-throughput screens to hypothesis-driven searches forcontext-dependent gene expression in honey bees Suchdirected search results can serve as a platform for experi-mental analyses of genome-wide integration in hormonalcontrol of caste development in bees In addition this studyexemplifies how existent algorithms for detecting sharedregulatory motifs can be joined into a toolkit for predictingcoregulated gene expression patterns in honey bees Thesemethods have been shown to be robust and are gainingacceptance for use in functional and comparative genomics(Liu et al 2004 Pritsker et al 2004 Zhu et al 2005)
Oogenesis and reproduction
As caste development sets the stage for reproductivedivision of labour genes involved in reproductive processesare strong candidates for functional analyses In the presentstudy we performed BLAST searches to identify honey beeorthologs for a list of 32 fly genes with the GO attributelsquooogenesisrsquo and for four genes specifically related tolsquovitellogenesisrsquo The list for fly genes involved in nuclearevents in germ cells consisted of 20 genes for lsquofemalemeiosisrsquo 12 genes for lsquorecombinationrsquo and 21 genes underthe heading lsquochromosome segregation including segregation
distortionrsquo (Supplementary material Table 3S) In some casesthese GO attributes for fly genes of course overlapped
BLASTN and BLASTX searches for these fly genesagainst the honey bee genome assembly 30 and theGLEAN3 Official Set (aa) retrieved statistically well sup-ported putative bee orthologs for most of these candidatesFor the genes involved in meiosis recombination and chro-mosome segregation this finding although not unexpectedis of interest as meiosis in the haploid honey bee drone isstrongly modified when compared with a normal diploidmeiosis The first meiosis is initiated but the nucleusremains undivided and only the superfluous centrioles areeliminated as cytoplasmatic buds (Hoage amp Kessel 1968)An interesting gene thelytoky (th) has recently beenmapped in this context (Lattorff et al 2005) It preventsalmost completely meiotic recombination in the automixisof laying workers of the Cape honey bee As an indicationof the interplay between meiosis and later developmentthis locus also appears to be an integral part of variousgene cascades involved in caste determination (Lattorffet al 2006)
The fly genes retrieved in the GO searches for lsquooogenesisrsquorepresent a much larger range of Molecular Functioncategories such as transcription factors proteins regulating
Figure 4 Map of the group-specific motifs found in queen and worker UCRs of caste-specifically expressed genes The coding region is represented by the GLEAN3 prediction number (assembly 40) with arrows indicating the translation start site Asterisks mark UCRs of the lsquotop10rsquo set used to find the over-represented motifs
710 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
translation by RNA binding RNA helicases enzymes(ubiquitination transfer of sugar residues sulfotransferase)GTPase activity and several factors binding to cytoskeletalproteins (Supplementary material Table 3S) This widerange of functional categories is expected as these genesare involved in a series of different steps during oogenesisin the polytrophic meroistic ovary Oogenesis starts out withthe maintenance of germline and somatic stem cell identityin the germline niche in the upper germarium A key geneinvolved in this process is pumilio (Forbes amp Lehmann1998) which is represented by a highly conserved beeortholog GB10504-PA The second step is the formation ofgerm cell cysts the determination of an oocyte within eachcyst and the survival of these cysts involving genes suchas benign gonial cell neoplasm (Lin et al 1994) encore(Hawkins et al 1996) ovo and ovarian tumour (otu) (Staabamp Steinmann-Zwicky 1995) all well conserved in the honeybee genome Interestingly we could not find a clear beeortholog for bag of marbles (bam) which is one of the primeearly response genes in the cystoblast differentiationpathway in Drosophila (McKearin 1997)
The third step comprises previtellogenic growth of thefollicle and during these stages a number of maternalfactors are deposited and anchored either within the oocyteor in the perivitelline space that define the egg and thefuture embryonic axes (for review see St Johnston ampNuumlsslein-Volhard (1992) In the list of Drosophila genesinvolved in early steps of axis determination a couple ofsurprises came up in the search for honey bee orthologsA big surprise was that we could not find a gurken orthologin the bee even though this gene sets up both the anteriorndashposterior and dorsalndashventral axes in the Drosophila egg(Gonzaacutelez-Reyes et al 1995) whereas downstreamcomponents of the Gurken signalling cascade appear tobe preserved in the bee genome (Honey Bee GenomeSequencing Consortium 2006) Similar apparent gaps inconstituents of patterning cascades were noted for theterminal regions of the embryo such as a lack of a torsoortholog whereas its ligand torso-like is represented by awell conserved ortholog in the bee (GB18663-PA)
With respect to genes involved in the final processes ofoogenesis we primarily looked at genes that play a partduring vitellogenesis There are four genes of interest in thisclass the primary one coding for the yolk protein precursorvitellogenin This gene has already been sequenced for thehoney bee (Piulachs et al 2003) and as expected it ismuch more related to vitellogenins of other insects andeven vertebrates than to the Drosophila yolk proteinswhich apparently are derived from lipases (Hagedorn et al1998) The second gene of interest is the bee ortholog toyolkless as this (GB16571-PA) could represent a putativevitellogenin receptor The other two Drosophila genes withclear orthologs in the bee are CG18641 and CG12139which code for a lipase and an LDL receptor respectively
General conclusions
The current analysis made use of previous experimentalanalyses on differential transcription during caste develop-ment of honey bee larvae In the annotation of these geneswhich includes references to Gene Ontology terms asso-ciated with their respective Drosophila orthologs two majorconfigurations emerged First of all worker genes werebetter defined in terms of GO attributes compared with therelatively large number of queen genes that had no GOterms associated to their respective Drosophila orthologsEven when taking into consideration the conceptual limitsin attributing GO terms on molecular function and biologicalprocess from Drosophila orthologs to honey bee genesthis finding could have a bearing on general basicquestions in socioevolution namely which caste is moredivergent from a nonsocial reproductive female bee proto-type or reproductive ground plan the queen or the workerLess speculative is the second major conclusion comingout of the GO analysis for Molecular Function showing andconfirming (Eder et al 1983 Corona et al 1999) theimportant role of metabolic regulation in caste developmentThis facet is demonstrated especially clearly in thecaste-specific expression of oxidoreductases (queen)vs hydrolases (workers)
The honey bee genome information provided not only amuch improved annotation platform for caste-specificallyexpressed ESTs but even more so opens the possibilityto explore putative regulatory features of the honey beegenome In the current study we employed modified Gibbssampling and expectation-maximization algorithms (Alig-nACE MDScan MEME) to detect group-specific motifs ingene regions up to 1000 bp upstream of translation startsites We detected 14 motifs that were significantly over-represented in the caste genes when compared withcorresponding motifs found in a random set of GLEAN3-predicted honey bee genes The localization of suchmotifs in UCRs of worker-overexpressed genes revealeda clustering of such motifs close to the predicted basalpromotor regions suggesting strong regulatory effectsSuch search strategies and the detected motifs can providethe lead to reveal and unravel cis-regulatory networks forand within specific contexts of honey bee biology
Caste polyphenism in social insects makes a strong casefor the emergence of novelties at a microevolutionary level(West-Eberhard 2003) Without the pretension to discussexhaustively the mechanisms underlying this surge ofdevelopmental plasticity two major themes becomeapparent in this and other studies Regulatory change hasbeen demonstrated in the shut-down of wing disc pattern-ing cascades in ants (Abouheif amp Wray 2002) and iscertainly also implicit in observed temporal changes ingene expression during postembryonic development ofants and bumble bees (Goodisman et al 2005 Pereboom
Genomics of honey bee caste development and reproduction 711
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
et al 2005) Such change would be expected to involvecis-regulatory elements that is change in transcriptionfactor binding sites in UCRs as approached in this studyand also evolutionary change in response thresholds tocirculating morphogenetic hormones (for review see Hart-felder amp Emlen 2005) The second and quite unexpectedtheme is the acquisition of new systemic functions byevolutionary rather old proteins such as vitellogenin andhexamerins These apparently unspectacular proteinshave evolved into key players for caste evolution and repro-ductive division of labour via novel regulatory connectivitywith JH (Amdam et al 2004 Guidugli et al 2005 Zhouet al 2006ab)
Experimental procedures
Selection and annotation of ESTs representing differentially expressed genes in honey bee caste development
The starting point were 164 entries (mainly 3prime-ESTs) in GenBank(BG101532ndashBG101697) from an SSH library (Evans amp Wheeler1999 Evans amp Wheeler 2000) When validated by macroarrayanalyses a clustering into three major classes became apparent(I) genes overexpressed in young larvae (II) genes overexpressedin last instar queen larvae and (III) genes overexpressed in lastinstar worker larvae For this study we excluded the class I ESTsbecause their expression is not caste-specific but rather representsexpression differences between young (still bipotent) and olderlarvae To the class II queen ESTs (82) we added one completecDNA entry (AY601642) from a DDRT-PCR screen (Corona et al1999) and to the class III set of worker ESTs (40) we added sevenGenBank dbEST entries (BG149167ndashBG149173) from a DDRT-PCR screen on ovary development (Hepperle amp Hartfelder 2001)
The EST sequences were submitted to BLASTN searches(parameters -G 2 -E 3 -W 15 -F lsquom Drsquo -U -e 1e-20) against genomesequence assembly Amel_v30 to retrieve matches in linked orunlinked genomic contigs and to exclude no-matches (seven ESTsin queen) ESTs that aligned within the same scaffold werechecked for clustering and overlap This clustering also served toexclude genes that were represented by non-overlapping ESTsfrom both castes This procedure generated a set of 51 uniqueputative gene sequences overexpressed in either queen (34) or inworker larvae (17) These 51 nonredundant sequences weresubmitted to BLASTX searches against the Official Set of GLEAN3-predicted protein sequences (cut-off value at 1eminus20) For ESTs withno significant protein sequence matches the genomic regionsadjacent to the mapped EST were searched to find neighbouringORFs especially those nearest to putative 3prime UTRs of predictedproteins as the EST libraries have a bias in this direction
Official Set protein sequences were aligned against Amel_v30sequence assembly using TBLASTN to map protein to genomeand subsequently they were aligned using BLASTP against theGenBank nonredundant (nr) and the Flybase protein sequencedatabases The manual features annotation procedure of theArtemis 70 program (Rutherford et al 2000) was used to mapORFs putative splice sites of exons and ESTs to genome coordi-nates The final annotation file was generated with a Python scriptin GFF format (httpwwwsangeracukSoftwareformatsGFF)
Honey bee sequences annotated as orthologs to D mela-nogaster genes were putatively assigned the GO terms listed in
the respective Flybase entry In addition the definition of new GOterms (Biological Process ontology) related to caste developmentand polyphenism (GO0048651 and GO0048650 respectively)was co-ordinated with the Gene Ontology Consortium (Ashburneret al 2000) The FatiGO web tool (Al-Sharour et al 2004) wasused to cluster GO terms (level 3 setting) for Biological Processand Molecular Function
For the detection of conserved domains the 51 proteinsequences were screened against the Pfam database (httpwwwsangeracukSoftwarePfam) using the HMMER platform(current release 232 httphmmerwustledu) with a cut-offvalue set at 1eminus10
Annotation of oogenesis and reproduction genes
In order to identify putative honey bee orthologs to D melanogastergenes we searched the following GO terms in Flybase lsquooogenesisrsquo(GO0009993) lsquovitellogenesisrsquo (GO0007296) lsquofemale meiosisrsquo(GO0007143) lsquoDNA recombinationrsquo (GO0006310) and lsquochromo-some segregationrsquo (GO0007059) Genes related to segregationdistortion were searched for in Flybase in phenotypic descriptionsand mutant effects of D melanogaster genes as this phenomenonis not represented by a GO term Hence this group may be moreheterogeneous than the others From this list we removed genesof pleiotropic function (multifaceted GO entries in Biological Process)and genes that lacked defined transcripts in the Drosophilagenome database
For the GO terms lsquooogenesisrsquo and lsquovitellogenesisrsquo we performedTBLASTN and BLASTP searches for 42 fruit fly genes against theAmel_v30 genome assembly and the GLEAN3-predicted proteinsequences (Honey Bee Genome Sequencing Consortium 2006)respectively The orthologous D melanogaster gene was charac-terized by the same procedure as described above (reciprocalbest hit) For the GO terms lsquofemale meiosisrsquo lsquoDNA recombinationrsquolsquochromosome segregationrsquo and the non-GO group lsquosegregationdistortionrsquo transcripts of D melanogaster were searched againstnr databases at NCBI using BLASTP The obtained sequenceswere searched against the Amel_v30 genome assembly and theGLEAN3-predicted protein sequences using TBLASTN Homolo-gous sequences (threshold 1eminus10) were predicted using the BioEditsoftware (Hall 1999) ORFs showing significant homology (BLASTPthreshold 1eminus20) were assembled and used in BLASTP searchesagainst the nr databases at NCBI
Motif search in upstream regions in caste-specifically expressed genes
In order to detect overrepresented motifs in the upstream controlregions (UCRs) of the two sets of caste-related genes we selectedgene subsets based on two criteria (1) those that had shown thehighest caste-specificity in the array analyses (Evans amp Wheeler2000) and (2) those that had a conserved 5prime region when comparedwith the Drosophila orthologs These lsquotop10rsquo genes consisted of sixqueen genes (GB13072 GB11628 GB19380 GB14798 GB16047and GB18242) and of six worker genes (GB10869 GB12371GB12239 GB10428 GB19006 GB14758) The motif search wasconducted separately on the two sets of UCR sequences using threemethods AlignAce (Roth et al 1998) MEME (Bailey amp Elkan1995) and MDscan (Liu et al 2002) Default parameters valueswere used in all searches except that GC content in intergenicregions was set to 25 representing the background value estab-lished for the honey bee UCR database generated in this study
712 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
The database containing 10156 UCR sequences was gener-ated by parsing the Official Set annotation file (downloaded in GFFformat from httpwwwbeegenomehgscbcmtmcedubeeftphtml)to extract upstream regions starting from the terminal 5prime-genomiccoordinate of each predicted CDS The UCRs were arbitrarily setto a size frame of 1000 nucleotides (Roth et al 1998) but weretrimmed whenever another predicted ORF was detected in any ofthese regions
The MAP (maximum a priori log likelihood) score group specificityscore (called Church score in this manuscript) (Hughes et al2000) ROC AUC (area under the curve for a receiver-operatorcharacteristic plot) metric and MNCP (mean normalized condi-tional probability) metric (Clarke amp Granek 2003) were used todetect motifs that most likely correspond to biologically significantcis-regulatory elements The filters ran on the UCRs of the subsetsof queen and worker genes were a MAP score cut-off value of 50followed by a ROC AUC cut-off at 07 followed by a group specificityscore cut-off at 1eminus05 The UCR database of all GLEAN3 predictedhoney bee genes was used as the background to calculatethese metrics
A parametric test (MANOVA) and a nonparametric test (KolmogorovndashSmirnov) were conducted to identify significance levels for the twosets of filtered motifs found in the UCRs of caste-specificallyexpressed genes against filtered motifs found in the random UCRset Random motifs were sampled from a motif database (10 391motifs) generated by running our script 100 times with a randomsets of UCR sequences
The main criterion for identifying known regulatory motifs amongthese caste-specific ones was the alignment of the PSSM for eachbee motif with the Drosophila melanogaster sequences in theTRANSFAC database (release 40) (Wingender et al 2000) Onlythe alignments passing a threshold of 80 identity for each PSSMwere considered as significant matches In addition we checkedfor a specific binding motif the EcRUSP motif (rGkTCAaTGamcy-3prime) known to function in the expression of genes responding tomorphogenetic hormone titres (Perera et al 2005)
Operating system and programming tools
An Ubuntu Linux (version Breezy) operating system was used toimplement all scripts and pipelines designed for annotation proce-dures and motif discovery The Python programming language(httpwwwpythonorg) Biopython (httpwwwbiopythonorg)and TAMO (Tools for Analysis of Motifs) packages (Gordon et al2005) were used in program design Other web applications werebuilt using the Zope application server (httpwwwzopeorg)hosted at httpzulufmrpuspbrbeelab
Acknowledgements
This work was supported by grants from FAPESP(Fundaccedilatildeo de Amparo a Pesquisa do Estado de SatildeoPaulo 9900719-6) and CNPq (Conselho Nacional deDesenvolvimento Cientiacutefico e Tecnoloacutegico 4729632-3-1)
References
Abouheif E and Wray GA (2002) Evolution of the genetic networkunderlying wing polyphenism in ants Science 297 249ndash252
Al-Sharour F Diaz-Uriarte R and Dopazo J (2004) FatiGO aweb tool for finding significant associations of Gene Ontologyterms with groups of genes Bioinformatics 20 578ndash580
Amdam GV Norberg K Hagen A and Omholt SW (2003)Social exploitation of vitellogenin Proc Natl Acad Sci USA 1001799ndash1802
Amdam GV Simotildees ZLP Hagen A Norberg K SchroderK Mikkelsen O et al (2004) Hormonal control of the yolkprecursor vitellogenin regulates immune function and longevityin honeybees Exp Gerontol 39 767ndash773
Antoniewski C OrsquoGrady MS Edmondson RG Lassieur SMand Benes H (1995) Characterization of an EcRUSP heter-odimer target site that mediates ecdysone responsiveness ofthe Drosophila Lsp-2 gene Mol Gen Genet 249 545ndash556
Arbeitsman MN Furlong EEM Imam F Johnson E NullBH Baker BS et al (2002) Gene expression during the lifecycle of Drosophila melanogaster Science 297 2270ndash2275
Ashburner M Ball CA Blake JA Botstein D Butler HCherry JM et al (2000) Gene Ontology tool for the unificationof biology Nature Genet 25 25ndash29
Bailey TL and Elkan D (1995) Unsupervised learning of multiplemotifs in biopolymers using expectation maximizationMachine Learning J 21 58ndash83
Bergem M Norberg K and Aamodt RM (2006) Long-termmaintenance of in vitro cultured honeybee (Apis mellifera)embryonic cells BMC Dev Biol 6 17 online
Boutros M Kiger AA Armknecht S Kerr K Hild M Koch Bet al (2004) Genome-wide RNAi analysis of growth and viabil-ity in Drosophila cells Science 303 832ndash835
Carsten LD Watts T and Markov TA (2005) Gene expressionpatterns accompanying a dietary switch in Drosophila mela-nogaster Mol Ecol 14 3203ndash3208
Clarke ND and Granek JA (2003) Rank order metrics for quan-tifying the association of sequence features with gene regula-tion Bioinformatics 19 212ndash218
Colombani J Bianchini L Layalle S Pondeville E Dauphin-Villemant C Antoniewski C et al (2005) Antagonisticactions of ecdysone and insulins determine final size inDrosophila Science 310 667ndash670
Corona M Estrada E and Zurita M (1999) Differential expressionof mitochondrial genes between queens and workers duringcaste determination in the honeybee Apis mellifera J Exp Biol202 929ndash938
Cunha AC Nascimento AM Guidugli KR Simotildees ZLPand Bitondi MMG (2005) Molecular cloning and expressionof a hexamerin cDNA from the honey bee J Insect Physiol 511135ndash1147
Darwin CR (1859) On the Origin of Species by Means of NaturalSelection John Murray London
Davidson EH (2001) Genomic Regulatory Systems Developmentand Evolution Academic Press San Diego
Eder J Kremer JP and Rembold H (1983) Correlation ofcytochrome c titer and respiration in Apis mellifera adaptiveresponses to caste determination defines workers intercastesand queens Comp Biochem Physiol B 76 703ndash716
Engels W (1974) Occurrence and significance of vitellogenins infemale castes of social Hymenoptera Am Zool 14 1229ndash1237
Evans JD and Wheeler DE (1999) Differential gene expressionbetween developing queens and workers in the honey beeApis mellifera Proc Natl Acad Sci USA 96 5575ndash5580
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
Genomics of honey bee caste development and reproduction
707
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society
Insect Molecular Biology
15
703ndash714
the large-scale RNAi assays established for
Drosophila
(Boutros
et al
2004) but the development of cell cultureapproaches in the honey bee (Bergem
et al
2006) repre-sents a step in this direction
Alternatively regulatory functional associations betweengenes and their integration into networks can be inferredfrom the presence of response elements in upstreamcontrol regions In our analysis of differentially expressedgenes in queen-worker development we took a bioinformaticsapproach for a first look into the molecular architecture of adevelopmental polyphenism
Motif search in upstream regions of differentially expressed genes
The genes related to caste development are among thefirst honey bee genes for which experimentally validatedexpression data were generated (Corona
2004) Certainly these 51 genes donot comprise all the genes involved in caste developmentbut they are expected be prominent players as they werethe ones that stood out in the SSH and DDRT-PCRapproaches The 51 caste genes do not represent genefamilies but rather fall into many very different molecularfunction categories This made us ask whether theobserved overexpression pattern of different genes ineither queen or worker larvae may be associated with theoccurrence of specific regulatory motifs in the upstreamcontrol regions (UCR) of these genes
Three different algorithms AlignACE (Roth
et al
1998)MEME (Bailey amp Elkan 1995) and MDscan (Liu
et al
2002) were used to construct a pipeline for detectingoverrepresented motifs in the two unaligned sets of UCRsequences for the caste-specifically expressed genes Thispipeline was run on a lsquotop-10rsquo set of 12 genes (six for eachcaste) which showed the most pronounced caste differ-ences in expression (Evans amp Wheeler 2000) and alsoon a randomly selected set of UCRs (background control)We calculated four different metrics for each motif MAPscore (Roth
et al
1998) a group-specificity score (Churchscore) (Hughes
et al
2000) and a ROC AUC and MNCPmetric (Clarke amp Granek 2003) A first set of filters wasused to detect motifs with a potential for regulatory functions(MAP score
ge
5 ROC AUC
ge
07) This resulted in 46 motifsout of 123 total UCR motifs found in the queen UCR setand in 71 motifs out of 261 total found in the worker UCRset (Supplementary material Table 2S
)
A parametric statistical test (
MANOVA
P
= 00001Wilksrsquo = 078
F
= 72) and a nonparametric statistical test(KolmogorovndashSmirnov Table 1) on ROC AUC and MNCPindices showed that these two sets of filtered motifs aresignificantly different from a randomly selected set ofmotifs The rank-order metrics ROC AUC and MNCP havepreviously been used to compare the association of short
regulatory sequence features with gene expression data(microarray analyses on coregulated genes) and they havebeen useful in flagging false positives erroneously includedin lsquotop-10rsquo sets of differentially expressed genes (Clarke ampGranek 2003)
To select highly specific motifs found in each data set weused the group-specificity score (Church score
le
1e
minus
05
Hughes
et al
2000) to identify the most likely motifsinvolved in decision making for pathways leading to queen(two motifs Fig 3A) or to worker development (12 motifswith Church score
le
1e
minus
07
Fig 3B) As the SSH and DDRT-PCR approaches on caste development can be expectedto retrieve only a subpopulation of such genes these motifsrepresent only a partial scenario of the transcriptionalregulatory network underlying caste development Themotifs can now be used to screen other GLEAN3-predictedgenes to integrate a candidate list of putatively coregulatedgenes in caste development that can be submitted tofurther experimental validation
Each motif found in UCRs of queen (46) and worker (71)overexpressed genes was compared with the entire setof
D melanogaster cis
-regulatory motifs contained in theTRANSFAC database (version 40 Wingender
et al
2000)Only alignments passing 80 identity for each position-specific site matrix (PSSM) were considered as significantmatches Whereas none of the most specific motifs foreach caste showed similarity to any of the
D melanogaster
motifs some of the more ubiquitous ones did resemblebinding sites of transcription factors such as
AntennapediaUltrabithorax
zerknuumlllt
even skipped
trithorax-like
tailless
paired
fushi tarazu
and
Adh transcription factor 1
(Supple-mentary material Table 2S)
When we plotted the positions of the two queen and the12 worker motifs in the UCRs of the caste-specificallyexpressed genes (Fig 4) an interesting pattern emergedfor the worker-specific motifs Some of the worker motifsappeared to be clustered and occurring in tandem further-more they were positioned relatively close to the predictedtranslation start sites in some of the genes that are over-expressed during worker development (annotation resultsof these genes are listed in Supplementary materialTable 1S) A position close to the predicted translation start
Table 1 KolmogorovndashSmirnov analysis of ROC AUC and MNCP metric for statistical significance of putative regulatory motifs in upstream control regions of genes with queen or worker-specific expression patterns These motifs were contrasted with a random set of motifs detected in a random set of UCRs of GLEAN3-predicted honey bee genes
Group pairs ROC AUC MNCP
Random times (Queen + Worker) P gt 01 P lt 0001Random times Queen P gt 01 P lt 0005Random times Worker P gt 01 P lt 0001Queen times Worker P lt 01 P gt 01
708 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
sites is generally taken as a sign of strong regulatory effect(Davidson 2001)
As caste development is highly dependent on changesin haemolymph titres of JH and ecdysteroids we alsoscreened the UCRs of the differentially expressed genesfor putative nuclear receptor binding sites Regulatoryelements involved in the JH response are not well under-stood yet so any prediction in this direction would beelusive (Wheeler amp Nijhout 2003) Functional ecdysoneresponse elements (EcRE) have however been identifiedand it is now well established that the EcRUSP complexbinds to direct or inverted (palindromic) repeats (Riddi-hough amp Pelham 1987 Antoniewski et al 1995 Perera
et al 2005) A PSSM search (Wassermann amp Sandelin2004) based on a canonical representation (rGkTCAaT-Gamcy) (Perera et al 2005) did not reveal any putativeEcRE motif in the UCRs of the 51 caste-differentiallyexpressed genes However this does not rule out thatthese genes respond to changes in JH andor ecdysteroidtitres as these hormones require EcRUSP binding prima-rily in the expression of early genes but not necessarily forthe late response genes (Li amp White 2003 Sullivan ampThummel 2003)
In conclusion the predictions from such a combinedstrategy that searches for group-specific and for conservedregulatory motifs in GLEAN3 predicted honey bee genes
Figure 3 Putative regulatory motifs and their consensus sequences in UCRs of queen and worker overexpressed genes Scores for MAP Church ROC AUC and MNCP metrics indicate degree of group specificity and significance level
Genomics of honey bee caste development and reproduction 709
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
represents a major transition from nonhypothesis-drivenhigh-throughput screens to hypothesis-driven searches forcontext-dependent gene expression in honey bees Suchdirected search results can serve as a platform for experi-mental analyses of genome-wide integration in hormonalcontrol of caste development in bees In addition this studyexemplifies how existent algorithms for detecting sharedregulatory motifs can be joined into a toolkit for predictingcoregulated gene expression patterns in honey bees Thesemethods have been shown to be robust and are gainingacceptance for use in functional and comparative genomics(Liu et al 2004 Pritsker et al 2004 Zhu et al 2005)
Oogenesis and reproduction
As caste development sets the stage for reproductivedivision of labour genes involved in reproductive processesare strong candidates for functional analyses In the presentstudy we performed BLAST searches to identify honey beeorthologs for a list of 32 fly genes with the GO attributelsquooogenesisrsquo and for four genes specifically related tolsquovitellogenesisrsquo The list for fly genes involved in nuclearevents in germ cells consisted of 20 genes for lsquofemalemeiosisrsquo 12 genes for lsquorecombinationrsquo and 21 genes underthe heading lsquochromosome segregation including segregation
distortionrsquo (Supplementary material Table 3S) In some casesthese GO attributes for fly genes of course overlapped
BLASTN and BLASTX searches for these fly genesagainst the honey bee genome assembly 30 and theGLEAN3 Official Set (aa) retrieved statistically well sup-ported putative bee orthologs for most of these candidatesFor the genes involved in meiosis recombination and chro-mosome segregation this finding although not unexpectedis of interest as meiosis in the haploid honey bee drone isstrongly modified when compared with a normal diploidmeiosis The first meiosis is initiated but the nucleusremains undivided and only the superfluous centrioles areeliminated as cytoplasmatic buds (Hoage amp Kessel 1968)An interesting gene thelytoky (th) has recently beenmapped in this context (Lattorff et al 2005) It preventsalmost completely meiotic recombination in the automixisof laying workers of the Cape honey bee As an indicationof the interplay between meiosis and later developmentthis locus also appears to be an integral part of variousgene cascades involved in caste determination (Lattorffet al 2006)
The fly genes retrieved in the GO searches for lsquooogenesisrsquorepresent a much larger range of Molecular Functioncategories such as transcription factors proteins regulating
Figure 4 Map of the group-specific motifs found in queen and worker UCRs of caste-specifically expressed genes The coding region is represented by the GLEAN3 prediction number (assembly 40) with arrows indicating the translation start site Asterisks mark UCRs of the lsquotop10rsquo set used to find the over-represented motifs
710 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
translation by RNA binding RNA helicases enzymes(ubiquitination transfer of sugar residues sulfotransferase)GTPase activity and several factors binding to cytoskeletalproteins (Supplementary material Table 3S) This widerange of functional categories is expected as these genesare involved in a series of different steps during oogenesisin the polytrophic meroistic ovary Oogenesis starts out withthe maintenance of germline and somatic stem cell identityin the germline niche in the upper germarium A key geneinvolved in this process is pumilio (Forbes amp Lehmann1998) which is represented by a highly conserved beeortholog GB10504-PA The second step is the formation ofgerm cell cysts the determination of an oocyte within eachcyst and the survival of these cysts involving genes suchas benign gonial cell neoplasm (Lin et al 1994) encore(Hawkins et al 1996) ovo and ovarian tumour (otu) (Staabamp Steinmann-Zwicky 1995) all well conserved in the honeybee genome Interestingly we could not find a clear beeortholog for bag of marbles (bam) which is one of the primeearly response genes in the cystoblast differentiationpathway in Drosophila (McKearin 1997)
The third step comprises previtellogenic growth of thefollicle and during these stages a number of maternalfactors are deposited and anchored either within the oocyteor in the perivitelline space that define the egg and thefuture embryonic axes (for review see St Johnston ampNuumlsslein-Volhard (1992) In the list of Drosophila genesinvolved in early steps of axis determination a couple ofsurprises came up in the search for honey bee orthologsA big surprise was that we could not find a gurken orthologin the bee even though this gene sets up both the anteriorndashposterior and dorsalndashventral axes in the Drosophila egg(Gonzaacutelez-Reyes et al 1995) whereas downstreamcomponents of the Gurken signalling cascade appear tobe preserved in the bee genome (Honey Bee GenomeSequencing Consortium 2006) Similar apparent gaps inconstituents of patterning cascades were noted for theterminal regions of the embryo such as a lack of a torsoortholog whereas its ligand torso-like is represented by awell conserved ortholog in the bee (GB18663-PA)
With respect to genes involved in the final processes ofoogenesis we primarily looked at genes that play a partduring vitellogenesis There are four genes of interest in thisclass the primary one coding for the yolk protein precursorvitellogenin This gene has already been sequenced for thehoney bee (Piulachs et al 2003) and as expected it ismuch more related to vitellogenins of other insects andeven vertebrates than to the Drosophila yolk proteinswhich apparently are derived from lipases (Hagedorn et al1998) The second gene of interest is the bee ortholog toyolkless as this (GB16571-PA) could represent a putativevitellogenin receptor The other two Drosophila genes withclear orthologs in the bee are CG18641 and CG12139which code for a lipase and an LDL receptor respectively
General conclusions
The current analysis made use of previous experimentalanalyses on differential transcription during caste develop-ment of honey bee larvae In the annotation of these geneswhich includes references to Gene Ontology terms asso-ciated with their respective Drosophila orthologs two majorconfigurations emerged First of all worker genes werebetter defined in terms of GO attributes compared with therelatively large number of queen genes that had no GOterms associated to their respective Drosophila orthologsEven when taking into consideration the conceptual limitsin attributing GO terms on molecular function and biologicalprocess from Drosophila orthologs to honey bee genesthis finding could have a bearing on general basicquestions in socioevolution namely which caste is moredivergent from a nonsocial reproductive female bee proto-type or reproductive ground plan the queen or the workerLess speculative is the second major conclusion comingout of the GO analysis for Molecular Function showing andconfirming (Eder et al 1983 Corona et al 1999) theimportant role of metabolic regulation in caste developmentThis facet is demonstrated especially clearly in thecaste-specific expression of oxidoreductases (queen)vs hydrolases (workers)
The honey bee genome information provided not only amuch improved annotation platform for caste-specificallyexpressed ESTs but even more so opens the possibilityto explore putative regulatory features of the honey beegenome In the current study we employed modified Gibbssampling and expectation-maximization algorithms (Alig-nACE MDScan MEME) to detect group-specific motifs ingene regions up to 1000 bp upstream of translation startsites We detected 14 motifs that were significantly over-represented in the caste genes when compared withcorresponding motifs found in a random set of GLEAN3-predicted honey bee genes The localization of suchmotifs in UCRs of worker-overexpressed genes revealeda clustering of such motifs close to the predicted basalpromotor regions suggesting strong regulatory effectsSuch search strategies and the detected motifs can providethe lead to reveal and unravel cis-regulatory networks forand within specific contexts of honey bee biology
Caste polyphenism in social insects makes a strong casefor the emergence of novelties at a microevolutionary level(West-Eberhard 2003) Without the pretension to discussexhaustively the mechanisms underlying this surge ofdevelopmental plasticity two major themes becomeapparent in this and other studies Regulatory change hasbeen demonstrated in the shut-down of wing disc pattern-ing cascades in ants (Abouheif amp Wray 2002) and iscertainly also implicit in observed temporal changes ingene expression during postembryonic development ofants and bumble bees (Goodisman et al 2005 Pereboom
Genomics of honey bee caste development and reproduction 711
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
et al 2005) Such change would be expected to involvecis-regulatory elements that is change in transcriptionfactor binding sites in UCRs as approached in this studyand also evolutionary change in response thresholds tocirculating morphogenetic hormones (for review see Hart-felder amp Emlen 2005) The second and quite unexpectedtheme is the acquisition of new systemic functions byevolutionary rather old proteins such as vitellogenin andhexamerins These apparently unspectacular proteinshave evolved into key players for caste evolution and repro-ductive division of labour via novel regulatory connectivitywith JH (Amdam et al 2004 Guidugli et al 2005 Zhouet al 2006ab)
Experimental procedures
Selection and annotation of ESTs representing differentially expressed genes in honey bee caste development
The starting point were 164 entries (mainly 3prime-ESTs) in GenBank(BG101532ndashBG101697) from an SSH library (Evans amp Wheeler1999 Evans amp Wheeler 2000) When validated by macroarrayanalyses a clustering into three major classes became apparent(I) genes overexpressed in young larvae (II) genes overexpressedin last instar queen larvae and (III) genes overexpressed in lastinstar worker larvae For this study we excluded the class I ESTsbecause their expression is not caste-specific but rather representsexpression differences between young (still bipotent) and olderlarvae To the class II queen ESTs (82) we added one completecDNA entry (AY601642) from a DDRT-PCR screen (Corona et al1999) and to the class III set of worker ESTs (40) we added sevenGenBank dbEST entries (BG149167ndashBG149173) from a DDRT-PCR screen on ovary development (Hepperle amp Hartfelder 2001)
The EST sequences were submitted to BLASTN searches(parameters -G 2 -E 3 -W 15 -F lsquom Drsquo -U -e 1e-20) against genomesequence assembly Amel_v30 to retrieve matches in linked orunlinked genomic contigs and to exclude no-matches (seven ESTsin queen) ESTs that aligned within the same scaffold werechecked for clustering and overlap This clustering also served toexclude genes that were represented by non-overlapping ESTsfrom both castes This procedure generated a set of 51 uniqueputative gene sequences overexpressed in either queen (34) or inworker larvae (17) These 51 nonredundant sequences weresubmitted to BLASTX searches against the Official Set of GLEAN3-predicted protein sequences (cut-off value at 1eminus20) For ESTs withno significant protein sequence matches the genomic regionsadjacent to the mapped EST were searched to find neighbouringORFs especially those nearest to putative 3prime UTRs of predictedproteins as the EST libraries have a bias in this direction
Official Set protein sequences were aligned against Amel_v30sequence assembly using TBLASTN to map protein to genomeand subsequently they were aligned using BLASTP against theGenBank nonredundant (nr) and the Flybase protein sequencedatabases The manual features annotation procedure of theArtemis 70 program (Rutherford et al 2000) was used to mapORFs putative splice sites of exons and ESTs to genome coordi-nates The final annotation file was generated with a Python scriptin GFF format (httpwwwsangeracukSoftwareformatsGFF)
Honey bee sequences annotated as orthologs to D mela-nogaster genes were putatively assigned the GO terms listed in
the respective Flybase entry In addition the definition of new GOterms (Biological Process ontology) related to caste developmentand polyphenism (GO0048651 and GO0048650 respectively)was co-ordinated with the Gene Ontology Consortium (Ashburneret al 2000) The FatiGO web tool (Al-Sharour et al 2004) wasused to cluster GO terms (level 3 setting) for Biological Processand Molecular Function
For the detection of conserved domains the 51 proteinsequences were screened against the Pfam database (httpwwwsangeracukSoftwarePfam) using the HMMER platform(current release 232 httphmmerwustledu) with a cut-offvalue set at 1eminus10
Annotation of oogenesis and reproduction genes
In order to identify putative honey bee orthologs to D melanogastergenes we searched the following GO terms in Flybase lsquooogenesisrsquo(GO0009993) lsquovitellogenesisrsquo (GO0007296) lsquofemale meiosisrsquo(GO0007143) lsquoDNA recombinationrsquo (GO0006310) and lsquochromo-some segregationrsquo (GO0007059) Genes related to segregationdistortion were searched for in Flybase in phenotypic descriptionsand mutant effects of D melanogaster genes as this phenomenonis not represented by a GO term Hence this group may be moreheterogeneous than the others From this list we removed genesof pleiotropic function (multifaceted GO entries in Biological Process)and genes that lacked defined transcripts in the Drosophilagenome database
For the GO terms lsquooogenesisrsquo and lsquovitellogenesisrsquo we performedTBLASTN and BLASTP searches for 42 fruit fly genes against theAmel_v30 genome assembly and the GLEAN3-predicted proteinsequences (Honey Bee Genome Sequencing Consortium 2006)respectively The orthologous D melanogaster gene was charac-terized by the same procedure as described above (reciprocalbest hit) For the GO terms lsquofemale meiosisrsquo lsquoDNA recombinationrsquolsquochromosome segregationrsquo and the non-GO group lsquosegregationdistortionrsquo transcripts of D melanogaster were searched againstnr databases at NCBI using BLASTP The obtained sequenceswere searched against the Amel_v30 genome assembly and theGLEAN3-predicted protein sequences using TBLASTN Homolo-gous sequences (threshold 1eminus10) were predicted using the BioEditsoftware (Hall 1999) ORFs showing significant homology (BLASTPthreshold 1eminus20) were assembled and used in BLASTP searchesagainst the nr databases at NCBI
Motif search in upstream regions in caste-specifically expressed genes
In order to detect overrepresented motifs in the upstream controlregions (UCRs) of the two sets of caste-related genes we selectedgene subsets based on two criteria (1) those that had shown thehighest caste-specificity in the array analyses (Evans amp Wheeler2000) and (2) those that had a conserved 5prime region when comparedwith the Drosophila orthologs These lsquotop10rsquo genes consisted of sixqueen genes (GB13072 GB11628 GB19380 GB14798 GB16047and GB18242) and of six worker genes (GB10869 GB12371GB12239 GB10428 GB19006 GB14758) The motif search wasconducted separately on the two sets of UCR sequences using threemethods AlignAce (Roth et al 1998) MEME (Bailey amp Elkan1995) and MDscan (Liu et al 2002) Default parameters valueswere used in all searches except that GC content in intergenicregions was set to 25 representing the background value estab-lished for the honey bee UCR database generated in this study
712 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
The database containing 10156 UCR sequences was gener-ated by parsing the Official Set annotation file (downloaded in GFFformat from httpwwwbeegenomehgscbcmtmcedubeeftphtml)to extract upstream regions starting from the terminal 5prime-genomiccoordinate of each predicted CDS The UCRs were arbitrarily setto a size frame of 1000 nucleotides (Roth et al 1998) but weretrimmed whenever another predicted ORF was detected in any ofthese regions
The MAP (maximum a priori log likelihood) score group specificityscore (called Church score in this manuscript) (Hughes et al2000) ROC AUC (area under the curve for a receiver-operatorcharacteristic plot) metric and MNCP (mean normalized condi-tional probability) metric (Clarke amp Granek 2003) were used todetect motifs that most likely correspond to biologically significantcis-regulatory elements The filters ran on the UCRs of the subsetsof queen and worker genes were a MAP score cut-off value of 50followed by a ROC AUC cut-off at 07 followed by a group specificityscore cut-off at 1eminus05 The UCR database of all GLEAN3 predictedhoney bee genes was used as the background to calculatethese metrics
A parametric test (MANOVA) and a nonparametric test (KolmogorovndashSmirnov) were conducted to identify significance levels for the twosets of filtered motifs found in the UCRs of caste-specificallyexpressed genes against filtered motifs found in the random UCRset Random motifs were sampled from a motif database (10 391motifs) generated by running our script 100 times with a randomsets of UCR sequences
The main criterion for identifying known regulatory motifs amongthese caste-specific ones was the alignment of the PSSM for eachbee motif with the Drosophila melanogaster sequences in theTRANSFAC database (release 40) (Wingender et al 2000) Onlythe alignments passing a threshold of 80 identity for each PSSMwere considered as significant matches In addition we checkedfor a specific binding motif the EcRUSP motif (rGkTCAaTGamcy-3prime) known to function in the expression of genes responding tomorphogenetic hormone titres (Perera et al 2005)
Operating system and programming tools
An Ubuntu Linux (version Breezy) operating system was used toimplement all scripts and pipelines designed for annotation proce-dures and motif discovery The Python programming language(httpwwwpythonorg) Biopython (httpwwwbiopythonorg)and TAMO (Tools for Analysis of Motifs) packages (Gordon et al2005) were used in program design Other web applications werebuilt using the Zope application server (httpwwwzopeorg)hosted at httpzulufmrpuspbrbeelab
Acknowledgements
This work was supported by grants from FAPESP(Fundaccedilatildeo de Amparo a Pesquisa do Estado de SatildeoPaulo 9900719-6) and CNPq (Conselho Nacional deDesenvolvimento Cientiacutefico e Tecnoloacutegico 4729632-3-1)
References
Abouheif E and Wray GA (2002) Evolution of the genetic networkunderlying wing polyphenism in ants Science 297 249ndash252
Al-Sharour F Diaz-Uriarte R and Dopazo J (2004) FatiGO aweb tool for finding significant associations of Gene Ontologyterms with groups of genes Bioinformatics 20 578ndash580
Amdam GV Norberg K Hagen A and Omholt SW (2003)Social exploitation of vitellogenin Proc Natl Acad Sci USA 1001799ndash1802
Amdam GV Simotildees ZLP Hagen A Norberg K SchroderK Mikkelsen O et al (2004) Hormonal control of the yolkprecursor vitellogenin regulates immune function and longevityin honeybees Exp Gerontol 39 767ndash773
Antoniewski C OrsquoGrady MS Edmondson RG Lassieur SMand Benes H (1995) Characterization of an EcRUSP heter-odimer target site that mediates ecdysone responsiveness ofthe Drosophila Lsp-2 gene Mol Gen Genet 249 545ndash556
Arbeitsman MN Furlong EEM Imam F Johnson E NullBH Baker BS et al (2002) Gene expression during the lifecycle of Drosophila melanogaster Science 297 2270ndash2275
Ashburner M Ball CA Blake JA Botstein D Butler HCherry JM et al (2000) Gene Ontology tool for the unificationof biology Nature Genet 25 25ndash29
Bailey TL and Elkan D (1995) Unsupervised learning of multiplemotifs in biopolymers using expectation maximizationMachine Learning J 21 58ndash83
Bergem M Norberg K and Aamodt RM (2006) Long-termmaintenance of in vitro cultured honeybee (Apis mellifera)embryonic cells BMC Dev Biol 6 17 online
Boutros M Kiger AA Armknecht S Kerr K Hild M Koch Bet al (2004) Genome-wide RNAi analysis of growth and viabil-ity in Drosophila cells Science 303 832ndash835
Carsten LD Watts T and Markov TA (2005) Gene expressionpatterns accompanying a dietary switch in Drosophila mela-nogaster Mol Ecol 14 3203ndash3208
Clarke ND and Granek JA (2003) Rank order metrics for quan-tifying the association of sequence features with gene regula-tion Bioinformatics 19 212ndash218
Colombani J Bianchini L Layalle S Pondeville E Dauphin-Villemant C Antoniewski C et al (2005) Antagonisticactions of ecdysone and insulins determine final size inDrosophila Science 310 667ndash670
Corona M Estrada E and Zurita M (1999) Differential expressionof mitochondrial genes between queens and workers duringcaste determination in the honeybee Apis mellifera J Exp Biol202 929ndash938
Cunha AC Nascimento AM Guidugli KR Simotildees ZLPand Bitondi MMG (2005) Molecular cloning and expressionof a hexamerin cDNA from the honey bee J Insect Physiol 511135ndash1147
Darwin CR (1859) On the Origin of Species by Means of NaturalSelection John Murray London
Davidson EH (2001) Genomic Regulatory Systems Developmentand Evolution Academic Press San Diego
Eder J Kremer JP and Rembold H (1983) Correlation ofcytochrome c titer and respiration in Apis mellifera adaptiveresponses to caste determination defines workers intercastesand queens Comp Biochem Physiol B 76 703ndash716
Engels W (1974) Occurrence and significance of vitellogenins infemale castes of social Hymenoptera Am Zool 14 1229ndash1237
Evans JD and Wheeler DE (1999) Differential gene expressionbetween developing queens and workers in the honey beeApis mellifera Proc Natl Acad Sci USA 96 5575ndash5580
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
708 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
sites is generally taken as a sign of strong regulatory effect(Davidson 2001)
As caste development is highly dependent on changesin haemolymph titres of JH and ecdysteroids we alsoscreened the UCRs of the differentially expressed genesfor putative nuclear receptor binding sites Regulatoryelements involved in the JH response are not well under-stood yet so any prediction in this direction would beelusive (Wheeler amp Nijhout 2003) Functional ecdysoneresponse elements (EcRE) have however been identifiedand it is now well established that the EcRUSP complexbinds to direct or inverted (palindromic) repeats (Riddi-hough amp Pelham 1987 Antoniewski et al 1995 Perera
et al 2005) A PSSM search (Wassermann amp Sandelin2004) based on a canonical representation (rGkTCAaT-Gamcy) (Perera et al 2005) did not reveal any putativeEcRE motif in the UCRs of the 51 caste-differentiallyexpressed genes However this does not rule out thatthese genes respond to changes in JH andor ecdysteroidtitres as these hormones require EcRUSP binding prima-rily in the expression of early genes but not necessarily forthe late response genes (Li amp White 2003 Sullivan ampThummel 2003)
In conclusion the predictions from such a combinedstrategy that searches for group-specific and for conservedregulatory motifs in GLEAN3 predicted honey bee genes
Figure 3 Putative regulatory motifs and their consensus sequences in UCRs of queen and worker overexpressed genes Scores for MAP Church ROC AUC and MNCP metrics indicate degree of group specificity and significance level
Genomics of honey bee caste development and reproduction 709
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
represents a major transition from nonhypothesis-drivenhigh-throughput screens to hypothesis-driven searches forcontext-dependent gene expression in honey bees Suchdirected search results can serve as a platform for experi-mental analyses of genome-wide integration in hormonalcontrol of caste development in bees In addition this studyexemplifies how existent algorithms for detecting sharedregulatory motifs can be joined into a toolkit for predictingcoregulated gene expression patterns in honey bees Thesemethods have been shown to be robust and are gainingacceptance for use in functional and comparative genomics(Liu et al 2004 Pritsker et al 2004 Zhu et al 2005)
Oogenesis and reproduction
As caste development sets the stage for reproductivedivision of labour genes involved in reproductive processesare strong candidates for functional analyses In the presentstudy we performed BLAST searches to identify honey beeorthologs for a list of 32 fly genes with the GO attributelsquooogenesisrsquo and for four genes specifically related tolsquovitellogenesisrsquo The list for fly genes involved in nuclearevents in germ cells consisted of 20 genes for lsquofemalemeiosisrsquo 12 genes for lsquorecombinationrsquo and 21 genes underthe heading lsquochromosome segregation including segregation
distortionrsquo (Supplementary material Table 3S) In some casesthese GO attributes for fly genes of course overlapped
BLASTN and BLASTX searches for these fly genesagainst the honey bee genome assembly 30 and theGLEAN3 Official Set (aa) retrieved statistically well sup-ported putative bee orthologs for most of these candidatesFor the genes involved in meiosis recombination and chro-mosome segregation this finding although not unexpectedis of interest as meiosis in the haploid honey bee drone isstrongly modified when compared with a normal diploidmeiosis The first meiosis is initiated but the nucleusremains undivided and only the superfluous centrioles areeliminated as cytoplasmatic buds (Hoage amp Kessel 1968)An interesting gene thelytoky (th) has recently beenmapped in this context (Lattorff et al 2005) It preventsalmost completely meiotic recombination in the automixisof laying workers of the Cape honey bee As an indicationof the interplay between meiosis and later developmentthis locus also appears to be an integral part of variousgene cascades involved in caste determination (Lattorffet al 2006)
The fly genes retrieved in the GO searches for lsquooogenesisrsquorepresent a much larger range of Molecular Functioncategories such as transcription factors proteins regulating
Figure 4 Map of the group-specific motifs found in queen and worker UCRs of caste-specifically expressed genes The coding region is represented by the GLEAN3 prediction number (assembly 40) with arrows indicating the translation start site Asterisks mark UCRs of the lsquotop10rsquo set used to find the over-represented motifs
710 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
translation by RNA binding RNA helicases enzymes(ubiquitination transfer of sugar residues sulfotransferase)GTPase activity and several factors binding to cytoskeletalproteins (Supplementary material Table 3S) This widerange of functional categories is expected as these genesare involved in a series of different steps during oogenesisin the polytrophic meroistic ovary Oogenesis starts out withthe maintenance of germline and somatic stem cell identityin the germline niche in the upper germarium A key geneinvolved in this process is pumilio (Forbes amp Lehmann1998) which is represented by a highly conserved beeortholog GB10504-PA The second step is the formation ofgerm cell cysts the determination of an oocyte within eachcyst and the survival of these cysts involving genes suchas benign gonial cell neoplasm (Lin et al 1994) encore(Hawkins et al 1996) ovo and ovarian tumour (otu) (Staabamp Steinmann-Zwicky 1995) all well conserved in the honeybee genome Interestingly we could not find a clear beeortholog for bag of marbles (bam) which is one of the primeearly response genes in the cystoblast differentiationpathway in Drosophila (McKearin 1997)
The third step comprises previtellogenic growth of thefollicle and during these stages a number of maternalfactors are deposited and anchored either within the oocyteor in the perivitelline space that define the egg and thefuture embryonic axes (for review see St Johnston ampNuumlsslein-Volhard (1992) In the list of Drosophila genesinvolved in early steps of axis determination a couple ofsurprises came up in the search for honey bee orthologsA big surprise was that we could not find a gurken orthologin the bee even though this gene sets up both the anteriorndashposterior and dorsalndashventral axes in the Drosophila egg(Gonzaacutelez-Reyes et al 1995) whereas downstreamcomponents of the Gurken signalling cascade appear tobe preserved in the bee genome (Honey Bee GenomeSequencing Consortium 2006) Similar apparent gaps inconstituents of patterning cascades were noted for theterminal regions of the embryo such as a lack of a torsoortholog whereas its ligand torso-like is represented by awell conserved ortholog in the bee (GB18663-PA)
With respect to genes involved in the final processes ofoogenesis we primarily looked at genes that play a partduring vitellogenesis There are four genes of interest in thisclass the primary one coding for the yolk protein precursorvitellogenin This gene has already been sequenced for thehoney bee (Piulachs et al 2003) and as expected it ismuch more related to vitellogenins of other insects andeven vertebrates than to the Drosophila yolk proteinswhich apparently are derived from lipases (Hagedorn et al1998) The second gene of interest is the bee ortholog toyolkless as this (GB16571-PA) could represent a putativevitellogenin receptor The other two Drosophila genes withclear orthologs in the bee are CG18641 and CG12139which code for a lipase and an LDL receptor respectively
General conclusions
The current analysis made use of previous experimentalanalyses on differential transcription during caste develop-ment of honey bee larvae In the annotation of these geneswhich includes references to Gene Ontology terms asso-ciated with their respective Drosophila orthologs two majorconfigurations emerged First of all worker genes werebetter defined in terms of GO attributes compared with therelatively large number of queen genes that had no GOterms associated to their respective Drosophila orthologsEven when taking into consideration the conceptual limitsin attributing GO terms on molecular function and biologicalprocess from Drosophila orthologs to honey bee genesthis finding could have a bearing on general basicquestions in socioevolution namely which caste is moredivergent from a nonsocial reproductive female bee proto-type or reproductive ground plan the queen or the workerLess speculative is the second major conclusion comingout of the GO analysis for Molecular Function showing andconfirming (Eder et al 1983 Corona et al 1999) theimportant role of metabolic regulation in caste developmentThis facet is demonstrated especially clearly in thecaste-specific expression of oxidoreductases (queen)vs hydrolases (workers)
The honey bee genome information provided not only amuch improved annotation platform for caste-specificallyexpressed ESTs but even more so opens the possibilityto explore putative regulatory features of the honey beegenome In the current study we employed modified Gibbssampling and expectation-maximization algorithms (Alig-nACE MDScan MEME) to detect group-specific motifs ingene regions up to 1000 bp upstream of translation startsites We detected 14 motifs that were significantly over-represented in the caste genes when compared withcorresponding motifs found in a random set of GLEAN3-predicted honey bee genes The localization of suchmotifs in UCRs of worker-overexpressed genes revealeda clustering of such motifs close to the predicted basalpromotor regions suggesting strong regulatory effectsSuch search strategies and the detected motifs can providethe lead to reveal and unravel cis-regulatory networks forand within specific contexts of honey bee biology
Caste polyphenism in social insects makes a strong casefor the emergence of novelties at a microevolutionary level(West-Eberhard 2003) Without the pretension to discussexhaustively the mechanisms underlying this surge ofdevelopmental plasticity two major themes becomeapparent in this and other studies Regulatory change hasbeen demonstrated in the shut-down of wing disc pattern-ing cascades in ants (Abouheif amp Wray 2002) and iscertainly also implicit in observed temporal changes ingene expression during postembryonic development ofants and bumble bees (Goodisman et al 2005 Pereboom
Genomics of honey bee caste development and reproduction 711
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
et al 2005) Such change would be expected to involvecis-regulatory elements that is change in transcriptionfactor binding sites in UCRs as approached in this studyand also evolutionary change in response thresholds tocirculating morphogenetic hormones (for review see Hart-felder amp Emlen 2005) The second and quite unexpectedtheme is the acquisition of new systemic functions byevolutionary rather old proteins such as vitellogenin andhexamerins These apparently unspectacular proteinshave evolved into key players for caste evolution and repro-ductive division of labour via novel regulatory connectivitywith JH (Amdam et al 2004 Guidugli et al 2005 Zhouet al 2006ab)
Experimental procedures
Selection and annotation of ESTs representing differentially expressed genes in honey bee caste development
The starting point were 164 entries (mainly 3prime-ESTs) in GenBank(BG101532ndashBG101697) from an SSH library (Evans amp Wheeler1999 Evans amp Wheeler 2000) When validated by macroarrayanalyses a clustering into three major classes became apparent(I) genes overexpressed in young larvae (II) genes overexpressedin last instar queen larvae and (III) genes overexpressed in lastinstar worker larvae For this study we excluded the class I ESTsbecause their expression is not caste-specific but rather representsexpression differences between young (still bipotent) and olderlarvae To the class II queen ESTs (82) we added one completecDNA entry (AY601642) from a DDRT-PCR screen (Corona et al1999) and to the class III set of worker ESTs (40) we added sevenGenBank dbEST entries (BG149167ndashBG149173) from a DDRT-PCR screen on ovary development (Hepperle amp Hartfelder 2001)
The EST sequences were submitted to BLASTN searches(parameters -G 2 -E 3 -W 15 -F lsquom Drsquo -U -e 1e-20) against genomesequence assembly Amel_v30 to retrieve matches in linked orunlinked genomic contigs and to exclude no-matches (seven ESTsin queen) ESTs that aligned within the same scaffold werechecked for clustering and overlap This clustering also served toexclude genes that were represented by non-overlapping ESTsfrom both castes This procedure generated a set of 51 uniqueputative gene sequences overexpressed in either queen (34) or inworker larvae (17) These 51 nonredundant sequences weresubmitted to BLASTX searches against the Official Set of GLEAN3-predicted protein sequences (cut-off value at 1eminus20) For ESTs withno significant protein sequence matches the genomic regionsadjacent to the mapped EST were searched to find neighbouringORFs especially those nearest to putative 3prime UTRs of predictedproteins as the EST libraries have a bias in this direction
Official Set protein sequences were aligned against Amel_v30sequence assembly using TBLASTN to map protein to genomeand subsequently they were aligned using BLASTP against theGenBank nonredundant (nr) and the Flybase protein sequencedatabases The manual features annotation procedure of theArtemis 70 program (Rutherford et al 2000) was used to mapORFs putative splice sites of exons and ESTs to genome coordi-nates The final annotation file was generated with a Python scriptin GFF format (httpwwwsangeracukSoftwareformatsGFF)
Honey bee sequences annotated as orthologs to D mela-nogaster genes were putatively assigned the GO terms listed in
the respective Flybase entry In addition the definition of new GOterms (Biological Process ontology) related to caste developmentand polyphenism (GO0048651 and GO0048650 respectively)was co-ordinated with the Gene Ontology Consortium (Ashburneret al 2000) The FatiGO web tool (Al-Sharour et al 2004) wasused to cluster GO terms (level 3 setting) for Biological Processand Molecular Function
For the detection of conserved domains the 51 proteinsequences were screened against the Pfam database (httpwwwsangeracukSoftwarePfam) using the HMMER platform(current release 232 httphmmerwustledu) with a cut-offvalue set at 1eminus10
Annotation of oogenesis and reproduction genes
In order to identify putative honey bee orthologs to D melanogastergenes we searched the following GO terms in Flybase lsquooogenesisrsquo(GO0009993) lsquovitellogenesisrsquo (GO0007296) lsquofemale meiosisrsquo(GO0007143) lsquoDNA recombinationrsquo (GO0006310) and lsquochromo-some segregationrsquo (GO0007059) Genes related to segregationdistortion were searched for in Flybase in phenotypic descriptionsand mutant effects of D melanogaster genes as this phenomenonis not represented by a GO term Hence this group may be moreheterogeneous than the others From this list we removed genesof pleiotropic function (multifaceted GO entries in Biological Process)and genes that lacked defined transcripts in the Drosophilagenome database
For the GO terms lsquooogenesisrsquo and lsquovitellogenesisrsquo we performedTBLASTN and BLASTP searches for 42 fruit fly genes against theAmel_v30 genome assembly and the GLEAN3-predicted proteinsequences (Honey Bee Genome Sequencing Consortium 2006)respectively The orthologous D melanogaster gene was charac-terized by the same procedure as described above (reciprocalbest hit) For the GO terms lsquofemale meiosisrsquo lsquoDNA recombinationrsquolsquochromosome segregationrsquo and the non-GO group lsquosegregationdistortionrsquo transcripts of D melanogaster were searched againstnr databases at NCBI using BLASTP The obtained sequenceswere searched against the Amel_v30 genome assembly and theGLEAN3-predicted protein sequences using TBLASTN Homolo-gous sequences (threshold 1eminus10) were predicted using the BioEditsoftware (Hall 1999) ORFs showing significant homology (BLASTPthreshold 1eminus20) were assembled and used in BLASTP searchesagainst the nr databases at NCBI
Motif search in upstream regions in caste-specifically expressed genes
In order to detect overrepresented motifs in the upstream controlregions (UCRs) of the two sets of caste-related genes we selectedgene subsets based on two criteria (1) those that had shown thehighest caste-specificity in the array analyses (Evans amp Wheeler2000) and (2) those that had a conserved 5prime region when comparedwith the Drosophila orthologs These lsquotop10rsquo genes consisted of sixqueen genes (GB13072 GB11628 GB19380 GB14798 GB16047and GB18242) and of six worker genes (GB10869 GB12371GB12239 GB10428 GB19006 GB14758) The motif search wasconducted separately on the two sets of UCR sequences using threemethods AlignAce (Roth et al 1998) MEME (Bailey amp Elkan1995) and MDscan (Liu et al 2002) Default parameters valueswere used in all searches except that GC content in intergenicregions was set to 25 representing the background value estab-lished for the honey bee UCR database generated in this study
712 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
The database containing 10156 UCR sequences was gener-ated by parsing the Official Set annotation file (downloaded in GFFformat from httpwwwbeegenomehgscbcmtmcedubeeftphtml)to extract upstream regions starting from the terminal 5prime-genomiccoordinate of each predicted CDS The UCRs were arbitrarily setto a size frame of 1000 nucleotides (Roth et al 1998) but weretrimmed whenever another predicted ORF was detected in any ofthese regions
The MAP (maximum a priori log likelihood) score group specificityscore (called Church score in this manuscript) (Hughes et al2000) ROC AUC (area under the curve for a receiver-operatorcharacteristic plot) metric and MNCP (mean normalized condi-tional probability) metric (Clarke amp Granek 2003) were used todetect motifs that most likely correspond to biologically significantcis-regulatory elements The filters ran on the UCRs of the subsetsof queen and worker genes were a MAP score cut-off value of 50followed by a ROC AUC cut-off at 07 followed by a group specificityscore cut-off at 1eminus05 The UCR database of all GLEAN3 predictedhoney bee genes was used as the background to calculatethese metrics
A parametric test (MANOVA) and a nonparametric test (KolmogorovndashSmirnov) were conducted to identify significance levels for the twosets of filtered motifs found in the UCRs of caste-specificallyexpressed genes against filtered motifs found in the random UCRset Random motifs were sampled from a motif database (10 391motifs) generated by running our script 100 times with a randomsets of UCR sequences
The main criterion for identifying known regulatory motifs amongthese caste-specific ones was the alignment of the PSSM for eachbee motif with the Drosophila melanogaster sequences in theTRANSFAC database (release 40) (Wingender et al 2000) Onlythe alignments passing a threshold of 80 identity for each PSSMwere considered as significant matches In addition we checkedfor a specific binding motif the EcRUSP motif (rGkTCAaTGamcy-3prime) known to function in the expression of genes responding tomorphogenetic hormone titres (Perera et al 2005)
Operating system and programming tools
An Ubuntu Linux (version Breezy) operating system was used toimplement all scripts and pipelines designed for annotation proce-dures and motif discovery The Python programming language(httpwwwpythonorg) Biopython (httpwwwbiopythonorg)and TAMO (Tools for Analysis of Motifs) packages (Gordon et al2005) were used in program design Other web applications werebuilt using the Zope application server (httpwwwzopeorg)hosted at httpzulufmrpuspbrbeelab
Acknowledgements
This work was supported by grants from FAPESP(Fundaccedilatildeo de Amparo a Pesquisa do Estado de SatildeoPaulo 9900719-6) and CNPq (Conselho Nacional deDesenvolvimento Cientiacutefico e Tecnoloacutegico 4729632-3-1)
References
Abouheif E and Wray GA (2002) Evolution of the genetic networkunderlying wing polyphenism in ants Science 297 249ndash252
Al-Sharour F Diaz-Uriarte R and Dopazo J (2004) FatiGO aweb tool for finding significant associations of Gene Ontologyterms with groups of genes Bioinformatics 20 578ndash580
Amdam GV Norberg K Hagen A and Omholt SW (2003)Social exploitation of vitellogenin Proc Natl Acad Sci USA 1001799ndash1802
Amdam GV Simotildees ZLP Hagen A Norberg K SchroderK Mikkelsen O et al (2004) Hormonal control of the yolkprecursor vitellogenin regulates immune function and longevityin honeybees Exp Gerontol 39 767ndash773
Antoniewski C OrsquoGrady MS Edmondson RG Lassieur SMand Benes H (1995) Characterization of an EcRUSP heter-odimer target site that mediates ecdysone responsiveness ofthe Drosophila Lsp-2 gene Mol Gen Genet 249 545ndash556
Arbeitsman MN Furlong EEM Imam F Johnson E NullBH Baker BS et al (2002) Gene expression during the lifecycle of Drosophila melanogaster Science 297 2270ndash2275
Ashburner M Ball CA Blake JA Botstein D Butler HCherry JM et al (2000) Gene Ontology tool for the unificationof biology Nature Genet 25 25ndash29
Bailey TL and Elkan D (1995) Unsupervised learning of multiplemotifs in biopolymers using expectation maximizationMachine Learning J 21 58ndash83
Bergem M Norberg K and Aamodt RM (2006) Long-termmaintenance of in vitro cultured honeybee (Apis mellifera)embryonic cells BMC Dev Biol 6 17 online
Boutros M Kiger AA Armknecht S Kerr K Hild M Koch Bet al (2004) Genome-wide RNAi analysis of growth and viabil-ity in Drosophila cells Science 303 832ndash835
Carsten LD Watts T and Markov TA (2005) Gene expressionpatterns accompanying a dietary switch in Drosophila mela-nogaster Mol Ecol 14 3203ndash3208
Clarke ND and Granek JA (2003) Rank order metrics for quan-tifying the association of sequence features with gene regula-tion Bioinformatics 19 212ndash218
Colombani J Bianchini L Layalle S Pondeville E Dauphin-Villemant C Antoniewski C et al (2005) Antagonisticactions of ecdysone and insulins determine final size inDrosophila Science 310 667ndash670
Corona M Estrada E and Zurita M (1999) Differential expressionof mitochondrial genes between queens and workers duringcaste determination in the honeybee Apis mellifera J Exp Biol202 929ndash938
Cunha AC Nascimento AM Guidugli KR Simotildees ZLPand Bitondi MMG (2005) Molecular cloning and expressionof a hexamerin cDNA from the honey bee J Insect Physiol 511135ndash1147
Darwin CR (1859) On the Origin of Species by Means of NaturalSelection John Murray London
Davidson EH (2001) Genomic Regulatory Systems Developmentand Evolution Academic Press San Diego
Eder J Kremer JP and Rembold H (1983) Correlation ofcytochrome c titer and respiration in Apis mellifera adaptiveresponses to caste determination defines workers intercastesand queens Comp Biochem Physiol B 76 703ndash716
Engels W (1974) Occurrence and significance of vitellogenins infemale castes of social Hymenoptera Am Zool 14 1229ndash1237
Evans JD and Wheeler DE (1999) Differential gene expressionbetween developing queens and workers in the honey beeApis mellifera Proc Natl Acad Sci USA 96 5575ndash5580
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
Genomics of honey bee caste development and reproduction 709
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
represents a major transition from nonhypothesis-drivenhigh-throughput screens to hypothesis-driven searches forcontext-dependent gene expression in honey bees Suchdirected search results can serve as a platform for experi-mental analyses of genome-wide integration in hormonalcontrol of caste development in bees In addition this studyexemplifies how existent algorithms for detecting sharedregulatory motifs can be joined into a toolkit for predictingcoregulated gene expression patterns in honey bees Thesemethods have been shown to be robust and are gainingacceptance for use in functional and comparative genomics(Liu et al 2004 Pritsker et al 2004 Zhu et al 2005)
Oogenesis and reproduction
As caste development sets the stage for reproductivedivision of labour genes involved in reproductive processesare strong candidates for functional analyses In the presentstudy we performed BLAST searches to identify honey beeorthologs for a list of 32 fly genes with the GO attributelsquooogenesisrsquo and for four genes specifically related tolsquovitellogenesisrsquo The list for fly genes involved in nuclearevents in germ cells consisted of 20 genes for lsquofemalemeiosisrsquo 12 genes for lsquorecombinationrsquo and 21 genes underthe heading lsquochromosome segregation including segregation
distortionrsquo (Supplementary material Table 3S) In some casesthese GO attributes for fly genes of course overlapped
BLASTN and BLASTX searches for these fly genesagainst the honey bee genome assembly 30 and theGLEAN3 Official Set (aa) retrieved statistically well sup-ported putative bee orthologs for most of these candidatesFor the genes involved in meiosis recombination and chro-mosome segregation this finding although not unexpectedis of interest as meiosis in the haploid honey bee drone isstrongly modified when compared with a normal diploidmeiosis The first meiosis is initiated but the nucleusremains undivided and only the superfluous centrioles areeliminated as cytoplasmatic buds (Hoage amp Kessel 1968)An interesting gene thelytoky (th) has recently beenmapped in this context (Lattorff et al 2005) It preventsalmost completely meiotic recombination in the automixisof laying workers of the Cape honey bee As an indicationof the interplay between meiosis and later developmentthis locus also appears to be an integral part of variousgene cascades involved in caste determination (Lattorffet al 2006)
The fly genes retrieved in the GO searches for lsquooogenesisrsquorepresent a much larger range of Molecular Functioncategories such as transcription factors proteins regulating
Figure 4 Map of the group-specific motifs found in queen and worker UCRs of caste-specifically expressed genes The coding region is represented by the GLEAN3 prediction number (assembly 40) with arrows indicating the translation start site Asterisks mark UCRs of the lsquotop10rsquo set used to find the over-represented motifs
710 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
translation by RNA binding RNA helicases enzymes(ubiquitination transfer of sugar residues sulfotransferase)GTPase activity and several factors binding to cytoskeletalproteins (Supplementary material Table 3S) This widerange of functional categories is expected as these genesare involved in a series of different steps during oogenesisin the polytrophic meroistic ovary Oogenesis starts out withthe maintenance of germline and somatic stem cell identityin the germline niche in the upper germarium A key geneinvolved in this process is pumilio (Forbes amp Lehmann1998) which is represented by a highly conserved beeortholog GB10504-PA The second step is the formation ofgerm cell cysts the determination of an oocyte within eachcyst and the survival of these cysts involving genes suchas benign gonial cell neoplasm (Lin et al 1994) encore(Hawkins et al 1996) ovo and ovarian tumour (otu) (Staabamp Steinmann-Zwicky 1995) all well conserved in the honeybee genome Interestingly we could not find a clear beeortholog for bag of marbles (bam) which is one of the primeearly response genes in the cystoblast differentiationpathway in Drosophila (McKearin 1997)
The third step comprises previtellogenic growth of thefollicle and during these stages a number of maternalfactors are deposited and anchored either within the oocyteor in the perivitelline space that define the egg and thefuture embryonic axes (for review see St Johnston ampNuumlsslein-Volhard (1992) In the list of Drosophila genesinvolved in early steps of axis determination a couple ofsurprises came up in the search for honey bee orthologsA big surprise was that we could not find a gurken orthologin the bee even though this gene sets up both the anteriorndashposterior and dorsalndashventral axes in the Drosophila egg(Gonzaacutelez-Reyes et al 1995) whereas downstreamcomponents of the Gurken signalling cascade appear tobe preserved in the bee genome (Honey Bee GenomeSequencing Consortium 2006) Similar apparent gaps inconstituents of patterning cascades were noted for theterminal regions of the embryo such as a lack of a torsoortholog whereas its ligand torso-like is represented by awell conserved ortholog in the bee (GB18663-PA)
With respect to genes involved in the final processes ofoogenesis we primarily looked at genes that play a partduring vitellogenesis There are four genes of interest in thisclass the primary one coding for the yolk protein precursorvitellogenin This gene has already been sequenced for thehoney bee (Piulachs et al 2003) and as expected it ismuch more related to vitellogenins of other insects andeven vertebrates than to the Drosophila yolk proteinswhich apparently are derived from lipases (Hagedorn et al1998) The second gene of interest is the bee ortholog toyolkless as this (GB16571-PA) could represent a putativevitellogenin receptor The other two Drosophila genes withclear orthologs in the bee are CG18641 and CG12139which code for a lipase and an LDL receptor respectively
General conclusions
The current analysis made use of previous experimentalanalyses on differential transcription during caste develop-ment of honey bee larvae In the annotation of these geneswhich includes references to Gene Ontology terms asso-ciated with their respective Drosophila orthologs two majorconfigurations emerged First of all worker genes werebetter defined in terms of GO attributes compared with therelatively large number of queen genes that had no GOterms associated to their respective Drosophila orthologsEven when taking into consideration the conceptual limitsin attributing GO terms on molecular function and biologicalprocess from Drosophila orthologs to honey bee genesthis finding could have a bearing on general basicquestions in socioevolution namely which caste is moredivergent from a nonsocial reproductive female bee proto-type or reproductive ground plan the queen or the workerLess speculative is the second major conclusion comingout of the GO analysis for Molecular Function showing andconfirming (Eder et al 1983 Corona et al 1999) theimportant role of metabolic regulation in caste developmentThis facet is demonstrated especially clearly in thecaste-specific expression of oxidoreductases (queen)vs hydrolases (workers)
The honey bee genome information provided not only amuch improved annotation platform for caste-specificallyexpressed ESTs but even more so opens the possibilityto explore putative regulatory features of the honey beegenome In the current study we employed modified Gibbssampling and expectation-maximization algorithms (Alig-nACE MDScan MEME) to detect group-specific motifs ingene regions up to 1000 bp upstream of translation startsites We detected 14 motifs that were significantly over-represented in the caste genes when compared withcorresponding motifs found in a random set of GLEAN3-predicted honey bee genes The localization of suchmotifs in UCRs of worker-overexpressed genes revealeda clustering of such motifs close to the predicted basalpromotor regions suggesting strong regulatory effectsSuch search strategies and the detected motifs can providethe lead to reveal and unravel cis-regulatory networks forand within specific contexts of honey bee biology
Caste polyphenism in social insects makes a strong casefor the emergence of novelties at a microevolutionary level(West-Eberhard 2003) Without the pretension to discussexhaustively the mechanisms underlying this surge ofdevelopmental plasticity two major themes becomeapparent in this and other studies Regulatory change hasbeen demonstrated in the shut-down of wing disc pattern-ing cascades in ants (Abouheif amp Wray 2002) and iscertainly also implicit in observed temporal changes ingene expression during postembryonic development ofants and bumble bees (Goodisman et al 2005 Pereboom
Genomics of honey bee caste development and reproduction 711
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
et al 2005) Such change would be expected to involvecis-regulatory elements that is change in transcriptionfactor binding sites in UCRs as approached in this studyand also evolutionary change in response thresholds tocirculating morphogenetic hormones (for review see Hart-felder amp Emlen 2005) The second and quite unexpectedtheme is the acquisition of new systemic functions byevolutionary rather old proteins such as vitellogenin andhexamerins These apparently unspectacular proteinshave evolved into key players for caste evolution and repro-ductive division of labour via novel regulatory connectivitywith JH (Amdam et al 2004 Guidugli et al 2005 Zhouet al 2006ab)
Experimental procedures
Selection and annotation of ESTs representing differentially expressed genes in honey bee caste development
The starting point were 164 entries (mainly 3prime-ESTs) in GenBank(BG101532ndashBG101697) from an SSH library (Evans amp Wheeler1999 Evans amp Wheeler 2000) When validated by macroarrayanalyses a clustering into three major classes became apparent(I) genes overexpressed in young larvae (II) genes overexpressedin last instar queen larvae and (III) genes overexpressed in lastinstar worker larvae For this study we excluded the class I ESTsbecause their expression is not caste-specific but rather representsexpression differences between young (still bipotent) and olderlarvae To the class II queen ESTs (82) we added one completecDNA entry (AY601642) from a DDRT-PCR screen (Corona et al1999) and to the class III set of worker ESTs (40) we added sevenGenBank dbEST entries (BG149167ndashBG149173) from a DDRT-PCR screen on ovary development (Hepperle amp Hartfelder 2001)
The EST sequences were submitted to BLASTN searches(parameters -G 2 -E 3 -W 15 -F lsquom Drsquo -U -e 1e-20) against genomesequence assembly Amel_v30 to retrieve matches in linked orunlinked genomic contigs and to exclude no-matches (seven ESTsin queen) ESTs that aligned within the same scaffold werechecked for clustering and overlap This clustering also served toexclude genes that were represented by non-overlapping ESTsfrom both castes This procedure generated a set of 51 uniqueputative gene sequences overexpressed in either queen (34) or inworker larvae (17) These 51 nonredundant sequences weresubmitted to BLASTX searches against the Official Set of GLEAN3-predicted protein sequences (cut-off value at 1eminus20) For ESTs withno significant protein sequence matches the genomic regionsadjacent to the mapped EST were searched to find neighbouringORFs especially those nearest to putative 3prime UTRs of predictedproteins as the EST libraries have a bias in this direction
Official Set protein sequences were aligned against Amel_v30sequence assembly using TBLASTN to map protein to genomeand subsequently they were aligned using BLASTP against theGenBank nonredundant (nr) and the Flybase protein sequencedatabases The manual features annotation procedure of theArtemis 70 program (Rutherford et al 2000) was used to mapORFs putative splice sites of exons and ESTs to genome coordi-nates The final annotation file was generated with a Python scriptin GFF format (httpwwwsangeracukSoftwareformatsGFF)
Honey bee sequences annotated as orthologs to D mela-nogaster genes were putatively assigned the GO terms listed in
the respective Flybase entry In addition the definition of new GOterms (Biological Process ontology) related to caste developmentand polyphenism (GO0048651 and GO0048650 respectively)was co-ordinated with the Gene Ontology Consortium (Ashburneret al 2000) The FatiGO web tool (Al-Sharour et al 2004) wasused to cluster GO terms (level 3 setting) for Biological Processand Molecular Function
For the detection of conserved domains the 51 proteinsequences were screened against the Pfam database (httpwwwsangeracukSoftwarePfam) using the HMMER platform(current release 232 httphmmerwustledu) with a cut-offvalue set at 1eminus10
Annotation of oogenesis and reproduction genes
In order to identify putative honey bee orthologs to D melanogastergenes we searched the following GO terms in Flybase lsquooogenesisrsquo(GO0009993) lsquovitellogenesisrsquo (GO0007296) lsquofemale meiosisrsquo(GO0007143) lsquoDNA recombinationrsquo (GO0006310) and lsquochromo-some segregationrsquo (GO0007059) Genes related to segregationdistortion were searched for in Flybase in phenotypic descriptionsand mutant effects of D melanogaster genes as this phenomenonis not represented by a GO term Hence this group may be moreheterogeneous than the others From this list we removed genesof pleiotropic function (multifaceted GO entries in Biological Process)and genes that lacked defined transcripts in the Drosophilagenome database
For the GO terms lsquooogenesisrsquo and lsquovitellogenesisrsquo we performedTBLASTN and BLASTP searches for 42 fruit fly genes against theAmel_v30 genome assembly and the GLEAN3-predicted proteinsequences (Honey Bee Genome Sequencing Consortium 2006)respectively The orthologous D melanogaster gene was charac-terized by the same procedure as described above (reciprocalbest hit) For the GO terms lsquofemale meiosisrsquo lsquoDNA recombinationrsquolsquochromosome segregationrsquo and the non-GO group lsquosegregationdistortionrsquo transcripts of D melanogaster were searched againstnr databases at NCBI using BLASTP The obtained sequenceswere searched against the Amel_v30 genome assembly and theGLEAN3-predicted protein sequences using TBLASTN Homolo-gous sequences (threshold 1eminus10) were predicted using the BioEditsoftware (Hall 1999) ORFs showing significant homology (BLASTPthreshold 1eminus20) were assembled and used in BLASTP searchesagainst the nr databases at NCBI
Motif search in upstream regions in caste-specifically expressed genes
In order to detect overrepresented motifs in the upstream controlregions (UCRs) of the two sets of caste-related genes we selectedgene subsets based on two criteria (1) those that had shown thehighest caste-specificity in the array analyses (Evans amp Wheeler2000) and (2) those that had a conserved 5prime region when comparedwith the Drosophila orthologs These lsquotop10rsquo genes consisted of sixqueen genes (GB13072 GB11628 GB19380 GB14798 GB16047and GB18242) and of six worker genes (GB10869 GB12371GB12239 GB10428 GB19006 GB14758) The motif search wasconducted separately on the two sets of UCR sequences using threemethods AlignAce (Roth et al 1998) MEME (Bailey amp Elkan1995) and MDscan (Liu et al 2002) Default parameters valueswere used in all searches except that GC content in intergenicregions was set to 25 representing the background value estab-lished for the honey bee UCR database generated in this study
712 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
The database containing 10156 UCR sequences was gener-ated by parsing the Official Set annotation file (downloaded in GFFformat from httpwwwbeegenomehgscbcmtmcedubeeftphtml)to extract upstream regions starting from the terminal 5prime-genomiccoordinate of each predicted CDS The UCRs were arbitrarily setto a size frame of 1000 nucleotides (Roth et al 1998) but weretrimmed whenever another predicted ORF was detected in any ofthese regions
The MAP (maximum a priori log likelihood) score group specificityscore (called Church score in this manuscript) (Hughes et al2000) ROC AUC (area under the curve for a receiver-operatorcharacteristic plot) metric and MNCP (mean normalized condi-tional probability) metric (Clarke amp Granek 2003) were used todetect motifs that most likely correspond to biologically significantcis-regulatory elements The filters ran on the UCRs of the subsetsof queen and worker genes were a MAP score cut-off value of 50followed by a ROC AUC cut-off at 07 followed by a group specificityscore cut-off at 1eminus05 The UCR database of all GLEAN3 predictedhoney bee genes was used as the background to calculatethese metrics
A parametric test (MANOVA) and a nonparametric test (KolmogorovndashSmirnov) were conducted to identify significance levels for the twosets of filtered motifs found in the UCRs of caste-specificallyexpressed genes against filtered motifs found in the random UCRset Random motifs were sampled from a motif database (10 391motifs) generated by running our script 100 times with a randomsets of UCR sequences
The main criterion for identifying known regulatory motifs amongthese caste-specific ones was the alignment of the PSSM for eachbee motif with the Drosophila melanogaster sequences in theTRANSFAC database (release 40) (Wingender et al 2000) Onlythe alignments passing a threshold of 80 identity for each PSSMwere considered as significant matches In addition we checkedfor a specific binding motif the EcRUSP motif (rGkTCAaTGamcy-3prime) known to function in the expression of genes responding tomorphogenetic hormone titres (Perera et al 2005)
Operating system and programming tools
An Ubuntu Linux (version Breezy) operating system was used toimplement all scripts and pipelines designed for annotation proce-dures and motif discovery The Python programming language(httpwwwpythonorg) Biopython (httpwwwbiopythonorg)and TAMO (Tools for Analysis of Motifs) packages (Gordon et al2005) were used in program design Other web applications werebuilt using the Zope application server (httpwwwzopeorg)hosted at httpzulufmrpuspbrbeelab
Acknowledgements
This work was supported by grants from FAPESP(Fundaccedilatildeo de Amparo a Pesquisa do Estado de SatildeoPaulo 9900719-6) and CNPq (Conselho Nacional deDesenvolvimento Cientiacutefico e Tecnoloacutegico 4729632-3-1)
References
Abouheif E and Wray GA (2002) Evolution of the genetic networkunderlying wing polyphenism in ants Science 297 249ndash252
Al-Sharour F Diaz-Uriarte R and Dopazo J (2004) FatiGO aweb tool for finding significant associations of Gene Ontologyterms with groups of genes Bioinformatics 20 578ndash580
Amdam GV Norberg K Hagen A and Omholt SW (2003)Social exploitation of vitellogenin Proc Natl Acad Sci USA 1001799ndash1802
Amdam GV Simotildees ZLP Hagen A Norberg K SchroderK Mikkelsen O et al (2004) Hormonal control of the yolkprecursor vitellogenin regulates immune function and longevityin honeybees Exp Gerontol 39 767ndash773
Antoniewski C OrsquoGrady MS Edmondson RG Lassieur SMand Benes H (1995) Characterization of an EcRUSP heter-odimer target site that mediates ecdysone responsiveness ofthe Drosophila Lsp-2 gene Mol Gen Genet 249 545ndash556
Arbeitsman MN Furlong EEM Imam F Johnson E NullBH Baker BS et al (2002) Gene expression during the lifecycle of Drosophila melanogaster Science 297 2270ndash2275
Ashburner M Ball CA Blake JA Botstein D Butler HCherry JM et al (2000) Gene Ontology tool for the unificationof biology Nature Genet 25 25ndash29
Bailey TL and Elkan D (1995) Unsupervised learning of multiplemotifs in biopolymers using expectation maximizationMachine Learning J 21 58ndash83
Bergem M Norberg K and Aamodt RM (2006) Long-termmaintenance of in vitro cultured honeybee (Apis mellifera)embryonic cells BMC Dev Biol 6 17 online
Boutros M Kiger AA Armknecht S Kerr K Hild M Koch Bet al (2004) Genome-wide RNAi analysis of growth and viabil-ity in Drosophila cells Science 303 832ndash835
Carsten LD Watts T and Markov TA (2005) Gene expressionpatterns accompanying a dietary switch in Drosophila mela-nogaster Mol Ecol 14 3203ndash3208
Clarke ND and Granek JA (2003) Rank order metrics for quan-tifying the association of sequence features with gene regula-tion Bioinformatics 19 212ndash218
Colombani J Bianchini L Layalle S Pondeville E Dauphin-Villemant C Antoniewski C et al (2005) Antagonisticactions of ecdysone and insulins determine final size inDrosophila Science 310 667ndash670
Corona M Estrada E and Zurita M (1999) Differential expressionof mitochondrial genes between queens and workers duringcaste determination in the honeybee Apis mellifera J Exp Biol202 929ndash938
Cunha AC Nascimento AM Guidugli KR Simotildees ZLPand Bitondi MMG (2005) Molecular cloning and expressionof a hexamerin cDNA from the honey bee J Insect Physiol 511135ndash1147
Darwin CR (1859) On the Origin of Species by Means of NaturalSelection John Murray London
Davidson EH (2001) Genomic Regulatory Systems Developmentand Evolution Academic Press San Diego
Eder J Kremer JP and Rembold H (1983) Correlation ofcytochrome c titer and respiration in Apis mellifera adaptiveresponses to caste determination defines workers intercastesand queens Comp Biochem Physiol B 76 703ndash716
Engels W (1974) Occurrence and significance of vitellogenins infemale castes of social Hymenoptera Am Zool 14 1229ndash1237
Evans JD and Wheeler DE (1999) Differential gene expressionbetween developing queens and workers in the honey beeApis mellifera Proc Natl Acad Sci USA 96 5575ndash5580
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
710 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
translation by RNA binding RNA helicases enzymes(ubiquitination transfer of sugar residues sulfotransferase)GTPase activity and several factors binding to cytoskeletalproteins (Supplementary material Table 3S) This widerange of functional categories is expected as these genesare involved in a series of different steps during oogenesisin the polytrophic meroistic ovary Oogenesis starts out withthe maintenance of germline and somatic stem cell identityin the germline niche in the upper germarium A key geneinvolved in this process is pumilio (Forbes amp Lehmann1998) which is represented by a highly conserved beeortholog GB10504-PA The second step is the formation ofgerm cell cysts the determination of an oocyte within eachcyst and the survival of these cysts involving genes suchas benign gonial cell neoplasm (Lin et al 1994) encore(Hawkins et al 1996) ovo and ovarian tumour (otu) (Staabamp Steinmann-Zwicky 1995) all well conserved in the honeybee genome Interestingly we could not find a clear beeortholog for bag of marbles (bam) which is one of the primeearly response genes in the cystoblast differentiationpathway in Drosophila (McKearin 1997)
The third step comprises previtellogenic growth of thefollicle and during these stages a number of maternalfactors are deposited and anchored either within the oocyteor in the perivitelline space that define the egg and thefuture embryonic axes (for review see St Johnston ampNuumlsslein-Volhard (1992) In the list of Drosophila genesinvolved in early steps of axis determination a couple ofsurprises came up in the search for honey bee orthologsA big surprise was that we could not find a gurken orthologin the bee even though this gene sets up both the anteriorndashposterior and dorsalndashventral axes in the Drosophila egg(Gonzaacutelez-Reyes et al 1995) whereas downstreamcomponents of the Gurken signalling cascade appear tobe preserved in the bee genome (Honey Bee GenomeSequencing Consortium 2006) Similar apparent gaps inconstituents of patterning cascades were noted for theterminal regions of the embryo such as a lack of a torsoortholog whereas its ligand torso-like is represented by awell conserved ortholog in the bee (GB18663-PA)
With respect to genes involved in the final processes ofoogenesis we primarily looked at genes that play a partduring vitellogenesis There are four genes of interest in thisclass the primary one coding for the yolk protein precursorvitellogenin This gene has already been sequenced for thehoney bee (Piulachs et al 2003) and as expected it ismuch more related to vitellogenins of other insects andeven vertebrates than to the Drosophila yolk proteinswhich apparently are derived from lipases (Hagedorn et al1998) The second gene of interest is the bee ortholog toyolkless as this (GB16571-PA) could represent a putativevitellogenin receptor The other two Drosophila genes withclear orthologs in the bee are CG18641 and CG12139which code for a lipase and an LDL receptor respectively
General conclusions
The current analysis made use of previous experimentalanalyses on differential transcription during caste develop-ment of honey bee larvae In the annotation of these geneswhich includes references to Gene Ontology terms asso-ciated with their respective Drosophila orthologs two majorconfigurations emerged First of all worker genes werebetter defined in terms of GO attributes compared with therelatively large number of queen genes that had no GOterms associated to their respective Drosophila orthologsEven when taking into consideration the conceptual limitsin attributing GO terms on molecular function and biologicalprocess from Drosophila orthologs to honey bee genesthis finding could have a bearing on general basicquestions in socioevolution namely which caste is moredivergent from a nonsocial reproductive female bee proto-type or reproductive ground plan the queen or the workerLess speculative is the second major conclusion comingout of the GO analysis for Molecular Function showing andconfirming (Eder et al 1983 Corona et al 1999) theimportant role of metabolic regulation in caste developmentThis facet is demonstrated especially clearly in thecaste-specific expression of oxidoreductases (queen)vs hydrolases (workers)
The honey bee genome information provided not only amuch improved annotation platform for caste-specificallyexpressed ESTs but even more so opens the possibilityto explore putative regulatory features of the honey beegenome In the current study we employed modified Gibbssampling and expectation-maximization algorithms (Alig-nACE MDScan MEME) to detect group-specific motifs ingene regions up to 1000 bp upstream of translation startsites We detected 14 motifs that were significantly over-represented in the caste genes when compared withcorresponding motifs found in a random set of GLEAN3-predicted honey bee genes The localization of suchmotifs in UCRs of worker-overexpressed genes revealeda clustering of such motifs close to the predicted basalpromotor regions suggesting strong regulatory effectsSuch search strategies and the detected motifs can providethe lead to reveal and unravel cis-regulatory networks forand within specific contexts of honey bee biology
Caste polyphenism in social insects makes a strong casefor the emergence of novelties at a microevolutionary level(West-Eberhard 2003) Without the pretension to discussexhaustively the mechanisms underlying this surge ofdevelopmental plasticity two major themes becomeapparent in this and other studies Regulatory change hasbeen demonstrated in the shut-down of wing disc pattern-ing cascades in ants (Abouheif amp Wray 2002) and iscertainly also implicit in observed temporal changes ingene expression during postembryonic development ofants and bumble bees (Goodisman et al 2005 Pereboom
Genomics of honey bee caste development and reproduction 711
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
et al 2005) Such change would be expected to involvecis-regulatory elements that is change in transcriptionfactor binding sites in UCRs as approached in this studyand also evolutionary change in response thresholds tocirculating morphogenetic hormones (for review see Hart-felder amp Emlen 2005) The second and quite unexpectedtheme is the acquisition of new systemic functions byevolutionary rather old proteins such as vitellogenin andhexamerins These apparently unspectacular proteinshave evolved into key players for caste evolution and repro-ductive division of labour via novel regulatory connectivitywith JH (Amdam et al 2004 Guidugli et al 2005 Zhouet al 2006ab)
Experimental procedures
Selection and annotation of ESTs representing differentially expressed genes in honey bee caste development
The starting point were 164 entries (mainly 3prime-ESTs) in GenBank(BG101532ndashBG101697) from an SSH library (Evans amp Wheeler1999 Evans amp Wheeler 2000) When validated by macroarrayanalyses a clustering into three major classes became apparent(I) genes overexpressed in young larvae (II) genes overexpressedin last instar queen larvae and (III) genes overexpressed in lastinstar worker larvae For this study we excluded the class I ESTsbecause their expression is not caste-specific but rather representsexpression differences between young (still bipotent) and olderlarvae To the class II queen ESTs (82) we added one completecDNA entry (AY601642) from a DDRT-PCR screen (Corona et al1999) and to the class III set of worker ESTs (40) we added sevenGenBank dbEST entries (BG149167ndashBG149173) from a DDRT-PCR screen on ovary development (Hepperle amp Hartfelder 2001)
The EST sequences were submitted to BLASTN searches(parameters -G 2 -E 3 -W 15 -F lsquom Drsquo -U -e 1e-20) against genomesequence assembly Amel_v30 to retrieve matches in linked orunlinked genomic contigs and to exclude no-matches (seven ESTsin queen) ESTs that aligned within the same scaffold werechecked for clustering and overlap This clustering also served toexclude genes that were represented by non-overlapping ESTsfrom both castes This procedure generated a set of 51 uniqueputative gene sequences overexpressed in either queen (34) or inworker larvae (17) These 51 nonredundant sequences weresubmitted to BLASTX searches against the Official Set of GLEAN3-predicted protein sequences (cut-off value at 1eminus20) For ESTs withno significant protein sequence matches the genomic regionsadjacent to the mapped EST were searched to find neighbouringORFs especially those nearest to putative 3prime UTRs of predictedproteins as the EST libraries have a bias in this direction
Official Set protein sequences were aligned against Amel_v30sequence assembly using TBLASTN to map protein to genomeand subsequently they were aligned using BLASTP against theGenBank nonredundant (nr) and the Flybase protein sequencedatabases The manual features annotation procedure of theArtemis 70 program (Rutherford et al 2000) was used to mapORFs putative splice sites of exons and ESTs to genome coordi-nates The final annotation file was generated with a Python scriptin GFF format (httpwwwsangeracukSoftwareformatsGFF)
Honey bee sequences annotated as orthologs to D mela-nogaster genes were putatively assigned the GO terms listed in
the respective Flybase entry In addition the definition of new GOterms (Biological Process ontology) related to caste developmentand polyphenism (GO0048651 and GO0048650 respectively)was co-ordinated with the Gene Ontology Consortium (Ashburneret al 2000) The FatiGO web tool (Al-Sharour et al 2004) wasused to cluster GO terms (level 3 setting) for Biological Processand Molecular Function
For the detection of conserved domains the 51 proteinsequences were screened against the Pfam database (httpwwwsangeracukSoftwarePfam) using the HMMER platform(current release 232 httphmmerwustledu) with a cut-offvalue set at 1eminus10
Annotation of oogenesis and reproduction genes
In order to identify putative honey bee orthologs to D melanogastergenes we searched the following GO terms in Flybase lsquooogenesisrsquo(GO0009993) lsquovitellogenesisrsquo (GO0007296) lsquofemale meiosisrsquo(GO0007143) lsquoDNA recombinationrsquo (GO0006310) and lsquochromo-some segregationrsquo (GO0007059) Genes related to segregationdistortion were searched for in Flybase in phenotypic descriptionsand mutant effects of D melanogaster genes as this phenomenonis not represented by a GO term Hence this group may be moreheterogeneous than the others From this list we removed genesof pleiotropic function (multifaceted GO entries in Biological Process)and genes that lacked defined transcripts in the Drosophilagenome database
For the GO terms lsquooogenesisrsquo and lsquovitellogenesisrsquo we performedTBLASTN and BLASTP searches for 42 fruit fly genes against theAmel_v30 genome assembly and the GLEAN3-predicted proteinsequences (Honey Bee Genome Sequencing Consortium 2006)respectively The orthologous D melanogaster gene was charac-terized by the same procedure as described above (reciprocalbest hit) For the GO terms lsquofemale meiosisrsquo lsquoDNA recombinationrsquolsquochromosome segregationrsquo and the non-GO group lsquosegregationdistortionrsquo transcripts of D melanogaster were searched againstnr databases at NCBI using BLASTP The obtained sequenceswere searched against the Amel_v30 genome assembly and theGLEAN3-predicted protein sequences using TBLASTN Homolo-gous sequences (threshold 1eminus10) were predicted using the BioEditsoftware (Hall 1999) ORFs showing significant homology (BLASTPthreshold 1eminus20) were assembled and used in BLASTP searchesagainst the nr databases at NCBI
Motif search in upstream regions in caste-specifically expressed genes
In order to detect overrepresented motifs in the upstream controlregions (UCRs) of the two sets of caste-related genes we selectedgene subsets based on two criteria (1) those that had shown thehighest caste-specificity in the array analyses (Evans amp Wheeler2000) and (2) those that had a conserved 5prime region when comparedwith the Drosophila orthologs These lsquotop10rsquo genes consisted of sixqueen genes (GB13072 GB11628 GB19380 GB14798 GB16047and GB18242) and of six worker genes (GB10869 GB12371GB12239 GB10428 GB19006 GB14758) The motif search wasconducted separately on the two sets of UCR sequences using threemethods AlignAce (Roth et al 1998) MEME (Bailey amp Elkan1995) and MDscan (Liu et al 2002) Default parameters valueswere used in all searches except that GC content in intergenicregions was set to 25 representing the background value estab-lished for the honey bee UCR database generated in this study
712 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
The database containing 10156 UCR sequences was gener-ated by parsing the Official Set annotation file (downloaded in GFFformat from httpwwwbeegenomehgscbcmtmcedubeeftphtml)to extract upstream regions starting from the terminal 5prime-genomiccoordinate of each predicted CDS The UCRs were arbitrarily setto a size frame of 1000 nucleotides (Roth et al 1998) but weretrimmed whenever another predicted ORF was detected in any ofthese regions
The MAP (maximum a priori log likelihood) score group specificityscore (called Church score in this manuscript) (Hughes et al2000) ROC AUC (area under the curve for a receiver-operatorcharacteristic plot) metric and MNCP (mean normalized condi-tional probability) metric (Clarke amp Granek 2003) were used todetect motifs that most likely correspond to biologically significantcis-regulatory elements The filters ran on the UCRs of the subsetsof queen and worker genes were a MAP score cut-off value of 50followed by a ROC AUC cut-off at 07 followed by a group specificityscore cut-off at 1eminus05 The UCR database of all GLEAN3 predictedhoney bee genes was used as the background to calculatethese metrics
A parametric test (MANOVA) and a nonparametric test (KolmogorovndashSmirnov) were conducted to identify significance levels for the twosets of filtered motifs found in the UCRs of caste-specificallyexpressed genes against filtered motifs found in the random UCRset Random motifs were sampled from a motif database (10 391motifs) generated by running our script 100 times with a randomsets of UCR sequences
The main criterion for identifying known regulatory motifs amongthese caste-specific ones was the alignment of the PSSM for eachbee motif with the Drosophila melanogaster sequences in theTRANSFAC database (release 40) (Wingender et al 2000) Onlythe alignments passing a threshold of 80 identity for each PSSMwere considered as significant matches In addition we checkedfor a specific binding motif the EcRUSP motif (rGkTCAaTGamcy-3prime) known to function in the expression of genes responding tomorphogenetic hormone titres (Perera et al 2005)
Operating system and programming tools
An Ubuntu Linux (version Breezy) operating system was used toimplement all scripts and pipelines designed for annotation proce-dures and motif discovery The Python programming language(httpwwwpythonorg) Biopython (httpwwwbiopythonorg)and TAMO (Tools for Analysis of Motifs) packages (Gordon et al2005) were used in program design Other web applications werebuilt using the Zope application server (httpwwwzopeorg)hosted at httpzulufmrpuspbrbeelab
Acknowledgements
This work was supported by grants from FAPESP(Fundaccedilatildeo de Amparo a Pesquisa do Estado de SatildeoPaulo 9900719-6) and CNPq (Conselho Nacional deDesenvolvimento Cientiacutefico e Tecnoloacutegico 4729632-3-1)
References
Abouheif E and Wray GA (2002) Evolution of the genetic networkunderlying wing polyphenism in ants Science 297 249ndash252
Al-Sharour F Diaz-Uriarte R and Dopazo J (2004) FatiGO aweb tool for finding significant associations of Gene Ontologyterms with groups of genes Bioinformatics 20 578ndash580
Amdam GV Norberg K Hagen A and Omholt SW (2003)Social exploitation of vitellogenin Proc Natl Acad Sci USA 1001799ndash1802
Amdam GV Simotildees ZLP Hagen A Norberg K SchroderK Mikkelsen O et al (2004) Hormonal control of the yolkprecursor vitellogenin regulates immune function and longevityin honeybees Exp Gerontol 39 767ndash773
Antoniewski C OrsquoGrady MS Edmondson RG Lassieur SMand Benes H (1995) Characterization of an EcRUSP heter-odimer target site that mediates ecdysone responsiveness ofthe Drosophila Lsp-2 gene Mol Gen Genet 249 545ndash556
Arbeitsman MN Furlong EEM Imam F Johnson E NullBH Baker BS et al (2002) Gene expression during the lifecycle of Drosophila melanogaster Science 297 2270ndash2275
Ashburner M Ball CA Blake JA Botstein D Butler HCherry JM et al (2000) Gene Ontology tool for the unificationof biology Nature Genet 25 25ndash29
Bailey TL and Elkan D (1995) Unsupervised learning of multiplemotifs in biopolymers using expectation maximizationMachine Learning J 21 58ndash83
Bergem M Norberg K and Aamodt RM (2006) Long-termmaintenance of in vitro cultured honeybee (Apis mellifera)embryonic cells BMC Dev Biol 6 17 online
Boutros M Kiger AA Armknecht S Kerr K Hild M Koch Bet al (2004) Genome-wide RNAi analysis of growth and viabil-ity in Drosophila cells Science 303 832ndash835
Carsten LD Watts T and Markov TA (2005) Gene expressionpatterns accompanying a dietary switch in Drosophila mela-nogaster Mol Ecol 14 3203ndash3208
Clarke ND and Granek JA (2003) Rank order metrics for quan-tifying the association of sequence features with gene regula-tion Bioinformatics 19 212ndash218
Colombani J Bianchini L Layalle S Pondeville E Dauphin-Villemant C Antoniewski C et al (2005) Antagonisticactions of ecdysone and insulins determine final size inDrosophila Science 310 667ndash670
Corona M Estrada E and Zurita M (1999) Differential expressionof mitochondrial genes between queens and workers duringcaste determination in the honeybee Apis mellifera J Exp Biol202 929ndash938
Cunha AC Nascimento AM Guidugli KR Simotildees ZLPand Bitondi MMG (2005) Molecular cloning and expressionof a hexamerin cDNA from the honey bee J Insect Physiol 511135ndash1147
Darwin CR (1859) On the Origin of Species by Means of NaturalSelection John Murray London
Davidson EH (2001) Genomic Regulatory Systems Developmentand Evolution Academic Press San Diego
Eder J Kremer JP and Rembold H (1983) Correlation ofcytochrome c titer and respiration in Apis mellifera adaptiveresponses to caste determination defines workers intercastesand queens Comp Biochem Physiol B 76 703ndash716
Engels W (1974) Occurrence and significance of vitellogenins infemale castes of social Hymenoptera Am Zool 14 1229ndash1237
Evans JD and Wheeler DE (1999) Differential gene expressionbetween developing queens and workers in the honey beeApis mellifera Proc Natl Acad Sci USA 96 5575ndash5580
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
Genomics of honey bee caste development and reproduction 711
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
et al 2005) Such change would be expected to involvecis-regulatory elements that is change in transcriptionfactor binding sites in UCRs as approached in this studyand also evolutionary change in response thresholds tocirculating morphogenetic hormones (for review see Hart-felder amp Emlen 2005) The second and quite unexpectedtheme is the acquisition of new systemic functions byevolutionary rather old proteins such as vitellogenin andhexamerins These apparently unspectacular proteinshave evolved into key players for caste evolution and repro-ductive division of labour via novel regulatory connectivitywith JH (Amdam et al 2004 Guidugli et al 2005 Zhouet al 2006ab)
Experimental procedures
Selection and annotation of ESTs representing differentially expressed genes in honey bee caste development
The starting point were 164 entries (mainly 3prime-ESTs) in GenBank(BG101532ndashBG101697) from an SSH library (Evans amp Wheeler1999 Evans amp Wheeler 2000) When validated by macroarrayanalyses a clustering into three major classes became apparent(I) genes overexpressed in young larvae (II) genes overexpressedin last instar queen larvae and (III) genes overexpressed in lastinstar worker larvae For this study we excluded the class I ESTsbecause their expression is not caste-specific but rather representsexpression differences between young (still bipotent) and olderlarvae To the class II queen ESTs (82) we added one completecDNA entry (AY601642) from a DDRT-PCR screen (Corona et al1999) and to the class III set of worker ESTs (40) we added sevenGenBank dbEST entries (BG149167ndashBG149173) from a DDRT-PCR screen on ovary development (Hepperle amp Hartfelder 2001)
The EST sequences were submitted to BLASTN searches(parameters -G 2 -E 3 -W 15 -F lsquom Drsquo -U -e 1e-20) against genomesequence assembly Amel_v30 to retrieve matches in linked orunlinked genomic contigs and to exclude no-matches (seven ESTsin queen) ESTs that aligned within the same scaffold werechecked for clustering and overlap This clustering also served toexclude genes that were represented by non-overlapping ESTsfrom both castes This procedure generated a set of 51 uniqueputative gene sequences overexpressed in either queen (34) or inworker larvae (17) These 51 nonredundant sequences weresubmitted to BLASTX searches against the Official Set of GLEAN3-predicted protein sequences (cut-off value at 1eminus20) For ESTs withno significant protein sequence matches the genomic regionsadjacent to the mapped EST were searched to find neighbouringORFs especially those nearest to putative 3prime UTRs of predictedproteins as the EST libraries have a bias in this direction
Official Set protein sequences were aligned against Amel_v30sequence assembly using TBLASTN to map protein to genomeand subsequently they were aligned using BLASTP against theGenBank nonredundant (nr) and the Flybase protein sequencedatabases The manual features annotation procedure of theArtemis 70 program (Rutherford et al 2000) was used to mapORFs putative splice sites of exons and ESTs to genome coordi-nates The final annotation file was generated with a Python scriptin GFF format (httpwwwsangeracukSoftwareformatsGFF)
Honey bee sequences annotated as orthologs to D mela-nogaster genes were putatively assigned the GO terms listed in
the respective Flybase entry In addition the definition of new GOterms (Biological Process ontology) related to caste developmentand polyphenism (GO0048651 and GO0048650 respectively)was co-ordinated with the Gene Ontology Consortium (Ashburneret al 2000) The FatiGO web tool (Al-Sharour et al 2004) wasused to cluster GO terms (level 3 setting) for Biological Processand Molecular Function
For the detection of conserved domains the 51 proteinsequences were screened against the Pfam database (httpwwwsangeracukSoftwarePfam) using the HMMER platform(current release 232 httphmmerwustledu) with a cut-offvalue set at 1eminus10
Annotation of oogenesis and reproduction genes
In order to identify putative honey bee orthologs to D melanogastergenes we searched the following GO terms in Flybase lsquooogenesisrsquo(GO0009993) lsquovitellogenesisrsquo (GO0007296) lsquofemale meiosisrsquo(GO0007143) lsquoDNA recombinationrsquo (GO0006310) and lsquochromo-some segregationrsquo (GO0007059) Genes related to segregationdistortion were searched for in Flybase in phenotypic descriptionsand mutant effects of D melanogaster genes as this phenomenonis not represented by a GO term Hence this group may be moreheterogeneous than the others From this list we removed genesof pleiotropic function (multifaceted GO entries in Biological Process)and genes that lacked defined transcripts in the Drosophilagenome database
For the GO terms lsquooogenesisrsquo and lsquovitellogenesisrsquo we performedTBLASTN and BLASTP searches for 42 fruit fly genes against theAmel_v30 genome assembly and the GLEAN3-predicted proteinsequences (Honey Bee Genome Sequencing Consortium 2006)respectively The orthologous D melanogaster gene was charac-terized by the same procedure as described above (reciprocalbest hit) For the GO terms lsquofemale meiosisrsquo lsquoDNA recombinationrsquolsquochromosome segregationrsquo and the non-GO group lsquosegregationdistortionrsquo transcripts of D melanogaster were searched againstnr databases at NCBI using BLASTP The obtained sequenceswere searched against the Amel_v30 genome assembly and theGLEAN3-predicted protein sequences using TBLASTN Homolo-gous sequences (threshold 1eminus10) were predicted using the BioEditsoftware (Hall 1999) ORFs showing significant homology (BLASTPthreshold 1eminus20) were assembled and used in BLASTP searchesagainst the nr databases at NCBI
Motif search in upstream regions in caste-specifically expressed genes
In order to detect overrepresented motifs in the upstream controlregions (UCRs) of the two sets of caste-related genes we selectedgene subsets based on two criteria (1) those that had shown thehighest caste-specificity in the array analyses (Evans amp Wheeler2000) and (2) those that had a conserved 5prime region when comparedwith the Drosophila orthologs These lsquotop10rsquo genes consisted of sixqueen genes (GB13072 GB11628 GB19380 GB14798 GB16047and GB18242) and of six worker genes (GB10869 GB12371GB12239 GB10428 GB19006 GB14758) The motif search wasconducted separately on the two sets of UCR sequences using threemethods AlignAce (Roth et al 1998) MEME (Bailey amp Elkan1995) and MDscan (Liu et al 2002) Default parameters valueswere used in all searches except that GC content in intergenicregions was set to 25 representing the background value estab-lished for the honey bee UCR database generated in this study
712 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
The database containing 10156 UCR sequences was gener-ated by parsing the Official Set annotation file (downloaded in GFFformat from httpwwwbeegenomehgscbcmtmcedubeeftphtml)to extract upstream regions starting from the terminal 5prime-genomiccoordinate of each predicted CDS The UCRs were arbitrarily setto a size frame of 1000 nucleotides (Roth et al 1998) but weretrimmed whenever another predicted ORF was detected in any ofthese regions
The MAP (maximum a priori log likelihood) score group specificityscore (called Church score in this manuscript) (Hughes et al2000) ROC AUC (area under the curve for a receiver-operatorcharacteristic plot) metric and MNCP (mean normalized condi-tional probability) metric (Clarke amp Granek 2003) were used todetect motifs that most likely correspond to biologically significantcis-regulatory elements The filters ran on the UCRs of the subsetsof queen and worker genes were a MAP score cut-off value of 50followed by a ROC AUC cut-off at 07 followed by a group specificityscore cut-off at 1eminus05 The UCR database of all GLEAN3 predictedhoney bee genes was used as the background to calculatethese metrics
A parametric test (MANOVA) and a nonparametric test (KolmogorovndashSmirnov) were conducted to identify significance levels for the twosets of filtered motifs found in the UCRs of caste-specificallyexpressed genes against filtered motifs found in the random UCRset Random motifs were sampled from a motif database (10 391motifs) generated by running our script 100 times with a randomsets of UCR sequences
The main criterion for identifying known regulatory motifs amongthese caste-specific ones was the alignment of the PSSM for eachbee motif with the Drosophila melanogaster sequences in theTRANSFAC database (release 40) (Wingender et al 2000) Onlythe alignments passing a threshold of 80 identity for each PSSMwere considered as significant matches In addition we checkedfor a specific binding motif the EcRUSP motif (rGkTCAaTGamcy-3prime) known to function in the expression of genes responding tomorphogenetic hormone titres (Perera et al 2005)
Operating system and programming tools
An Ubuntu Linux (version Breezy) operating system was used toimplement all scripts and pipelines designed for annotation proce-dures and motif discovery The Python programming language(httpwwwpythonorg) Biopython (httpwwwbiopythonorg)and TAMO (Tools for Analysis of Motifs) packages (Gordon et al2005) were used in program design Other web applications werebuilt using the Zope application server (httpwwwzopeorg)hosted at httpzulufmrpuspbrbeelab
Acknowledgements
This work was supported by grants from FAPESP(Fundaccedilatildeo de Amparo a Pesquisa do Estado de SatildeoPaulo 9900719-6) and CNPq (Conselho Nacional deDesenvolvimento Cientiacutefico e Tecnoloacutegico 4729632-3-1)
References
Abouheif E and Wray GA (2002) Evolution of the genetic networkunderlying wing polyphenism in ants Science 297 249ndash252
Al-Sharour F Diaz-Uriarte R and Dopazo J (2004) FatiGO aweb tool for finding significant associations of Gene Ontologyterms with groups of genes Bioinformatics 20 578ndash580
Amdam GV Norberg K Hagen A and Omholt SW (2003)Social exploitation of vitellogenin Proc Natl Acad Sci USA 1001799ndash1802
Amdam GV Simotildees ZLP Hagen A Norberg K SchroderK Mikkelsen O et al (2004) Hormonal control of the yolkprecursor vitellogenin regulates immune function and longevityin honeybees Exp Gerontol 39 767ndash773
Antoniewski C OrsquoGrady MS Edmondson RG Lassieur SMand Benes H (1995) Characterization of an EcRUSP heter-odimer target site that mediates ecdysone responsiveness ofthe Drosophila Lsp-2 gene Mol Gen Genet 249 545ndash556
Arbeitsman MN Furlong EEM Imam F Johnson E NullBH Baker BS et al (2002) Gene expression during the lifecycle of Drosophila melanogaster Science 297 2270ndash2275
Ashburner M Ball CA Blake JA Botstein D Butler HCherry JM et al (2000) Gene Ontology tool for the unificationof biology Nature Genet 25 25ndash29
Bailey TL and Elkan D (1995) Unsupervised learning of multiplemotifs in biopolymers using expectation maximizationMachine Learning J 21 58ndash83
Bergem M Norberg K and Aamodt RM (2006) Long-termmaintenance of in vitro cultured honeybee (Apis mellifera)embryonic cells BMC Dev Biol 6 17 online
Boutros M Kiger AA Armknecht S Kerr K Hild M Koch Bet al (2004) Genome-wide RNAi analysis of growth and viabil-ity in Drosophila cells Science 303 832ndash835
Carsten LD Watts T and Markov TA (2005) Gene expressionpatterns accompanying a dietary switch in Drosophila mela-nogaster Mol Ecol 14 3203ndash3208
Clarke ND and Granek JA (2003) Rank order metrics for quan-tifying the association of sequence features with gene regula-tion Bioinformatics 19 212ndash218
Colombani J Bianchini L Layalle S Pondeville E Dauphin-Villemant C Antoniewski C et al (2005) Antagonisticactions of ecdysone and insulins determine final size inDrosophila Science 310 667ndash670
Corona M Estrada E and Zurita M (1999) Differential expressionof mitochondrial genes between queens and workers duringcaste determination in the honeybee Apis mellifera J Exp Biol202 929ndash938
Cunha AC Nascimento AM Guidugli KR Simotildees ZLPand Bitondi MMG (2005) Molecular cloning and expressionof a hexamerin cDNA from the honey bee J Insect Physiol 511135ndash1147
Darwin CR (1859) On the Origin of Species by Means of NaturalSelection John Murray London
Davidson EH (2001) Genomic Regulatory Systems Developmentand Evolution Academic Press San Diego
Eder J Kremer JP and Rembold H (1983) Correlation ofcytochrome c titer and respiration in Apis mellifera adaptiveresponses to caste determination defines workers intercastesand queens Comp Biochem Physiol B 76 703ndash716
Engels W (1974) Occurrence and significance of vitellogenins infemale castes of social Hymenoptera Am Zool 14 1229ndash1237
Evans JD and Wheeler DE (1999) Differential gene expressionbetween developing queens and workers in the honey beeApis mellifera Proc Natl Acad Sci USA 96 5575ndash5580
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
712 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
The database containing 10156 UCR sequences was gener-ated by parsing the Official Set annotation file (downloaded in GFFformat from httpwwwbeegenomehgscbcmtmcedubeeftphtml)to extract upstream regions starting from the terminal 5prime-genomiccoordinate of each predicted CDS The UCRs were arbitrarily setto a size frame of 1000 nucleotides (Roth et al 1998) but weretrimmed whenever another predicted ORF was detected in any ofthese regions
The MAP (maximum a priori log likelihood) score group specificityscore (called Church score in this manuscript) (Hughes et al2000) ROC AUC (area under the curve for a receiver-operatorcharacteristic plot) metric and MNCP (mean normalized condi-tional probability) metric (Clarke amp Granek 2003) were used todetect motifs that most likely correspond to biologically significantcis-regulatory elements The filters ran on the UCRs of the subsetsof queen and worker genes were a MAP score cut-off value of 50followed by a ROC AUC cut-off at 07 followed by a group specificityscore cut-off at 1eminus05 The UCR database of all GLEAN3 predictedhoney bee genes was used as the background to calculatethese metrics
A parametric test (MANOVA) and a nonparametric test (KolmogorovndashSmirnov) were conducted to identify significance levels for the twosets of filtered motifs found in the UCRs of caste-specificallyexpressed genes against filtered motifs found in the random UCRset Random motifs were sampled from a motif database (10 391motifs) generated by running our script 100 times with a randomsets of UCR sequences
The main criterion for identifying known regulatory motifs amongthese caste-specific ones was the alignment of the PSSM for eachbee motif with the Drosophila melanogaster sequences in theTRANSFAC database (release 40) (Wingender et al 2000) Onlythe alignments passing a threshold of 80 identity for each PSSMwere considered as significant matches In addition we checkedfor a specific binding motif the EcRUSP motif (rGkTCAaTGamcy-3prime) known to function in the expression of genes responding tomorphogenetic hormone titres (Perera et al 2005)
Operating system and programming tools
An Ubuntu Linux (version Breezy) operating system was used toimplement all scripts and pipelines designed for annotation proce-dures and motif discovery The Python programming language(httpwwwpythonorg) Biopython (httpwwwbiopythonorg)and TAMO (Tools for Analysis of Motifs) packages (Gordon et al2005) were used in program design Other web applications werebuilt using the Zope application server (httpwwwzopeorg)hosted at httpzulufmrpuspbrbeelab
Acknowledgements
This work was supported by grants from FAPESP(Fundaccedilatildeo de Amparo a Pesquisa do Estado de SatildeoPaulo 9900719-6) and CNPq (Conselho Nacional deDesenvolvimento Cientiacutefico e Tecnoloacutegico 4729632-3-1)
References
Abouheif E and Wray GA (2002) Evolution of the genetic networkunderlying wing polyphenism in ants Science 297 249ndash252
Al-Sharour F Diaz-Uriarte R and Dopazo J (2004) FatiGO aweb tool for finding significant associations of Gene Ontologyterms with groups of genes Bioinformatics 20 578ndash580
Amdam GV Norberg K Hagen A and Omholt SW (2003)Social exploitation of vitellogenin Proc Natl Acad Sci USA 1001799ndash1802
Amdam GV Simotildees ZLP Hagen A Norberg K SchroderK Mikkelsen O et al (2004) Hormonal control of the yolkprecursor vitellogenin regulates immune function and longevityin honeybees Exp Gerontol 39 767ndash773
Antoniewski C OrsquoGrady MS Edmondson RG Lassieur SMand Benes H (1995) Characterization of an EcRUSP heter-odimer target site that mediates ecdysone responsiveness ofthe Drosophila Lsp-2 gene Mol Gen Genet 249 545ndash556
Arbeitsman MN Furlong EEM Imam F Johnson E NullBH Baker BS et al (2002) Gene expression during the lifecycle of Drosophila melanogaster Science 297 2270ndash2275
Ashburner M Ball CA Blake JA Botstein D Butler HCherry JM et al (2000) Gene Ontology tool for the unificationof biology Nature Genet 25 25ndash29
Bailey TL and Elkan D (1995) Unsupervised learning of multiplemotifs in biopolymers using expectation maximizationMachine Learning J 21 58ndash83
Bergem M Norberg K and Aamodt RM (2006) Long-termmaintenance of in vitro cultured honeybee (Apis mellifera)embryonic cells BMC Dev Biol 6 17 online
Boutros M Kiger AA Armknecht S Kerr K Hild M Koch Bet al (2004) Genome-wide RNAi analysis of growth and viabil-ity in Drosophila cells Science 303 832ndash835
Carsten LD Watts T and Markov TA (2005) Gene expressionpatterns accompanying a dietary switch in Drosophila mela-nogaster Mol Ecol 14 3203ndash3208
Clarke ND and Granek JA (2003) Rank order metrics for quan-tifying the association of sequence features with gene regula-tion Bioinformatics 19 212ndash218
Colombani J Bianchini L Layalle S Pondeville E Dauphin-Villemant C Antoniewski C et al (2005) Antagonisticactions of ecdysone and insulins determine final size inDrosophila Science 310 667ndash670
Corona M Estrada E and Zurita M (1999) Differential expressionof mitochondrial genes between queens and workers duringcaste determination in the honeybee Apis mellifera J Exp Biol202 929ndash938
Cunha AC Nascimento AM Guidugli KR Simotildees ZLPand Bitondi MMG (2005) Molecular cloning and expressionof a hexamerin cDNA from the honey bee J Insect Physiol 511135ndash1147
Darwin CR (1859) On the Origin of Species by Means of NaturalSelection John Murray London
Davidson EH (2001) Genomic Regulatory Systems Developmentand Evolution Academic Press San Diego
Eder J Kremer JP and Rembold H (1983) Correlation ofcytochrome c titer and respiration in Apis mellifera adaptiveresponses to caste determination defines workers intercastesand queens Comp Biochem Physiol B 76 703ndash716
Engels W (1974) Occurrence and significance of vitellogenins infemale castes of social Hymenoptera Am Zool 14 1229ndash1237
Evans JD and Wheeler DE (1999) Differential gene expressionbetween developing queens and workers in the honey beeApis mellifera Proc Natl Acad Sci USA 96 5575ndash5580
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
Genomics of honey bee caste development and reproduction 713
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
Evans JD and Wheeler DE (2000) Expression profiles duringhoneybee caste determination Genome Biol 2 1 6 pagesonline httpgenomebiologycom200021research00011
Forbes A and Lehmann R (1998) Nanos and Pumilio havecritical roles in the development and function of Drosophilagermline stem cells Development 125 679ndash690
Gonzaacutelez-Reyes A Elliot H and St Johnston D (1995)Polarization of both major body axes in Drosophila by gurken-torpedo signalling Nature 375 654ndash658
Goodisman MA Isoe J Wheeler DE and Wells MA (2005)Evolution of insect metamorphosis a microarray-based studyof larval and adult gene expression in the ant Camponotusfestinatus Evolution 59 858ndash870
Gordon DB Nekludova L McCallum S and Fraenkel E(2005) TAMO a flexible object-oriented framework for analys-ing transcriptional regulation using DNA-sequence motifsBioinformatics 21 3164ndash3165
Guidugli KR Hepperle C and Hartfelder K (2004) A memberof the short-chain dehydrogenasereductase (SDR) super-family is a target of the ecdysone response in honey bee (Apismellifera) caste development Apidologie 35 37ndash47
Guidugli KR Nascimento AM Amdam GV Barchuk AROmholt SW Simotildees ZLP and Hartfelder K (2005)Vitellogenin regulates hormonal dynamics in the worker casteof a eusocial insect FEBS Lett 579 4961ndash4965
Hagedorn HH Maddison DR and Tu Z (1998) The evolutionof vitellogenins cyclorrhaphan yolk proteins and relatedmolecules Adv Insect Physiol 27 335ndash384
Hall TA (1999) BioEdit a user-friendly biological sequence align-ment editor and analysis program for Windows 9598NTNucleic Acids Symp Ser 41 95ndash98
Hamilton WD (1964) The genetical theory of social behaviour Iamp II J Theor Biol 7(1ndash16) 17ndash52
Hartfelder K and Emlen DJ (2005) Endocrine control of insectpolyphenism In Comprehensive Insect Molecular Science(Gilbert LI Iatrou K and Gill S eds) Vol 3 pp 651ndash703Elsevier Oxford
Hartfelder K and Engels W (1998) Social insect polymorphismHormonal regulation of plasticity in development and repro-duction in the honeybee Curr Topics Dev Biol 40 45ndash77
Hawkins NC Thorpe J and Schuumlpbach T (1996) encore agene required for the regulation of germ line mitosis and oocytedifferentiation during Drosophila oogenesis Development 122281ndash290
Haydak MH (1970) Honey bee nutrition Annu Rev Entomol 15143ndash156
Hepperle C and Hartfelder K (2001) Differentially expressedregulatory genes in honey bee caste development Naturwis-senschaften 88 113ndash116
Hoage TR and Kessel RG (1968) An electron microscopicstudy of the process of differentiation during spermatogenesis inthe drone honey bee (Apis mellifera L) with special reference tocentriole replication and elimination J Ultrastruct Res 24 6ndash32
Honey Bee Genome Sequencing Consortium (2006) Insightsinto social insects from the genome of the honeybee Apismellifera Nature (in press)
Hughes JD Estep PW Tavazole S and Church GM (2000)Computational identification of cis-regulatory elementsassociated with groups of functionally related genes inSaccharomyces cervisiae J Mol Biol 296 1205ndash1214
Hunt GJ and Page RE (1995) Linkage map of the honey bee
Apis mellifera based on RAPD markers Genetics 139 1371ndash1382
Hunt JH Buck NA and Wheeler DE (2003) Storage proteinsin vespid wasps characterization developmental pattern andoccurrence in adults J Insect Physiol 49 785ndash794
Judice CC Carazzole MF Festa F Sogayar MC HartfelderK and Pereira GAG (2006) Gene expression profiles under-lying alternative caste phenotypes in a highly eusocial beeMelipona quadrifasciata Insect Mol Biol 15 33ndash44
Kerr WE (1950) Genetic determination of castes in the genusMelipona Genetics 35 143ndash152
Kropaacutecovaacute S and Haslbachovaacute H (1971) The influence ofqueenlessness and of unsealed brood on the development ofovaries in worker honeybees J Apicult Res 10 57ndash61
Lattorff HMG Moritz RFA and Fuchs S (2005) A single genedetermines thelytokous parthenogenesis in honey bee workers(Apis mellifera capensis) Heredity 94 533ndash537
Lattorff HMG Moritz RFA Solignac M and Crewe RM(2006) Control of reproductive dominance by the thelytokygene in honeybees Biol Lett (in press)
Li TR and White KP (2003) Tissue-specific gene expressionand ecdysone-regulated genomic networks in Drosophila DevCell 5 59ndash72
Lin H Yue L and Spradling AC (1994) The Drosophila fusomea germline-specific organelle contains membrane skeletalproteins and functions in cyst formation Development 120947ndash956
Liu XS Brutlag DL and Liu JS (2002) An algorithm for findingprotein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments Nature Biotechnol20 835ndash839
Liu Y Liu S Wei L Altman RB and Batzoglou S (2004)Eukaryotic regulatory element conservation analysis andidentification using comparative genomics Genome Res 14354ndash366
Martinez T Burmester T Veenstra JA and Wheeler DE(2001) Sequence and evolution of a hexamerin from the antCamponotus festinatus Insect Mol Biol 9 427ndash431
McKearin D (1997) The Drosophila fusome organelle biogenesisand germ cell differentiation if you build it hellip Bioessays 19147ndash152
Mirth C Truman JW and Riddiford LM (2005) The role of theprothoracic gland in determining critical weight for metamor-phosis in Drosophila melanogaster Curr Biol 15 1ndash12
Moritz RFA Kryger P and Allsopp MH (1996) Competition forroyalty in bees Nature 384 31
Page RE and Erickson EH (1988) Reproduction by worker honeybees (Apis mellifera L) Behav Ecol Sociobiol 23 117ndash126
Pereboom JJM Jordan WC Sumner S Hammond RL andBourke AFG (2005) Differential gene expression in queen-worker caste determination in bumble bees Proc R Soc B 2721145ndash1152
Perera SC Zheng S Feng QL Krell PJ Retnakaran A andPalli SR (2005) Heterodimerization of ecdysone receptor andultraspiracle on symmetric and asymmetric response elementsArch Insect Biochem Physiol 60 55ndash70
Piulachs MD Guidugli KR Barchuk AR Cruz J SimotildeesZLP and Belleacutes X (2003) The vitellogenin of the honeybeeApis mellifera Structural analysis of the cDNA and expressionstudies Insect Biochem Mol Biol 33 459ndash465
Pritsker M Liu YC Beer MA and Tavazole S (2004) Whole
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom
714 A S Cristino et al
copy 2006 The AuthorsJournal compilation copy 2006 The Royal Entomological Society Insect Molecular Biology 15 703ndash714
genome discovery of transcription factor binding sites bynetwork-level conservation Genome Res 14 99ndash108
Rachinsky A Strambi C Strambi A and Hartfelder K (1990)Caste and metamorphosis ndash hemolymph titers of juvenilehormone and ecdysteroids in last instar honeybee larvae GenComp Endocr 79 31ndash38
Riddihough G and Pelham HRB (1987) An ecdysone responseelement in the Drosophila hsp27 promotor EMBO J 6 3729ndash3734
Roth FP Hughes JD Estep PW and Church GM (1998)Finding DNA regulatory motifs within unaligned noncodingsequences clustered by whole-genome mRNA quantificationNature Biotechnol 16 939ndash945
Rutherford K Parkhill J Crook J Horsnell T Rice PRajandream MA and Barrell B (2000) Artemis sequencevisualisation and annotation Bioinformatics 16 944ndash945
Schmidt-Capella IC and Hartfelder K (1998) Juvenile hormoneeffect on DNA synthesis and apoptosis in caste-specificdifferentiation of the larval honey bee (Apis mellifera L) ovaryJ Insect Physiol 44 385ndash391
Snodgrass RE (1956) Anatomy of the Honey Bee Cornell Uni-versity Press Ithaca
Solignac M Vautrin D Baudry E Mougel F Loiseau A andCornuet JM (2004) A microsatellite-based linkage map of thehoneybee Apis mellifera L Genetics 167 253ndash262
St Johnston D and Nuumlsslein-Volhard C (1992) The origin of pat-tern and polarity in the Drosophila embryo Cell 68 201ndash219
Staab S and Steinmann-Zwicky M (1995) Female germ cells ofDrosophila require zygotic ovo and otu product for survival inlarvae and pupae respectively Mech Dev 54 205ndash210
Sullivan AA and Thummel CS (2003) Temporal profiles ofnuclear receptor gene expression profiles reveal coordinatetranscriptional responses during Drosophila development MolEndocrinol 17 2125ndash2137
Wassermann WW and Sandelin A (2004) Applied bioinformaticsfor the identification of regulatory elements Nature Rev Genet5 267ndash287
West-Eberhard MJ (2003) Developmental Plasticity andEvolution Oxford University Press Oxford
Wheeler DE and Nijhout HF (2003) A perspective for under-standing the modes of juvenile hormone as a lipid signalingsystem Bioessays 25 994ndash1001
Wilson EO (1971) The Insect Societies Belknapp Press of Har-vard University Press Cambridge MA
Wingender E Chen X Hehl R Karas H Liebich I Matys Vet al (2000) TRANSFAC an integrated system for geneexpression regulation Nucleic Acids Res 28 316ndash319
Zhou X Oi FM and Scharf ME (2006a) Social exploitationof hexamerin RNAi reveals a major caste-regulatory factor intermites Proc Natl Acad Sci USA 103 4499ndash4504
Zhou X Tarver MR Bennett GW Oi FM and Scharf ME(2006b) Two hexamerin genes from the termite Reticulitermesflavipes sequence expression and proposed functions incaste regulation Gene 376 47ndash58
Zhu Z Shendure J and Church GM (2005) Discoveringfunctional transcription factor combinations in the human cellcycle Genome Res 15 848ndash855
Supplementary material
The following material is available for this article online
Table S1 Annotation results of caste-specifically expressedhoney bee genes
Table S2 Caste-specific motifs in UCRs of honey bee genes
Table S3 Annotation of honey bee orthologs to Drosophilamelanogaster genes involved in Reproduction
This material is available as part of the online article fromhttpwwwblackwell-synergycom