Top Banner
ORIGINAL PAPER Structural organization, classification and phylogenetic relationship of cytochrome P450 genes in Citrus clementina and Citrus sinensis Suresh Reddy Mittapelli & Shailendar Kumar Maryada & Venkateswara Rao Khareedu & Dashavantha Reddy Vudem Received: 16 September 2013 /Revised: 19 December 2013 /Accepted: 26 December 2013 /Published online: 5 January 2014 # Springer-Verlag Berlin Heidelberg 2014 Abstract The genus Citrus is an important fruit crop and nutritional source for the good health of humans. Cytochrome P450s represent about 1 % of the proteome and mediate diverse biochemical reactions pertaining to both primary and secondary metabolism. Analysis of Citrus genomic resources identified 296 plant cytochrome P450s (CYP) coding genes in Citrus clementina , 272 in double haploid (dh) Citrus sinensis , and 202 in C. sinensis . In C. clementina and dh C. sinensis , CYP genes are distributed into nine clans. In the three ge- nomes, single intron containing CYP genes are predominant in the A-type families. Among non-A-type CYP families, multiple intron containing genes are predominant. More num- ber of genes in CYP A-type families over non-A-type families is attributed to rapid evolution of A-type genes facilitated by their gene organization. Further, complex gene organization of non-A-type genes with the presence of multiple introns might have contributed to the slower evolvement of paralogs. Ma- jority of introns (1,660) from three genomes showed canonical GT-AG splice sites. However, 33 introns showed non- conventional GCPyAG splice sites and functionality of these splice sites is confirmed by the ESTs lacking this intron. Across the families, gene organization is conserved between the three genomes. In dh C. sinensis , 22 genes were identified to have alternate splicing. Examination of scaffolds in C. clementina revealed that majority of the Citrus CYP genes are solitary and a few of them are in clusters of 38 genes. PCR amplification of C. sinensis genomic DNA with gene- specific primers failed to amplify out-grouped genes Ccl- CYP706A16 and Ccl-CYP706B1, confirming that they are specific to C. clementina . Differential number of CYP genes observed between C. clementina and C. sinensis is attributed to the extent of variability between their parents representing ancestral taxa. Keywords Citrus clementina . Citrus sinensis . Cytochrome P450 . Gene amplification . Gene families . Intron phasing Introduction The genus Citrus belongs to the family Rutaceae and is considered as one of the most economically important fruit crops of the world. It is distributed in both tropical and subtropical regions spread all over the world in more than 140 countries (Liu et al. 2012; Malik et al. 2012). According to FAOSTAT (http://faostat.fao.org/site/567/default.aspx), about two thirds of global Citrus fruit production comes from China, Brazil, USA, India, Mexico, and Spain. The primary Citrus species Citrus medica L. of citrons, Citrus reticulata Blanco of mandarins and Citrus maxima L. of pomelos are the ancestors of the cultivated Citrus (Scora 1975). The cultivated species include Citrus aurantium L . (sour orange), Citrus sinensis L. Osb . (sweet orange), Citrus paradisi Macf. (grapefruit), Citrus clementina hort Extan. (Clementine), and Citrus limon Osb. (lemon). These species are believed to have originated from the hybridization of cross compatible basic taxa (Ollitrault et al. 2012). Most often, Citrus species are diploid with a basic chromosome number Communicated by W.-W. Guo Suresh Reddy Mittapelli and Shailendar Kumar Maryada equally contributed to this study. Electronic supplementary material The online version of this article (doi:10.1007/s11295-013-0695-8) contains supplementary material, which is available to authorized users. S. R. Mittapelli : S. K. Maryada : V. R. Khareedu : D. R. Vudem (*) Centre for Plant Molecular Biology, Osmania University, Hyderabad 500 007, India e-mail: [email protected] Tree Genetics & Genomes (2014) 10:399409 DOI 10.1007/s11295-013-0695-8
11

Structural organization, classification and phylogenetic ...

Feb 15, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Structural organization, classification and phylogenetic ...

ORIGINAL PAPER

Structural organization, classification and phylogeneticrelationship of cytochrome P450 genes in Citrus clementinaand Citrus sinensis

Suresh Reddy Mittapelli & Shailendar Kumar Maryada &

Venkateswara Rao Khareedu & Dashavantha Reddy Vudem

Received: 16 September 2013 /Revised: 19 December 2013 /Accepted: 26 December 2013 /Published online: 5 January 2014# Springer-Verlag Berlin Heidelberg 2014

Abstract The genus Citrus is an important fruit crop andnutritional source for the good health of humans. CytochromeP450s represent about 1 % of the proteome and mediatediverse biochemical reactions pertaining to both primary andsecondary metabolism. Analysis of Citrus genomic resourcesidentified 296 plant cytochrome P450s (CYP) coding genes inCitrus clementina, 272 in double haploid (dh) Citrus sinensis,and 202 in C. sinensis. In C. clementina and dh C. sinensis,CYP genes are distributed into nine clans. In the three ge-nomes, single intron containing CYP genes are predominantin the A-type families. Among non-A-type CYP families,multiple intron containing genes are predominant. More num-ber of genes in CYPA-type families over non-A-type familiesis attributed to rapid evolution of A-type genes facilitated bytheir gene organization. Further, complex gene organization ofnon-A-type genes with the presence of multiple introns mighthave contributed to the slower evolvement of paralogs. Ma-jority of introns (1,660) from three genomes showed canonicalGT-AG splice sites. However, 33 introns showed non-conventional GC… PyAG splice sites and functionality ofthese splice sites is confirmed by the ESTs lacking this intron.Across the families, gene organization is conserved betweenthe three genomes. In dh C. sinensis, 22 genes were identified

to have alternate splicing. Examination of scaffolds inC. clementina revealed that majority of the CitrusCYP genesare solitary and a few of them are in clusters of 3–8 genes.PCR amplification of C. sinensis genomic DNA with gene-specific primers failed to amplify out-grouped genes Ccl-CYP706A16 and Ccl-CYP706B1, confirming that they arespecific to C. clementina. Differential number of CYP genesobserved between C. clementina and C. sinensis is attributedto the extent of variability between their parents representingancestral taxa.

Keywords Citrus clementina .Citrus sinensis . CytochromeP450 . Gene amplification . Gene families . Intron phasing

Introduction

The genus Citrus belongs to the family Rutaceae and isconsidered as one of the most economically important fruitcrops of the world. It is distributed in both tropical andsubtropical regions spread all over the world in more than140 countries (Liu et al. 2012; Malik et al. 2012). Accordingto FAOSTAT (http://faostat.fao.org/site/567/default.aspx),about two thirds of global Citrus fruit production comesfrom China, Brazil, USA, India, Mexico, and Spain. Theprimary Citrus species Citrus medica L. of citrons, Citrusreticulata Blanco of mandarins and Citrus maxima L. ofpomelos are the ancestors of the cultivated Citrus (Scora1975). The cultivated species include Citrus aurantium L.(sour orange), Citrus sinensis L. Osb. (sweet orange), Citrusparadisi Macf. (grapefruit), Citrus clementina hort Extan.(Clementine), and Citrus limon Osb. (lemon). These speciesare believed to have originated from the hybridization of crosscompatible basic taxa (Ollitrault et al. 2012). Most often,Citrus species are diploid with a basic chromosome number

Communicated by W.-W. Guo

Suresh Reddy Mittapelli and Shailendar Kumar Maryada equallycontributed to this study.

Electronic supplementary material The online version of this article(doi:10.1007/s11295-013-0695-8) contains supplementary material,which is available to authorized users.

S. R. Mittapelli : S. K. Maryada :V. R. Khareedu :D. R. Vudem (*)Centre for Plant Molecular Biology, Osmania University,Hyderabad 500 007, Indiae-mail: [email protected]

Tree Genetics & Genomes (2014) 10:399–409DOI 10.1007/s11295-013-0695-8

Page 2: Structural organization, classification and phylogenetic ...

of x=9 (Krug 1943) and genome size ofCitrus species is in therange of 360 to 398 Mbp (Gmitter et al. 2012; Ollitrault et al.2012; Xu et al. 2013).

Plant cytochrome P450s (CYPs) form a large super-familyof heme-containing monooxygenases and mediate diversereactions of both primary and secondary metabolisms(Kumar et al. 2013).Many CYPs are involved in the pathwaysleading to the synthesis of UV protectants (flavonoids andanthocyanins), defense compounds (isoflavonoids, phyto-alexins, hydroxamic acids, and terpenes), fatty acids, hor-mones (gibberellins, brassinosteroids), signaling molecules(oxylipins, salicylic acid, and jasmonic acid), accessory pig-ments (carotenoids), and structural polymers like lignins(Schuler and Werck-Reichhart 2003). CYPs also participatein the catabolic breakdown of some of these endogenousmolecules as well as exogenous compounds (herbicides, in-secticides, and other environmental pollutants). It is verydifficult to determine specific metabolic functions of a givenCYP, as these enzymes are highly labile and present in lowquantities. Sequencing of plant genomes has facilitated theclassification of CYPs into unique families with predictedfunctions. Further, a large number of CYP proteins conferresistance to various pathogens and insects (Smigocki andWilson 2004).

In view of the divergent roles of CYPs and economicimportance ofCitrus species, their vast diversity, cross speciescompatibility and wide range of metabolites, the present studyis focused on the structural organization, classification andphylogenetic relationship of CYP genes in C. clementina,C. sinensis, and doubled haploid (dh) C. sinensis.

Material and methods

Haploid C. clementina and C. sinensis genomes “v1.0” atphytozome (2011) database (http://www.phytozome.net/)and doubled haploid sweet orange genome (http://citrus.hzau.edu.cn/orange/download/data.php) were searched forputative cytochrome P450 genes and retrieved the predictedprotein and nucleotide (CDs and genomic) sequences. Out-of-range CYP candidate proteins which are below 300 and above600 amino acids were validated by using Softberry geneprediction tool (http://linux1.softberry.com/berry.phtml) byincreasing the scaffold size to 2,000-bp upstream of 5′ end.Naming of Citrus CYP genes was carried out based onsimilarity with the gene orthologs of Arabidopsis (Nelsonet al. 2004) retrieved from http://drnelson.uthsc.edu/CytochromeP450.html. BLASTP analysis of Citrusproteome with retrieved CYP protein orthologs was carriedout to identify diversified paralogs.

Multiple sequence alignment of Citrus CYP proteins wasperformed using the UPGMB clustering (Gap opening −2.9and gap extension penalty 0), in the MUSCLE module (Edgar

2004) from the MEGA5.2.2 software. The Neighbor-Joiningtree method by P-distance inMEGA5.2.2 (Tamura et al. 2011)was used to construct the phylogenetic tree. Further, thephylogenetic trees were also generated using maximum like-lihood (ML) method in MEGA5.2.2 (Hall 2013). The signif-icance level for the phylogenetic trees using bootstrap testingwith 1,000 replications was carried out. Polymerase chainreaction (PCR) was performed using genomic DNA ofC. sinensis and gene-specific primers corresponding to out-grouped genes ofC. clementina. Amplification was performedfor 35 cycles at 94 °C for 30 s, 55 °C for 30 s, and 72 °C for30 s.

The alignment of predicted amino acid sequences of CYPsfrom theCitruswith genomic DNA sequence using GeneWiseprogram (http://www.ebi.ac.uk/Tools/psa/genewise/) wascarried out to identify positions of introns and exons andtheir phases. Using the generated alignments, the lengths ofintrons were computed. Alignment of CYP coding sequencesto genomic sequences using BLAT program (Kent 2002) wascarried out and predicted the number of introns, exonic/intronic lengths, as well as the phasing of introns. To identifythe evolutionary relationship of species-specific CYPs,BLAST search analysis was performed against the expressedsequence tags (ESTs) of different Citrus species (Babu et al.2013). dh C. sinensis proteome was used to identify geneswith plausible alternate splicing. CYP gene clusters wereidentified using the scaffold information of C. clementina.Amino acid sequences of enzymes involved in the hesperidin(Bar-Peled et al. 1991) and luteolin (Martens et al.2001;Britsch et al. 1990) biosynthetic pathways from MetaCyc(http://metacyc.org/) were aligned and CYPs showing highhomology were integrated into the pathway.

Results

Analysis of genomes of C. clementina and dh C. sinensisdisclosed 296 and 272 CYP coding genes, respectively, andwere grouped into nine clans (CYP51, 71, 72, 74, 85, 86, 97,710, and 711) consisting of 43 families (Tables 1, 2, and 3, S1,S2, and S3, Figs. 1 and 2). Whereas,C. sinensiscontained 202CYP genes representing eight clans and the clan CYP74 couldnot be identified (Tables 1 and 2 and S1). The highest numberof the CYP genes (207/131/188) are present in the CYP71clan which represents the whole set of A-type CYP genesbelonging to 18 families, CYP71, CYP73, CYP75, CYP76,CYP77, CYP78, CYP79, CYP81, CYP82, CYP83, CYP84,CYP89, CYP93, CYP98, CYP701, CYP703, CYP706, andCYP712. A total of 89 non-A-type CYP genes are present inC. clementina, while dh C. sinensis and C. sinensis contained84 and 71 genes, respectively. C. clementina, C. sinensis anddh C. sinensis contained equal number of genes in 19 familiessuch as CYP51, CYP715, CYP735, CYP85, CYP87, CYP90,

400 Tree Genetics & Genomes (2014) 10:399–409

Page 3: Structural organization, classification and phylogenetic ...

CYP718, CYP720, CYP722, CYP724, CYP97, CYP710,CYP711, CYP73, CYP77, CYP78, CYP84, CYP93, andCYP703. C. clementina and dh C. sinensis families, CYP71,CYP72, CYP76, CYP79, CYP81, CYP82, CYP83, CYP89,CYP94, CYP701, CYP706, CYP716, and CYP721 showedmore number of genes than families of C. sinensis (Tables 1and 2, S1, S2, and S3). Remaining families showed differ-ences in 1 to 3 genes. In the CYP83 family, Csi-CYP83B6 isthe only gene without introns and exhibited 97 and 96 %identity of nucleotide and amino acids, respectively, with theCsi-CYP83B2 gene having three introns. Majority of thegenes in CYP89 and CYP94 families are intron less. In dhC. sinensis, the lone gene dhCsi-CYP89A8 in the family iswith introns and out-grouped in the phylogenetic tree. ESTsfrom sweet orange (EY668750.1) and Citrus reshni(FC924296.1) corresponding to this gene were identified inthe database. Further, the search failed in identifying corre-sponding EST in C. clementina (Table S4). The gene Ccl-CYP94D3 is the lone gene with introns in the family. How-ever, the search failed in detecting corresponding ESTs in theCitrus data base. Among A-type genes, single intron contain-ing genes are found predominantly (66.1 to 79.8 %), whileintron less genes represents 5.3 to 6.2 %. The remaining A-type genes (14.9 to 27.7 %) are multiple intron containinggenes. Majority of the non-A-type genes (73.0 to 78.9 %) are

multiple introns containing, while 12.7 to 22.5 % are intronless genes. The remaining non-A-type genes (4.5 to 8.4 %) arewith single intron (Table 3). C. clementina A-type genescontained 480 exons and 286 introns which fall in the rangeof 6 to 1,097 and 30 to 2,106 bp, respectively (Table 3).C. sinensisA-type genes contained 296 exons and 173 intronswith a size range of 9 to 1,179 and 32 to 2,409 bp, respective-ly. dh C. sinensis A-type genes contained 407 exons in therange of 33 to 1,608 bp, while the size of 219 introns variedfrom 30 to 6303 bp. Non-A-type genes of C. clementinacontained 422 exons and 353 introns in the size range of 7–1,287 and 29–3,491 bp, respectively. Non-A-type genes ofC. sinensis possessed 375 exons and 313 introns in the sizerange of 5–1,489 and 30–2,875 bp, respectively (Table 3). dhC. sinensis non-A-type genes contained 434 exons and 349introns in the size range of 31–1,626 and 62 to 4,032 bp,respectively. The mean exon length of A-type CYP genes isgreater than non-A-type CYP genes. Among A-type genes, amaximum number of zero phase introns were observed inC. clementina/C. sinensis/dh C. sinensis (228/134/196)followed by phase one introns (38/19/17) and of phase twointrons (20/20/6). Maximum number of non-A-type introns ofC. clementina/C. sinensis/dh C. sinensis (219/199/222) arewith zero phase followed by phase two (74/63/71) and phaseone (60/51/56). Majority of introns (1,660) from three

Table 1 Family wise distributionof A-type cytochrome P450 genesin C. clementina, C. sinensis,and dh C. sinensis

CYP clan andfamily name

No. of genesC. clementina

No. of genesC. sinensis

No. of genesdh C. sinensis

Plausible metabolic pathway (s)

CYP 71 clan

CYP71 54 40 54 Herbicide and camalexin

CYP73 2 2 2 Phenylpropanoid and lignin

CYP75 2 1 2 Phenylpropanoid and flavonoid

CYP76 23 10 20 Terpenoid and indole alkaloid

CYP77 2 2 2 Fatty acid and cutin

CYP78 7 7 7

CYP79 9 2 8 Indole glucosinolate, camalexinand auxin

CYP81 9 6 10 Indole glucosinolates

CYP82 28 14 27 Tryptophan-derived secondary metabolites

CYP83 26 15 17 Aliphatic, indole glucosinolatesand ascorbate

CYP84 2 2 2 Phenylpropanoid and lignin

CYP89 8 6 8

CYP93 8 8 8

CYP98 3 2 2 Phenylpropanoid, lignin monomersand soluble phenolics

CYP701 2 1 2 Gibberellin

CYP703 1 1 1 Sporopollenin

CYP706 17 11 12

CYP712 4 1 4

Tree Genetics & Genomes (2014) 10:399–409 401

Page 4: Structural organization, classification and phylogenetic ...

genomes showed canonical GT-AG splice sites. However, atotal of 33 introns from three genomes showed non-conventional GC…PyAG splice sites (Table 3, Tables S1,S2, and S3). Maximum number of introns with non-conventional splice sites were found in C. clementina (6+10) followed by C. sinensis (1+8), and dh C. sinensis (2+6)(Table 3, S1, S2, and S3). CYP711A2 gene in three genomesshowed a 93-bp GC…CAG intron. Further, the correspondingEST (Genbank id CX673883.1) also showed the absence ofthis intron and presence of consecutively positioned flankingexons. dh C. sinensis CYP genes on an average code for theprotein of 499 amino acids, followed by C. clementina(495 aa) and C. sinensis (470 and 481 aa) (Table 3).

Conserved amino acids, coding sequences, and positions ofexons and introns, and phasing of introns were observedbetween the genes pertaining to CYP74 clan of C. clementinaand dh C. sinensis. In general, between the three genomesconserved gene organization was recorded among the familiesrepresenting basic set of genes (Tables S1, S2, and S3). dhCsi-CYP51G1 and Csi-CYP51G1 exhibited 100 % amino acididentity and conserved intron and exon sequences with asingle change of C/T at 741 nucleotide in the CDS. Ascompared to these two genes, Ccl-CYP51G1 gene showed99 % amino acid identity with a change of 15 nucleotidesdistributed across CDS. Interestingly intron from both dhCsi-CYP51G1 and Csi-CYP51G1 showed AAT tri-nucleotide

Table 2 Family wise distributionof non-A-type cytochromeP450 genes in C. clementina,C. sinensis and dh C. sinensis

Clan andfamily name

No. of genesC. clementina

No. of genesC. sinensis

No. of genesdh C. sinensis

Plausible metabolic pathway

CYP 51 clan

CYP51 1 1 1 Sterols/steroids

CYP 72 clan

CYP72 9 4 11 Brassinosteroids

CYP714 4 2 2 Gibberillin

CYP715 2 2 2

CYP721 3 2 3

CYP734 5 4 3 Brassinolide

CYP735 1 1 1 Cytokinin

CYP74 clan

CYP74 3 0 3 Jasmonate and oxylipin

CYP85 clan

CYP85 2 2 2 Brassinolide

CYP87 3 3 3

CYP88 2 4 4 Gibberellins

CYP90 4 4 4 Brassinolide

CYP707 5 5 4 Abscisic acid

CYP716 9 8 9 Hesperdin and luteolin

CYP718 1 1 1

CYP720 1 1 1

CYP722 3 3 3

CYP724 2 2 2 Brassinolide

CYP86 clan

CYP86 5 5 4 Fatty acid and cutin

CYP94 10 4 7 Jasmonate and its mediatedsignaling

CYP96 4 3 2 Wax and eriodictyol

CYP704 4 4 6 Sporopollenin

CYP97 clan

CYP97 3 3 3 Carotenoid

CYP710 clan

CYP710 1 1 1 Sterols

CYP711 clan

CYP711 2 2 2

402 Tree Genetics & Genomes (2014) 10:399–409

Page 5: Structural organization, classification and phylogenetic ...

repeat of nine times, while Ccl-CYP51G1 intron displayedAAT tri-nucleotide repeat of four times. In three genomes, aminimum of one gene in each of the family was recorded tohave the conserved organization. CYP718 gene is conservedin three genomes with two putative alternate translation initi-ation sites falling 120 nucleotides apart. In general, the struc-tural organization of multiple intron containing genes showeda higher degree of conservation between three genomes ascompared to single intron containing genes.

Multiple intron containing geneCYP97A3 inC. clementinaand C. sinensis genomes is conserved with regard to size andnumber of exons (16), phasing (1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2,2, 0, 0) and number of introns (15). However, the length of theseventh intron (Ccl-2702/Csi-2875 bp) varied between twospecies (Table S1 and S2). Conversely, the 102-bp exonpresent in these two orthologs is not found in dhCsi-CYP97A3and included in the eighth intron of the gene. The seventh

intron of this ortholog also showed variation in its size(4,032 bp) (Table S3). ESTs EY770031.1 and EY742825.1represented 6th to 12th exons, while EST EY687747.1 repre-sented 13th to 16th exons and EST EY704803.1 represented8th to 16th exons of CYP97A3 gene. ESTwithout 102-bp 9thexon was not found in the EST data base of Citrus.

In the genusCitrus, a total of 22 CYP genes were identifiedwith plausible alternate splicing. Fourteen genes dhCsi-CYP71B11, dhCsi-CYP79A3P, dhCsi-CYP82F3, dhCsi-CYP82G3, dhCsi-CYP72A7, dhCsi-CYP714A3, dhCsi-CYP714A1, dhCsi-CYP85A1, dhCsi-CYP87A6, dhCsi-CYP88C1, dhCsi-CYP704A3, dhCsi-CYP704B1, dhCsi-CYP97A3, and dhCsi-CYP97B3 are with two alternatespliced products followed by five genes dhCsi-CYP88B1,dhCsi-CYP90D3, dhCsi-CYP716A5, dhCsi-CYP707A1,and dhCsi-CYP734B2 with three spliced products, two genesdhCsi-CYP707A5 and dhCsi-CYP72C3 with four spliced

Table 3 Distribution of exons, introns, and phasing of introns in CYP450 genes of Citrus clementina, Citrus sinensis, and dh Citrus sinensis

Description Ccl-A-type Csi-A-type dhCsi-A-type Ccl-non-A-type Csi-non-A-type dhCsi-non-A-type

No. of families 18 18 18 25 24 25

No. of sub families 32 17 31 25 19 29

No. of genes 207 131 188 89 71 84

No. of intron less genes 13 8 11 20 9 16

No. of genes with single intron 137 93 149 4 6 5

No. of genes with 2 introns 37 19 21 6 4 4

No. of genes with 3 introns 13 6 4 12 9 10

No. of genes with 4 introns 2 2 1 13 12 17

No. of genes with 5 introns 2 2 – 7 5 5

No. of genes with 6 introns 3 1 2 5 7 7

No. of genes with 7 introns – – – 5 6 8

No. of genes with 8 introns – – – 14 9 8

No. of genes with 9 introns – – – 1 2 1

No. of genes with 10 introns – – – – 1 –

No. of genes with 13 introns – – – 1 – 1

No. of genes with 14 introns – – – – – 1

No. of genes with 15 introns – – – 1 1 1

No. of exons 480 296 407 422 375 434

No. of introns 286 173 219 353 313 349

No. of introns with non conventionalsplice site ↓GC…PyAG↓

6 1 2 10 8 6

Size range of exons (bp) 6–1,097 9–1,179 33–1,608 7–1,287 5–1,489 31–1,626

Mean exon length (bp) 601 586 691 241 236 294

Size range of introns (bp) 30–2,106 32–2,409 30–6,303 29–3,491 30–2,875 62–4,032

Mean intron length (bp) 333 299 389 329 265 329

Range of amino acids 303-621 307-556 304-568 328-612 314-621 354-582

Mean no. of amino acids 495 470 499 495 481 499

No. of 0 phase introns 228 134 196 219 199 222

No. of phase 1 introns 38 19 17 60 51 56

No. of phase 2 introns 20 20 6 74 63 71

Ccl C. clementina; CsiC. sinensis; dhCsi dh C. sinensis

Tree Genetics & Genomes (2014) 10:399–409 403

Page 6: Structural organization, classification and phylogenetic ...

products and one gene dhCsi-CYP715A6 with eight splicedproducts. An EST (EY674371.1) corresponding to dhCsi-CYP97A3 gene includes 11th intron of 498 bp, while anotherEST (EY742825.1) lacking this intron represents alternatespliced product. dhCsi-CYP707A5 gene showed four alter-nate spliced products. ESTs DY291781.1 and CF504502.1probably represents the full-length transcripts of dhCsi-CYP707A5 gene. Another EST DC894727.1 contained thefifth intron of 112 bp in between the fifth and sixth exons of 90

and 79 bp, respectively. EST DC894728.1 showed the inclu-sion of the eighth intron of 488 bp.

Examination of scaffolds of C. clementina revealed thatmost of the CYP genes inC. clementina are solitary genes anda few genes are grouped in clusters; two clusters of eight genes(Ccl-CYP76D2, Ccl-CYP76C5, Ccl-CYP76E4, Ccl-CYP76D7, Ccl-CYP76E3, Ccl-CYP76D1, Ccl-CYP76D4,and Ccl-CYP76C7) and (Ccl-CYP82D2, Ccl-CYP82F1,Ccl-CYP82D5, Ccl-CYP98A9, Ccl-CYP82F2, Ccl-

Fig. 1 ML-based phylogenetic tree of A-type Cytochrome P450 proteins ofC. clementinaand dhC. sinensis“Ccl and dhCsi” represent CYP proteins ofC. clementina and dh C. sinensis, respectively

404 Tree Genetics & Genomes (2014) 10:399–409

Page 7: Structural organization, classification and phylogenetic ...

CYP82C2, Ccl-CYP706A5, and Ccl-CYP71C7), a seven-gene cluster (Ccl-CYP87D7, Ccl-CYP74A, Ccl-CYP82D2,Ccl-CYP82F1, Ccl-CYP82D5, Ccl-CYP82F2, and Ccl-CYP82C2), a four-gene cluster (Ccl-CYP83B3, Ccl-CYP83B1, Ccl-CYP83B4, and Ccl-CYP83B2), and threethree-gene clusters (Ccl-CYP82C9, Ccl-CYP82C10, andCcl-CYP82C11), (Ccl-CYP71B14, Ccl-CYP89A5, and Ccl-CYP71B35) and (Ccl-CYP77B1, Ccl-CYP710A1, and Ccl-CYP83A4) are observed.

In the phylogenetic tree, 11 C. clementinaCYP genes, Ccl-CYP71B24, Ccl-CYP71D2, Ccl-CYP76F2, Ccl-CYP83C2,Ccl-CYP706B1, Ccl-CYP734B1, Ccl-CYP734A3, Ccl-CYP94C1, Ccl-CYP96A1, Ccl-CYP706A16, and Ccl-CYP86C1 are without orthologs and are found to be out-grouped (Figs. 1 and 2). EST search for the coding sequencesof these genes failed to identify corresponding ESTs inC. sinensis (Table S4). C. clementina genes Ccl-CYP82D6,Ccl-CYP82F7, Ccl-CYP98A9, Ccl-CYP72A8, Ccl-

Fig. 2 ML-based phylogenetic tree of non-A-type Cytochrome P450 proteins of C. clementina and dhC. sinensis “Ccl and dh Csi” represent CYPproteins of C. clementina and dh C. sinensis, respectively

Tree Genetics & Genomes (2014) 10:399–409 405

Page 8: Structural organization, classification and phylogenetic ...

CYP72C1, Ccl-CYP714A2, Ccl-CYP87A4, Ccl-CYP707A6,and Ccl-CYP82D5 showed out grouping in the phylogenetictree and their orthologs were not found in dh C. sinensis(Figs. 1 and 2). However, C. sinensis EST searches revealedcorresponding ESTs of more than 90 % homology for thesegenes (Table S4). Genes Ccl-CYP76F1, Ccl-CYP83B11, Ccl-CYP83A10, Ccl-CYP83A9, Ccl-CYP89A9, and Ccl-CYP706A12 are the duplicate genes of Ccl-CYP76D7, Ccl-CYP83B9, Ccl-CYP83A9, Ccl-CYP83C3, Ccl-CYP89A7,and Ccl-CYP706A14, respectively. Orthologs of CYP71A15,CYP71A21, CYP71A28, CYP94D3, CYP83A5, andCYP83B6 of C. clementina and dh C. sinensis are scatteredin the phylogenetic tree. Phylogenetic tree showed genesdhCsi-CYP81F4, dhCsi-CYP89A8, dhCsi-CYP88B1, anddhCsi-CYP88C1 of dh C. sinensis are out-grouped (Figs.1and 2). The corresponding orthologs/ESTs of C. clementinawere not detected. However, out-grouped gene dhCsi-CYP87A5 identified a corresponding EST of C. clementina.Phylogenetic tree also revealed dhCsi-CYP79A5 is the dupli-cate gene of dhCsi-CYP79A3P (Table S4). PCR analysis withC. sinensis genomic DNA using gene-specific primers ofout-grouped genes of C. clementina failed to amplify Ccl-CYP706A16 and Ccl-CYP706B1 corresponding genes.

The dh C. sinensis CYP proteins dhCsi-CYP71A27,dhCsi-CYP71C1, dhCsi-CYP85A1, and C. clementina CYPprotein Ccl-CYP85A1, showing 100 % homology to theenzymes involved in the biosynthesis of hesperidin andluteolin, are integrated in to the biosynthetic pathways (Table 4and Fig. 3).

Discussion

A total of 296 CYP genes in C. clementina, 202 inC. sinensis,and 272 in dh C. sinensiswere identified in the present studycorrespond to 0.81 to 1.18 % of protein coding genes of theirgenomes. This observation is in conformity with CYPsrepresenting 0.57 to 1.07 % of protein coding genes in various

plant species (Nelson et al. 2004; Nelson and Werck-Reichhart 2011; Guttikonda et al. 2010). More numbers ofCYP A-type genes (207/131/188) over CYP non-A-typegenes (89/71/84) in three genomes suggests rapid expansionof A-type genes over non-A-type in the Citrus genome. Theseresults are similar to our earlier reports with flax (Babu et al.2013) and castor (Kumar et al. 2013). Single family clansCYP 51, CYP 97, CYP710, CYP711, and CYP74, with a fewgenes, are plausibly ancient and may code for enzymes asso-ciated with pivotal metabolism, thereby limiting their diversi-fication. Earlier reports also indicated the presence oforthologs for CYP51, CYP97, CYP710, and CYP711 in greenalgae (Nelson 2006) which confirms their ancient nature.More numbers (137/93/149) of genes with single intron inA-type families over non-A-type (4/6/5) indicates the partic-ipation of single intron containing genes in gene amplificationand rapid evolution of paralogs with diversified functions(Babu et al. 2013). Lesser number of genes in non-A-typefamilies may be due to the prevalence of multiple introncontaining genes specifying essential functions thereby limit-ing their amplification in the Citrus genome. Non-A-typeCYP genes are ancient than A-type genes and their structuralorganization require more time for gene duplication and rear-rangement contributing to their slow evolution (Guttikondaet al. 2010). In the CYP83 family, Csi-CYP83B6 is the onlyintron less gene which exhibits 97 % nucleotide identity withthree introns containing gene Csi-CYP83B2, suggesting thatevolution of intron less gene either by reverse transcriptionfrom the transcript of intron containing gene or may be animproper assembly artifact as the study failed in identifyingthe foot prints of retroposon-mediated integration. Prevalenceof intron containing CYP genes (88.9 to 92.7 %) over intronless CYP genes (7.3 to 11.1 %) indicates CYP genes mighthave originated from an intron containing ancestral gene andintron less genes inCitrusmight have originated by the loss ofintrons. Prevalence of zero phase introns (68.5 to 73.6 %)suggest that the intron containing genes might have evolvedby the inclusion of scattered segments into a functional

Table 4 Citrus cytochromeP450s involved in the luteolin andhesperidin biosynthetic pathway

CYP protein Enzyme Homology % Genbank ID and reference

dhCsi-CYP71A27 1,2 Rhamnosyltransferase 100 AAL06646.2

(Bar-Peled et al.1991)dhCsi-CYP71C1 100

Ccl-CYP716A1 86

Ccl-CYP85A1 Flavone synthase I 100 AAP57393.1

(Martens et al. 2001)dhCsi-CYP85A1 100

Ccl-CYP71B16 88

Ccl-CYP76C5 88

Ccl-CYP76E1 86

dhCsi-CYP71B16 86

dhCsi-CYP85A1 Flavone synthase I 100 AAP57393.1

(Britsch et al. 1990)dhCsi-CYP71B16 88

406 Tree Genetics & Genomes (2014) 10:399–409

Page 9: Structural organization, classification and phylogenetic ...

transcription unit. Identification of few introns with non-conventional splice sites of GC…PyAG implicates that theserare introns might have originated recently. Presence of aconserved 93-bp GC…CAG intron in the CYP711A2 in allthree genomes and a corresponding EST (Genbank idCX673883.1) lacking this intron clearly demonstrates thefunctional nature of this intron with non-conventional splicesites. Equal number of genes observed in the clan CYP74 ofC. clementina and dh C. sinensis showed conserved structuralorganization indicating the pivotal role of these gene productsin the Citrus species. However, the failure of identifying theCYP74 clan in C. sinensismay be attributed to the large gapexisting in the quality, integrity and coverage of sequencebetween the C. clementina/dh C. sinensis and C. sinensisgenomes. C. clementina/dh C. sinensis genomes are quitegood with the scaffolds N50 of 6.8/1.69 Mb, while that ofC. sinensis is of only 0.25Mb.C. clementinaand dhC. sinensiswith three genes in the CYP74 clan probably executes thepathway of allene-oxide biosynthesis and may contribute tothe ability of resisting various pathogens. Transgenic rice linesover expressing OsAOS2 transgene encoding allene-oxidesynthase accumulated abundant OsAOS2 transcripts andhigher levels of Jasmonic acid and contributed to the enhancedactivation of pathogenesis-related genes resulting in increased

resistance to Magnaporthe grisea infection (Mei et al. 2006).Variable lengths of AAT tri-nucleotide repeat observed in theintron of CYP51G1 of clementine and sinensis indicates poly-morphism between the species and can serve as a molecularmarker for the identification of these species. In all threegenomes, multiple introns containing gene CYP97A3 is con-served with regard to its exons as well as introns except for thevariable length (2,702; 2,875; and 4,032 bp) of the seventhintron that suggests the contribution of introns in the expan-sion of genomes. The size variation recorded may also arisedue to the misassembly of sequences. However, themisassembly occurring in three genomes at same location isquite unlikely. Hence, it is presumed that the variable length ofthe seventh intron exists between the orthologs. The missing102-bp ninth exon in dh Csi-CYP97A3 might be amisassembly and is supported by the sweet orange ESTs(EY770031.1 and EY742825.1) containing this exon. Twentytwo CYP genes of dh C. sinensis were identified to haveplausible alternate splicing resulting in 59 differently splicedproducts. An EST (EY674371.1) corresponding to dhCsi-CYP97A3 gene includes 11th intron of 498 bp, while anotherEST (EY742825.1) lacking this intron represents alternatespliced product and supports the alternate splicing in Citrus.CYP718 gene is conserved in three genomes with two

Fig. 3 Citrus cytochrome P450 genes involved in the luteolin and hesperidin biosynthetic pathway

Tree Genetics & Genomes (2014) 10:399–409 407

Page 10: Structural organization, classification and phylogenetic ...

putative alternate translation initiation sites falling 120 nucle-otides apart, indicating certain CYP genes of Citrus producevariable length of proteins by employing alternate translationinitiation sites.

C. clementina genes Ccl-CYP82D6, Ccl-CYP82F7, Ccl-CYP98A9, Ccl-CYP72A8, Ccl-CYP72C1, Ccl-CYP714A2,Ccl-CYP87A4, Ccl-CYP707A6, and Ccl-CYP82D5 showedout grouping in the phylogenetic tree and their orthologs werenot detected in dh C. sinensis (Figs. 1 and 2). Further, ESTsearches of sweet orange revealed corresponding ESTs ofmore than 90 % homology for these genes indicating theirpresence in the dh C. sinensis genome. Similarly out-groupedgene dhCsi-CYP87A5 identified a corresponding EST ofC. clementina indicating its presence in C. clementina ge-nome. Orthologs of CYP71A15, CYP71A21, CYP71A28,CYP94D3, CYP83A5, and CYP83B6 of C. clementina anddhC. sinensisare scattered in the phylogenetic tree suggestingtheir divergent evolution. PCR analysis with C. sinensis ge-nomic DNA using gene-specific primers of out-grouped genesof C. clementina showed no amplification of orthologs corre-sponding to Ccl-CYP706A16 and Ccl-CYP706B1 genes, in-dicating their absence in the C. sinensisgenome. These resultsfurther confirm that Ccl-CYP706A16 and Ccl-CYP706B1genes are specific to C. clementina. As compared toC. clementina, less number of CYP genes observed in dhC. sinensis/C. sinensis is attributed to its clonal multiplicationand lesser diversity of its ancestors C. maxima. On the otherhand C. clementina is evolved through inter-specific hybridi-zation of ancestral mandarins, pomelos, and sweet orangesfollowed by rigorous selection for pest resistance and adapt-ability to diverse environmental conditions. Earlier studiesmentioned that modern cultivars of Citrus have inter specificorigin and contain genomes with mosaics of large DNAfragments originated from various basic taxa (Garcia-Loret al. 2012). Further, the C. maxima and C. reticulata genepools might have contributed to the cultivated Citrus species(Ollitrault et al. 2012). The classification of CYP genes ofCitrus species will be of great help in utilizing specificorthologs/paralogs for the Citrus improvement programs.

Conclusion

The study revealed a total of 296, 272, and 202 CYP genes inC. clementina, dhC. sinensis, andC. sinensis, respectively. Allthe genes are distributed in 43 families representing nineclans. The A-type CYP genes (207/188/131) in a single clanCYP71 are represented by 18 families. C. clementina and dhC. sinensis non-A-type CYP genes (89/84) in eight clans aredistributed into 25 families. Among A-type CYP genes, singleintron containing genes are predominant, while multiple in-tron containing genes represent the bulk of non-A- type CYPgenes. Majority of families (19) contained similar number of

genes in three genomes. In general, conserved gene organiza-tion was observed between three genomes.Majority of introns(1,660) from three genomes showed canonical GT-AG splicesites. However, 33 introns showed non-conventional GC…PyAG splice sites and functionality of these splice sites isconfirmed by the ESTs lacking this intron. In dh C. sinensis,22 genes were identified to have alternate splicing pattern.Scaffold examination in C. clementina revealed that majorityof the CitrusCYP genes are solitary and a few of them are inclusters of 3–8 genes. PCR amplification experiments withC. sinensis genomic DNA and gene-specific primers failed toamplify the two out-grouped genes Ccl-CYP706A16 and Ccl-CYP706B1, confirming that they are specific toC. clementina.Differential number of CYP genes observed between thegenomes ofC. clementinaand dhC. sinensismay be attributedto the variability among the parents representing ancestraltaxa.

Acknowledgments We gratefully acknowledge Prof. T. Papi Reddy,Former Head, Department of Genetics, Osmania University for criticalreading of the manuscript. Thanks are due to Dr. P. R. Babu, ResearchAssociate, Centre for Plant Molecular Biology, Osmania University forhis kind help.

Data archiving Accession numbers of all the CYPs reported in thestudy are provided in Table S1, Table S2, and Table S3 and can beretrieved from http://www.phytozome.net/and http://citrus.hzau.edu.cn/orange/download/data.php.

References

Babu PR, Rao KV, Reddy VD (2013) Structural organization and classi-fication of cytochrome P450 genes in flax (Linum usitatissimum L.).Gene 513:156–162

Bar-Peled M, Lewinsohn E, Fluhr R, Gressel J (1991) UDP-rhamnose:flavanone-7-O-glucoside-2-O-rhamnosyltransferase. Purificationand characterization of an enzyme catalyzing the production of bittercompounds in citrus. J Biol Chem 266(31):20953–20959

Britsch L (1990) Purification and characterization of flavone synthase I, a2-oxoglutarate-dependent desaturase. Arch Biochem Biophys282(1):152–160

Edgar RC (2004) MUSCLE: multiple sequence alignment with highaccuracy and high throughput. Nucleic Acids Res 32:1792–1797

Garcia-Lor A, Luro F, Navarro L, Ollitrault P (2012) Comparative use ofInDel and SSR markers in deciphering the interspecific structure ofcultivated citrus genetic diversity: a perspective for genetic associ-ation studies. Mol Genet Genomics 287(1):77–94

Gmitter FG, Chen C, Machado MA, de Souza AA, Ollitrault P,Froehlicher Y, Shimizu T (2012) Citrus genomics. Tree Genetics& Genomes 8:611–626

Guttikonda SK, Trupti J, Bisht NC, Chen H, An YQC, Pandey S, Xu D,Yu O (2010) Whole genome co-expression analysis of soybeancytochrome P450 genes identifies nodulation-specific P450monooxygenases. BMC Plant Biol 10:243

Hall BG (2013) Building phylogenetic trees from molecular data withMEGA. Mol Biol Evol 30:1229–1235

Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res12:656–664

408 Tree Genetics & Genomes (2014) 10:399–409

Page 11: Structural organization, classification and phylogenetic ...

Krug CA (1943) Chromosome numbers in the subfamily Arantioideae, withspecial reference in the genus Citrus. Citrus Bot Gaz 104:602–611

Kumar MS, Babu PR, Rao KV, Reddy VD (2013) Organization andclassification of cytochrome P450 genes in Castor (Ricinuscommunis L.). Proc Natl Acad Sci, India, Sect B Biol Sci.doi:10.1007/s40011-013-0192–8

Liu Y, Heying E, Sherry A, Tanumihardjo SA (2012) History, globaldistribution, and nutritional importance of citrus fruits.Comprehensive Reviews in Food Science and Food Safety 11:530–545

Malik SK, Rohini MR, Kumar S, Choudhary R, Pal D, Chaudhury R(2012) Assessment of genetic diversity in Sweet Orange [Citrussinensis (L.) Osbeck] cultivars of India using morphological andRAPD markers. Agric Res 1(4):317–324

Martens S, Forkmann G,Matern U, Lukacin R (2001) Cloning of parsleyflavone synthase I. Phytochemistry 58(1):43–46

Mei C, QiM, Sheng G, Yang Y (2006) Inducible overexpression of a riceallene oxide synthase gene increases the endogenous jasmonic acidlevel, PR gene expression, and host resistance to fungal infection.Mol Plant Microbe Interact 19:1127–1137

Nelson DR (2006) Plant cytochrome P450s from moss to poplar.Phytochem Rev 5:193–204

Nelson D, Werck-Reichhart D (2011) A P450-centric view of plantevolution. Plant J 66:194–211

Nelson DR, Schuler MA, Paquette SM, Werck-Reichhart D, Bak S(2004) Comparative genomics of Oryza sativa and Arabidopsis

thaliana. Analysis of 727 Cytochrome P450 genes and pseudogenesfrom a monocot and a dicot. Plant Physiol 135:756–772

Ollitrault P, Terol J, Chen C, Federici CT, Lotfy S, Hippolyte I, OllitraultF, Bérard A, Chauveau A, Cuenca J, Costantino G, Kacar Y, Mu L,Garcia-Lor A, Froelicher Y, Aleza P, Boland A, Billot C, Navarro L,Luro F, Roose ML, Gmitter FG, Talon M, Brunel D (2012) Areference genetic map of C. clementina hort. exTan. citrus evolutioninferences from comparative mapping. BMC Genomics 13:593

Phytozome (2011) Haploid Clementine Genome, International CitrusGenome Consortium, http://int-citrusgenomics.org/, http://www.phytozome.net/clementine

Schuler MA, Werck-Reichhart D (2003) Functional genomics of P450s.Annu Rev Plant Biol 54:629–667

Scora RW (1975) On the history and origin of Citrus. Bull Torrey BotClub 102:369–375

Smigocki AC, Wilson D (2004) Pest and disease resistance enhanced byheterologous suppression of a Nicotiana plumbaginifolia cyto-chrome P450 gene CYP72A2. Biotechnol Lett 26:1809–1814

Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S (2011)MEGA5:molecular evolutionary genetics analysis using maximumlikelihood, evolutionary distance, and maximum parsimonymethods. Mol Biol Evol 28:2731–2739

Xu Q, Chen LL, Ruan X, Chen D, Zhu A, Chen C, Denis B, Wen-Biao J,Bao-Hai H, LyonMP et al (2013) The draft genome of sweet orange(Citrus sinensis). Nat Genet 45:59–66

Tree Genetics & Genomes (2014) 10:399–409 409