–Page 1 of 39– Sequencing of plant genomes - A review Mine Türktaş 1 , Kuaybe Yücebilgili Kurtoğlu 2 , Gabriel Dorado 3 , Baohong Zhang 4 , Pilar Hernandez 5 , Turgay Unver 1 * 1 Cankiri Karatekin University, Faculty of Science, Department of Biology, Cankiri, Turkey 2 Marmara University, Faculty of Arts and Science, Department of Biology, Istanbul, Turkey 3 Dep. Bioquímica y Biología Molecular, Campus Rabanales C6-1-E17, Campus de Excelencia Internacional Agroalimentario, Universidad de Córdoba, 14071 Córdoba, Spain 4 Department of Biology, East Carolina University, Greenville, NC 27858, United States of America 5 Instituto de Agricultura Sostenible (IAS-CSIC), Alameda del Obispo s/n, 14080 Córdoba, Spain *corresponding author Turgay Unver Faculty of Science, Department of Biology, Cankiri Karatekin University, 18100, Cankiri, Turkey Email: [email protected],[email protected]Tel: 0090376 218 95 40 Fax: 0090376 218 95 41
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
in a quick and rather inexpensive way (Henry et al., 2014). Nucleotide polymorphism and 4
copy-number variant detection utilizing this method have been conducted in another research 5
on the switchgrass Panicum virgatum (Evans et al., 2014). In this study, a total of 1,395,501 6
SNP and 8,173 putative copy-number variants were detected. Hence, the applicability of 7
exome-capture for genomic variation studies in polyploid species with large, repetitive and 8
heterozygous genomes was shown. In a similar study carried out in hexaploid wheat (T. 9
aestivum), a total of 10,251 SNP markers were developed employing targeted re-sequencing 10
of the wheat exome to produce large genomic data for eight varieties. These exome-based 11
SNP markers provide a prominent source, especially for wheat breeders. (Allen et al., 2013). 12
5. Sequenced plant genomes 13
Along with the breakthrough in sequencing technology, there has been a great accumulation 14
of genome-sequence data of plant species (Figure 1). The application of the new sequencing 15
technologies to plant genomes gave rise to rapid improvements in crop science. Genomic-16
sequence availability and easy access to such data enabled researches to discover and develop 17
genetic markers, improve breeding and reveal evolutionary relationships between the 18
sequenced species via comparative genomic analysis in general and synteny approaches in 19
particular. Currently, bread wheat (Triticum aestivum var. Chinese spring, 2n = 6x = 42) 20
which is a major staple food with a ~700-million tone annual-production 21
<http://www.fao.org> is being sequenced by the International Wheat Genome Sequencing 22
Consortium (IWGSC), adopting a chromosome-by-chromosome approach. Due to the huge-23
–Page 20 of 39–
size and complex nature of the wheat genome (17 Gbp, AABBDD) researchers have sorted 1
chromosomes and performed synteny with model grass genomes (Choulet et al. 2014). 2
Much effort has been carried out elucidating the genomic backgrounds, in order to improve 3
grain yield and quality against some of the limiting factors, such as biotic and abiotic stresses. 4
Thus, the 454 pyrosequencing was used to survey individual chromosomes (Vitulo et al. 5
2011, Hernandez et al 2012, Poursarebani et al. 2014, Sergeeva et al. 2014). Recently, a bread 6
wheat (T. aestivum) genome-draft has been obtained by Illumina sequencing of the flow-7
sorted chromosomes (IWGSC. 2014) and was simultaneously published with the first wheat-8
chromosome (3B) reference sequence (Choulet et al. 2014). Comparative gene-analyses of 9
wheat subgenomes and extant diploid and tetraploid wheat relatives showed that both a high 10
sequence-similarity and a structural conservation are retained, with limited gene-loss after 11
polyploidization. The study showed evidence of dynamic gene-gain, -loss, and -duplication 12
across the genomes. Such alterations would have a critical role in wheat adaptation in a 13
diverse set of climatic conditions (Langridge, 2012). 14
Before the bread wheat genome draft, the draft genome sequences of two progenitors of the 15
hexaploid wheat had been simultaneously published: Triticum urartu and Aegilops tauschii 16
(Jia et al., 2013; Ling et al., 2013). Triticum urartu (AA, 2n = 2x = 14), the progenitor of the 17
A genome of wheat (Chantret et al., 2005; Dvorak and Akhunov, 2005) was sequenced on the 18
Illumina platform using whole-genome shotgun strategy, resulting in 448.49 Gbp high-19
quality sequence data corresponding to ~91x coverage of an estimated 4.94 Gbp genome size. 20
Additionally, a total of 34,879 protein-coding gene models were predicted using 21
transcriptome-sequence data obtained from the same study (Ling et al., 2013). Additionally, 22
Aegilops tauschii (DD, 2n = 2x = 14) was sequenced using the same Illumina whole-genome 23
shotgun strategy. Jia and others generated 398 Gbp of high-quality reads (90x coverage), 24
–Page 21 of 39–
representing 97% of the 4.36 Gbp genome size. A 117 Mb transcriptome assembly was 1
generated from RNA-Seq data obtained from different tissues and used to predict 34,498 2
high-confidence protein-coding loci (Jia et al., 2013). The data revealed in these articles 3
identified genes that are of agronomical importance, such as resistance to abiotic stresses and 4
nutritious quality. Hence, these developments help to understand the environmental 5
adaptation of wheat, together with its genomic nature. Additionally, the strategy developed 6
for genome sequencing and assembly of wheat could be also adapted to other large and 7
complex plant-genomes as well. 8
On the other hand, cotton, as one of the most economically important crops for the textile 9
industry, was another genome sequenced with the new technologies. Wang and others 10
published a draft genome of Gossypium raimondii (2n = 2x = 26), a putative D-genome 11
donor, employing an Illumina paired-end sequencing strategy. A total of 78.7 Gbp Illumina 12
reads were produced, with a 103.6x genome coverage. The draft sequence was 775.2 Mbp, 13
counting for 88.1% of the estimated genome size. Combining ab initio predictions, homology 14
searches and EST alignment methods, a total of 40,976 protein-coding genes were identified 15
and 92.2% of them were supported by transcriptome-sequencing data. Comparative analysis 16
with T. cacao, A. thaliana and Zea mays showed that G. raimondii contains a high proportion 17
of transposable elements and a lower gene density than the other species, although they all 18
have a similar number of gene families. Another finding of this study revealed the 19
evolutionary relationships between G. raimondii and T. cacao, which probably diverged 33.7 20
million years ago. The authors also claimed that these both draft sequences will both serve as 21
a reference for the assembly of the tetraploid G. hirsutum genome and as a useful source for 22
genetic improvement of cotton quality and yield (Wang et al., 2012a). 23
–Page 22 of 39–
Sugar beet (Beta vulgaris) is another important crop, which substantially contributes to 1
world-wide sugar production. In 2013, the reference genome sequence of this species was 2
released, representing 85% of its 576 Mbp genome size. A combination of 454, Illumina and 3
Sanger sequencing platforms were utilized in this study. In total, 27,421 protein-coding genes 4
were identified and evidenced by RNA-Seq data. Based on intraspecific genomic analysis of 5
five different sugar-beet species, 7 million genomic variants have been identified, together 6
with large constant regions. The availability of the sugar-beet genome enables the discovery 7
of agronomically-important traits that may increase the quality and productivity of the plant. 8
The genome sequences would also contribute to comparative studies with Caryophyllales and 9
other flowering plants (Dohm et al., 2014). 10
Conifers, as the largest division of gymnosperms, have had widespread distribution in forests 11
for almost 200 million years (Nystedt et al., 2013). Besides the economic value of conifers as 12
a source of timber, they are of great ecological importance, since a high proportion of plant 13
photosynthesis is met by these woody plants. However, genomic studies on conifers require 14
much effort, due to their huge-genome size and repetitive nature. In a recent study, de novo 15
sequencing of the coniferous tree Norway spruce (Picea abies) has been performed using the 16
Illumina technology, following a whole-genome shotgun approach. A hierarchical genome-17
assembly strategy was developed to combine haploid and diploid genomic and RNA-Seq 18
data. The genome size of P. abies is estimated as 19.6 Gbp. On the contrary, only 28,354 19
high-confidence protein-coding sequences were predicted from EST and transcriptome data, 20
which is similar to the almost 40-times smaller sugar-beet genome. In this case, the large 21
genome size was interpreted as a result of the accumulation of transposable elements (TE); 22
especially, long-terminal repeats (LTR), due to the possibility of lacking an efficient 23
elimination-mechanism. Furthermore, a model for conifer-genome evolution has been 24
proposed, which suggests that the TE removal is less active than in most of other plant 25
–Page 23 of 39–
species (Bennetzen et al., 2005), with TE insertions into genes resulting in large introns and 1
pseudogenes (Nystedt et al., 2013). Additional conifer-species genome sequencing would 2
enable comparative analyses and provide further resources to understand the evolution of 3
important traits for seed plants. 4
Additionally, Eucalyptus is one of the most widespread trees, with more than 20 million 5
hectares of land planted throughout the world. This noteworthy diversity and adaptability of 6
eucalyptus can be exploited as a sustainable energy source, mostly providing cellulose for the 7
paper industry. Myburg et al (2014) have sequenced and assembled a reference sequence for 8
Eucalyptus grandis. They used Sanger WGS, paired BAC-end sequencing and a high-density 9
genetic linkage map (Myburg et al., 2014). The E. grandis genome size was estimated to be 10
640 Mbp, and 36,376 protein-coding loci were predicted. For further gene-expression 11
analyses, RNA-Seq reads were obtained from diverse sets of E. grandis tissues by Illumina 12
sequencing. This is the first reference-genome published for the Myrtales eudicot order, 13
providing a resource to gain insights about the genetic nature of large woody perennials. 14
Tobacco (Nicotiana tabacum, 2n = 4x = 48) is a widely cultivated non-food crop used as a 15
model organism in molecular plant studies (Zhang et al., 2011b). In a recent study, three 16
inbred varieties were sequenced using an Illumina WGS approach. Estimated genome sizes 17
were reported as 4.41 Gbp for N. tabacum TN90, 4.60 Gbp for N. tabacum K326 and 4.57 18
Gbp for N. tabacum BX (with 49x, 38x and 29x coverage, respectively). Based on next-19
generation sequencing transcriptome data, protein-coding sequences ranging from 81,000 to 20
94,000 were identified in the three varieties. The N gene and va allele responsible for the 21
hypersensitive response to the tobacco-mosaic virus and potyvirus were also investigated in 22
these lines. The authors foresaw that the draft genomes should significantly contribute to 23
functional genomic studies on the N. tabacum model-organism (Sierro et al., 2014). 24
–Page 24 of 39–
Watermelon (Citrullus lanatus) is one of the most consumed fresh fruits, with a 90-million 1
tone annual-production. A high-quality draft genome sequence has been published recently. 2
De novo sequencing was generated utilizing the Illumina platform, resulting in 46.18 Gbp 3
reads, corresponding to 108.6x coverage of an estimated 425 Mbp genome size of this 4
species. Subsequently, a total of 23,440 protein-coding genes were identified using ab initio 5
predictions, cDNA/EST- and homology-mapping methods. Furthermore, 20 watermelon 6
accessions were resequenced following the paired-end Illumina strategy. Among them 7
6,784,860 candidate SNP and 965,006 small indels were identified, representing a germplasm 8
biodiversity that can contribute to the species plant breeding. Additionally, the comparative 9
analyses of the transcriptome data should contribute to the understanding of the genetic 10
diversity and molecular mechanisms underlying some biological processes in watermelon 11
populations. Thus, the evolutionary scenario proposed in this study should shed light on the 12
genetic backgrounds of the modern cultivars (Guo et al., 2013). 13
In addition to the draft and reference genomes mentioned above, more than 50 plant species 14
have been sequenced so far, as listed in Table 2 and Figure 2. 15
In conclusion; NGS has becoming a powerful tool for decoding the entire genome of a plant 16
species as well investigating gene expression profiles and SNPs. As techniques developed, 17
more sequencing strategies will be formed, selecting and comparing the different NGS 18
platforms will be challenge. In the past years, more than 50 plant species have ben sequenced 19
that provide a new resources for plant improvement. However, more bioinformatics tools 20
need to develop for better fishing the data generated from the NGS. Sequencing the genome 21
is not the purpose; the final goal should be using this genome to improve crop yield and 22
quality and better understanding the evolution history. 23
6. Future perspectives 24
–Page 25 of 39–
Many new de novo and resequenced plant genomes are expected in the near future for plants 1
in general and crop species in particular, using the second- and mostly third-generation 2
sequencing platforms. Further work is needed to complete the biggest and most complex 3
genome drafts, while achieving high-quality reference sequences for most plant genomes. 4
This genome knowledge will be coupled with deep gene-expression analyses (RNA-Seq and 5
true RNA sequencing), uncovering alternative splicing, copy-number variations (CNV), etc. 6
ChIP-Seq and microRNA-Seq availability for an increasing number of crops will further 7
expand the emerging field of epigenomics. They are all necessary tools to face food 8
production and security in a climate-changing scenario. 9
Acknowledgements. MT and TU were funded by Scientific and Research Council of Turkey 10
“TÜBİTAK” with grant numbers 111O036, 112O502 and, 113O016. PH and GD were 11
funded by “Ministerio de Economía y Competitividad” (MINECO grants AGL2010-17316 12
and BIO2011-15237-E) and “Instituto Nacional de Investigación y Tecnología Agraria y 13
Alimentaria” (MINECO and INIA RF2012-00002-C02-02); “Consejería de Agricultura y 14
Pesca” (041/C/2007, 75/C/2009 and 56/C/2010), “Consejería de Economía, Innovación y 15
Ciencia” (P11-AGR-7322 and P12-AGR-0482) and “Grupo PAI” (AGR-248) of “Junta de 16
Andalucía”; and “Universidad de Córdoba” (“Ayuda a Grupos”), Spain. 17
18
–Page 26 of 39–
1
References 2
Ahmad R, Parfitt D, Fass J, Ogundiwin E, Dhingra A, Gradziel T, Lin D, Joshi N, Martinez-‐Garcia P, 3 Crisosto C (2011). Whole genome sequencing of peach (Prunus persica L.) for SNP 4 identification and selection. BMC Genomics 12:569. 5
Al-‐Dous EK, George B, Al-‐Mahmoud ME, Al-‐Jaber MY, Wang H, Salameh YM, Al-‐Azwani EK, Chaluvadi 6 S, Pontaroli AC, DeBarry J et al. (2011). De novo genome sequencing and comparative 7 genomics of date palm (Phoenix dactylifera). Nat Biotech 29:521-‐527. 8
Allen AM, Barker GLA, Wilkinson P, Burridge A, Winfield M, Coghill J, Uauy C, Griffiths S, Jack P, Berry 9 S et al. (2013). Discovery and development of exome-‐based, co-‐dominant single nucleotide 10 polymorphism markers in hexaploid wheat (Triticum aestivum L.). Plant Biotechnology 11 Journal 11:279-‐295. doi:10.1111/pbi.12009. 12
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990). Basic local alignment search tool. J Mol 13 Biol 215:403-‐410. doi:10.1016/s0022-‐2836(05)80360-‐2. 14
Andries K, Verhasselt P, Guillemont J, Gohlmann HW, Neefs JM, Winkler H, Van Gestel J, Timmerman 15 P, Zhu M, Lee E et al. (2005). A diarylquinoline drug active on the ATP synthase of 16 Mycobacterium tuberculosis. Science 307 :223-‐227. doi:10.1126/science.1106753. 17
Angeloni F, Wagemaker C, Jetten M, Op den Camp H, JANSSEN-‐MEGENS E, FRANCOIJS KJ, 18 Stunnenberg H, Ouborg N (2011). De novo transcriptome characterization and development 19 of genomic tools for Scabiosa columbaria L. using next-‐generation sequencing techniques. 20 Molecular Ecology Resources 11:662-‐674. 21
Argout X, Salse J, Aury J-‐M, Guiltinan MJ, Droc G, Gouzy J, Allegre M, Chaparro C, Legavre T, 22 Maximova SN et al. (2011). The genome of Theobroma cacao. Nat Genet 43:101-‐108. 23
Bao S, Jiang R, Kwan W, Wang B, Ma X, Song YQ (2011). Evaluation of next-‐generation sequencing 24 software in mapping and assembly. J Hum Genet 56:406-‐414. doi:10.1038/jhg.2011.43. 25
Bao Z, Eddy SR (2002). Automated de novo identification of repeat sequence families in sequenced 26 genomes. Genome Res 12:1269-‐1276. doi:10.1101/gr.88502. 27
Batzoglou S, Jaffe DB, Stanley K, Butler J, Gnerre S, Mauceli E, Berger B, Mesirov JP, Lander ES (2002). 28 ARACHNE: a whole-‐genome shotgun assembler. Genome Res 12:177-‐189. 29 doi:10.1101/gr.208902. 30
Bennetzen JL, Ma J, Devos KM (2005). Mechanisms of Recent Genome Size Variation in Flowering 31 Plants. Annals of Botany 95:127-‐132. doi:10.1093/aob/mci008. 32
Benson G (1999). Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27 33 :573-‐580. 34
Bergman CM, Quesneville H (2007). Discovering and detecting transposable elements in genome 35 sequences. Brief Bioinform 8:382-‐392. doi:10.1093/bib/bbm048. 36
Bolger A, Scossa F, Bolger ME, Lanz C, Maumus F, Tohge T, Quesneville H, Alseekh S, Sørensen I, 37 Lichtenstein G (2014). The genome of the stress-‐tolerant wild tomato species Solanum 38 pennellii. Nature genetics 46:1034-‐1038. 39
Bombarely A, Rosli HG, Vrebalov J, Moffett P, Mueller LA, Martin GB (2012). A draft genome 40 sequence of Nicotiana benthamiana to enhance molecular plant-‐microbe biology research. 41 Molecular Plant-‐Microbe Interactions 25:1523-‐1530. 42
Brenchley R, Spannagl M, Pfeifer M, Barker GL, D'Amore R, Allen AM, McKenzie N, Kramer M, 43 Kerhornou A, Bolser D et al. (2012). Analysis of the bread wheat genome using whole-‐44 genome shotgun sequencing. Nature 491:705-‐710. doi:10.1038/nature11650. 45
Cahill MJ, Koser CU, Ross NE, Archer JA (2010). Read length and repeat resolution: exploring 46 prokaryote genomes using next-‐generation sequencing technologies. PLoS One 5:e11518. 47 doi:10.1371/journal.pone.0011518. 48
–Page 27 of 39–
Cantu D, Vanzetti LS, Sumner A, Dubcovsky M, Matvienko M, Distelfeld A, Michelmore RW, 1 Dubcovsky J (2010). Small RNAs, DNA methylation and transposable elements in wheat. BMC 2 Genomics 11:408. doi:10.1186/1471-‐2164-‐11-‐408. 3
Chalhoub B, Denoeud F, Liu S, Parkin IA, Tang H, Wang X, Chiquet J, Belcram H, Tong C, Samans B 4 (2014). Early allopolyploid evolution in the post-‐Neolithic Brassica napus oilseed genome. 5 Science 345:950-‐953. 6
Chantret N, Salse J, Sabot F, Rahman S, Bellec A, Laubin B, Dubois I, Dossat C, Sourdille P, Joudrier P 7 (2005). Molecular basis of evolutionary events that shaped the hardness locus in diploid and 8 polyploid wheat species (Triticum and Aegilops). The Plant Cell Online 17:1033-‐1045. 9
Chen J, Huang Q, Gao D, Wang J, Lang Y, Liu T, Li B, Bai Z, Goicoechea JL, Liang C (2013). Whole-‐10 genome sequencing of Oryza brachyantha reveals mechanisms underlying Oryza genome 11 evolution. Nature communications 4:1595. 12
Consortium TG (2012). The tomato genome sequence provides insights into fleshy fruit evolution. 13 Nature 485:635-‐641. 14
D’Hont A, Denoeud F, Aury J-‐M, Baurens F-‐C, Carreel F, Garsmeur O, Noel B, Bocs S, Droc G, Rouard 15 M (2012). The banana (Musa acuminata) genome and the evolution of monocotyledonous 16 plants. Nature 488:213-‐217. 17
Dassanayake M, Oh D-‐H, Haas JS, Hernandez A, Hong H, Ali S, Yun D-‐J, Bressan RA, Zhu J-‐K, Bohnert 18 HJ (2011). The genome of the extremophile crucifer Thellungiella parvula. Nature genetics 19 43:913-‐918. 20
Der JP, Barker MS, Wickett NJ, Wolf PG (2011). De novo characterization of the gametophyte 21 transcriptome in bracken fern, Pteridium aquilinum. BMC genomics 12:99. 22
Diaz D, Esteban FJ, Hernandez P, Caballero JA, Guevara A, Dorado G, Galvez S (2014). MC64-‐23 ClustalWP2: a highly-‐parallel hybrid strategy to align multiple sequences in many-‐core 24 architectures. PLoS One 9:e94044. doi:10.1371/journal.pone.0094044. 25
Dohm JC, Minoche AE, Holtgrawe D, Capella-‐Gutierrez S, Zakrzewski F, Tafer H, Rupp O, Sorensen TR, 26 Stracke R, Reinhardt R et al. (2014). The genome of the recently domesticated crop plant 27 sugar beet (Beta vulgaris). Nature 505:546-‐549. doi:10.1038/nature12817. 28
Dolezel J, Kubalakova M, Paux E, Bartos J, Feuillet C (2007). Chromosome-‐based genomics in the 29 cereals. Chromosome Res 15:51-‐66. doi:10.1007/s10577-‐006-‐1106-‐x. 30
Dvorak J, Akhunov ED (2005). Tempos of gene locus deletions and duplications and their relationship 31 to recombination rate during diploid and polyploid evolution in the Aegilops-‐Triticum 32 alliance. Genetics 171:323-‐332. 33
Eldem V, Celikkol Akcay U, Ozhuner E, Bakir Y, Uranbey S, Unver T (2012). Genome-‐Wide 34 Identification of miRNAs Responsive to Drought in Peach (Prunus persica) by High-‐35 Throughput Deep Sequencing. PLoS One 7:e50298. doi:10.1371/journal.pone.0050298. 36
Evans J, Kim J, Childs KL, Vaillancourt B, Crisovan E, Nandety A, Gerhardt DJ, Richmond TA, Jeddeloh 37 JA, Kaeppler SM et al. (2014). Nucleotide polymorphism and copy number variant detection 38 using exome capture and next-‐generation sequencing in the polyploid grass Panicum 39 virgatum. The Plant Journal:n/a-‐n/a. doi:10.1111/tpj.12601. 40
Feuillet C, Leach JE, Rogers J, Schnable PS, Eversole K (2011). Crop genome sequencing: lessons and 41 rationales. Trends Plant Sci 16:77-‐88. doi:10.1016/j.tplants.2010.10.005. 42
Flutre T, Duprat E, Feuillet C, Quesneville H (2011). Considering transposable element diversification 43 in de novo annotation approaches. PLoS One 6:e16526. doi:10.1371/journal.pone.0016526. 44
Franssen SU, Gu J, Bergmann N, Winters G, Klostermeier UC, Rosenstiel P, Bornberg-‐Bauer E, Reusch 45 TBH (2011a). Transcriptomic resilience to global warming in the seagrass Zostera marina, a 46 marine foundation species. Proceedings of the National Academy of Sciences 108:19276-‐47 19281. doi:10.1073/pnas.1107680108. 48
–Page 28 of 39–
Franssen SU, Shrestha RP, Bräutigam A, Bornberg-‐Bauer E, Weber AP (2011b). Comprehensive 1 transcriptome analysis of the highly complex Pisum sativum genome using next generation 2 sequencing. BMC genomics 12:227. 3
Galvez S, Diaz D, Hernandez P, Esteban FJ, Caballero JA, Dorado G (2010). Next-‐generation 4 bioinformatics: using many-‐core processor architecture to develop a web service for 5 sequence alignment. Bioinformatics 26:683-‐686. doi:10.1093/bioinformatics/btq017. 6
Garcia-‐Mas J, Benjak A, Sanseverino W, Bourgeois M, Mir G, González VM, Hénaff E, Câmara F, 7 Cozzuto L, Lowy E (2012). The genome of melon (Cucumis melo L.). Proceedings of the 8 National Academy of Sciences 109:11872-‐11877. 9
Góngora-‐Castillo E, Fedewa G, Yeo Y, Chappell J, DellaPenna D, Buell CR (2012). Genomic approaches 10 for interrogating the biochemistry of medicinal plant species. Methods in enzymology 11 517:139. 12
Gonnella G, Kurtz S (2012). Readjoiner: a fast and memory efficient string graph-‐based sequence 13 assembler. BMC Bioinformatics 13:82. doi:10.1186/1471-‐2105-‐13-‐82. 14
Guo S, Zhang J, Sun H, Salse J, Lucas WJ, Zhang H, Zheng Y, Mao L, Ren Y, Wang Z et al. (2013). The 15 draft genome of watermelon (Citrullus lanatus) and resequencing of 20 diverse accessions. 16 Nat Genet 45:51-‐58. 17
Gupta OP, Permar V, Koundal V, Singh UD, Praveen S (2012). MicroRNA regulated defense responses 18 in Triticum aestivum L. during Puccinia graminis f.sp. tritici infection. Mol Biol Rep 39:817-‐19 824. doi:10.1007/s11033-‐011-‐0803-‐5. 20
Haiminen N, Feltus FA, Parida L (2011). Assessing pooled BAC and whole genome shotgun strategies 21 for assembly of complex genomes. BMC Genomics 12:194. doi:10.1186/1471-‐2164-‐12-‐194. 22
Havlak P, Chen R, Durbin KJ, Egan A, Ren Y, Song XZ, Weinstock GM, Gibbs RA (2004). The Atlas 23 genome assembly system. Genome Res 14:721-‐732. doi:10.1101/gr.2264004. 24
He N, Zhang C, Qi X, Zhao S, Tao Y, Yang G, Lee T-‐H, Wang X, Cai Q, Li D et al. (2013). Draft genome 25 sequence of the mulberry tree Morus notabilis. Nat Commun 4. doi:10.1038/ncomms3445. 26
Henry IM, Nagalakshmi U, Lieberman MC, Ngo KJ, Krasileva KV, Vasquez-‐Gross H, Akhunova A, 27 Akhunov E, Dubcovsky J, Tai TH et al. (2014). Efficient Genome-‐Wide Detection and 28 Cataloging of EMS-‐Induced Mutations Using Exome Capture and Next-‐Generation 29 Sequencing. The Plant Cell Online 26:1382-‐1397. doi:10.1105/tpc.113.121590. 30
Hernandez D, Francois P, Farinelli L, Osteras M, Schrenzel J (2008). De novo bacterial genome 31 sequencing: millions of very short reads assembled on a desktop computer. Genome Res 32 18:802-‐809. doi:10.1101/gr.072033.107. 33
Hernandez P, Martis M, Dorado G, Pfeifer M, Galvez S, Schaaf S, Jouve N, Simkova H, Valarik M, 34 Dolezel J et al. (2012). Next-‐generation sequencing and syntenic integration of flow-‐sorted 35 arms of wheat chromosome 4A exposes the chromosome structure and gene content. Plant 36 J 69:377-‐386. doi:10.1111/j.1365-‐313X.2011.04808.x. 37
Hirsch CN, Robin Buell C (2013). Tapping the promise of genomics in species with complex, 38 nonmodel genomes. Annual review of plant biology 64:89-‐110. 39
Huang S, Li R, Zhang Z, Li L, Gu X, Fan W, Lucas WJ, Wang X, Xie B, Ni P et al. (2009). The genome of 40 the cucumber, Cucumis sativus L. Nat Genet 41:1275-‐1281. doi:10.1038/ng.475. 41
Huang X, Madan A (1999). CAP3: A DNA sequence assembly program. Genome Res 9:868-‐877 42 Huang X, Yang SP (2005). Generating a genome assembly with PCAP. Curr Protoc Bioinformatics 43
Chapter 11:Unit11.13. doi:10.1002/0471250953.bi1103s11. 44 Ibarra-‐Laclette E, Lyons E, Hernández-‐Guzmán G, Pérez-‐Torres CA, Carretero-‐Paulet L, Chang T-‐H, 45
Lan T, Welch AJ, Juárez MJA, Simpson J (2013). Architecture and evolution of a minute plant 46 genome. Nature 498 :94-‐98. 47
Imelfort M, Edwards D (2009). De novo sequencing of plant genomes using second-‐generation 48 technologies. Brief Bioinform 10:609-‐618. doi:10.1093/bib/bbp039. 49
International Brachypodium I (2010). Genome sequencing and analysis of the model grass 50 Brachypodium distachyon. Nature 463 :763-‐768. doi:10.1038/nature08747. 51
–Page 29 of 39–
IWGSC TIWGSC (2014). A chromosome-‐based draft sequence of the hexaploid bread wheat (Triticum 1 aestivum) genome. Science 345. doi:10.1126/science.1251788. 2
Jain M (2012). Next-‐generation sequencing technologies for gene expression profiling in plants. 3 Briefings in functional genomics 11:63-‐70. 4
Jeck WR, Reinhardt JA, Baltrus DA, Hickenbotham MT, Magrini V, Mardis ER, Dangl JL, Jones CD 5 (2007). Extending assembly of short DNA sequences to handle error. Bioinformatics 23 6 :2942-‐2944. doi:10.1093/bioinformatics/btm451. 7
Jia J, Zhao S, Kong X, Li Y, Zhao G, He W, Appels R, Pfeifer M, Tao Y, Zhang X et al. (2013). Aegilops 8 tauschii draft genome sequence reveals a gene repertoire for wheat adaptation. Nature 496 9 :91-‐95. doi:10.1038/nature12028. 10
Kaufmann K, Muino JM, Osteras M, Farinelli L, Krajewski P, Angenent GC (2010). Chromatin 11 immunoprecipitation (ChIP) of plant transcription factors followed by sequencing (ChIP-‐SEQ) 12 or hybridization to whole genome arrays (ChIP-‐CHIP). Nat Protocols 5:457-‐472. 13
Kenan-‐Eichler M, Leshkowitz D, Tal L, Noor E, Melamed-‐Bessudo C, Feldman M, Levy AA (2011). 14 Wheat Hybridization and Polyploidization Results in Deregulation of Small RNAs. Genetics 15 188:263-‐272. doi:10.1534/genetics.111.128348. 16
Kent WJ (2002). BLAT-‐-‐the BLAST-‐like alignment tool. Genome Res 12:656-‐664. 17 doi:10.1101/gr.229202. Article published online before March 2002. 18
Kim S, Park M, Yeom S-‐I, Kim Y-‐M, Lee JM, Lee H-‐A, Seo E, Choi J, Cheong K, Kim K-‐T (2014). Genome 19 sequence of the hot pepper provides insights into the evolution of pungency in Capsicum 20 species. Nature genetics. 21
Koenig D, Jiménez-‐Gómez JM, Kimura S, Fulop D, Chitwood DH, Headland LR, Kumar R, Covington 22 MF, Devisetty UK, Tat AV (2013). Comparative transcriptomics reveals patterns of selection 23 in domesticated and wild tomato. Proceedings of the National Academy of Sciences 24 110:E2655-‐E2662. 25
Krishnan NM, Pattnaik S, Jain P, Gaur P, Choudhary R, Vaidyanathan S, Deepak S, Hariharan AK, 26 Krishna PB, Nair J (2012). A draft of the genome and four transcriptomes of a medicinal and 27 pesticidal angiosperm Azadirachta indica. BMC genomics 13:464. 28
Kurtoglu KY KM, Lucas SJ, Budak H (2013). Unique and Conserved MicroRNAs in Wheat Chromosome 29 5D Revealed by Next-‐Generation Sequencing. PLoS ONE 8:e69801. 30
Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R (2001). REPuter: the 31 manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res 29:4633-‐4642. 32
Langmead B, Trapnell C, Pop M, Salzberg SL (2009). Ultrafast and memory-‐efficient alignment of 33 short DNA sequences to the human genome. Genome Biol 10:R25. doi:10.1186/gb-‐2009-‐10-‐34 3-‐r25. 35
Langridge P (2012). Genomics: Decoding our daily bread. Nature 491:678-‐680. 36 Leaungthitikanchana S, Fujibe T, Tanaka M, Wang S, Sotta N, Takano J, Fujiwara T (2013). Differential 37
expression of three BOR1 genes corresponding to different genomes in response to boron 38 conditions in hexaploid wheat (Triticum aestivum L.). Plant and Cell Physiology 54 :1056-‐39 1063. 40
Lerat E (2010). Identifying repeats and transposable elements in sequenced genomes: how to find 41 your way through the dense forest of programs. Heredity (Edinb) 104:520-‐533. 42 doi:10.1038/hdy.2009.165. 43
Li H, Durbin R (2009). Fast and accurate short read alignment with Burrows-‐Wheeler transform. 44 Bioinformatics 25:1754-‐1760. doi:10.1093/bioinformatics/btp324. 45
Li Y-‐F, Zheng Y, Jagadeeswaran G, Sunkar R (2013). Characterization of small RNAs and their target 46 genes in wheat seedlings using sequencing-‐based approaches. Plant Science 203–204:17-‐24. 47
Ling H-‐Q, Zhao S, Liu D, Wang J, Sun H, Zhang C, Fan H, Li D, Dong L, Tao Y (2013). Draft genome of 48 the wheat A-‐genome progenitor Triticum urartu. Nature 496:87-‐90. 49
Liu L, Li Y, Li S, Hu N, He Y, Pong R, Lin D, Lu L, Law M (2012). Comparison of next-‐generation 50 sequencing systems. J Biomed Biotechnol 2012:251364. doi:10.1155/2012/251364. 51
–Page 30 of 39–
Llaca V (2012). Sequencing Technologies and Their Use in Plant Biotechnology and Breeding. DNA 1 sequencing–methods and applications:35. 2
Marguerat S, Bähler J (2010). RNA-‐seq: from technology to biology. Cellular and molecular life 3 sciences 67:569-‐579. 4
Metzker ML (2009). Sequencing technologies—the next generation. Nature Reviews Genetics 11 :31-‐5 46. 6
Ming R, Hou S, Feng Y, Yu Q, Dionne-‐Laporte A, Saw JH, Senin P, Wang W, Ly BV, Lewis KL et al. 7 (2008). The draft genome of the transgenic tropical fruit tree papaya (Carica papaya 8 Linnaeus). Nature 452:991-‐996. doi:10.1038/nature06856. 9
Ming R, VanBuren R, Liu Y, Yang M, Han Y, Li L-‐T, Zhang Q, Kim M-‐J, Schatz MC, Campbell M (2013). 10 Genome of the long-‐living sacred lotus (Nelumbo nucifera Gaertn.). Genome biology 14 11 :R41. 12
Mullikin JC, Ning Z (2003). The phusion assembler. Genome Res 13 (1):81-‐90. doi:10.1101/gr.731003. 13 Myburg AA, Grattapaglia D, Tuskan GA, Hellsten U, Hayes RD, Grimwood J, Jenkins J, Lindquist E, Tice 14
H, Bauer D et al. (2014). The genome of Eucalyptus grandis. Nature 510:356-‐362. 15 doi:10.1038/nature13308. 16
Myers EW (2005). The fragment assembly string graph. Bioinformatics 21 Suppl 2:ii79-‐85. 17 doi:10.1093/bioinformatics/bti1114. 18
Narzisi G, Mishra B (2011). Comparing de novo genome assembly: the long and short of it. PLoS One 19 6 :e19175. doi:10.1371/journal.pone.0019175. 20
Novaes E, Drost DR, Farmerie WG, Pappas GJ, Grattapaglia D, Sederoff RR, Kirst M (2008). High-‐21 throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC 22 genomics 9:312. 23
Nystedt B, Street NR, Wetterbom A, Zuccolo A, Lin Y-‐C, Scofield DG, Vezzi F, Delhomme N, 24 Giacomello S, Alexeyenko A et al. (2013). The Norway spruce genome sequence and conifer 25 genome evolution. Nature 497 (7451):579-‐584. doi:10.1038/nature12211. 26
Park PJ (2009). ChIP-‐seq: advantages and challenges of a maturing technology. Nat Rev Genet 10 27 :669-‐680. doi:10.1038/nrg2641. 28
Paszkiewicz K, Studholme DJ (2010). De novo assembly of short sequence reads. Brief Bioinform 11 29 :457-‐472. doi:10.1093/bib/bbq020. 30
Paux E, Sourdille P, Salse J, Saintenac C, Choulet F, Leroy P, Korol A, Michalak M, Kianian S, 31 Spielmeyer W et al. (2008). A physical map of the 1-‐gigabase bread wheat chromosome 3B. 32 Science 322:101-‐104. doi:10.1126/science.1161847. 33
Peng Z, Lu Y, Li L, Zhao Q, Feng Q, Gao Z, Lu H, Hu T, Yao N, Liu K et al. (2013). The draft genome of 34 the fast-‐growing non-‐timber forest species moso bamboo (Phyllostachys heterocycla). Nat 35 Genet 45:456-‐461. 36
Pevzner PA, Tang H, Waterman MS (2001). An Eulerian path approach to DNA fragment assembly. 37 Proc Natl Acad Sci U S A 98:9748-‐9753. doi:10.1073/pnas.171285098. 38
Potato Genome Sequencing C, Xu X, Pan S, Cheng S, Zhang B, Mu D, Ni P, Zhang G, Yang S, Li R et al. 39 (2011). Genome sequence and analysis of the tuber crop potato. Nature 475:189-‐195. 40 doi:10.1038/nature10158. 41
Price AL, Jones NC, Pevzner PA (2005). De novo identification of repeat families in large genomes. 42 Bioinformatics 21 Suppl 1:i351-‐358. doi:10.1093/bioinformatics/bti1018. 43
Prochnik S, Marri PR, Desany B, Rabinowicz PD, Kodira C, Mohiuddin M, Rodriguez F, Fauquet C, 44 Tohme J, Harkins T (2012). The cassava genome: current progress, future directions. Tropical 45 plant biology 5:88-‐94. 46
Rahman AYA, Usharraj A, Misra B, Thottathil G, Jayasekaran K, Feng Y, Hou S, Ong SY, Ng FL, Lee LS et 47 al. (2013). Draft genome sequence of the rubber tree Hevea brasiliensis. BMC Genomics 14 48 :75. 49
–Page 31 of 39–
Sato S, Hirakawa H, Isobe S, Fukai E, Watanabe A, Kato M, Kawashima K, Minami C, Muraki A, 1 Nakazaki N et al. (2010). Sequence Analysis of the Genome of an Oil-‐Bearing Tree, Jatropha 2 curcas L. DNA Research. doi:10.1093/dnares/dsq030. 3
Schatz MC, Delcher AL, Salzberg SL (2010). Assembly of large genomes using second-‐generation 4 sequencing. Genome Res 20 :1165-‐1173. doi:10.1101/gr.101360.109. 5
Scheibye-‐Alsing K, Hoffmann S, Frankel A, Jensen P, Stadler PF, Mang Y, Tommerup N, Gilchrist MJ, 6 Nygard AB, Cirera S et al. (2009). Sequence assembly. Comput Biol Chem 33 (2):121-‐136. 7 doi:10.1016/j.compbiolchem.2008.11.003. 8
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J et 9 al. (2010). Genome sequence of the palaeopolyploid soybean. Nature 463:178-‐183. 10 doi:10.1038/nature08670. 11
Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA et 12 al. (2009). The B73 maize genome: complexity, diversity, and dynamics. Science 326 :1112-‐13 1115. doi:10.1126/science.1178534. 14
Schneeberger K (2014). Using next-‐generation sequencing to isolate mutant genes from forward 15 genetic screens. Nature reviews Genetics advance online publication. doi:10.1038/nrg3745. 16
Shamimuzzaman M, Vodkin L (2013). Genome-‐wide identification of binding sites for NAC and 17 YABBY transcription factors and co-‐regulated genes during soybean seedling development 18 by ChIP-‐Seq and RNA-‐Seq. BMC Genomics 14:477. 19
Shulaev V, Sargent DJ, Crowhurst RN, Mockler TC, Folkerts O, Delcher AL, Jaiswal P, Mockaitis K, 20 Liston A, Mane SP (2011). The genome of woodland strawberry (Fragaria vesca). Nature 21 genetics 43:109-‐116. 22
Sierro N, Battey JN, Ouadi S, Bovet L, Goepfert S, Bakaher N, Peitsch MC, Ivanov NV (2013). 23 Reference genomes and transcriptomes of Nicotiana sylvestris and Nicotiana 24 tomentosiformis. Genome biology 14:R60. 25
Sierro N, Battey JND, Ouadi S, Bakaher N, Bovet L, Willig A, Goepfert S, Peitsch MC, Ivanov NV 26 (2014). The tobacco genome sequence and its comparison with those of tomato and potato. 27 Nature communications 5. doi:10.1038/ncomms4833. 28
Simpson JT, Durbin R (2012). Efficient de novo assembly of large genomes using compressed data 29 structures. Genome Res 22:549-‐556. doi:10.1101/gr.126953.111. 30
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I (2009). ABySS: a parallel assembler for 31 short read sequence data. Genome Res 19:1117-‐1123. doi:10.1101/gr.089532.108. 32
Singh R, Ong-‐Abdullah M, Low E-‐TL, Manaf MAA, Rosli R, Nookiah R, Ooi LC-‐L, Ooi S-‐E, Chan K-‐L, 33 Halim MA et al. (2013). Oil palm genome sequence reveals divergence of interfertile species 34 in Old and New worlds. Nature 500:335-‐339. doi:10.1038/nature12309. 35
Smaczniak C, Immink RGH, Muiño JM, Blanvillain R, Busscher M, Busscher-‐Lange J, Dinh QD, Liu S, 36 Westphal AH, Boeren S et al. (2012). Characterization of MADS-‐domain transcription factor 37 complexes in Arabidopsis flower development. Proceedings of the National Academy of 38 Sciences 109:1560-‐1565. doi:10.1073/pnas.1112871109. 39
Staton SE, Bakken BH, Blackman BK, Chapman MA, Kane NC, Tang S, Ungerer MC, Knapp SJ, 40 Rieseberg LH, Burke JM (2012). The sunflower (Helianthus annuus L.) genome reflects a 41 recent history of biased accumulation of transposable elements. The Plant Journal 72 :142-‐42 153. 43
Strickler SR, Bombarely A, Mueller LA (2012). Designing a transcriptome next-‐generation sequencing 44 project for a nonmodel plant species1. American journal of botany 99 :257-‐266. 45
Tang Z, Zhang L, Xu C, Yuan S, Zhang F, Zheng Y, Zhao C (2012). Uncovering Small RNA-‐Mediated 46 Responses to Cold Stress in a Wheat Thermosensitive Genic Male-‐Sterile Line by Deep 47 Sequencing. Plant Physiology 159 :721-‐738. doi:10.1104/pp.112.196048. 48
–Page 32 of 39–
Taudien S, Steuernagel B, Ariyadasa R, Schulte D, Schmutzer T, Groth M, Felder M, Petzold A, Scholz 1 U, Mayer KF et al. (2011). Sequencing of BAC pools by different next generation sequencing 2 platforms and strategies. BMC Res Notes 4:411. doi:10.1186/1756-‐0500-‐4-‐411. 3
Tomato Genome C (2012). The tomato genome sequence provides insights into fleshy fruit 4 evolution. Nature 485:635-‐641. doi:10.1038/nature11119. 5
Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, 6 Salamov A et al. (2006). The genome of black cottonwood, Populus trichocarpa (Torr. & 7 Gray). Science 313:1596-‐1604. doi:10.1126/science.1128691. 8
van Bakel H, Stout J, Cote A, Tallon C, Sharpe A, Hughes T, Page J (2011). The draft genome and 9 transcriptome of Cannabis sativa. Genome Biology 12 :R102. 10
Varshney RK, Chen W, Li Y, Bharti AK, Saxena RK, Schlueter JA, Donoghue MTA, Azam S, Fan G, 11 Whaley AM et al. (2012). Draft genome sequence of pigeonpea (Cajanus cajan), an orphan 12 legume crop of resource-‐poor farmers. Nat Biotech 30:83-‐89. doi:10.1038/nbt.2022. 13
Varshney RK, Nayak SN, May GD, Jackson SA (2009). Next-‐generation sequencing technologies and 14 their implications for crop genetics and breeding. Trends in biotechnology 27 :522-‐530. 15
Varshney RK, Song C, Saxena RK, Azam S, Yu S, Sharpe AG, Cannon S, Baek J, Rosen BD, Tar'an B 16 (2013). Draft genome sequence of chickpea (Cicer arietinum) provides a resource for trait 17 improvement. Nature biotechnology 31:240-‐246. 18
Vaucheret H (2006). Post-‐transcriptional small RNA pathways in plants: mechanisms and regulations. 19 Genes & Development 20:759-‐771. 20
Velasco R, Zharkikh A, Affourtit J, Dhingra A, Cestaro A, Kalyanaraman A, Fontana P, Bhatnagar SK, 21 Troggio M, Pruss D et al. (2010). The genome of the domesticated apple (Malus x domestica 22 Borkh.). Nat Genet 42:833-‐839. doi:10.1038/ng.654. 23
Wang K, Wang Z, Li F, Ye W, Wang J, Song G, Yue Z, Cong L, Shang H, Zhu S et al. (2012a). The draft 24 genome of a diploid cotton Gossypium raimondii. Nat Genet 44 :1098-‐1103. 25
Wang N, Thomson M, Bodles WJA, Crawford RMM, Hunt HV, Featherstone AW, Pellicer J, Buggs RJA 26 (2013). Genome sequence of dwarf birch (Betula nana) and cross-‐species RAD markers. 27 Molecular Ecology 22:3098-‐3111. doi:10.1111/mec.12131. 28
Wang S, Wang X, He Q, Liu X, Xu W, Li L, Gao J, Wang F (2012b). Transcriptome analysis of the roots 29 at early and late seedling stages using Illumina paired-‐end sequencing and development of 30 EST-‐SSR markers in radish. Plant Cell Rep 31:1437-‐1447. doi:10.1007/s00299-‐012-‐1259-‐3. 31
Wang X, Wang H, Wang J, Sun R, Wu J, Liu S, Bai Y, Mun J-‐H, Bancroft I, Cheng F et al. (2011). The 32 genome of the mesopolyploid crop species Brassica rapa. Nat Genet 43:1035-‐1039. 33
Wang Z, Fang B, Chen J, Zhang X, Luo Z, Huang L, Chen X, Li Y (2010). De novo assembly and 34 characterization of root transcriptome using Illumina paired-‐end sequencing and 35 development of cSSR markers in sweet potato (Ipomoea batatas). BMC Genomics 11:726. 36 doi:10.1186/1471-‐2164-‐11-‐726. 37
Wang Z, Gerstein M, Snyder M (2009). RNA-‐Seq: a revolutionary tool for transcriptomics. Nature 38 Reviews Genetics 10:57-‐63. 39
Wang Z, Hobson N, Galindo L, Zhu S, Shi D, McDill J, Yang L, Hawkins S, Neutelings G, Datla R et al. 40 (2012c). The genome of flax (Linum usitatissimum) assembled de novo from short shotgun 41 sequence reads. The Plant Journal 72:461-‐473. doi:10.1111/j.1365-‐313X.2012.05093.x. 42
Warren RL, Sutton GG, Jones SJ, Holt RA (2007). Assembling millions of short DNA sequences using 43 SSAKE. Bioinformatics 23:500-‐501. doi:10.1093/bioinformatics/btl629. 44
Wold B, Myers RM (2008). Sequence census methods for functional genomics. Nat Meth 5 (1):19-‐21. 45 Wu GA, Prochnik S, Jenkins J, Salse J, Hellsten U, Murat F, Perrier X, Ruiz M, Scalabrin S, Terol J 46
(2014). Sequencing of diverse mandarin, pummelo and orange genomes reveals complex 47 history of admixture during citrus domestication. Nature biotechnology 2:656-‐62. 48
Wu J, Wang Z, Shi Z, Zhang S, Ming R, Zhu S, Khan MA, Tao S, Korban SS, Wang H (2013). The genome 49 of the pear (Pyrus bretschneideri Rehd.). Genome research 23:396-‐408. 50
–Page 33 of 39–
Xu Q, Chen L-‐L, Ruan X, Chen D, Zhu A, Chen C, Bertrand D, Jiao W-‐B, Hao B-‐H, Lyon MP (2013). The 1 draft genome of sweet orange (Citrus sinensis). Nature genetics 45:59-‐66. 2
Xu X, Pan S, Cheng S, Zhang B, Mu D, Ni P, Zhang G, Yang S, Li R, Wang J et al. (2011). Genome 3 sequence and analysis of the tuber crop potato. Nature 475:189-‐195. 4 doi:10.1038/nature10158. 5
Yanik H, Turktas M, Dundar E, Hernandez P, Dorado G, Unver T (2013). Genome-‐wide identification 6 of alternate bearing-‐associated microRNAs (miRNAs) in olive (Olea europaea L.). BMC plant 7 biology 13:10. doi:10.1186/1471-‐2229-‐13-‐10. 8
Yao Y, Sun Q (2012). Exploration of small non coding RNAs in wheat (Triticum aestivum L.). Plant Mol 9 Biol 80:67-‐73. doi:10.1007/s11103-‐011-‐9835-‐4. 10
Young ND, Debelle F, Oldroyd GED, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KFX, 11 Gouzy J, Schoof H et al. (2011). The Medicago genome provides insight into the evolution of 12 rhizobial symbioses. Nature 480:520-‐524. 13
Zerbino DR, Birney E (2008). Velvet: algorithms for de novo short read assembly using de Bruijn 14 graphs. Genome Res 18 :821-‐829. doi:10.1101/gr.074492.107. 15
Zhang G, Liu X, Quan Z, Cheng S, Xu X, Pan S, Xie M, Zeng P, Yue Z, Wang W (2012a). Genome 16 sequence of foxtail millet (Setaria italica) provides insights into grass evolution and biofuel 17 potential. Nature biotechnology 30 (6):549-‐554. 18
Zhang J, Chiodini R, Badr A, Zhang G (2011a). The impact of next-‐generation sequencing on 19 genomics. J Genet Genomics 38 (3):95-‐109. doi:10.1016/j.jgg.2011.02.003. 20
Zhang J, Liu J, Ming R (2014). Genomic analyses of the CAM plant pineapple. Journal of experimental 21 botany:eru101. 22
Zhang J, Zhang Y, Du Y, Chen S, Tang H (2011b). Dynamic metabonomic responses of tobacco 23 (Nicotiana tabacum) plants to salt stress. J Proteome Res 10:1904-‐1914. 24 doi:10.1021/pr101140n. 25
Zhang Q, Chen W, Sun L, Zhao F, Huang B, Yang W, Tao Y, Wang J, Yuan Z, Fan G et al. (2012b). The 26 genome of Prunus mume. Nature communications 3:1318. 27