2013 iPlant workshop Marcela Karey Monaco
Dec 21, 2015
2013
What is Gramene?
• An integrated plant reference genome resource
• Comparative genomics hub of data & tools for the analysis and visualization of plant genomes data informed by evolutionary histories
• Genome sequence, gene annotations, genetic & structural variants, metabolic & regulatory pathways
2013
Gramene offers…
• Genome sequence & gene annotations– 28 complete & 9 partial reference genome assemblies
2013
Gramene Release 39• 28 complete reference genomes•15 Monocots• 9 Eudicots• 4 Lower plants• 9 partial assemblies
Fully sequenced reference genomes in Gramene
2013
Gramene offers…
• Genome sequence & gene annotations– 28 complete & 9 partial reference genome assemblies– Variation data for 9 plant varieties
2013
Variation in Gramene Ensembl Browsers
Species Variants Source Studies
Oryza sativa ssp japonica 3,332,525160K SNPs x 20 accessions1311 SNPs x 395 accessionsNCBI dbSNP
McNally et al. (2009). PNAS 106:12273-12278Zhao et al. 2010. PLoS ONE. 5:e10780
Oryza sativa ssp indica 4,747,883 NCBI dbSNP
Zea mays 50,719,843HapMap1: NAM founder linesHapMap2: pre-domesticated & domesticated lines
Gore et al. 2009. Science 326:1115-1117. Chia et al. 2013. Genet 44:803-807.
Arabidopsis thaliana 14,234,197SV: 13,667
250K SNPs x 1179 accessions1001 genomes project: 411 resequenced accessions
Atwell et al. 2010. Nature. 465:627-631.
Brachypodium distachyon 327,988 2 accessions of Brachypodium sylvaticum
Fox et al. (2013) Applications in Plant Sciences 1(3):1200011. 2013
Vitis vinifera 457,404 Resequencing USDA germplasm collection Myles S, et al. 2010. PLoS ONE. 5:e8219.
Hordeum vulgare 12,994,003 Resequencing 4 accessions plus wild barely
The International Barley Genome Sequencing Consortium. 2012. Nature 491, 711–716
Oryza glaberrima 7,704,409 Resequenced 20 accessions African rice & wild progenitor Oryza Genome Evolution project
Sorghum bicolor SV: 64,507 Structural variants from Database of Genomic Variants archive (dGVA) Zheng et al. 2011. Genome Biol.. 12:R114.
2013
Gramene offers…
• Genome sequence & gene annotations– 28 complete & 9 partial reference genome assemblies– Variation data for 9 plant varieties
• Comparative analysis– Phylogenetic gene trees => Ortholog predictions– 56 whole genome alignments– 20 synteny maps
2013
Gramene offers…
• Genome sequence & gene annotations• Comparative analysis• New functionalities
– Highlight gene trees with InterPro & GO – Variant Effect Predictor service: user loads VCF– Upload private data and view in browser (BAM, VCF, GFF) – Assembly converter service
2013
Gramene offers…
• Genome sequence & gene annotations• Comparative analysis• Tools & services
– Genome browser– GrameneMart– FTP site– Public MySQL
2013
GrameneMart
Use case: Find transcription factors having “stop_gained” alleles
• Custom queries for data mining• Gene-based & SNP-based queries• Map, markers, and QTL data
2013
FTP site & MySQL serverFTP site for bulk downloads:
ftp://ftp.gramene.org/pub/gramene/
> mysql -hgramenedb.gramene.org -pgramene
2013
Gramene offers…
• Genome sequence & gene annotations• Comparative analysis• Tools & services• Metabolic & regulatory pathways
– BioCyc platform– Plant Reactome
2013
BioCyc-formatted Pathway DBs
Developed by Developed by GrameneGramene
Developed by Developed by GrameneGramene
2013
Gibberellin Biosynthesis I (non C-3, non C-13 hydroxylation)
Manual Curation?Evidence / SupportManual Curation?
Evidence / Support
EnzymeEnzyme
Zoom up to chemical structures
Zoom up to chemical structures
Compound /
Metabolite
Compound /
Metabolite
2013
Object Comparison Organism Selection page
Comparison of a Metabolic Pathway Across Species
Click to visit pathway
Click to visit pathway
Click to visit the Gene/
gene product
Click to visit the Gene/
gene product
Click to visit the reactionClick to visit the reaction
2013
Pathways• BioCyc databases• Plant Reactome
– 129 rice pathways – Arabidopsis, maize, other plant models
2013
Synergy between Gramene & iPlant
• Genetic variation data submission– VCF formatting of data for import into Ensembl Variation
repository
• Representation of the Pan-Genome– Large computational cost of genetic (k-mer) mapping
• BioCyc-formatted databases to be served from an iPlant virtual server
2013SAB 2013
AcknowledgementsDoreen Ware, PI (USDA ARS, CSHL)Yinping Jiao, Sunita Kumari, Zhenyuan Lu, Marcela K. Monaco, Andrew Olson, Shiran Pasternak, Joshua Stein, Jim Thomason, Sharon Wei, Ken Youens-Clark, Bo Wang, Liya Wang Pankaj Jaiswal, Co-PI (OSU) Vindhya Amarasinghe, Palitha Dharmawardhana, Sushma Naithani, Justin Preece
Paul Kersey / Helen Parkinson (EMBL-EBI)Dan Bolser, Arnaud Kerhornou, Dan Staines, Brandon Walts / Nuno Fonseca, Maria Keays, Robert Petrysyk, Eleanor Williams
Lincoln Stein (OICR) Robin Haw; Peter D’Eustachio (NYU); Guanming Wu; David Croft (EBI)
Crispin Taylor (ASPB)Patty Lockhart