Top Banner
Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD www.ebi.ac.uk/eva
24

Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Jan 18, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive

Gary Saunders, PhD

www.ebi.ac.uk/eva

Page 2: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Agenda

• Overview of European Variation Archive (EVA)

• EVA model of data sharing

• Summary of how we share genetic variation data

• Merging open-access variation datasets

Page 3: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

European Variation Archive – EVA (Eva)

• Curated genetic variation data sharing & analysis platform

• All types of variation:

• SNVs, MNVs, small indels and structural variation

• Germ line, somatic, within / cross population, potentially between speciesSingle portal for open access variation data

Page 4: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

EVA Data Sharing Model

Submitter Archived at EBI

Sample(s) Methodology Genome

EVA

Page 5: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

EVA Data Sharing Model

Submitter Archived at EBI

Sample(s) Methodology Genome

EVA Publication

Collaborators

Wider Study Data

Stable POA Credit for reuse

Page 6: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

EVA: Study Browser

• Core EVA functionality: portal to open-access genetic variation project data (VCF files):

Page 7: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

EVA: Study Browser – project pages

• Core EVA functionality: portal to open-access genetic variation project data (VCF files):

Page 8: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

EVA: Study Browser – assessing data quality• Core EVA functionality: portal to open-access

genetic variation project data (VCF files):

Page 9: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Submission to EVA

• Minimal or data-rich submissions are accepted

• Collaborative process

• Submitter recognition

• Hold date

• Links to runs / experiments / analyses

• Accession number in 48 hours

• EVA has a dynamic study loading pipeline

• Online documentation

[email protected]

Page 10: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Rate of Submission to EVA

Non-human

Total

March 2014 October 2015

1 billion

Page 11: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Merging Open-Access Datasets

Page 12: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

share data

Merging Open-Access Datasets

Page 13: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Data submitters

share data

Merging Open-Access Datasets

Page 14: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Data submitters

share data

Merging Open-Access Datasets

Page 15: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Merging Open-Access Datasets

Page 16: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Merging Open-Access Datasets

Page 17: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Conclusion

European Variation Archivewww.ebi.ac.uk/eva

• Open-access genetic variation archive

• Curated resource

• All types of variants

• All species

• Simplified submission system

Page 18: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

FundingEVAJustin Paschall

Ignacio Medina Castello

Gary Saunders

Cristina Yenyxe Gonzalez

Jag Kandasamy

Ilkka Lappalainen

EGAJeff Almeida-King

Vasudev Kumanduri

Saif Ur-Rehman

Tom Smith

AcknowledgmentsEnsembl VariationFiona Cunningham

Sarah Hunt

William McLaren

Anja Thormann

Laurent Gil

ENARasko Leinonen

Rajesh Radhakrishnan

Daniel Vaughan @ebivariation

www.ebi.ac.uk/eva

Page 19: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Case-study: deCODE

• Variation data from 2000 Icelanders

• VCF files

• Novel samples and metadata, custom reference genome

• Hold until publication

Page 20: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Case-study: deCODE

Page 21: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Case-study: deCODE

Page 22: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Variant Call Format (VCF): The Community Standard

• Most VCF validation tools do not truly conform to specification:

• Of all ~250 Human VCFs loaded to EVA < 10% were truly valid in first pass

• (EVA has publicized comprehensive C++ VCF validator that raises errors and warnings)

Most VCFs publically available are not truly valid

Page 23: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

Sharing Genetic Variation Data

Page 24: Sharing Genetic Variation Data via EMBL-EBI: The European Variation Archive Gary Saunders, PhD .

• Data accuracy

• Metadata

• Links to associated data

• Credit to data generator(s) for reuse

PROBLEMS

Sharing Genetic Variation Data