Top Banner
Johannes Griss [email protected] PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for A Simple Data Format for Proteomics Results
12

Johannes Griss [email protected] PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Dec 13, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011EBI is an Outstation of the European Molecular Biology Laboratory.

mzTabProposal for A Simple Data Format for Proteomics Results

Page 2: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

Current Situation

• The necessity of standard data formats has become generally accepted

• Proteomics techniques are constantly evolving• Proposed standard formats had to become very complex

to adequately capture proteomics data• mzIdentML for identification data• mzQuantML for quantitative data

• An effective use of these data formats requires sophisticated bioinformatic knowledge

• Many researchers are still used to use MS Excel to “look” at their data

Page 3: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

Communication of Proteomics Results

• Proteomics resources require a mechanism to simply/efficiently exchange basic proteomics results

• Collaboration with colleagues from other scientific fields is increasingly important• Necessity to share proteomics results with researchers outside of

proteomics

• Need to make proteomics data easily accessible

Page 4: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

Potential Current Problems

• Currently proposed standard formats are difficult to use without the JAVA APIs

• “Complete” standard formats are too complex and big to quickly share the essential results

• Quick, f.e. Perl scripts for specific research questions are not easily possible• Large amount of potential innovation could be lost

• Reading files requires special software• Further processing of the data (f.e. with statistical) tools is not easily

possible• No standard tools to read / write mz*ML files available• Custom built software required for many use cases otherwise fulfilled by

“Excel & friends”

Page 5: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

mzTab - Aim

• To provide a simple and efficient way of exchanging proteomics data• Which protein / peptide was identified in a given experimental

setting

• Easy to update and maintain• Easy to use by the proteomics community, systems

biologists as well as providers of knowledge bases

Page 6: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

mzTab – Target Audience

• Proteomics repositories (f.e. PRIDE, PeptideAtlas) • Knowledge base resources (f.e. UniProt, HPRD)• Researchers outside of proteomics• Researchers analyzing proteomics data with limited

bioinformatic knowledge / support

Page 7: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

mzTab – proposed concept

• A tab-delimited file format• Goals

• Content should be “readable” using MS Excel• Should contain minimal information for proteomics repositories /

knowledge bases to exchange data• Data should be easily accessible using f.e. scripting languages• One file should be able to contain multiple experiments / proteins from

different resources• Aim: To represent the result of a query to f.e. PRIDE using this

format• Provide a simplisitic summary of proteomics results

• Every entry contains a reference to the source data (in mzIdentML / mzQuantML format)

Page 8: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

mzTab – proposed concept

• What the format does NOT aim at:• Replace mzIdentML or mzQuantML• Contain the complete data of a proteomics experiment• Provide detailed evidence for the data• Allow a researcher to recreate the process which led to the

results• Be requirements conform (MIAPE, journal guidelines, etc.)• In short: be complete in any way

Page 9: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

mzTab – Possible Format Specification

• Three sections• (Optional) Metdata section• (Required) Protein section• (Optional) Peptide section

• Can report proteomics data at different levels• Single experiments• Multiple (possibly linked) experiments• Data generated as a result to a query (possibly to multiple

resources)

Page 10: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

mzTab – Metadata Section

----metadataPRIDE_16649-title: The Synaptic Proteome during

Development and Plasticity of the Mouse Visual CortexPRIDE_16649-species: [NEWT, 10090, Mouse,]PRIDE_16649-tissue: [EFO, EFO:0000916, visual cortex,]PRIDE_16649-instrument[1]-type: [MS, MS:1000287, TOF-

MS,]PRIDE_16649-search_engine: [MS, MS:1001207, Mascot, ]PRIDE_16649-contact[1]-name: August B SmitPRIDE_16649-contact[1]-email: [email protected]_16649-url:

http://www.ebi.ac.uk/pride/q.do?accession=16649----END

Page 11: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

mzTab – Protein Section

----proteinsAccession … reliability peptides …

ambiguity_membersP12345 4 2

P12346,P123457…´----END

• A Table holding the basic identification information• Suggestions of how to include

• quantitative data• multiple search engine scores• ambiguous modification positions

Page 12: Johannes Griss jgriss@ebi.ac.uk PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.

Johannes [email protected]

PSI MeetingHeidelberg, April 2011

mzTab – Peptide Table

----peptidessequence accession unit unique … reliability …DIIL O00160 PRIDE_3381 false 5 …VESVDL O00160 PRIDE_3381 true 4 …----END

• A Table holding the basic peptide information

• Suggestions of how to include • quantitative data• multiple search engine scores• ambiguous modification positions