Top Banner
MS (and NMR) data standards in Metabolomics why, how and some caveats Steffen Neumann Leibniz Institute of Plant Biochemistry ScienceCampus Halle (WCH) June 23, 2014 S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014
27

MS (and NMR) data standards in Metabolomics why, how and some caveats

Sep 05, 2014

Download

Science

Steffen Neumann

Presentation help at the Metabolomics Scoiety meeting 2014 in Tsuruoka on netCDF, mzData, mzML and nmrML for Metabolomics
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MS (and NMR) data standards in Metabolomics why, how and some caveats

MS (and NMR) data standards in Metabolomicswhy, how and some caveats

Steffen Neumann

Leibniz Institute of Plant BiochemistryScienceCampus Halle (WCH)

June 23, 2014

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 2: MS (and NMR) data standards in Metabolomics why, how and some caveats

Metabolomics – The Pipeline

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 3: MS (and NMR) data standards in Metabolomics why, how and some caveats

IPB machine Park

Data processing from

LC-QqTOF-MS:QStar Pulsar i, microTOF Q

Bruker Apex (FTICR)HCT Ultra (IT-MS, CID+ETD)Reflex III (Maldi-TOF)

Thermo FinniganQuantum Ultra AM, LCQ Deca XP

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 4: MS (and NMR) data standards in Metabolomics why, how and some caveats

netCDF: Grandfather is still alive

netCDF as file format, ANDI-MS as content specificationfine for GC/MS and simple LC/MSwidely supported in software and programming languagesno mix of MS and MS/MSvery poor metadataDefined in Standard: “ASTM E1947 – 98(2009)Standard Specification for Analytical Data Inter-change Protocol for Chromatographic Data”available for only $42

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 5: MS (and NMR) data standards in Metabolomics why, how and some caveats

netCDF in action: I

� �1 dataset_completeness = "C1+C2" ;2 ms_template_revision = "1.0.1" ;3 dataset_origin = "PE−SCIEX" ;4 experiment_date_time_stamp = "20050928190327+0100" ;5 operator_name = "SYSTEM" ;6 source_file_reference = "d: \tt4_4_1. wiff " ;7 source_file_format = "PE−SCIEX Wiff version 1" ;8 experiment_type = "Continuum Mass Spectrum" ;9 test_separation_type = "Normal Phase Liquid Chromatography" ;

10 test_ms_inlet = "Electrospray Inlet " ;11 test_ms_inlet_temperature = 20.f ;12 test_ionization_mode = "Electrospray Ionization" ;13 test_ionization_polarity = "Positive Polarity " ;14 test_detector_type = "Electron Multiplier " ;15 test_resolution_type = "Constant Resolution" ;16 test_scan_function = "Mass Scan" ;17 test_scan_direction = "Up" ;18 test_scan_law = "Linear" ;19 actual_run_time_length = 3480.54 ;� �

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 6: MS (and NMR) data standards in Metabolomics why, how and some caveats

netCDF in action: II

� �1

2 scan_acquisition_time = 2.00100016593933, 4.00200033187866,3 6.00100040435791, 8.00100040435791, 10.0040006637573, 12.0020008087158,4 14.0020008087158, 16.0020008087158, 18.0040016174316, 20.0020008087158,5

6 total_intensity = 10541, 10640, 10697, 10455, 10707, 10554, 10612, 10434,7 10738, 10504, 10567, 10646, 10675, 10660, 10676, 10638, 10498, 10581,8 10655, 10843, 10650, 10703, 10792, 10667, 10564, 10732, 10613, 10766,9

10 mass_values = 106.0288, 106.038, 106.0564, 106.061, 106.0656, 106.0702,11 106.0748, 106.0794, 106.0931, 106.9725, 106.9771, 106.9817, 106.9863,12 106.9909, 106.9955, 107.0001, 107.0047, 107.0094, 107.014, 107.0324,� �

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 7: MS (and NMR) data standards in Metabolomics why, how and some caveats

More metadata in XML: mzData

Proteome Standards InitiativeRaw / Measurement Data:

Mass Spec EquipmentSoftware(Raw) PeaksIsolation windows,collision energies, . . .

Vendor Support: Bruker, AppliedBiosystems, Kratos Analytical,Matrix Science, . . .“Competitor”: mzXML

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 8: MS (and NMR) data standards in Metabolomics why, how and some caveats

mzData + mzXML = mzML

mzData1.05

mzXML3.0

mzML0.90

SFO2006-05

dataXML0.6

DC2006-09

Lyon2007-04

EBI2007-06

mzML0.91

PSI Doc Proc2007-11

mzML0.99 RC

Toledo2008-04

mzML1.0.0

Release!2008-06

Early Development Final Development

mzML1.1.0RC5

Turku2009-04

mzML1.1.0

Release!2009-06

HUPO-PSIMore stable than mzXMLBetter defined than mzDataReference implementationsEarly vendor involvement

mzML

run

spectrumspectrumDescription

binaryDataArray

binaryDataArray

• • •

precursorList

scan

spectrumList

• • •spectrum

spectrum

cvList

referenceableParamGroupList

sampleList

acquisitionSettingsList

dataProcessingList

softwareList

instrumentConfigurationList

chromatogramList

• • •chromatogram

chromatogram

chromatogram

binaryDataArray

binaryDataArray

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Martens, Chambers, Sturm, Kessner, Levander, Shofstahl, Tang, Römpp, Neumann, Pizarro, Montecchi-Palazzi, Tasman, Coleman, Reisinger, Souda, Hermjakob, Binz, Deutsch. mzML–a community standard formass spectrometry data. Mol Cell Proteomics. (2011)

Page 9: MS (and NMR) data standards in Metabolomics why, how and some caveats

mzML in action: I

� �1 <mzML >2 <cv id="MS" fullName="PSI MS Vocabularies" />3 <cv id="UO" fullName="unit" />4

5 <fileContent>6 <cvParam cv="MS" name="MS1 spectrum"/>7 <cvParam cv="MS" name="MSn spectrum"/>8 <cvParam cv="MS" name="centroid spectrum"/>9 </fileContent>

10

11 <sourceFile id="sourceFile" location="C:/MSMSpos15_MM48_1_2−18485.d/analysis.baf">12 <cvParam cv="MS" name="Bruker BAF file"/>13 <cvParam cv="MS" name="SHA−1" value="4ef...7c0"/>14 </sourceFile>� �

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 10: MS (and NMR) data standards in Metabolomics why, how and some caveats

mzML in action: II

� �1 <software id="exportSoftware" version="3.0.5">2 <cvParam cv="MS" name="CompassXport"/>3 </software>4 <software id="recalibrationSoftware" version="4.0.234.0">5 </software>6

7 <instrumentConfiguration id="instrument">8 <cvParam cv="MS" name="micrOTOF−Q"/>9 </instrumentConfiguration>

10

11 <dataProcessing id="export">12 <processingMethod order="1" softwareRef="instrumentSoftware">13 <cvParam cv="MS" accession="MS:1000035" name="peak picking"/>14 </processingMethod>15 <processingMethod order="2" softwareRef="recalibrationSoftware">16 <cvParam cv="MS" name="m/z calibration"/>17 </processingMethod>18 <processingMethod order="3" softwareRef="exportSoftware">19 <cvParam cv="MS" name="Conversion to mzML"/>20 </processingMethod>21 </dataProcessing>� �

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 11: MS (and NMR) data standards in Metabolomics why, how and some caveats

mzML in action: your data

� �1 <spectrum id="scan=16" >2 <cvParam cv="MS" name="positive scan"/>3 <cvParam cv="MS" name="MS2 spectrum"/>4 <cvParam cv="MS" name="centroid spectrum"/>5 <precursor>6 <cvParam cv="MS" name="selected ion m/z" value="542.1" unitName="m/z"/>7 <activation>8 <cvParam cv="MS" name="collision energy" value="15.0" unitName="electronvolt"/>9 <cvParam cv="MS" name="low−energy collision−induced dissociation"/>

10 </ activation >11 </precursor>12 <binaryData>13 <cvParam cv="MS" name="zip compression"/>14 <cvParam cv="MS" name="m/z array" unitName="m/z"/>15 <binary>eNrj/luT+KC02sEswyJj5...doaB42HsdAItdCw4=</binary>16 </binaryDat>17 <binaryData>18 <cvParam cv="MS" name="zip compression"/>19 <cvParam cv="MS" name="intensity array" unitName="counts"/>20 <binary>eNpjYACCBkcHBjCwhdKWD...gAXvgH4</binary>21 </binaryDataArray>22 </spectrum>� �

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 12: MS (and NMR) data standards in Metabolomics why, how and some caveats

www.openms.de

Originally for MS-based ProteomicsReads mzData, mzXML, mzMLNetCDF (Not on 64bit!)FileInfo, FileConverter, FileFilter, ...

plus Calibration, Merge, NoiseFilter, . . .TOPPView Viewer and GUI

⇒ Very useful for preprocessing

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

M. Sturm, A. Bertsch, C. Gröpl, A. Hildebrandt, R. Hussong, E. Lange, N. Pfeifer, O. Schulz-Trieglaff, A. Zerck,K. Reinert, O. Kohlbacher, 2008. OpenMS – an Open-Source Software Framework for Mass SpectrometryBMC Bioinformatics doi:10.1186/1471-2105-9-163.

Page 13: MS (and NMR) data standards in Metabolomics why, how and some caveats

http://proteowizard.sourceforge.net/

Originally for MS-based Proteomicscross-platform (MSVC on Windows, gcc on Linux, XCode on OSX)open source (Apache v2)Formats supported on all platforms: mzML, mzXML, MGFFormats supported on Windows with vendor libraries installed:Thermo RAW, Waters RAW, Bruker FID/YEP/BAFmsconvert: conversion tool.msdiff: validation of conversion/preprocessingmsaccess: command line access:binary data and metadata,EICs & pseudo-2D gel image creationSeeMS: interactive viewer for mass spec data files (Windows only)

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Chambers, Maclean, Burke, Amodei, Ruderman, Neumann, Gatto, Fischer, Pratt, Egertson, Hoff, Kessner,Tasman, Shulman, Frewen, Baker, Brusniak, Paulse, Creasy, Flashner, Kani, Moulding, Seymour, Nuwaysir,Lefebvre, Kuhlmann, Roark, Rainer, Gerd, Hemenway, Huhmer, Langridge, Eckels, Connolly, Stearns,Deutsch, Katz, Agus, MacCoss, Tabb, Mallick. A cross-platform toolkit for mass spectrometry and proteomics.Nat. Biotech. (2012)

Page 14: MS (and NMR) data standards in Metabolomics why, how and some caveats

Converters: Notes

https://xcmsonline.scripps.edu/docs/fileformats.html

Bruker:Calibration requires setting a specific Registry Key:HKEY_CURRENT_USER\Software\Bruker Daltonik\CompassXport

UseRecalibratedSpectra=1

Waters:No support for calibration in Waters DLL used by msconvertDataBridge writes netCDF only, and writes calibrated dataAncient massWolf requires full MassLynx installed, will usecalibrated data, but intermingle LockMass Scans

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 15: MS (and NMR) data standards in Metabolomics why, how and some caveats

Plumbing: libraries for mzML

pymzML (Python) http://pymzml.github.io/jmzML (Java) https://code.google.com/p/jmzml/OpenMS (C++) https://www.openms.de/Proteowizard (C++) http://proteowizard.sourceforge.net/mzR (R/Bioconductor) http://www.bioconductor.org/packages/release/bioc/html/mzR.html

. . . and many more!

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 16: MS (and NMR) data standards in Metabolomics why, how and some caveats

MS and Metabolomics in BioC

Collection of biology-related R packagesStarted back in 2002Current release: >500 packages!

Package Maintainer TitlemzR Gatto,me,Fischer parser for netCDF, mzXML, mzData and mzMLxcms Ralf Tautenhahn LC/MS and GC/MS Data AnalysisMassSpecWavelet Pan Du Mass spectrum processing by wavelet-based algorithmsCAMERA Carsten Kuhl Collection of Annotation related MEthods for mass spectRometry dAtaRdisop Steffen Neumann Decomposition of Isotopic PatternsMSnbase Laurent Gatto Base Functions and Classes for MS-based Proteomicsiontree Mingshu Cao Data management and analysis of ion trees from ion-trap MSrpubchem Rajarshi Guha Interface to the PubChem CollectionKEGGSOAP R. Gentleman client interface to the KEGG SOAP serverapComplex D. Scholtens Estimate protein complex membership using AP-MS protein dataPROcess X. Li Ciphergen SELDI-TOF ProcessingsimulatorAPMS Tony Chiang Computationally simulates the AP-MS technology.TargetSearch Cuadros-Inostroza et al. analysis of GC-MS metabolite profiling data.flagme Mark Robinson Analysis of Metabolomics GC/MS Data

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Page 17: MS (and NMR) data standards in Metabolomics why, how and some caveats

LC-MS Data preprocessing with XCMS

www.bioconductor.org

Import: netCDF, mzXML,mzData, mzMLPeak detectionPeak alignmentPeak integration“Differential” metabolitesCompatible with allMS instruments at the IPB

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Lange, Tautenhahn, Neumann, Gröpl. Critical assessment of alignment procedures for LC-MS proteomics andmetabolomics measurements. BMC Bioinformatics (2008)

Page 18: MS (and NMR) data standards in Metabolomics why, how and some caveats

FTICR Peak Picking

Bioconductor Package“MassSpecWavelet”Integration into XCMS:

Same Annotationand IdentificationSame statistics(Same database schema)

380 381 382 383 384

0e+

002e

+06

4e+

06

a) MS raw spectrum

m/z value

Inte

nsity

b) CWT coefficients

m/z value

CW

T c

oeffi

cien

t sca

le

380 381 382 383 384

15

811

1723

380 381 382 383 384

0e+

002e

+06

4e+

06

c) Identified peaks with SNR > 3

m/z value

Inte

nsity

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Projektarbeit Sebastian Wolf & Michael Gerlich: Du, Kibbe, Lin: Peak Detection of Mass Spectrometry Spec-trum by Continuous Wavelet Transform based Pattern Matching, Bioinformatics (2008)

Page 19: MS (and NMR) data standards in Metabolomics why, how and some caveats

Plumbing: mzR for MS raw data

New in BioC 2.10 (Oct 2011)Joint work Fischer/Gatto/NeumannConglomerate of former XCMS code, ISB Ramp,Proteowizard via RcppRead netCDF, mzXML, mzData, mzML (mz5 soon ?)Read mzIdentML mzQuantML one day ?To become the affyIO of MS data ?!GSoC project 2014 to improve mzR

mzR

mzRramp

mzRpwiz

mzRnetCDF

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Chambers, Maclean, Burke, Amodei, Ruderman, Neumann, Gatto, Fischer, Pratt, Egertson, Hoff, Kessner,Tasman, Shulman, Frewen, Baker, Brusniak, Paulse, Creasy, Flashner, Kani, Moulding, Seymour, Nuwaysir,Lefebvre, Kuhlmann, Roark, Rainer, Gerd, Hemenway, Huhmer, Langridge, Eckels, Connolly, Stearns,Deutsch, Katz, Agus, MacCoss, Tabb, Mallick. A cross-platform toolkit for mass spectrometry and proteomics.Nat. Biotech. (2012)

Page 20: MS (and NMR) data standards in Metabolomics why, how and some caveats

imzML: imaging mass spectrometry in mzML

Huge data files,complex access patternsimzML: same ’ol mzML,but base64 in 2nd data file

Some new CV termsfaster access7/8 space reductionlossless mzML � imzMLhttp://www.imzml.org

⇒ Open MS imaging software!

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

Schramm T, Hester A, Klinkert I, Both J-P, Heeren RMA, Brunelle A, Laprévote O, Desbenoit N, Robbe M-F, Stoeckli M, Spengler B, Römpp A (2012) imzML — A common data format for the flexible exchange andprocessing of mass spectrometry imaging data. J. of Proteomics 10.1016/j.jprot.2012.07.026

Page 21: MS (and NMR) data standards in Metabolomics why, how and some caveats

mz5: netCDF meets mzML

Convert from XML to HDF5HDF5: big cousin of netCDFPros:

size reduction 54%read/write speed 3–4-foldFully implemented in pwizHDF5 API for mostlanguages

Cons:Not human-readableKills emacs and wordpad

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014

mz5: Space- and Time-efficient Storage of Mass Spectrometry Data SetsM. Wilhelm, M. Kirchner, J. Steen, H Steen, MCP 10.1074/mcp.O111.011379

Page 22: MS (and NMR) data standards in Metabolomics why, how and some caveats

Focus of standards in NMR

D2.6MetadataISAtab

D2.4Raw datanmrML

Metabolite Identification

mzTab

Metabolite Quantification

mzTab

Page 23: MS (and NMR) data standards in Metabolomics why, how and some caveats

Capture NMR raw data (equivalent to mzML)Ingredients for nmrML standard:

● XML Schema and controlled vocabulary (CV)

● Examples, converters and validation suite

● COSMOS partners involved:IPB, EMBL-EBI, UB2, UBHAM, UOXF, IMPERIAL, MRC, Mike Wilson (Canada), Matthias Klein (D), Ian Lewis (US)

New format: nmrML D2.4Raw datanmrML

Page 24: MS (and NMR) data standards in Metabolomics why, how and some caveats

github.org as development platform

● Web site with content managementhttp://nmrml.org/

● Version control system,Issue tracker, activity statistics

● Free for open source projects

nmrML infrastructure D2.4Raw datanmrML

Page 25: MS (and NMR) data standards in Metabolomics why, how and some caveats

● Controlled vocabulary developed as OWL ontology

● Based on earlier work by MSI, D. Rubtsov and J.Cruz

● ISAtab can leverage ontologies

● With semantic web / RDF / SparQL in mind for later deliverables

nmrML Ontology D2.4Raw datanmrML

Page 26: MS (and NMR) data standards in Metabolomics why, how and some caveats

The need for an open nmr standard

nmrML: an XML-based open standard for NMR data storage and exchange

NMR data is currently accumulating in local data silos, hindering distribution and secondary data usage. Cross platform NMR data access, integration and comparison is hindered by incompatible vendor formats and the lack of a robust vendor-agnostic NMR data standard. Data in proprietary data formats ages fast, posing the danger of irreproducible data from older studies. An open vendor-neutral storage standard is needed as long-term archival format, if emerging metabolomics repositories are to capture data from all vendor formats in a persistent way, yet supporting the dynamics in this field.

To ease format conversions we deliver parsers for Bruker and Varian data formats, which can be incorporated into open NMR processing and analysis software.

Parsers

Although coverage is good at raw data capture, the XSD and CV will be expanded for better processed data and quantification data. Our standard is accepted by major open source nmr data processing tools and will serve the MetaboLights repository with a stable storage format.

Daniel Schober 1, Michael Wilson2, Daniel Jacob3, Annick Moing3, Catherine Deborde3, Luis de Figueiredo4, Kenneth Haug4,

Philippe Rocca-Serra5, John Easton6, Christian Ludwig7, Antonio Rosato8, David Wishart2, Christoph Steinbeck4, Reza Salek4, Steffen Neumann1

1Leibniz Institute of Plant Biochemistry, Dept. of Stress and Developmental Biology, Weinberg 3, 06120 Halle, Germany

2Department of Computing/Biological Sciences, University of Alberta, Edmonton, Canada

3INRA, Univ. Bordeaux, Metabolome Facility of Bordeaux Functional Genomics Center, 71 av Edouard Bourlaux, F-33140 Villenave d’Ornon, France

4European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK

5University of Oxford, e-Research Centre, 7 Keble Road, Oxford, OX1 3QG, UK

6School of Electronic, Electrical and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK

7School of Cancer Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK

8Magnetic Resonance Center (CERM), University of Florence, 50019 Sesto Fiorentino (FI), Italy

nmrML XML schema excerpt nmrML data example nmrML use cases

The COordination of Standards in MetabOlomicS, COSMOS EU consortium has teamed up with the metabolomics standards initiative to create an open exchange and storage format for NMR data. We largely follow design principles already established in the Proteomics Standards Initiative (PSI) for the mzML data standard for mass spectrometry. The standard is composed of an XML schema (nmrML.xsd) and an accompanying controlled vocabulary (nmrCV.owl), which ensures update flexibility and schema robustness by allowing to outsource more variant and dynamic descriptors into the vocabulary which is referenced from within an nmrML file.

•Website: http://www.nmrML.org

•Github: https://github.com/nmrML/nmrML

•nmrML validator: http://msbi.ipb-halle.de/nmrML/index.php

•Cosmos: http://www.cosmos-fp7.eu/

•Email: [email protected]

•Google Group: https://groups.google.com/forum/?hl=en#!forum/nmrml/join

Data from a paper: Farag, M., Porzel, A., Schmidt, J. & Wessjohann, L. Metabolite profiling and fingerprinting of commercial cultivars of Humulus lupulus L. (hop) - a comparision of MS and NMR methods in metabolomics, Metabolomics 8, 492-507, (2012)

<nmrML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://nmrml.org/schema ../../../xml-schemata/nmrML.xsd" xmlns="http://nmrml.org/schema" version="1.0.0"> <cvList count="2"> <cv fullName="nmrML Controlled Vocabulary" version="0.0.1" id="NMRCV" URI="http://www.nmrml.org/nmrml-cv.0.0.1.owl"/> <cv fullName="Unit Ontology" version="3.2.0" id="UO" URI="http://unit-ontology.googlecode.com/svn/trunk/uo.owl/"/> </cvList>

<contactList> <contact id="ID004" fullname="Lutger A. Wessjohann" email="Ludger.Wessjohann [a] ipb-halle.de"/> <contact id="ID044" fullname="Mohamed A. Farag" email="mfarag73 [a] yahoo.com"/> </contactList> <sourceFileList count="2"> <sourceFile sha1="fd99c095046e2356c7d31154d45353fa79cbc844"

location=file:///Users/mike/Projects/nmrML/nmrML/examples/IPB_HopExample/FIDs/FAM013_ AHTM.PROTON_04.fid/procpar

id="SOURCE_FILE_0" name="procpar"> <cvTerm cvRef="NMRCV" accession="NMR:1400297" name="Varian VNMR Format"/> <cvTerm cvRef="NMRCV" accession="NMR:1002006" name="acquisition parameter file"/> </sourceFile> <sourceFile sha1="e4ffeb41da28b1e9017e72819252ec6d78f8179f“

location=file:///Users/mike/Projects/nmrML/nmrML/examples/IPB_HopExample/FIDs/FAM013_AHTM.PROTON_04.fid/fid

id="SOURCE_FILE_1" name="fid"> <cvTerm cvRef="NMRCV" accession="NMR:1400297" name="Varian VNMR Format"/> <cvTerm cvRef="NMRCV" accession="NMR:1400119" name="FID file"/> </sourceFile> </sourceFileList> <softwareList count="1"> <software cvRef="NMRCV" accession="NMR:1000277" name="VnmrJ software" version="2.2C" id="SOFTWARE_1"/> </so<instrumentConfigurationList count="4"> <instrumentConfiguration id="INST_CONFIG_1"> <cvTerm cvRef="NMRCV" accession="NMR:1400234" name="Varian NMR instrument"/> <cvTerm cvRef="NMRCV" accession="NMR:1000235" name="Varian probe"/> <cvTerm cvRef="NMRCV" accession="NMR:1400234" name="Varian NMR instrument"/> <cvTerm cvRef="NMRCV" accession="NMR:1000236" name="5mm HCN probe"/> </instrumentConfiguration> </instrumentConfigurationList> <acquisition> <acquisition1D> <acquisitionParameterSet numberOfScans="160" numberOfSteadyStateScans="0"> <sampleAcquisitionTemperature unitName="kelvin" unitCvRef="UO" value="299.15" unitAccession="UO:0000012"/> <spinningRate unitName="hertz" unitCvRef="UO" value="0" unitAccession="UO:0000106"/> <relaxationDelay unitName="second" unitCvRef="UO" value="22.2737024" unitAccession="UO:0000010"/> <pulseSequence/> <DirectDimensionParameterSet numberOfDataPoints="65536" decoupled="false"> <acquisitionNucleus cvRef="NMRCV" accession="NMR:1400151" name="1H"/> <gammaB1PulseFieldStrength unitName="hertz" unitCvRef="UO" value="34482.7586207"

unitAccession="UO:0000106"/> <irradiationFrequency unitName="hertz" unitCvRef="UO" value="599.8311617" unitAccession="UO:0000106"/> </DirectDimensionParameterSet> </acquisitionParameterSet> <fidData byteFormat="Complex128" encodedLength="324160" compressed="true">eJwMl4dfzl8Ux7U3lYZKy0qiomQ […]</fidData> </acquisition1D> </acquisition></nmrML>ftwareList>

MetaboLights

The nmrML setup

We also deliver a content validator which checks a data file is syntactically well formatted, sufficiently complete and that aspects of minimal information requirements like the Core Information for Metabolomics Reporting (CIMR) are met.

Validators

Outlook Project resources

nmrML setup

•MetaboLights: http://www.ebi.ac.uk/metabolights/

•MSI: http://msi-workgroups.sourceforge.net/

•CIMR-MI: http://mibbi.sourceforge.net/projects/CIMR.shtml

Validation Layer Onion Validation webservice & resultValidation rules (html)

Page 27: MS (and NMR) data standards in Metabolomics why, how and some caveats

My pleas for the future

. . . to the vendors:Please start (or continue!) to support Open Data formats

. . . to the computational mass spec community:Please use (and improve!) joint data I/O libraries

. . . to YOU (the users):

Please start (or continue!) to REQUESTopen formats when inviting to bid for a new instrument

S. Neumann (IPB-Halle.DE) (Raw) data standards in metabolomics June 23, 2014