SBML, SBML Packages, SED-ML, COMBINE Archive, and more Michael Hucka, Ph.D. on behalf of many people NIH MSM Satellite Meeting, Sep. 10, 2015, Bethesda, MD, USA Email: [email protected] Twitter: @mhucka
SBML, SBML Packages, SED-ML, COMBINE Archive, and more
Michael Hucka, Ph.D. on behalf of many people
NIH MSM Satellite Meeting, Sep. 10, 2015, Bethesda, MD, USA
Email: [email protected] Twitter: @mhucka
Outli
ne
SBML introduction
SBML Level 3 and Level 3 packages
Annotations
SED-ML
COMBINE Archive format
What is SBML, and why might you care?
ABC-SysBio CellNetAnalyzer Karyote* PaVESy SBW: Auto Layout acslXtreme CellNOpt KEGGconverter PAYAO sbw: javasim ALC Cellware KEGGtranslator PET sbw: stochastic simulator AMIGO CLEML Kineticon PhysioLab Modeler SCIpath Antimony CL-SBML Kinsolver PINT SED-ML Web Tools APMonitor COBRA libAnnotationSBML PK-Sim / MoBi semanticSBML Arcadia CompuCell3D libRoadRunner PNK SensSB Asmparts ConsensusPathDB libSBML PottersWheel SGMP Athena COPASI libSBMLSim PRISM Sigmoid* AutoSBW CRdata libStruct ProcessDB SIGNALIGN AVIS CycSim MASS Toolbox ProMoT SignaLink BALSA CySBML MatCont PROTON SigPath BASIS Cytoscape MathSBML pybrn SigTran BetaWB Cyto-Sim Medicel PyDSTool SIMBA Bifurcation Discovery Tool DBSolve MEMOSys PySB SimBiology BiGG DEDiscover MesoRD PySCeS Simpathica BiNoM Dizzy Meta-All RANGE SimPheny* BiNoM Cytoscape Plugin DOTcvpSB Metaboflux RAVEN Simulate3D Bio Sketch Pad E-CELL MetaCrop Reactome Simulation Core Library BioBayes ecellJ MetaFluxNet ReMatch Simulation Tool BIOCHAM EPE Metannogen RMBNToolbox SimWiz BioCharon ESS Metatool roadRunner SloppyCell BioCyc Facile MetExplore RSBML SmartCell BioGRID FAME MetNetMaker SABIO-RK Snoopy Biological Networks FASIMU MIRIAM Resources Saint SOSlib BioMet Toolbox FBASBW MMT2 SBFC SPDBS BioModels Database FERN modelMaGe SBML Harvester SRS BioModels Importer FluxBalance ModeRator SBML Layout STEPS BioNessie Fluxor Modesto SBML Reaction Finder StochKit BioNetGen Genetdes Moleculizer SBML Translators StochPy BioPARKIN Genetic Network Analyzer MonaLisa SBML2APM StochSim BioPathwise Gepasi Monod SBML2BioPax STOCKS BioPAX2SBML Gillespie2 MOOSE SBML2LaTeX SurreyFBA BioRica GINsim MuVal (Multi-valued logic) SBML2NEURON SyBiL BioSens GNAT Narrator SBML2Octave SYCAMORE BioSPICE Dashboard GNU MCSim nemo SBML2SMW SynBioSS BioSpreadsheet GRENDEL NetBuilder' SBML2TikZ Systrip BioSyS HSMB NetPath SBML2XPP TERANODE Suite BioTapestry HybridSBML NetPro SBMLEditor The Cell Collective BioUML iBioSim Odefy SBML-PET-MPI Tide BoolNet IBRENA Omix SBMLR TinkerCell braincirc Insilico Discovery ONDEX SBML-SAT Trelis BRENDA insilicoIDE optflux SBML-shorthand UTKornTools BSTLab iPathways Oscill8 SBMLSim VANTED ByoDyn JACOBIAN PANTHER Pathway SBMLsqueezer Vcell CADLIVE Jacobian Viewer PathArt sbmltidy WebCell Cain Jarnac Pathway Access SBMLToolbox WinSCAMP CARMEN JarnacLite Pathway Analyser SBMM assistant Wolfram SystemModeler Cell Illustrator JDesigner Pathway Builder SBO xCellerator CellDesigner JigCell Pathway Solver SBSI Xholon Cellerator JSBML Pathway Tools SBToolbox2 XPPAUT CellMC JSim PathwayLab sbtranslate CellML2SBML JWS Online PATIKAweb SBW
Many software tools for modelingand simulation are available
https://www.behance.net/gallery/d/7465033
Research often involves the use of more than one tool
Need flexible way to exchange results between tools (and researchers)
Format for representing models of biological processes
• Data structures + principles + serialization to XML
• (Mostly) Declarative, not procedural—not a scripting language
(Mostly) neutral with respect to modeling framework
• E.g., ODE, stochastic systems, etc.
Does not store experimental data, or simulation descriptions
• But software may write their own metadata (annotations) in SBML
For software to read/write, not humans
SBML = Systems Biology Markup Language
SBML is a file format based on XML
SBML is a file format based on XML
Don’t work with it directly! Let software do it.
The process is central
• Literally called “reaction” (not necessarily biochemical)
• Participants are pools of entities of the same kind (“species”)
• Species are located in containers (“compartments”)
- Core SBML assumes well-mixed compartments (but see Level 3)
Models can further include:
• Discontinuous events
• Explicitly-written math
Core SBML concepts are fairly simple
• Unit definitions
• Annotations
• Other constants & variables
na1 A nb1 B+ nc1 Cf1(...)
na2 A nd2 D+ ne1 Ef2(...)
. . .nc3 C nf3 F
f3(...) + ng3 G
Core SBML constructs support many types of models
Typical ODE models (e.g., cell differentiation)
Conductance-based models (e.g., Hodgin-Huxley)
Typically do not use SBML “reaction” construct,but instead use “rate rules” construct
Neural models (e.g., spiking neurons)
Typically use “events” for discontinuous changes
Pharmacokinetic/dynamics models
“Species” are not required to be biochemical entities
Infectious diseases BioModels Database model #MODEL1008060001
BioModels Database model #BIOMD0000000451
BioModels Database model #BIOMD0000000020
BioModels Database model #BIOMD0000000127
BioModels Database model #BIOMD0000000234
Example of model type Example model
List originally by Nicolas Le Novére
Many examples of SBML and software resources are available
Accepted by dozens of journals *
100’s of software tools available today
• 280+ listed in SBML Software Guide †
1000’s of models available
• ... in public databases, e.g., BioModels Database, Reactome
• ... as supplementary data to papers
• ... in private repositories
* http://sbml.org/Documents/Publications_known_to_accept_submissions_in_SBML_format † http://sbml.org/SBML_Software_Guide
http://sbml.org
libSBML
• Written in portable C++
- Linux, Mac, Windows
• APIs for C, C++, C#, Java, JavaScript, MATLAB, Octave, Perl, PHP, Python, Ruby
• Reads, writes, validates SBML
• Many other features: e.g., unit checking & conversion
JSBML
• Pure Java
• API very similar to libSBML, but more Java-ish
• Reads, writes, manipulates SBML
• Additional Java-relevant APIs such as listeners
API libraries for supporting SBML
Both are free, open-source under LGPL
http://sbml.org/Software/libSBML http://sbml.org/Software/JSBML
Related effort: CellML
Feature SBML CellML
Overall scopeCore: algebraic, ODE, DAE, DDE, stochastic. Level 3: rule-based, constraint-based, qualitative, PDE.
algebraic, ODE, DAE.
Built-in constructs Numerous; constructs add predefined semantics
Few; models use components & pure math
Math features Subset of “content” MathML Full “content” MathML
Events Yes NoTime delays Yes NoUnits Yes YesCompartments Yes Yes via nested components
Annotations 1) Subset of RDF2) Explicit SBO term attributes Full RDF
User-defined functions Yes In principle, via MathML
lambda statements
Modularity Yes (in Level 3) – model structure-oriented Yes – math-oriented
Extensible syntax Yes (in Level 3) No
SBML Level 3 and Level 3 packages
Level 3 packages add constructs on top of SBML Level 3 Core
Level 3 package What it supportsHierarchical model composition Models composed from other models/parts ✔
Flux balance constraints Constraint-based (a.k.a. steady-state) models ✔
Qualitative models Petri net, Boolean, and similar model types ✔
Graph layout Storing layouts of network diagrams ✔
Arrays Arrays of components 🕒
Distributions Statistical distributions of values 🕒
Groups Grouping elements for conceptual purposes 🕒
Multicomponent/state species Rule-based descriptions of entities with features 🕒
Spatial Nonhomogeneous spatial models 🕒
Dynamic structures Creation/destruction of entities during simulation 🕒
Graph rendering Storing graphical symbols used in Layout diagrams 🕒
Annotations Richer annotation syntax !
Status
Level 3 package What it supportsHierarchical model composition Models composed from other models/parts ✔
Flux balance constraints Constraint-based (a.k.a. steady-state) models ✔
Qualitative models Petri net, Boolean, and similar model types ✔
Graph layout Storing layouts of network diagrams ✔
Arrays Arrays of components 🕒
Distributions Statistical distributions of values 🕒
Groups Grouping elements for conceptual purposes 🕒
Multicomponent/state species Rule-based descriptions of entities with features 🕒
Spatial Nonhomogeneous spatial models 🕒
Dynamic structures Creation/destruction of entities during simulation 🕒
Graph rendering Storing graphical symbols used in Layout diagrams 🕒
Annotations Richer annotation syntax !
Status
Implementations being tested
Implementations being tested
Implementations being tested
Implementations being tested
Implementations being tested
SBML Level 3 package: Hierarchical Model Composition (“comp”)
Defines syntax for composing models from other models (or fragments)
Developed by Lucian Smith, Mike Hucka, Stefan Hoops, Chris Myers, Andrew Finney, Martin Ginkel, Ion Moraru, Wolfram Liebermeister
Species ...Compartments ...
Parameters ...Reactions ...
Model “A”Core SBML
Species ...Compartments ...
Parameters ...Reactions ...
Model “A”
With hierarchical model composition
Species ...Compartments ...
Parameters ...Reactions ...
Model “B”
Species ...Compartments ...
Parameters ...Reactions ...
Model “C”
The ‘comp’ package supports multiple arrangements
Species ...Compartments ...
Parameters ...Reactions ...
Model “A”
Species ...Compartments ...
Parameters ...Reactions ...
Model “B”
Separate files (possibly in databases)
Species ...Compartments ...
Parameters ...Reactions ...
Model “C”
Model “C”
Model “D”
Species ...Compartments ...
Parameters ...Reactions ...
Model “D”
Model “B”
Substitutions and deletions of entities can
be defined
Results can be “flattened” to plain SBML Level 3 CoreAllows tools to read L3 + ‘comp’ models as if they were just plain L3
Algorithm is implemented in libSBML
Species ...Compartments ...
Parameters ...Reactions ...
Model “A”
Species ...Compartments ...
Parameters ...Reactions ...
Model “B”
Species ...Compartments ...
Parameters ...Reactions ...
Model “C”
Model “C”
Model “D”
Species ...Compartments ...
Parameters ...Reactions ...
Model “D”
Model “B” Species ...Compartments ...
Parameters ...Reactions ...
Model “A”
SBML Level 3 CoreOriginal SBML Level 3 Core + SBML ‘comp’
Define syntax for constraint-based (e.g., flux-balance analysis) models
• E.g. problem: optimize a specific property subject to constraints on reaction fluxes and other parameters
Developed by Brett Olivier and Frank Bergmann, with considerable community discussion and feedback for Version 2.
• Version 2 is essentially final
Implemented in libSBML, JSBML; supported in CMBPy, FAME, SBW; converters available to/from CORBA Toolbox
SBML Level 3 package: Flux Balance Constraints (“fbc”)
http://sbml.org/Documents/Specifications/SBML_Level_3/Packages/fbc
SBML Level 3 package: Multistate & Multicomponent speciesCore SBML lacks support for structured entities and pattern rules
• Different states of molecular entities must be different entities/species
SBML Level 3 effort for “multi” aims to add support for structures & patterns
• First proposals were by Finney, Blinov, Faeder, Hlavacek, Le Novère
• Revived by F. Zhang from Simmune group (Meier-Schellersheim et al.)
• Aspects of new effort: species types, binding sites, complexes, rules
http://sbml.org/Documents/Specifications/SBML_Level_3/Packages/multi
SBML Level 3 package: Spatial Processes (“spatial”)Main components:
• Definition of coordinate systems
• Definition of patches of spatial geometries, called domains
• Mapping of SBML compartments, species, & parameters to domains
• Definition of molecular transport mechanisms (advection, diffusion, boundary conditions)
• Mapping of molecular transport mechanisms to domains
Developed mostly by Jim Schaff & Anu Lakshminarayana (VCell), with recent input and involvement from Devin Sullivan (U. Pittsburgh)
• Beta implementation for libSBML available today
http://sbml.org/Documents/Specifications/SBML_Level_3/Packages/spatial
Virtual Cell
Draft implementations available in several tools already
SBML Level 3 Spatial packagedraft specification
COPASI
MCell and CellBlender
Many people contributed to the development of SBML
Mike Hucka, Sarah Keating, Frank Bergmann, Lucian Smith, Andrew Finney, Herbert Sauro, Hamid Bolouri, Ben Bornstein, Maria Schilstra, Jo Matthews, Bruce Shapiro, Linda Taddeo, Akira Funahashi, Akiya Juraku, Ben Kovitz, Nicolas Rodriguez, Andreas Dräger, Alex Thomas
SBML & JSBML Team:
SBML Editors: Mike Hucka, Frank Bergmann, Andreas Dräger, Sarah Keating, Nicolas Le Novère, Chris Myers, Lucian Smith, Stefan Hoops, Sven Sahle, James Schaff, Dagmar Waltemath, Darren Wilkinson, Brett Olivier
SBML Package authors (so far):
Duncan Berenguier, Frank Bergmann, Claudine Chaouiya, Andreas Dräger, Andrew Finney, Ralph Gauges, Colin Gillespie, Martin Ginkel, Tomás Helikar, Stefan Hoops, Sarah Keating, Anu Lakshminarayana, Nicolas Le Novère, Wolfram Liebermeister, Martin Meier-Schellerscheim, Stuart Moodie, Ion Moraru, Chris Myers, Aurélien Naldi, Brett Olivier, Ursula Rost, Lucian Smith, Sven Sahle, James Schaff, Devin P. Sullivan, Denis Thieffry, Martijn P. van Iersel, Leandro Watanabe, Katja Wengler, Darren Wilkinson, Maciej Swat (EBI), Fengkai Zhang
SBML funding sources over the past 15 years
National Institute of General Medical Sciences (USA) Air Force Office of Scientific Research (USA) BBSRC (UK) Beckman Institute, Caltech (USA) DARPA IPTO Bio-SPICE Bio-Computation Program (USA) Drug Disease Model Resources (EU-EFPIA Innovative Medicine Initiate) ELIXIR (UK) European Molecular Biology Laboratory (EMBL) Google Summer of Code International Joint Research Program of NEDO (Japan) Japanese Ministry of Agriculture Japanese Ministry of Education, Culture, Sports, Science and Technology JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003) JST ERATO-SORST Program (Japan) Keio University (Japan) Molecular Sciences Institute (USA) National Science Foundation (USA) STRI, University of Hertfordshire (UK)
Annotations
Structured formats provide syntax and only limited semantics
Structured formats provide syntax and only limited semantics
No standard identifiers
Structured formats provide syntax and only limited semantics
Low info content
No standard identifiers
Structured formats provide syntax and only limited semantics
Modelers are free to choose names
Problems:
• No universal agreement
• Software can’t recognize arbitrary names
Need standard schemes for machine-readable annotations
• Entity identities
• Mathematical semantics
• Links to other data resources
• Authorship & pub. info
Low info content
No standard identifiers
Element in the model
Entity elsewhere (e.g., in a database)
relationship qualifier (optional)
Annotations at their simplest
SBML supports two annotation schemesSBO (Systems Biology Ontology)
• For mathematical semantics
• One SBML object ← one SBO term
• Short, compact, tightly coupled but limited scope
MIRIAM (Minimum Information Requested In the Annotation of Models)
• For any kind of annotation
• One SBML object ← multiple MIRIAM annotations
• Larger, more free-form, wider scope
Both are externalized and independent of SBML
<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ... </sbml>
<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ... </sbml>
SBO:0000339
<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ... </sbml>
SBO:0000339
“forward bimolecular rate constant, continuous case”
MIRIAM (Minimum Information Requested In the Annotation of Models)
Addresses 2 general areas of annotation needs:
MIRIAM is not specific to SBML
Requirements for reference correspondence
Scheme for encoding annotations
Annotations for attributing model creators & sources
Annotations for referring to external
data resources
MIRIAM (Minimum Information Requested In the Annotation of Models)
Addresses 2 general areas of annotation needs:
MIRIAM is not specific to SBML
Requirements for reference correspondence
Scheme for encoding annotations
Annotations for attributing model creators & sources
Annotations for referring to external
data resources
Annotations for referring to external
data resources
Why might you care?
http://www.ebi.ac.uk/chebi
Low info content
Why might you care?
http://www.ebi.ac.uk/chebi
Low info content
Known by different names – do you want to write all of
them into your model?
salicylic acid
BioModels Database: example of using the annotations
Resolving resource identifiersMIRIAM Registry supports the creation of globally unique identifiers
• Example MIRIAM identifier:urn:miriam:ec-code:1.1.1.1
• Provides various data about theresource, including alternate servers
• Provides web services
identifiers.org is layered on top of that and provides resolvable URIs
• Can type it in a web browser!
• Example identifiers.org URI:http://identifiers.org/ec-code/1.1.1.1
Summary: why care about standard ways of writing annotations?
Structured, machine-readable annotations increase your model’s utility
• Allow more precise identification of model components
- Understand model structure
- Search/discover models
- Compare models
• Adds a semantic layer—integrates knowledge into the model
- Helps recipients understand the underlying biology
- Allows for better reuse of models
- Supports conversion of models from one form to another
SED-ML – Simulation Experiment Description Markup Language
Another problem: software can’t read figure legends
?
BIOMD0000000319 in BioModels Database
Decroly & Goldbeter, PNAS, 1982
SED-ML = Simulation Experiment Description MLApplication-independent format
• Captures procedures, algorithms, parameter values
Can be used for
• Simulation experiments encoding parametrizations & perturbations
• Simulations using more than one model and/or method
• Data manipulations to produce plot(s)
http://sedml.org
Simulation
Model
Task Data generators
Reports
COMBINE Archive
The problemMultiple files usually comprise a single simulation experiment
• Model(s) file(s), possibly in multiple formats
• Simulation set-up (e.g., in SED-ML format)
• Parameter settings data files
• Diagrams (e.g., in SBGN format)
• Other files…
All the files need to be communicated together
• Opportunity to lose or mix up files during exchange & sharing
Open Modeling EXchange format (OMEX)COMBINE Archive format = single file that supports exchange of all information necessary for any modeling and simulation experiment
• Not SBML-specific at all
• Not programming-languagespecific
• Not domain specific
OMEX = file format for COMBINE Archive
• ZIP file containing manifest file (in XML form) + other files
• Use of ZIP leverages many existing programming libraries
http://co.mbine.org/documents/archive
Acknowledgments
Huge thanks to everyone in the COMBINE community
Attendees of COMBINE 2013, Paris, France
Attendees of COMBINE 2014, Los Angeles, California, USA
BioModels Database http://biomodels.net/biomodels
CellML http://cellml.org
COMBINE Archive http://co.mbine.org/documents/archive
identifiers.org http://identifiers.org
MIRIAM http://biomodels.net/miriam
MIRIAM Registry http://www.ebi.ac.uk/miriam/main/
SBML http://sbml.org
SED-ML http://biomodels.net/sed-ml
SBO http://biomodels.net/sbo
URLs