Creating a new language to support open innovation Michael Hucka, Ph.D. Department of Computing + Mathematical Sciences California Institute of Technology Pasadena, CA, USA BioBriefing –BioMelbourne Network, Australia, August 2013 Email: [email protected]Twitter: @mhucka
55
Embed
Creating a new language to support open innovation
Presentation given on 19 August 2013 at a BioBriefings meeting of the BioMelbourne Network (http://www.biomelbourne.org/events/view/289) in Melbourne, Australia.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Creating a new language to support open innovation
Michael Hucka, Ph.D.Department of Computing + Mathematical Sciences
California Institute of TechnologyPasadena, CA, USA
BioBriefing – BioMelbourne Network, Australia, August 2013
Qualitative models Petri net models, Boolean models ✔
Graph layout Diagrams of models ✔
Multicomponent/state species Entities w/ structure; also rule-based models draft
Spatial Nonhomogeneous spatial models draft
Graph rendering Diagrams of models draft
Groups Arbitrary grouping of components draft
Distributions Numerical values as statistical distributions in dev
Arrays & sets Arrays or sets of entities in dev
Dynamic structures Creation & destruction of components in dev
Annotations Richer annotation syntax
Status
National Institute of General Medical Sciences (USA) European Molecular Biology Laboratory (EMBL)JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003)JST ERATO-SORST Program (Japan)ELIXIR (UK)Beckman Institute, Caltech (USA)Keio University (Japan)International Joint Research Program of NEDO (Japan)Japanese Ministry of AgricultureJapanese Ministry of Educ., Culture, Sports, Science and Tech.BBSRC (UK)National Science Foundation (USA)DARPA IPTO Bio-SPICE Bio-Computation Program (USA)Air Force Office of Scientific Research (USA)STRI, University of Hertfordshire (UK)Molecular Sciences Institute (USA)
SBML funding sources over the past 13+ years
Outli
ne
Background and introduction
The Systems Biology Markup Language (SBML)
Complementary efforts: MIRIAM and SED-ML
COMBINE: the Computational Modeling in Biology Network
Conclusion
Mathematical semantics
Biological semantics
Visual interpretation
Discrete stochastic entities
Continuous lumped parameter
State transition
Mean field approximation
Model type
Model creation
Model annotation
Model analysis
Numerical results
Model life-cycle
Model representation level
COMBINE efforts cover different facets of modeling
...
Conc
ept d
ue to
Nic
olas
Le N
ovèr
e
Modelers want to use their own conventions
Modelers want to use their own conventions
No standard identifiers
Modelers want to use their own conventions
Low info content
No standard identifiers
Raw models alone are insufficient
Need standard schemes for machine-readable annotations
• Identify entities
• Mathematical semantics
• Links to other data resources
• Authorship & pub. info
Modelers want to use their own conventions
Low info content
No standard identifiers
Addresses 2 general areas of annotation needs:
MIRIAM is not specific to SBML
MIRIAM (Minimum Information Requested In the Annotation of Models)
Requirements for reference correspondence
Scheme for encoding annotations
Annotations for attributing model creators & sources
Annotations for referring to external
data resources
Addresses 2 general areas of annotation needs:
MIRIAM is not specific to SBML
MIRIAM (Minimum Information Requested In the Annotation of Models)
Requirements for reference correspondence
Scheme for encoding annotations
Annotations for attributing model creators & sources
Annotations for referring to external
data resources
Annotations for referring to external
data resources
Example of a problem that can be solved with annotations
Example of a problem that can be solved with annotations
http://www.ebi.ac.uk/chebi
Low info content
Known by different names – do you want to write all of
them into your model?
salicylic acid
MIRIAM annotations for external referencesGoal: link model constituents to corresponding entities in bioinformatics resources (e.g., databases, controlled vocabularies)
• Supports:
- Precise identification of model constituents
- Discovery of models that concern the same thing
- Comparison of model constituents between different models
MIRIAM approach avoids putting data content directly in the model
• Instead, it points at external resources that contain the data
How do we create globally unique identifiers consistently?Long story short—developed by the Le Novère group at the EBI
• Resource identifiers (URIs) combine 2 parts:
• There’s a registry for namespaces: MIRIAM Registry
- Allows people & software to use same namespace identifiers
• There’s a URI resolution service: MIRIAM Resources & identifiers.org
- Allows people & software to take a given identifier and figure out what it points to
namespace entity identifier{ {
Identifies a dataset Identifies a datumwithin the dataset
Another problem: software can’t read figure legends
?
BIOMD0000000319 in BioModels Database
Decroly & Goldbeter, PNAS, 1982
SED-ML = Simulation Experiment Description MLApplication-independent format
• Makes people comfortable knowing it will always be available
Be creative about seeking funding
Some things we (maybe?) got right with SBML
Not waiting for implementations before freezing specifications
• Sometimes finalized specification before implementations tested it
- Especially bad when we failed to do a good job
‣ E.g., “forward thinking” features, or “elegant” designs
Not formalizing the development process sufficiently
• Especially early in the history, did not have a very open process
Not resolving intellectual property issues from the beginning
• Industrial users ask “who has the right to give any rights to this?”
Some things we certainly got wrong
Nicolas Le Novère, Henning Hermjakob, Camille Laibe, Chen Li, Lukas Endler, Nico Rodriguez, Marco Donizelli, Viji Chelliah, Mélanie Courtot, Harish Dharuri
Attendees at SBML 10th Anniversary Symposium, Edinburgh, 2010
John C. Doyle, Hiroaki Kitano
Mike Hucka, Sarah Keating, Frank Bergmann, Lucian Smith, Andrew Finney, Herbert Sauro, Hamid Bolouri, Ben Bornstein, Bruce Shapiro, Akira Funahashi, Akiya Juraku, Ben Kovitz
Original PI’s:
SBML Team:
SBML Editors:
BioModels DB:
Mike Hucka, Nicolas Le Novère, Sarah Keating, Frank Bergmann, Lucian Smith, Chris Myers, Stefan Hoops, Sven Sahle, James Schaff, Darren Wilkinson
And a huge thanks to many others in the COMBINE community
This work was made possible thanks to a great community