NanoInformaTIX receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 814426 eNanoMapper data model Dr. Nina Jeliazkova Ideaconsult Ltd. Sofia, Bulgaria https://www.ideaconsult.net/ 1
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 814426
eNanoMapper data modelDr. Nina Jeliazkova
Ideaconsult Ltd. Sofia, Bulgariahttps://www.ideaconsult.net/
1
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
Introduction
Nina Jeliazkova• https://orcid.org/0000-0002-4322-6179• Ideaconsult Ltd. Sofia, Bulgaria• We are developing (mostly open source) tools for data management
and modeling for• Chemical substances (safety, pharma, etc)
• FP7 OpenTox, FP7 CADASTER, FP7 ToxBank, H2020 FET ExCAPE, LIFE Concert REACH, AMBIT LRI toolbox (CEFIC), Toxtree, number of industry projects
• Nanomaterial safety (nanomaterials are chemical substances!)• FP7 eNanoMapper, H2020 NanoReg2, caLIBRAte, GRACIOUS, NanoinformaTIX,
Gov4Nano, RiskGone, European observatory of nanomaterials (EUON)
• Will talk about eNanoMapper data model
2
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
The eNanoMapper data model
• Has concepts and relationships• Is not a taxonomy
• But can use (multiple) existing taxonomies to annotate entities
• Is not an ontology• But can use (multiple) existing ontologies to annotate entities• The data entries based on the eNanoMapper model can be converted to
different ontologies. Examples:• RDF serialization using BioAssay ontology classes and properties (relationships)• https://isa-tools.org/ data model, which itself has RDF serialization using several ontologies
• It could be converted to and from different data models and formats• Different representations are appropriate for UI, data capture from instruments, big data
integration and analysis, modelling , AI, etc.
3
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
2. Application domain: data about chemical substances
4
Yes, Excel files, plenty of them, (majority of NanoSafety Cluster data)
OECD Harmonized templates/ IUCLID(mandatory for REACH dossiers)
CODATA VAMAS Uniform Description System
Don’t forget ISO standards, this is what industry uses
BioAssay ontology
https://isa-tools.org/
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
3. Intended purpose: Organising the nanosafety data
• Challenges• Diverse data sources• Diverse data input
formats• Different data
organization• Diverse modelling tools
• Approach:• Enable mappings!
• i.e. eNanoMapper
5
• Physico-chemical identity : Different analytic techniques, manufacturing conditions, batch effects, mixtures, impurities, size distribution, differences in the amount of surface modification, etc.
• Biological identity : Wide variety of measurements, toxicity pathways, effects of ENM coronas, modes-of-action, interactions (cell lines, assays).
• Support for data analysis : Requires “spreadsheet” or matrix view of data. The experimental data in the public datasets is usually not in a form appropriate for modelling (merging multiple values, conditions, similar experiments into matrix form is a challenge).
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
4. How do you represent the world
• (as a continuum? as discrete particles? with quantum mechanics?)• Irrelevant question. Depends on the use case and data available• Material representation
• A material is represented as REST resource. • A REST resource may have many different representations
• Material is composed of components with specified role (multiple serialisations)• A component is represented as e.g. chemical structure – including, but not limited to name,
SMILES, connection table, crystal structure format, any digital representation deemed important (multiple, requested by Mime-type)
• Data about material• A measurement is the result of applying a (measurement) protocol to a material sample. • Simulation data is the result of applying an (in-silico) protocol to a digital representaiton
of a material • Again REST resource with multiple serializations (including semantic)
6
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
5/8.Concepts and relations• Concepts:
• Substance/Material• Nanomaterials are chemical substances• Composition, components, structure, structure properties
• Measurements• Protocol applications (protocol, protocol parameters, factors)• Results (what is measured and what is the result)
• Relationships• Examples : material components (part of , role)• A measurement is the result of applying a protocol to a
material. • The protocols have attributes (e.g. instrument, cell
model)• The outcome of a measurement are data values (numeric,
scalar, vector, categorical, text, etc). • The data entities can be related to each other (e.g. IC50 is
defined by dose response). Measurements can be related to each other as well.
7
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
6. Industrial use cases
• AMBIT LRI tool http://ambitlri.ideaconsult.net (REACH dossiers of chemical substances, same data model)
• EUON https://euon.echa.europa.eu/enanomapper• Largest integration of nanosafety data https://search.data.enanomapper.net
8
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
7. Overlap with other taxonomy and ontologies
• We use (multiple!) existing taxonomies and ontologies to annotate data entries
• Different data models, standards (ISO , OECD HT, etc), data integration, different tools , synonym search (via query expansion)
9
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
Representation• 9. What is the knowledge your specific ontology represents
• Knowledge necessary for a pragmatic description of current practices. Flexible model based on integration of ideas from several approaches of representing data on chemical substances
• 10. How does your ontology represent the relations between different granularity views on the same object?
• Different REST representations, if needed, denoted by MIME type
• 11. How does your ontology represent materials?• See previous slides
• 15. What is the representation language and implementation?• REST resources with multiple representations (JSON, JSON-LD, RDF, other domain specific formats)
10
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
Representation
• 12. What type of processes do you address? How does your ontology represent these processes?
• Annotation with external ontologies/taxonomies
• 13. How does your ontology represent manufacturing?• Annotation with external ontologies/taxonomies
• 14. How does your ontology address the circular connection between physical properties, materials models (see definition in RoMM Review of Materials Modelling VI) and measurement?
• We represent models and measurements as application of a protocol (computational or experimental) to a material. The protocol is defined as a procedure to assess a physical property of the material (or approximation of it).
11
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
14. Properties and measurements
• Properties definitions• may differ, see table on the right
• Of interest in nanosafety:• Properties that can be measured• Properties relevant (e.g. for transport and
fate)• Examples
• “On powders, He-pycnometry is the most appropriatemethod for powders (standardized, available, highlyreproducible), and it measures the mass of the particledivided by its apparent volume = Apparent particledensity = Skeletal density. This is relevant for transportand fate by aerosol and in suspension.”
• “In contrast, true density = theoretical density is lessrelevant for nanosafety purposes, because the biologicalprocesses do not break up closed pores within theparticles. Further, it requires sophisticated methods.”
• More examples in the extra slides
12
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 814426
Thank you!Discussions
13
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
14. Properties and measurements. Shape
14
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
14. Properties and measurements. Aspect ratio
15
ISO enanoMapper ontology
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
14. Properties and measurements. Feret diameter
16
ISO
eNanoMapper ontology
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426
Composition
17
Coating
Core