Eric Little, PhD
VP Data Science
Demystifying Semantics:P r a c t i c a l U t i l i z a t i o n o f S e m a n t i c
T e c h n o l o g i e s f o r R e a l W o r l d A p p l i c a t i o n s
Heiner Oberkampf, PhD
Senior Semantic Eng.
Slide 2
HOW WE APPROACH TECHNOLOGY
WE connect data, people and
organizations
Integration at many levels
Technology development pertains to
more than 0’s and 1’s
Slide 3
MOVING TO SMART DATA
Smart data can be added to existing
systems
Does not require replacement of existing
tech
Smart data provides a separation of:
Model Layer
Data Layer
Link to the model layer
Leave data in place
Smart data links information from the
models to instance-level data
Slide 4
DRIVING BUSINESS VALUE WITH SEMANTICS
Understand what the business drivers are for
your organization
Helps determine value of a given solution
Benefits of semantics:
Integrated data is more valuable
Understanding your data makes it easier to search and
share results
Important patterns in your data drive new analytics
Private data can be linked to public sources
Improved context makes data more meaningful over
time
Slide 5
USE CASES FOR SEMANTICS
Current customers are using semantics for
the following kinds of applications
Data Integration
ProjectsTaxonomies
Linked Data Analytics
Reference Master Data Long-term Storage
Slide 6
THE SPECTRUM OF SEMANTIC SYSTEMS
Slide 7
THE SPECTRUM OF SEMANTIC SYSTEMS
CODE LISTS
LIST
Example: Airport codes
Code Name
ATL Atlanta
MIA Miami
JFK John F Kennedy
LGA LaGuardia
HOU Hobby Airport
IAH George Bush
Intercontinental Airport
MCO Orlando International
Airport
CGN Cologne Bonn Airport
… …
Slide 8
THE SPECTRUM OF SEMANTIC SYSTEMS
INFORMAL HIERARCHY – ORGANIZE BY GROUPING
Definition:
An informal hierarchy (or
weak taxonomy) defines an
informal parent/child
relationship between entities
to group them without
ensuring a consistent
semantics of the relationship
and the category or type of
the elements.Form of exposure
Animal
Test Method
INFORMAL HIERARCHY
Example: Regulatory Toxicology Data
Slide 9
THE SPECTRUM OF SEMANTIC SYSTEMS
THESAURUSDefinition:
A thesaurus is based on
concepts and shows
relationships among
terms.
THESAURUS
Ibuprofen
IP-82
Ibuprofen, Copper (2+) Salt
Calcium Salt Ibuprofen
Ibuprofen, Sodium Salt
Ibuprofen-Zinc
Aluminum Salt Ibuprofen
Ibuprofen Zinc
drug
Phenylpropionate
synonym
pain killer
Acetaminophen
pain
broader
broader
broader
related
Motrin
narrower
“Schmerzmittel”@de
label
Slide 10
THE SPECTRUM OF SEMANTIC SYSTEMS
TAXONOMYDefinition:
A taxonomy is a
formal generalization-
specialization
(subclass or is-a)
hierarchy. It allows
inference along the
class hierarchy.
TAXONOMY
Example: International Classification of Diseases v 10
Source: http://bioportal.bioontology.org/ontologies/ICD10/
Slide 11
THE SPECTRUM OF SEMANTIC SYSTEMS
QUESTION: TAXONOMY OR THESAURUS?
MeSH terms
How is the term “Thumb”
categorized here?
Examine the relationships
THESAURUS
TAXONOMY
Slide 12
THE SPECTRUM OF SEMANTIC SYSTEMS
ANSWER: THESAURUS (NOT TAXONOMY)
MeSH Thesaurus
“MeSH hierarchical
links are not subclass
relations. If you interpret
them as such you get
strange inferences such as
‘Every thumb is a hand’.
This would do injustice to
MeSH , which is a great
resource, which fulfils it
goals without subscribing
to OWL semantics. “
THESAURUS
Slide 13
THE SPECTRUM OF SEMANTIC SYSTEMS
CONCEPTUAL MODEL
Definition:
A conceptual model
formally distinguishes
between classes and
instances and allows
to define properties
for classes and
instances and
corresponding
inheritance.
CONCEPTUAL
MODEL
Slide 14
THE SPECTRUM OF SEMANTIC SYSTEMS
ONTOLOGY MODEL
ONTOLOGY
MODEL
Examples:
Hepatitis := Infection AND (hasLocation SOME Liver)
Pain Killer := Drug AND (treats SOME Pain)
DisjointClasses(Acetaminophen, Ibuprofen)
Definition: An ontology
is a model that provides
a formal description of
entities, their attributes
and all sorts of
relationships that can
hold between them.
Slide 15
W3C TECHNOLOGY STACK
STANDARDS ARE KEY
Source: Artificial Intelligence and the Semantic Web: AAAI2006 Keynote. 2006.
URL: http://www.w3.org/2006/Talks/0718-aaai-tbl/Overview.html
Notional standards-driven semantic “stack” to
implement Semantic Web
Slide 16
STORAGE AND ACCESS: TRIPLE STORES
Slide 17
WHAT WE FIND WITH VARIOUS
TRIPLE STORES & GRAPH DBS
Many have the same basic functionality
Storage of triples
Performance can vary
Tuning of the DB is often necessary
Some scale higher than others based on
reported testing
Some can better integrate with RDBs
Native querying of non-triples
Some can run analytics internally
Graphical analytics
Reasoning/inferencing capabilities
Slide 18
DIFFERENT TYPES OF DBS USED
FOR SEMANTICS
Slide 19
ENTERPRISE APPLICATIONS OFTEN
REQUIRE HYBRID ARCHITECTURES
Slide 20
INDUSTRY USE CASES
Slide 21
R&D APPLICATIONS
Semantic applications can be effectively
used to integrate existing data sources
Mappings are created between semantic
models and other data sources (RDBs, Excel
sheets, etc.)
Offers effective linkage to the external
world
Standards allow for common vocabularies
Several customers are seeing benefits by
exploiting external data points
Slide 22
MANUFACTURING EXAMPLE
Machine lines produce large data sets
Materials – raw goods, mixing instructions
(on-site / off-site), batch quality, etc.
Process – transformation of product,
temperature, coating, packaging, etc.
Move QA/QC to near-real time
Integration of different data sources – know
everything about a product when finished
Machines are becoming more self-aware and
communicative
Sensors & IoT
Yield vs. Defects
Can help trending good vs bad products
use classifications as named graphs
Can be linked to other heuristics - data
science & trending
Slide 23
REGULATORY AFFAIRS
Unstructured data combined with structured
data
Documents can be linked to your internal
DBs
Entity extraction and tagging techniques
Legal entities and countries captured
Different rules exist for each country
Laws change over time – requires constant
monitoring
Patents
Patterns can be found that possess
significant business value for customers
Analytics for patterns of interest
Can be used internally or to monitor
competitors in the market
Customer analysis - sentiment, trending,
etc.)
Q&A Session
Connecting data, people and organisations