Transcript

Semantic Technologies for Linked Open Data at the STLab

Aldo Gangemi*, Andrea Nuzzolese, Valentina Presutti*, Diego Reforgiato, Alberto Salvati*,

Eva Blomqvist, Enrico Daga*, Francesco Draicchio, Paolo Ciancarini°, Sergio Consoli, Silvio Peroni°, Daria Spampinato*

CNR Semantic Technology Lab, ISTC-CNR, Rome/Catania name.surname@istc.cnr.it ; *name.surname@cnr.it ; °surname@cs.unibo.it

http://stlab.istc.cnr.it http://wit.istc.cnr.it/stlab-tools

http://data.cnr.it

People• STLab@ISTC

Aldo Gangemi Valentina Presutti Daria Spampinato Andrea Nuzzolese Diego Reforgiato Stefania Capotosti Sergio Consoli Alessio Iabichella

• STLab@SI Alberto Salvati

Gianluca Troiani

• STLab Associates

Paolo Ciancarini (UniBo)

Malvina Nissim

(UniBo)

Massi Ciaram

ita (Google)

Alfio Gliozzo (IBM)

Eva Blomqvist

(Un. Linköping)

Enrico Daga (Open University

)

Alessandro Adamou (Open Un.)

Francesco Draicchio (UniBo)

Francesco Antinucci (CNR)

• STLab (Semantic Technology Lab) è un laboratorio dell’ISTC (Istituto di Scienze e Tecnologie Cognitive) del CNR, con sedi a Roma e Catania, attivo anche a Bologna e Parigi

�2

Outline

•The Linked Open Data (LOD) of CNR !

•The Semantic Scout !

•Machine reading for the Semantic Web !

•Knowledge pattern discovery and usage

�3

A practical experience: data.cnr.it and the Semantic Scout

Joint work by STLab and the Information Systems unit of CNR Thanks to

Alberto Salvati, Enrico Daga, Gianluca Troiani, Andrea Pompili, Angelo Olivieri

Past collaboration with Claudio Baldassarre (UN-FAO) and Alfio Gliozzo (now IBM-Watson)

Linked Open Data in Public Administrations

�4

Objective and results• Objectives

• Publishing CNR data as LOD

• Matching the research demand to the research supply in the largest research institution (CNR) in Italy

!

• Results • data.cnr.it

• The CNR ontology network and data available as LOD

• Semantic interoperability between heterogeneous data sources

• The Semantic Scout - http://bit.ly/semanticscout • Expert finding based on competence

• Monitoring funding and evolution of different research areas and units

• Browsing and reporting capabilities

�5

data.cnr.it

�6

data.cnr.it

�6

data.cnr.it

�6

�7

Methods for data conversion, extraction, inference, integration, linking, publishing, and searching

Semantic scoutSemantic search

�8 http://bit.ly/semanticscout

Semantic scoutBrowsing

�8 http://bit.ly/semanticscout

Semantic scoutRelation explorer

�8 http://bit.ly/semanticscout

Semantic scoutExporting exploration results

�8 http://bit.ly/semanticscout

Semantic scoutAutomated reporting

�8 http://bit.ly/semanticscout

Machine reading for the Semantic Web

Apache Stanbol• A set of reusable components for semantic content

management • To extend traditional content management systems with semantic services

accessible as HTTP REST services

• Stanbol is the main software result of the EU IP IKS !

• Our contribution: the Knowledge Representation and Reasoning layer of Stanbol • Services used to define and manipulate semantic data models in CMS, i.e.,

Ontology Network Manager component

• Services able to retrieve additional semantic information about content, i.e., Reaoners and Rules components

�10

Stanbol in a nutshell

�11

NER and linking to LOD datasets

• The Black Hand might not have decided to barbarously assassinate Franz Ferdinand after he arrived in Sarajevo on June 28th, 1914

qualities

modality

negation

type induction WSD taxonomy induction

semantic roles

NER

events

dates

The <span xmlns:dbo="http://dbpedia.org/ontology/" xmlns:dbr="http://dbpedia.org/resource/" about="dbr:Black_Hand_(Serbia)" typeof=”dbo:Agent">Black Hand</span> might not have decided to barbarously assassinate <span xmlns:schemaorg="http://schema.org/" xmlns:dbr="http://dbpedia.org/resource/" about="dbr:Archduke_Franz_Ferdinand_of_Austria" typeof=”schemaorg:Person”>Franz Ferdinand</span> after he arrived in <span xmlns:schemaorg="http://schema.org/" xmlns:dbr="http://dbpedia.org/resource/" about="dbr:Sarajevo” typeof=”schemaorg:City”>Sarajevo</span> on June 28th, 1914

sample RDFa annotation

FRED http://wit.istc.cnr.it/stlab-tools/fred/

co-reference tense

�12�12

RESTful

Tìpalo

• Motivation

• It is difficult to automatically generate enterprise taxonomies from data available as plain documents

!

• Objective

• To enable automatic generate taxonomies by exploiting the richness of natural language text

�13

Typing DBpedia entities with Tìpalo

Typing

NER

Taxonomy induction

WSD Alignment to WordNet supersenses

Alignment to Dolce

http://wit.istc.cnr.it/stlab-tools/tipalo/ �14

“Pakito is the alias of french electronic dance music artist Julien Ranouil” (cf. wikipedia.org)

RESTful

Sentilo

• Sentilo is a new method of Sentic Computing

• i.e., Semantic Sentiment Analysis, which is a new research area

!

• Motivations

• Sentiment Analysis does not take into account semantic features when computing opinion scores

• Semantics can give a lot of information for Sentiment Analysis methods

!

• Objectives

• To provide Sentiment Analysis methods with Semantic information

• To identify more easily and also using semantic information the opinion

�15

Sentilo

�16

Topic

Sub topics

Opinion holder

Opinions

Sentiment scores

“Robert is happy because Silvio Berlusconi finally was

condemned by judges”

http://wit.istc.cnr.it/stlab-tools/sentilo /

RESTful

Knowledge pattern discovery

Bottom-up: schema extraction

�18

Encyclopedic Knowledge Patterns

• 184 Encyclopaedic Knowledge Patterns (EKPs) were discovered by identifying invariances in the structure of Wikipedia page links !

• EKPs are represented as OWL2 ontologies !

• They capture concepts that are typically used by Wikipedia users for describing things of a certain type

�19

An EKP for OfficeHolder

http://ontologydesignpatterns.org/ekp/�20

An EKP for OfficeHolder

Formal represenation

http://ontologydesignpatterns.org/ekp/�20

An EKP for OfficeHolder

Access to data

http://ontologydesignpatterns.org/ekp/�20

An EKP for OfficeHolder

Textual grounding

From wikipedia.org

http://ontologydesignpatterns.org/ekp/�20

Aemoo

• Aemoo exploits EKPs for • Entity summarisation and Exploratory search

• Distinguishing between core and peculiar knowledge

!

• The data sources are Wikipedia, DBpedia, Twitter, and GoogleNews !

• Aemoo is a KP-aware application • Benefits from KPs for addressing knowledge interaction tasks

• Uses KPs as the basic unit of mean for representing, exchanging, as well as reasoning with knolwedge

�21

Aemoo UI

http://aemoo.org

�22

Conclusions

�23

• We have provided a practical overview about how to build Linked Open Data !

• We have provided case studies and scenarios for exploiting Linked Data !

• We have shown Linked Data-compliant algorithms and tools

�24

Thank you!

top related