Nuxeo World Session: Semantic Technologies - Update on Recent Research

Post on 28-Jan-2015

105 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation from Nuxeo World 2010 (November 17-18, 2010).

Transcript

Nov. 17 2010 - S. Fermigier & O. Grisel, Nuxeo

Towards semantic ECM:report on the IKS and Scribo projects

Monday, November 22, 2010

Outline

• Introduction to semantic technologies

• Collaborative R&D within the Scribo and IKS projects

• Fise & Apache Stanbol / Nuxeo Integration

Monday, November 22, 2010

1. Introduction to semantic technologies

Monday, November 22, 2010

Illustration source: Mills Davis, “Semantic Social Computing”, sept. 2007Monday, November 22, 2010

Photo source: http://www.flickr.com/photos/pixelydixel/Monday, November 22, 2010

Invented the web in 1989(yeah!)

Photo source: http://www.flickr.com/photos/pixelydixel/Monday, November 22, 2010

Invented the web in 1989(yeah!)

Invented the semantic web in 1999 (duh?)

Photo source: http://www.flickr.com/photos/pixelydixel/Monday, November 22, 2010

Historical perspective

• From web 1.0: web of pages, aka the World Wide Web

• To web 2.0: web of people and of participation, aka the Social Web

• To web 3.0: web of data, of meaning and of connected knowledge, aka the Semantic Web

Monday, November 22, 2010

Picture source: http://www.flickr.com/photos/pixelydixel/Monday, November 22, 2010

Monday, November 22, 2010

Monday, November 22, 2010

Monday, November 22, 2010

A “layer cake” of technologies

Monday, November 22, 2010

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

Linked Online Data in 2007

Monday, November 22, 2010

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

2008

Monday, November 22, 2010

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

2009

Monday, November 22, 2010

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

2010

Monday, November 22, 2010

Good for Enterprise apps too!

Diagram source: http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/Monday, November 22, 2010

Key Enablers

• Open Data and Linked Online Data

• Advances in automatic content analysis (linguistics, image processing)

• Computing power (Moore’s law + MapReduce)

• Classical logic and classical AI

Monday, November 22, 2010

let’s put them to use!

The technologies and data are available,

Monday, November 22, 2010

Content Meaning

Text

Image

Sound

Video

Metadata

Relations

EntitiesTags

Reasoning

Semantic ECM

Monday, November 22, 2010

Goals for Semantic ECM(& Nuxeo)

• Repurpose existing content

• Improve search and collaboration

• Make information contextual

• Extract and use information from your content

•Make your content smarter!

Monday, November 22, 2010

Challenges

• Extract meaning from content

• Enrich content with knowledge

• Enhance interaction with content thanks to added meaning

Monday, November 22, 2010

Business valuefrom semantic ECM

• Efficiency gains: 20% to 90% (ex: in search, collaboration)

• Effectiveness gains: better returns from your assets (ex: news and images from AFP)

• Strategic edge: growth, value capture, new services, gain unfair strategic advantage (ex: vertical ontologies for CEVAs / CCAs)

Monday, November 22, 2010

2. SCRIBO and IKS

Monday, November 22, 2010

• Project under the french FUI program, with 9 partners, and a budget of 4.7 M€

• Goal: to develop algorithms and collaborative tools for extracting knowledge from unstructured documents and images

• Started in 2008, finishing in Dec. 2010, with results already integrated as a Nuxeo plugin

Monday, November 22, 2010

• European project under the FP7, with 13 partners (6 SMEs) and a 8.5 M€ budget

• Goal: create a semantic software “stack” that will be used by CMS vendors to add semantic features to their products

• Started in Jan. 2009, will last until Dec. 2012

• First tangible result: FISE, already integrated in a Nuxeo plugin

Monday, November 22, 2010

3. Linking Semantic EntitiesApache Stanbol - Nuxeo integration

Monday, November 22, 2010

What are entities?

27

Monday, November 22, 2010

28

Monday, November 22, 2010

What is wrong with tags?

29

• Many terms for same meaning

• NYC, New York, New York City

• Many meanings for same terms

• Need context to remove any ambiguity

Monday, November 22, 2010

30

Washington is...

Monday, November 22, 2010

Tagging with Entities

31

• Global namespace / universal meaning context

• Interoperability across domains

• Interoperability across applications

Monday, November 22, 2010

Demo time!

32

Screencast online at http://blogs.nuxeo.com/dev

Monday, November 22, 2010

How does this work?

33

Monday, November 22, 2010

34

Monday, November 22, 2010

35

• Open Source Semantic Engine

• HTTP Services

• For content driven applications

• OSGi: loosely coupled components

• Analysis Engines

• Knowledge RDF vocabularies

Monday, November 22, 2010

What is a semantic engine?

36

• Unstructured content => Knowledge

• Language guessing

• Topic classification (Business, Sports, Media, ...)

• Named Entities extraction and linking

• Relationships and properties extraction

Monday, November 22, 2010

37

Monday, November 22, 2010

38

Monday, November 22, 2010

39

RESTfulis

Beautiful

Monday, November 22, 2010

40

curl -X POST \ -H "Accept: application/json" \ -H "Content-type: text/plain" \ --data "John Smith works at Smith Consulting in Paris." \ http://fise.demo.nuxeo.com/engines

{ "urn:enhancement-1564680b-861c-df6f-fdf9-d34a75d68dfe": { "http://fise.iks-project.eu/ontology/selected-text": [ { "datatype": "http://www.w3.org/2001/XMLSchema#string", "type": "literal", "value": "Paris" } ], "http://fise.iks-project.eu/ontology/selection-context": [ { "datatype": "http://www.w3.org/2001/XMLSchema#string", "type": "literal", "value": "John Smith works at Smith Consulting Paris." } ], "http://purl.org/dc/terms/type": [ { "type": "uri", "value": "http://dbpedia.org/ontology/Place" } ] }, …

Monday, November 22, 2010

41

Monday, November 22, 2010

42

Monday, November 22, 2010

43

= fise +

fast Linked Data local index +

semantic rule engine+

more ?

Monday, November 22, 2010

Apache Stanbol / Nuxeo integration

44

Monday, November 22, 2010

Local IT infrastructure (LAN) 45

Nuxeo DM

addon

1

Apache Stanbol

2

Engine 1

Engine 2

Engine 3

3

DBpedia

Freebase

GeonamesLDAP

Monday, November 22, 2010

46

• Implemented as an Operation for Studio

• Entities & Relationships stored in Nuxeo Core

• CMIS interoperability

Monday, November 22, 2010

Soon available on marketplace.nuxeo.com

47

Monday, November 22, 2010

48

• http://iks-project.eu

• http://fise.demo.nuxeo.com

• http://scribo.ws

• http://incubator.apache.org/stanbol

• http://blogs.nuxeo.com/dev

Questions?

Monday, November 22, 2010

top related