Top Banner
Semantic-assisted Analysis and Search in Customer Specifications Martin Voigt, Daniel Hladky September 2014 1 ONTOS LINKED DATA INFORMATION WORKBENCH Extraction & Analysis Indexing Information & Knowledge Management Search Engineer Storage Sales Portal Multilingual Specifications
12

Semantic-assisted Analysis and Search in Customer Specifications

Dec 26, 2014

Download

Software

Martin Voigt

Talk at Future Search Engines 2014 (FoRESEE), INFORMATIK2014 (http://www.informatik2014.de/)

Abstract (DE):
Die gezielte Suche von Informationen in großen Dokumentenmengen ist eine der wesentlichen Herausforderungen der heutigen Zeit. In diesem Papier wird beschrieben, wie wir die Analyse von und Suche in mehrsprachigen Kundenspezifikationen in einem aktuellen Kundenprojekt im Maschinenbau realisiert haben. Im Rahmen der Dokumentenanalyse kommen computerlinguistische und semantische Technologien zum Einsatz. Basis für die Suche bildet das Paradigma des Faceted Browsing.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Semantic-assisted Analysis and Search in Customer Specifications

Semantic-assisted Analysis and

Search in Customer Specifications

Martin Voigt, Daniel Hladky

September 2014

1

ONTOS LINKED DATA INFORMATION WORKBENCH

Extraction & Analysis

Indexing

Information &Knowledge Management

SearchEngineer

Stor

age

Sales

Po

rtal

MultilingualSpecifications

Page 2: Semantic-assisted Analysis and Search in Customer Specifications

I speak about …

The Problem,

Our Solution,

Insights & Further Work.

2

Page 3: Semantic-assisted Analysis and Search in Customer Specifications

The Problem

AviComp Controls GmbH

leading engineering contractor

for rotating machinery controls

3

Customers

Engineers

Sales

> 100k TechnicalSpecifications

http://www.avicomp.com/capabilities/turbo-compressor-controls.html

Page 4: Semantic-assisted Analysis and Search in Customer Specifications

The Problem

Analysis: 1) task, 2) current solution, 3) ideas

Problems

Multiple, inefficient tools

Heterogeneity

Knowledge management & transfer

4

http://answerhub.com/article/

the-cost-of-knowledge-loss/

Page 5: Semantic-assisted Analysis and Search in Customer Specifications

Our Solution

5

ONTOS LINKED DATA INFORMATION WORKBENCH

Extraction & Analysis

Indexing

Information &Knowledge Management

SearchEngineer

Stor

age

Sales

Po

rtal

MultilingualSpecifications

http://www.ontos.com/products/ontosldiw/

Page 6: Semantic-assisted Analysis and Search in Customer Specifications

Our Solution

Extraction & Analysis

Homogenization: PDF conversion (Apache POI) &

OCR (CuneiForm)

Text extraction (Apache Tika)

Language detection (language-detection API)

Text preparation, e.g., remove headers & footers

SKOS-based concept identification

6

Lorem ipsum dolor sit amet, consetetur sadipscing

elitr, sed diam nonumy eirmod tempor

invidunt ut labore et dolore magna aliquyam

erat, sed diam voluptua. At vero eos et accusam

et justo duo dolores et ea rebum. Stet

clita kasd gubergren, no sea takimata

sanctus est Lorem ipsum dolor sit

elitr, sed diam nonumy eirmod tempor

invidunt ut labore et dolore magna aliquyam

erat, sed diam voluptua. At vero eos et accusam

et justo duo dolores et ea rebum. Stet

clita kasd gubergren, no sea takimata

elitr, sed diam nonumy eirmod tempor

invidunt ut labore et dolore magna aliquyam

erat, sed diam voluptua. At vero eos et accusam

et justo duo dolores et ea rebum. Stet

clita kasd gubergren, no sea takimata

ONTOS LINKED DATA INFORMATION WORKBENCH

Extraction & Analysis

Indexing

Information &Knowledge Management

SearchEngineer

Stor

age

Sales

Port

al

MultilingualSpecifications

Page 7: Semantic-assisted Analysis and Search in Customer Specifications

Our Solution

Storage via OntoQUAD

Triple and/or QuadStore, SPARQL 1.1, …

Indexing

Full text search, result grouping, faceted browsing,

SKOS-based label expansion, …

Apache Solr with lucene-skos plugin (https://github.com/behas/lucene-SKOS)

7

ONTOS LINKED DATA INFORMATION WORKBENCH

Extraction & Analysis

Indexing

Information &Knowledge Management

SearchEngineer

Stor

age

Sales

Port

al

MultilingualSpecifications

Page 8: Semantic-assisted Analysis and Search in Customer Specifications

Our Solution

Knowledge Management

via OntoDix but SKOS-only

8

ONTOS LINKED DATA INFORMATION WORKBENCH

Extraction & Analysis

Indexing

Information &Knowledge Management

SearchEngineer

Stor

age

Sales

Port

al

MultilingualSpecifications

Page 9: Semantic-assisted Analysis and Search in Customer Specifications

Our Solution

Search

via AJAX Solr (https://github.com/evolvingweb/ajax-solr)

9

ONTOS LINKED DATA INFORMATION WORKBENCH

Extraction & Analysis

Indexing

Information &Knowledge Management

SearchEngineer

Stor

age

Sales

Port

al

MultilingualSpecifications

Page 10: Semantic-assisted Analysis and Search in Customer Specifications

Insights & Further Work

Iterative development with early customer

testing lowers usage barrier

Lessons learned

Development of a knowledge base

Faceted search user interface

Faceted search on RDF

Multilingual disambiguation

mechanisms

10

Page 11: Semantic-assisted Analysis and Search in Customer Specifications

Q&A

Martin Voigt

Ontos AG / GmbH

Nidau (CH) / Leipzig (DE)

T: +49 341 21559-10

M: +49 178 40 222 58

E: [email protected]

11

Page 12: Semantic-assisted Analysis and Search in Customer Specifications

About Ontos

12

12

DoW – CTI Project

Ontos Group

Key Facts- Established 2001

- 15+ employees

- Share in Eventos RU

(30 people)

- 5± Mio CHF turnover

Industry- Media/News

- Law Enforcement

- Government

- (Russia)