Top Banner
Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June 7/04
15

Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

Bootstrapping an Ontology-based

Information Extraction System

Alexander Maedche, Günter Neumann, Steffen Staab

(presented by D. Lonsdale)

CS 652 – June 7/04

Page 2: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

Traditional IE + machine learning Extensive use of NLP (SMES: German,

English, Japanese) Ontologies and related tools (OntoEdit,

OntoBroker)

abstract ontology + lexicon

concrete ontology Conclusions/reflections

Overview

Page 3: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

The mantra

Lexical knowledge As usual, concepts are grounded in lexical items

Extraction rules OntoBroker: deductive, OODB, F-Logic

Ontology Abstract ontology + lexicon concrete ontology

Page 4: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

Lexical knowledge

Low-level lexicons, dynamically updated Basic low-level NLP:

tokenization (50 classes) morphological processing POS tagging named entity extraction chunk parsing thematic role assignment (grammatical function)

Cascading finite-state transducers

Page 5: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

The NLP component

Page 6: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

NLP terms

Dependency syntax Chunk parsing Subcategorization Case Topolological fields PP attachment

Page 7: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

Dependency syntax

Page 8: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

Extraction

Concept definitions Inference rules/axioms Bridging (forward inferencing)

Syntactic dependency relations “...implementations of idiosyncratic syntactic cues

for particular ontological structures...” Logical relations (e.g. transitivity, LocatedIn)

OntoBroker engine

Page 9: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

OntoEdit display (tourism)

Page 10: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

An abstract ontology

Page 11: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

A(n ontology) lexicon

Page 12: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

Ontology learning

So how does ontology learning happen? Ontology engineer specifies, refines knowledge structures Select and process a text corpus with the model Use a set of different learning approaches

“...generalized association rule learning algorithm...” Extend the extracted model (all three parts...) Human reviews learning decisions

The ontology is concrete, the methodology description less so...

Page 13: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

The overall approach/system

Page 14: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

GETESS visualization

Page 15: Bootstrapping an Ontology-based Information Extraction System Alexander Maedche, Günter Neumann, Steffen Staab (presented by D. Lonsdale) CS 652 – June.

Conclusions/reflections

Heavy use of NLP (good/bad) Fairly typical mapping of lexical items,

concepts, relations Toolkit approach: lingware, inferencing, GUI’s Machine learning description is vague A picture is only worth a thousand words...