Top Banner
ClausIE: Clause-Based Open Information Extraction Luciano Del Corro Rainer Gemulla Max-Planck-Institut für Informatik May 2013 Del Corro, Gemulla (MPI) ClausIE May 2013 1 / 18
72

ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Aug 26, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE: Clause-BasedOpen Information Extraction

Luciano Del Corro Rainer Gemulla

Max-Planck-Institut für Informatik

May 2013

Del Corro, Gemulla (MPI) ClausIE May 2013 1 / 18

Page 2: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Open Information Extraction: From sentences to propositions

GOAL: Extract information from natural text

SentenceBell, a telecommunication company, which is based in Los Angeles,makes and distributes electronic, computer and building products.

Extractions/Propositions(Bell, ’is’, a telecommunication company)(Bell, is based in, Los Angeles)(Bell, makes, electronic products)(Bell, distributes, electronic products)

. . .

Most OIE extractorsPropositions expressed as triples (arg1, relation, arg2)

Verb based relationArguments restricted to noun phrases

Del Corro, Gemulla (MPI) ClausIE May 2013 2 / 18

Page 3: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Open Information Extraction: From sentences to propositions

GOAL: Extract information from natural text

SentenceBell, a telecommunication company, which is based in Los Angeles,makes and distributes electronic, computer and building products.

Extractions/Propositions(Bell, ’is’, a telecommunication company)(Bell, is based in, Los Angeles)(Bell, makes, electronic products)(Bell, distributes, electronic products)

. . .

Most OIE extractorsPropositions expressed as triples (arg1, relation, arg2)

Verb based relationArguments restricted to noun phrases

Del Corro, Gemulla (MPI) ClausIE May 2013 2 / 18

Page 4: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Open Information Extraction: From sentences to propositions

GOAL: Extract information from natural text

SentenceBell, a telecommunication company, which is based in Los Angeles,makes and distributes electronic, computer and building products.

Extractions/Propositions(Bell, ’is’, a telecommunication company)(Bell, is based in, Los Angeles)(Bell, makes, electronic products)(Bell, distributes, electronic products)

. . .

Most OIE extractorsPropositions expressed as triples (arg1, relation, arg2)

Verb based relationArguments restricted to noun phrases

Del Corro, Gemulla (MPI) ClausIE May 2013 2 / 18

Page 5: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Open Information Extraction: From sentences to propositions

GOAL: Extract information from natural text

SentenceBell, a telecommunication company, which is based in Los Angeles,makes and distributes electronic, computer and building products.

Extractions/Propositions(Bell, ’is’, a telecommunication company)(Bell, is based in, Los Angeles)(Bell, makes, electronic products)(Bell, distributes, electronic products)

. . .

Most OIE extractorsPropositions expressed as triples (arg1, relation, arg2)

Verb based relationArguments restricted to noun phrases

Del Corro, Gemulla (MPI) ClausIE May 2013 2 / 18

Page 6: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Open Information Extraction: challenges and applications

Challenges/RequirementsDomain independentUnbounded set of relationsNo filtering of informationStructured outputScalable

ApplicationsStructured searchAutomatic ontology constructionQuestion answeringSemantic role labeling, discourse parsing, ... ?

Del Corro, Gemulla (MPI) ClausIE May 2013 3 / 18

Page 7: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Open Information Extraction: challenges and applications

Challenges/RequirementsDomain independentUnbounded set of relationsNo filtering of informationStructured outputScalable

ApplicationsStructured searchAutomatic ontology constructionQuestion answeringSemantic role labeling, discourse parsing, ... ?

Del Corro, Gemulla (MPI) ClausIE May 2013 3 / 18

Page 8: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Outline

1 Information and Representation

2 Open Information Extractors and Language Technology

3 ClausIEClauses in the English LanguageFrom clauses to propositions

4 Results

5 Conclusions and Future Directions

Del Corro, Gemulla (MPI) ClausIE May 2013 4 / 18

Page 9: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Information and Representation

Outline

1 Information and Representation

2 Open Information Extractors and Language Technology

3 ClausIEClauses in the English LanguageFrom clauses to propositions

4 Results

5 Conclusions and Future Directions

Del Corro, Gemulla (MPI) ClausIE May 2013 5 / 18

Page 10: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Information and Representation

Information and Representation: a two-step approach

InformationWhat information is expressed?How much to retain?How to identify it? (e.g. non-verb mediated propositions‘)

? Messi, a golden ball winner, plays in Barcelona

RepresentationWhat is the form of the relation?

? Messi plays in Barcelona → plays or plays inTriples or n-ary propositions?

? (Messi, plays football in, Barcelona) or (Messi, plays, football, inBarcelona)

What should be the scope of the arguments?? Gandhi was vegetarian

Del Corro, Gemulla (MPI) ClausIE May 2013 5 / 18

Page 11: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Information and Representation

Information and Representation: a two-step approach

InformationWhat information is expressed?How much to retain?How to identify it? (e.g. non-verb mediated propositions‘)

? Messi, a golden ball winner, plays in Barcelona

RepresentationWhat is the form of the relation?

? Messi plays in Barcelona → plays or plays inTriples or n-ary propositions?

? (Messi, plays football in, Barcelona) or (Messi, plays, football, inBarcelona)

What should be the scope of the arguments?? Gandhi was vegetarian

Del Corro, Gemulla (MPI) ClausIE May 2013 5 / 18

Page 12: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Information and Representation

Information and Representation: a two-step approach

InformationWhat information is expressed?How much to retain?How to identify it? (e.g. non-verb mediated propositions‘)

? Messi, a golden ball winner, plays in Barcelona

RepresentationWhat is the form of the relation?

? Messi plays in Barcelona → plays or plays inTriples or n-ary propositions?

? (Messi, plays football in, Barcelona) or (Messi, plays, football, inBarcelona)

What should be the scope of the arguments?? Gandhi was vegetarian

Del Corro, Gemulla (MPI) ClausIE May 2013 5 / 18

We aim to separate these two phases

Page 13: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Open Information Extractors and Language Technology

Outline

1 Information and Representation

2 Open Information Extractors and Language Technology

3 ClausIEClauses in the English LanguageFrom clauses to propositions

4 Results

5 Conclusions and Future Directions

Del Corro, Gemulla (MPI) ClausIE May 2013 6 / 18

Page 14: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Open Information Extractors and Language Technology

Open Information Extractors and Language Technology

Chunks/POSTextRunnerWOEpos

Reverb

Dependency ParserWanderlustWOEparse

KrakeNOLLIE

Del Corro, Gemulla (MPI) ClausIE May 2013 6 / 18

Bell , a telecommunication company , which is based in Los Angeles , makes and distributes electronic , computer and building products .

B-NP B-NP I-NP I-NP , B-NP B-VP I-VP B-PP B-NP I-NP , B-VP I-VP I-VP B-ADJP , B-NP I-NP I-NP I-NP .

NNP DT JJ NN , WDT VBZ VBN IN NNP NNP , VBZ CC VBZ JJ , NN CC NN NNS .

nsubj

detnn

appos

nsubjpass

auxpass

rcmod

nn

prep inconj and

amod

conj and

conj and

dobjroot

1

DP

chunksPOS

Page 15: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Open Information Extractors and Language Technology

Open Information Extractors and Language Technology

Chunks/POSTextRunnerWOEpos

Reverb

Dependency ParserWanderlustWOEparse

KrakeNOLLIE

Del Corro, Gemulla (MPI) ClausIE May 2013 6 / 18

Bell , a telecommunication company , which is based in Los Angeles , makes and distributes electronic , computer and building products .

B-NP B-NP I-NP I-NP , B-NP B-VP I-VP B-PP B-NP I-NP , B-VP I-VP I-VP B-ADJP , B-NP I-NP I-NP I-NP .

NNP DT JJ NN , WDT VBZ VBN IN NNP NNP , VBZ CC VBZ JJ , NN CC NN NNS .

nsubj

detnn

appos

nsubjpass

auxpass

rcmod

nn

prep inconj and

amod

conj and

conj and

dobjroot

1

DP

chunksPOS

Page 16: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE

Outline

1 Information and Representation

2 Open Information Extractors and Language Technology

3 ClausIEClauses in the English LanguageFrom clauses to propositions

4 Results

5 Conclusions and Future Directions

Del Corro, Gemulla (MPI) ClausIE May 2013 7 / 18

Page 17: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

Clause Essentials

A clause is like a simple sentence? Paul eats a chocolate bar

A sentence can be composed by more than one clause? Anna drinks coffee and Bob plays football

Each clause encodes one or more propositions

Clauses can have optional adverbials? He will take the exam in May

A minimal clause is a clause without its optional adverbials? He will take the exam

Del Corro, Gemulla (MPI) ClausIE May 2013 7 / 18

Page 18: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

Clause Essentials

A clause is like a simple sentence? Paul eats a chocolate bar

A sentence can be composed by more than one clause? Anna drinks coffee and Bob plays football

Each clause encodes one or more propositions

Clauses can have optional adverbials? He will take the exam in May

A minimal clause is a clause without its optional adverbials? He will take the exam

Del Corro, Gemulla (MPI) ClausIE May 2013 7 / 18

Page 19: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

Clause Essentials

A clause is like a simple sentence? Paul eats a chocolate bar

A sentence can be composed by more than one clause? Anna drinks coffee and Bob plays football

Each clause encodes one or more propositions

Clauses can have optional adverbials? He will take the exam in May

A minimal clause is a clause without its optional adverbials? He will take the exam

Del Corro, Gemulla (MPI) ClausIE May 2013 7 / 18

Page 20: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

Clause Essentials

A clause is like a simple sentence? Paul eats a chocolate bar

A sentence can be composed by more than one clause? Anna drinks coffee and Bob plays football

Each clause encodes one or more propositions

Clauses can have optional adverbials? He will take the exam in May

A minimal clause is a clause without its optional adverbials? He will take the exam

Del Corro, Gemulla (MPI) ClausIE May 2013 7 / 18

Page 21: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

Clause Essentials

A clause is like a simple sentence? Paul eats a chocolate bar

A sentence can be composed by more than one clause? Anna drinks coffee and Bob plays football

Each clause encodes one or more propositions

Clauses can have optional adverbials? He will take the exam in May

A minimal clause is a clause without its optional adverbials? He will take the exam

Del Corro, Gemulla (MPI) ClausIE May 2013 7 / 18

Page 22: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

The seven clauses

1 SVi → Albert Einstein died.

2 SVe A → Albert Einstein remained in Princeton.

3 SVc C → Albert Einstein is smart.

4 SVmt O → Albert Einstein has won the Nobel Prize.

5 SVdt Oi Od → RSAS gave Albert Einstein the Nobel Prize.

6 SVct O A → The doorman showed Albert Einstein to his office.

7 SVct O C → Albert Einstein declared the meeting open.

S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect Object, O: Direct Object

Del Corro, Gemulla (MPI) ClausIE May 2013 8 / 18

Page 23: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

The seven clauses

1 SVi → Albert Einstein died.

2 SVe A → Albert Einstein remained in Princeton.

3 SVc C → Albert Einstein is smart.

4 SVmt O → Albert Einstein has won the Nobel Prize.

5 SVdt Oi Od → RSAS gave Albert Einstein the Nobel Prize.

6 SVct O A → The doorman showed Albert Einstein to his office.

7 SVct O C → Albert Einstein declared the meeting open.

S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect Object, O: Direct Object

Del Corro, Gemulla (MPI) ClausIE May 2013 8 / 18

Page 24: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

The seven clauses

1 SVi → Albert Einstein died.

2 SVe A → Albert Einstein remained in Princeton.

3 SVc C → Albert Einstein is smart.

4 SVmt O → Albert Einstein has won the Nobel Prize.

5 SVdt Oi Od → RSAS gave Albert Einstein the Nobel Prize.

6 SVct O A → The doorman showed Albert Einstein to his office.

7 SVct O C → Albert Einstein declared the meeting open.

S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect Object, O: Direct Object

Del Corro, Gemulla (MPI) ClausIE May 2013 8 / 18

Page 25: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

The seven clauses

1 SVi → Albert Einstein died.

2 SVe A → Albert Einstein remained in Princeton.

3 SVc C → Albert Einstein is smart.

4 SVmt O → Albert Einstein has won the Nobel Prize.

5 SVdt Oi Od → RSAS gave Albert Einstein the Nobel Prize.

6 SVct O A → The doorman showed Albert Einstein to his office.

7 SVct O C → Albert Einstein declared the meeting open.

S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect Object, O: Direct Object

Del Corro, Gemulla (MPI) ClausIE May 2013 8 / 18

Page 26: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

The seven clauses

1 SVi → Albert Einstein died.

2 SVe A → Albert Einstein remained in Princeton.

3 SVc C → Albert Einstein is smart.

4 SVmt O → Albert Einstein has won the Nobel Prize.

5 SVdt Oi Od → RSAS gave Albert Einstein the Nobel Prize.

6 SVct O A → The doorman showed Albert Einstein to his office.

7 SVct O C → Albert Einstein declared the meeting open.

S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect Object, O: Direct Object

Del Corro, Gemulla (MPI) ClausIE May 2013 8 / 18

By identifying each minimal clause in a sentencewe can identify the essential information

Page 27: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

The seven clauses: optional adverbials

Pattern Clause Type Example Derived clauses

Some extended patterns

SViAA SV AE died in Princeton in 1955. (AE, died)(AE, died, in Princeton)(AE, died, in 1955)(AE, died, in Princeton, in 1955)

SVeAA SVA AE remained in Princeton until his death. (AE, remained, in Princeton)(AE, remained, in Princeton, until his death)

SVcCA SVC AE is a scientist of the 20th century. (AE, is, a scientist)(AE, is, a scientist, of the 20th century)

SVmtOA SVO AE has won the Nobel Prize in 1921. (AE, has won, the Nobel Prize)(AE, has won, the Nobel Prize, in 1921)

ASVmtO SVO In 1921, AE has won the Nobel Prize. (AE, has won, the Nobel Prize)(AE, has won, the Nobel Prize, in 1921)

S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect Object, O: Direct Object

Del Corro, Gemulla (MPI) ClausIE May 2013 9 / 18

Page 28: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

The seven clauses: optional adverbials

Pattern Clause Type Example Derived clauses

Some extended patterns

SViAA SV AE died in Princeton in 1955. (AE, died)(AE, died, in Princeton)(AE, died, in 1955)(AE, died, in Princeton, in 1955)

SVeAA SVA AE remained in Princeton until his death. (AE, remained, in Princeton)(AE, remained, in Princeton, until his death)

SVcCA SVC AE is a scientist of the 20th century. (AE, is, a scientist)(AE, is, a scientist, of the 20th century)

SVmtOA SVO AE has won the Nobel Prize in 1921. (AE, has won, the Nobel Prize)(AE, has won, the Nobel Prize, in 1921)

ASVmtO SVO In 1921, AE has won the Nobel Prize. (AE, has won, the Nobel Prize)(AE, has won, the Nobel Prize, in 1921)

S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect Object, O: Direct Object

Del Corro, Gemulla (MPI) ClausIE May 2013 9 / 18

Page 29: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE Clauses in the English Language

The seven clauses: optional adverbials

Pattern Clause Type Example Derived clauses

Some extended patterns

SViAA SV AE died in Princeton in 1955. (AE, died)(AE, died, in Princeton)(AE, died, in 1955)(AE, died, in Princeton, in 1955)

SVeAA SVA AE remained in Princeton until his death. (AE, remained, in Princeton)(AE, remained, in Princeton, until his death)

SVcCA SVC AE is a scientist of the 20th century. (AE, is, a scientist)(AE, is, a scientist, of the 20th century)

SVmtOA SVO AE has won the Nobel Prize in 1921. (AE, has won, the Nobel Prize)(AE, has won, the Nobel Prize, in 1921)

ASVmtO SVO In 1921, AE has won the Nobel Prize. (AE, has won, the Nobel Prize)(AE, has won, the Nobel Prize, in 1921)

S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect Object, O: Direct Object

Del Corro, Gemulla (MPI) ClausIE May 2013 9 / 18

Page 30: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (I)

Del Corro, Gemulla (MPI) ClausIE May 2013 10 / 18

Gandhi was vegetarian.

NNP VBD JJ.

nsubj

cop

root

1

DP

Page 31: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (I)

Del Corro, Gemulla (MPI) ClausIE May 2013 10 / 18

Gandhi was vegetarian.

NNP VBD JJ.

nsubj

cop

root

1

DP Clause

Page 32: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (I)

Del Corro, Gemulla (MPI) ClausIE May 2013 10 / 18

Gandhi was vegetarian.

NNP VBD JJ.

nsubj

cop

root

1

DP Clause Object?Q1

Page 33: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (I)

Del Corro, Gemulla (MPI) ClausIE May 2013 10 / 18

Gandhi was vegetarian.

NNP VBD JJ.

nsubj

cop

root

1

DP Clause Object?Q1

Complement?Q2

No

Page 34: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (I)

Del Corro, Gemulla (MPI) ClausIE May 2013 10 / 18

Gandhi was vegetarian.

NNP VBD JJ.

nsubj

cop

root

1

DP Clause Object?Q1

Complement?Q2

Copular(SVC)

No

Yes

Page 35: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (I)

Del Corro, Gemulla (MPI) ClausIE May 2013 10 / 18

Gandhi was vegetarian.

NNP VBD JJ.

nsubj

cop

root

1

DP Clause Object?Q1

Complement?Q2

Copular(SVC)

No

Yes

( S: Gandhi, V: was, C: vegetarian)

Page 36: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

Albert Einstein died in Princeton.

B-NP I-NP B-VP B-PP B-NP.

NNP NNP VBD IN NNP.

nn nsubj

prep in

root

1

DP

Page 37: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

Albert Einstein died in Princeton.

B-NP I-NP B-VP B-PP B-NP.

NNP NNP VBD IN NNP.

nn nsubj

prep in

root

1

DP Clause

Page 38: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

Albert Einstein died in Princeton.

B-NP I-NP B-VP B-PP B-NP.

NNP NNP VBD IN NNP.

nn nsubj

prep in

root

1

DP Clause Object?Q1

Page 39: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

Albert Einstein died in Princeton.

B-NP I-NP B-VP B-PP B-NP.

NNP NNP VBD IN NNP.

nn nsubj

prep in

root

1

DP Clause Object?Q1

Complement?Q2

No

Page 40: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

Albert Einstein died in Princeton.

B-NP I-NP B-VP B-PP B-NP.

NNP NNP VBD IN NNP.

nn nsubj

prep in

root

1

DP Clause Object?Q1

Complement?Candidateadverbial?

Q2

No

No

Page 41: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

Albert Einstein died in Princeton.

B-NP I-NP B-VP B-PP B-NP.

NNP NNP VBD IN NNP.

nn nsubj

prep in

root

1

DP Clause Object?Q1

Complement?Candidateadverbial?

Known non-ext. copular?

Q2

No

No Yes

Page 42: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

Albert Einstein died in Princeton.

B-NP I-NP B-VP B-PP B-NP.

NNP NNP VBD IN NNP.

nn nsubj

prep in

root

1

DP Clause Object?Q1

Complement?Candidateadverbial?

Known non-ext. copular?

Q2

Intransitive(SV)

No

No Yes

Yes

Page 43: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

Albert Einstein died in Princeton.

B-NP I-NP B-VP B-PP B-NP.

NNP NNP VBD IN NNP.

nn nsubj

prep in

root

1

DP Clause Object?Q1

Complement?Candidateadverbial?

Known non-ext. copular?

Q2

Intransitive(SV)

No

No Yes

Yes

( S: AE, V: died,)( S: AE, V: died, A: in Princeton)

Page 44: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

DP Clause Object?Q1

Complement?Candidateadverbial?

Known non-ext. copular?

Knownext. copular?

Conservative?

Q2 Q3 Q4 Q5

Q6Copular(SVC)

Intransitive(SV)

Extendedcopular (SVA)

No

Yes

No Yes

No

No

Yes Noyes

no yes

Dir. and in-direct object? Complement?

Cand.adv. and direct

object?

Potentiallycompl.-trans.? Conservative?

Q7 Q8 Q9 Q10 Q11

Ditransitive(SVOO)

Complex tran-sitive (SVOC)

Monotransitive(SVO)

Complex tran-sitive (SVOA)

Yes No

Yes

No

Yes

Yes

No

No

Yes

No

Yes

Page 45: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

DP Clause Object?Q1

Complement?Candidateadverbial?

Known non-ext. copular?

Knownext. copular?

Conservative?

Q2 Q3 Q4 Q5

Q6Copular(SVC)

Intransitive(SV)

Extendedcopular (SVA)

No

Yes

No Yes

No

No

Yes Noyes

no yes

Dir. and in-direct object? Complement?

Cand.adv. and direct

object?

Potentiallycompl.-trans.? Conservative?

Q7 Q8 Q9 Q10 Q11

Ditransitive(SVOO)

Complex tran-sitive (SVOC)

Monotransitive(SVO)

Complex tran-sitive (SVOA)

Yes No

Yes

No

Yes

Yes

No

No

Yes

No

Yes

We first identify the information and then generate the proposition.

Page 46: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

From clauses to clause types (II)

ClausIE makes use of dictionaries

Del Corro, Gemulla (MPI) ClausIE May 2013 11 / 18

DP Clause Object?Q1

Complement?Candidateadverbial?

Known non-ext. copular?

Knownext. copular?

Conservative?

Q2 Q3 Q4 Q5

Q6Copular(SVC)

Intransitive(SV)

Extendedcopular (SVA)

No

Yes

No Yes

No

No

Yes Noyes

no yes

Dir. and in-direct object? Complement?

Cand.adv. and direct

object?

Potentiallycompl.-trans.? Conservative?

Q7 Q8 Q9 Q10 Q11

Ditransitive(SVOO)

Complex tran-sitive (SVOC)

Monotransitive(SVO)

Complex tran-sitive (SVOA)

Yes No

Yes

No

Yes

Yes

No

No

Yes

No

Yes

We first identify the information and then generate the proposition.

Page 47: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Example

Reverb → (a telecommunication company, is based in, Los Angeles)

Ollie → (Bell, distributes, electronic , computer and building products)

ClausIE → (S: Bell, V: ’is’, C: a telecommunication company)(S: Bell, V: is based, A: in Los Angeles)(S: Bell, V: makes, O: electronic products)(S: Bell, V: makes, O: computer products)(S: Bell, V: makes, O: building products)(S: Bell, V: distributes, O: electronic products)(S: Bell, V: distributes, O: computer products)(S: Bell, V: distributes, O: building products)

Del Corro, Gemulla (MPI) ClausIE May 2013 12 / 18

Bell , a telecommunication company , which is based in Los Angeles , makes and distributes electronic , computer and building products .

B-NP B-NP I-NP I-NP , B-NP B-VP I-VP B-PP B-NP I-NP , B-VP I-VP I-VP B-ADJP , B-NP I-NP I-NP I-NP .

NNP DT JJ NN , WDT VBZ VBN IN NNP NNP , VBZ CC VBZ JJ , NN CC NN NNS .

nsubj

detnn

appos

nsubjpass

auxpass

rcmod

nn

prep inconj and

amod

conj and

conj and

dobjroot

1

Page 48: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Example

Reverb → (a telecommunication company, is based in, Los Angeles)

Ollie → (Bell, distributes, electronic , computer and building products)

ClausIE → (S: Bell, V: ’is’, C: a telecommunication company)(S: Bell, V: is based, A: in Los Angeles)(S: Bell, V: makes, O: electronic products)(S: Bell, V: makes, O: computer products)(S: Bell, V: makes, O: building products)(S: Bell, V: distributes, O: electronic products)(S: Bell, V: distributes, O: computer products)(S: Bell, V: distributes, O: building products)

Del Corro, Gemulla (MPI) ClausIE May 2013 12 / 18

Bell , a telecommunication company , which is based in Los Angeles , makes and distributes electronic , computer and building products .

B-NP B-NP I-NP I-NP , B-NP B-VP I-VP B-PP B-NP I-NP , B-VP I-VP I-VP B-ADJP , B-NP I-NP I-NP I-NP .

NNP DT JJ NN , WDT VBZ VBN IN NNP NNP , VBZ CC VBZ JJ , NN CC NN NNS .

nsubj

detnn

appos

nsubjpass

auxpass

rcmod

nn

prep inconj and

amod

conj and

conj and

dobjroot

1

Bell, a telecommunication company, which is based in Los Angeles ,makes and distributes electronic, computer and building products.

Page 49: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Example

Reverb → (a telecommunication company, is based in, Los Angeles)

Ollie → (Bell, distributes, electronic , computer and building products)

ClausIE → (S: Bell, V: ’is’, C: a telecommunication company)(S: Bell, V: is based, A: in Los Angeles)(S: Bell, V: makes, O: electronic products)(S: Bell, V: makes, O: computer products)(S: Bell, V: makes, O: building products)(S: Bell, V: distributes, O: electronic products)(S: Bell, V: distributes, O: computer products)(S: Bell, V: distributes, O: building products)

Del Corro, Gemulla (MPI) ClausIE May 2013 12 / 18

Bell , a telecommunication company , which is based in Los Angeles , makes and distributes electronic , computer and building products .

B-NP B-NP I-NP I-NP , B-NP B-VP I-VP B-PP B-NP I-NP , B-VP I-VP I-VP B-ADJP , B-NP I-NP I-NP I-NP .

NNP DT JJ NN , WDT VBZ VBN IN NNP NNP , VBZ CC VBZ JJ , NN CC NN NNS .

nsubj

detnn

appos

nsubjpass

auxpass

rcmod

nn

prep inconj and

amod

conj and

conj and

dobjroot

1

Bell, a telecommunication company, which is based in Los Angeles ,makes and distributes electronic, computer and building products.

Page 50: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Example

Reverb → (a telecommunication company, is based in, Los Angeles)

Ollie → (Bell, distributes, electronic , computer and building products)

ClausIE → (S: Bell, V: ’is’, C: a telecommunication company)(S: Bell, V: is based, A: in Los Angeles)(S: Bell, V: makes, O: electronic products)(S: Bell, V: makes, O: computer products)(S: Bell, V: makes, O: building products)(S: Bell, V: distributes, O: electronic products)(S: Bell, V: distributes, O: computer products)(S: Bell, V: distributes, O: building products)

Del Corro, Gemulla (MPI) ClausIE May 2013 12 / 18

Bell , a telecommunication company , which is based in Los Angeles , makes and distributes electronic , computer and building products .

B-NP B-NP I-NP I-NP , B-NP B-VP I-VP B-PP B-NP I-NP , B-VP I-VP I-VP B-ADJP , B-NP I-NP I-NP I-NP .

NNP DT JJ NN , WDT VBZ VBN IN NNP NNP , VBZ CC VBZ JJ , NN CC NN NNS .

nsubj

detnn

appos

nsubjpass

auxpass

rcmod

nn

prep inconj and

amod

conj and

conj and

dobjroot

1

Bell, a telecommunication company, which is based in Los Angeles ,makes and distributes electronic, computer and building products.

Page 51: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Example

Reverb → (a telecommunication company, is based in, Los Angeles)

Ollie → (Bell, distributes, electronic , computer and building products)

ClausIE → (S: Bell, V: ’is’, C: a telecommunication company)(S: Bell, V: is based, A: in Los Angeles)(S: Bell, V: makes, O: electronic products)(S: Bell, V: makes, O: computer products)(S: Bell, V: makes, O: building products)(S: Bell, V: distributes, O: electronic products)(S: Bell, V: distributes, O: computer products)(S: Bell, V: distributes, O: building products)

Del Corro, Gemulla (MPI) ClausIE May 2013 12 / 18

Bell , a telecommunication company , which is based in Los Angeles , makes and distributes electronic , computer and building products .

B-NP B-NP I-NP I-NP , B-NP B-VP I-VP B-PP B-NP I-NP , B-VP I-VP I-VP B-ADJP , B-NP I-NP I-NP I-NP .

NNP DT JJ NN , WDT VBZ VBN IN NNP NNP , VBZ CC VBZ JJ , NN CC NN NNS .

nsubj

detnn

appos

nsubjpass

auxpass

rcmod

nn

prep inconj and

amod

conj and

conj and

dobjroot

1

Bell, a telecommunication company, which is based in Los Angeles ,makes and distributes electronic, computer and building products.

Page 52: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Example

Reverb → (a telecommunication company, is based in, Los Angeles)

Ollie → (Bell, distributes, electronic , computer and building products)

ClausIE → (S: Bell, V: ’is’, C: a telecommunication company)(S: Bell, V: is based, A: in Los Angeles)(S: Bell, V: makes, O: electronic products)(S: Bell, V: makes, O: computer products)(S: Bell, V: makes, O: building products)(S: Bell, V: distributes, O: electronic products)(S: Bell, V: distributes, O: computer products)(S: Bell, V: distributes, O: building products)

Del Corro, Gemulla (MPI) ClausIE May 2013 12 / 18

Bell , a telecommunication company , which is based in Los Angeles , makes and distributes electronic , computer and building products .

B-NP B-NP I-NP I-NP , B-NP B-VP I-VP B-PP B-NP I-NP , B-VP I-VP I-VP B-ADJP , B-NP I-NP I-NP I-NP .

NNP DT JJ NN , WDT VBZ VBN IN NNP NNP , VBZ CC VBZ JJ , NN CC NN NNS .

nsubj

detnn

appos

nsubjpass

auxpass

rcmod

nn

prep inconj and

amod

conj and

conj and

dobjroot

1

Bell, a telecommunication company, which is based in Los Angeles ,makes and distributes electronic, computer and building products.

Page 53: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Example

Reverb → (a telecommunication company, is based in, Los Angeles)

Ollie → (Bell, distributes, electronic , computer and building products)

ClausIE → (S: Bell, V: ’is’, C: a telecommunication company)(S: Bell, V: is based, A: in Los Angeles)(S: Bell, V: makes, O: electronic products)(S: Bell, V: makes, O: computer products)(S: Bell, V: makes, O: building products)(S: Bell, V: distributes, O: electronic products)(S: Bell, V: distributes, O: computer products)(S: Bell, V: distributes, O: building products)

Del Corro, Gemulla (MPI) ClausIE May 2013 12 / 18

Bell , a telecommunication company , which is based in Los Angeles , makes and distributes electronic , computer and building products .

B-NP B-NP I-NP I-NP , B-NP B-VP I-VP B-PP B-NP I-NP , B-VP I-VP I-VP B-ADJP , B-NP I-NP I-NP I-NP .

NNP DT JJ NN , WDT VBZ VBN IN NNP NNP , VBZ CC VBZ JJ , NN CC NN NNS .

nsubj

detnn

appos

nsubjpass

auxpass

rcmod

nn

prep inconj and

amod

conj and

conj and

dobjroot

1

Bell, a telecommunication company, which is based in Los Angeles ,makes and distributes electronic, computer and building products.

Page 54: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Example

Reverb → (a telecommunication company, is based in, Los Angeles)

Ollie → (Bell, distributes, electronic , computer and building products)

ClausIE → (S: Bell, V: ’is’, C: a telecommunication company)(S: Bell, V: is based, A: in Los Angeles)(S: Bell, V: makes, O: electronic products)(S: Bell, V: makes, O: computer products)(S: Bell, V: makes, O: building products)(S: Bell, V: distributes, O: electronic products)(S: Bell, V: distributes, O: computer products)(S: Bell, V: distributes, O: building products)

Del Corro, Gemulla (MPI) ClausIE May 2013 12 / 18

Bell , a telecommunication company , which is based in Los Angeles , makes and distributes electronic , computer and building products .

B-NP B-NP I-NP I-NP , B-NP B-VP I-VP B-PP B-NP I-NP , B-VP I-VP I-VP B-ADJP , B-NP I-NP I-NP I-NP .

NNP DT JJ NN , WDT VBZ VBN IN NNP NNP , VBZ CC VBZ JJ , NN CC NN NNS .

nsubj

detnn

appos

nsubjpass

auxpass

rcmod

nn

prep inconj and

amod

conj and

conj and

dobjroot

1

Bell, a telecommunication company, which is based in Los Angeles ,makes and distributes electronic, computer and building products.

Page 55: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Identifying information

ClausIE separates the identification of the information from itsrepresentation

Identifies essential and optional arguments in a clause

No training data

Initial support non-verb mediated relations

Processing of conjunctions (in verbs and subject/arguments)? Messi and Iniesta play in Barcelona → (Messi, plays, in

Barcelona), (Iniesta, plays, in Barcelona)

Resolution of relative clauses? I saw the man whose house you like → (I, saw, the man), (You,

like, the man’s house) ...

Del Corro, Gemulla (MPI) ClausIE May 2013 13 / 18

Page 56: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Identifying information

ClausIE separates the identification of the information from itsrepresentation

Identifies essential and optional arguments in a clause

No training data

Initial support non-verb mediated relations

Processing of conjunctions (in verbs and subject/arguments)? Messi and Iniesta play in Barcelona → (Messi, plays, in

Barcelona), (Iniesta, plays, in Barcelona)

Resolution of relative clauses? I saw the man whose house you like → (I, saw, the man), (You,

like, the man’s house) ...

Del Corro, Gemulla (MPI) ClausIE May 2013 13 / 18

Page 57: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Identifying information

ClausIE separates the identification of the information from itsrepresentation

Identifies essential and optional arguments in a clause

No training data

Initial support non-verb mediated relations

Processing of conjunctions (in verbs and subject/arguments)? Messi and Iniesta play in Barcelona → (Messi, plays, in

Barcelona), (Iniesta, plays, in Barcelona)

Resolution of relative clauses? I saw the man whose house you like → (I, saw, the man), (You,

like, the man’s house) ...

Del Corro, Gemulla (MPI) ClausIE May 2013 13 / 18

Page 58: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Identifying information

ClausIE separates the identification of the information from itsrepresentation

Identifies essential and optional arguments in a clause

No training data

Initial support non-verb mediated relations

Processing of conjunctions (in verbs and subject/arguments)? Messi and Iniesta play in Barcelona → (Messi, plays, in

Barcelona), (Iniesta, plays, in Barcelona)

Resolution of relative clauses? I saw the man whose house you like → (I, saw, the man), (You,

like, the man’s house) ...

Del Corro, Gemulla (MPI) ClausIE May 2013 13 / 18

Page 59: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Identifying information

ClausIE separates the identification of the information from itsrepresentation

Identifies essential and optional arguments in a clause

No training data

Initial support non-verb mediated relations

Processing of conjunctions (in verbs and subject/arguments)? Messi and Iniesta play in Barcelona → (Messi, plays, in

Barcelona), (Iniesta, plays, in Barcelona)

Resolution of relative clauses? I saw the man whose house you like → (I, saw, the man), (You,

like, the man’s house) ...

Del Corro, Gemulla (MPI) ClausIE May 2013 13 / 18

Page 60: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Identifying information

ClausIE separates the identification of the information from itsrepresentation

Identifies essential and optional arguments in a clause

No training data

Initial support non-verb mediated relations

Processing of conjunctions (in verbs and subject/arguments)? Messi and Iniesta play in Barcelona → (Messi, plays, in

Barcelona), (Iniesta, plays, in Barcelona)

Resolution of relative clauses? I saw the man whose house you like → (I, saw, the man), (You,

like, the man’s house) ...

Del Corro, Gemulla (MPI) ClausIE May 2013 13 / 18

Page 61: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Proposition Generation: a flexible process

Arbitrary form of relations? (Messi, plays football in, Barcelona) or (Messi, plays, football in

Barcelona)

Propositions can be customized (e.g. triple, n-ary, etc)? (Messi, plays, football in Barcelona) or (Messi, plays, football, in

Barcelona)

Arbitrary argument types (e.g. noun phrases, adjectives, etc)? (Gandhi, was, vegetarian) or (Gandhi, was, a vegetarian) or

(Gandhi from Porbandar, was, a vegetarian)

Optional arguments can be used to generate new propositions? (Paul, takes, a shower, in the morning) or (Paul, takes, a shower)

Del Corro, Gemulla (MPI) ClausIE May 2013 14 / 18

Page 62: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Proposition Generation: a flexible process

Arbitrary form of relations? (Messi, plays football in, Barcelona) or (Messi, plays, football in

Barcelona)

Propositions can be customized (e.g. triple, n-ary, etc)? (Messi, plays, football in Barcelona) or (Messi, plays, football, in

Barcelona)

Arbitrary argument types (e.g. noun phrases, adjectives, etc)? (Gandhi, was, vegetarian) or (Gandhi, was, a vegetarian) or

(Gandhi from Porbandar, was, a vegetarian)

Optional arguments can be used to generate new propositions? (Paul, takes, a shower, in the morning) or (Paul, takes, a shower)

Del Corro, Gemulla (MPI) ClausIE May 2013 14 / 18

Page 63: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Proposition Generation: a flexible process

Arbitrary form of relations? (Messi, plays football in, Barcelona) or (Messi, plays, football in

Barcelona)

Propositions can be customized (e.g. triple, n-ary, etc)? (Messi, plays, football in Barcelona) or (Messi, plays, football, in

Barcelona)

Arbitrary argument types (e.g. noun phrases, adjectives, etc)? (Gandhi, was, vegetarian) or (Gandhi, was, a vegetarian) or

(Gandhi from Porbandar, was, a vegetarian)

Optional arguments can be used to generate new propositions? (Paul, takes, a shower, in the morning) or (Paul, takes, a shower)

Del Corro, Gemulla (MPI) ClausIE May 2013 14 / 18

Page 64: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

ClausIE From clauses to propositions

Proposition Generation: a flexible process

Arbitrary form of relations? (Messi, plays football in, Barcelona) or (Messi, plays, football in

Barcelona)

Propositions can be customized (e.g. triple, n-ary, etc)? (Messi, plays, football in Barcelona) or (Messi, plays, football, in

Barcelona)

Arbitrary argument types (e.g. noun phrases, adjectives, etc)? (Gandhi, was, vegetarian) or (Gandhi, was, a vegetarian) or

(Gandhi from Porbandar, was, a vegetarian)

Optional arguments can be used to generate new propositions? (Paul, takes, a shower, in the morning) or (Paul, takes, a shower)

Del Corro, Gemulla (MPI) ClausIE May 2013 14 / 18

Page 65: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Results

Outline

1 Information and Representation

2 Open Information Extractors and Language Technology

3 ClausIEClauses in the English LanguageFrom clauses to propositions

4 Results

5 Conclusions and Future Directions

Del Corro, Gemulla (MPI) ClausIE May 2013 15 / 18

Page 66: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Results

Evaluation

3 datasetsReverb: Web, very noisy (500 sentences)

New York Times: Complex, written by experts (200 sentences)

Wikipedia: Simple, written by non-experts (200 sentences)

2 labelers, pessimistic approach.

Agreement 57%-68%.

High precision, high recall.

Del Corro, Gemulla (MPI) ClausIE May 2013 15 / 18

Page 67: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Results

Results I: Reverb Sentences

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Number of extractions

Pre

cisi

on

ClausIEClausIE (non−red.)ClausIE w/o CCsClausIE w/o CCs (non−red.)ReverbOLLIETextRunnerTextRunner (Reverb)WOE

Del Corro, Gemulla (MPI) ClausIE May 2013 16 / 18

Page 68: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Results

Results II: Wikipedia and New York Times

0 200 400 600 800 1000

0.0

0.2

0.4

0.6

0.8

1.0

Number of extractions

Pre

cisi

on

ClausIEClausIE (non−red.)ClausIE w/o CCClausIE w/o CC (non−red.)ReverbOLLIE

0 200 400 600 800 1000 12000.

00.

20.

40.

60.

81.

0Number of extractions

Pre

cisi

on

ClausIEClausIE (non−red.)ClausIE w/o CCClausIE w/o CC (non−red.)ReverbOLLIE

Del Corro, Gemulla (MPI) ClausIE May 2013 17 / 18

Wikipedia (200 sentences) New York Times (200 sentences)

Page 69: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Conclusions and Future Directions

Outline

1 Information and Representation

2 Open Information Extractors and Language Technology

3 ClausIEClauses in the English LanguageFrom clauses to propositions

4 Results

5 Conclusions and Future Directions

Del Corro, Gemulla (MPI) ClausIE May 2013 18 / 18

Page 70: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Conclusions and Future Directions

Conclusions and Future Directions

ConclusionsClausIE is a principled approach for OIESeparates identification and representationNo training neededDP basedPublicly available http://www.mpi-inf.mpg.de/departments/d5/software/clausie/

Future DirectionsBuild dictionariesIncorporate context analysisPost processing of argumentsInput to other tasks: discourse processing, SRL, targeted IE,ontology learning, QA, ...

Del Corro, Gemulla (MPI) ClausIE May 2013 18 / 18

Page 71: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Conclusions and Future Directions

Conclusions and Future Directions

ConclusionsClausIE is a principled approach for OIESeparates identification and representationNo training neededDP basedPublicly available http://www.mpi-inf.mpg.de/departments/d5/software/clausie/

Future DirectionsBuild dictionariesIncorporate context analysisPost processing of argumentsInput to other tasks: discourse processing, SRL, targeted IE,ontology learning, QA, ...

Del Corro, Gemulla (MPI) ClausIE May 2013 18 / 18

Page 72: ClausIE: Clause-Based Open Information Extractioncorrogg/... · 7 SVct OC ! Albert Einsteindeclaredthe meetingopen. S: Subject, V: Verb, A: Adverbial, C: Complement, Oi: Indirect

Conclusions and Future Directions

Conclusions and Future Directions

ConclusionsClausIE is a principled approach for OIESeparates identification and representationNo training neededDP basedPublicly available http://www.mpi-inf.mpg.de/departments/d5/software/clausie/

Future DirectionsBuild dictionariesIncorporate context analysisPost processing of argumentsInput to other tasks: discourse processing, SRL, targeted IE,ontology learning, QA, ...

Del Corro, Gemulla (MPI) ClausIE May 2013 18 / 18Thank You!