Top Banner
Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu
32

Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

Dec 24, 2015

Download

Documents

Arron Carson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

Semantic Enrichment of Text with Background

Knowledge

Anselmo Peñas

NLP & IR GroupUNED

nlp.uned.es

Eduard Hovy USC / ISI

isi.edu

Page 2: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Text omits information

San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.

Page 3: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Make explicit implicit information

Implicit (More) explicit

San Francisco’s Eric Davis Eric Davis plays for San FranciscoE.D. is a player, S.F. is a team

Eric Davis intercepted pass1

-

Steve Walsh pass1 Steve Walsh threw pass1

Steve Walsh threw interception1…

Young touchdown pass2 Young completed pass2 for touchdown…

touchdown pass2 to Brent Jones

Brent Jones caught pass2 for touchdown

San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.

Page 4: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Goals

General Goal Automatic recovering of such

omitted information

Enrichment is the process of adding explicitly to a text’s representation the information that is either implicit or missing in the text

Page 5: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

The enrichment cycle

Cycle:1. Read text from collection2. Ruminate in BKB3. Enrich text representation4. Repeat

DomainDocs.

ReadingBackgroun

d Knowledge

Base

Rumination

Enrichment

Page 6: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Goals

Specific goals of this work

Explore the idea of using “Proposition Stores” as Background Knowledge for enrichment

Explore procedures for enrichment

Determine the kinds of knowledge that Proposition Stores must include to enable enrichment

Page 7: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Outline

1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion

Page 8: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Elements in our BKB

Entities• Classes: not limited to a predefined set• Instances: proper nouns (in this first

approach)• Class:has-instance:Instance relations

Propositions: Predefined syntactic structures

• NV, NVPN• NVN, NVNPN• NPN, AN• …

Page 9: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Extraction of propositions

Patterns over dependency trees

prop( Type, Form : DependencyConstrains : NodeConstrains ).

Examples:prop(nv, [N,V] : [V:N:nsubj, not(V:_:'dobj')] : [verb(V)]).

prop(nvnpn, [N1,V,N2,P,N3]:[V:N2:'dobj', V:N3:Prep, subj(V,N1)]:[prep(Prep,P)]).

prop(has_value, [N,Val]:[N:Val:_]:[nn(N), cd(Val), not(lemma(Val,'one'))]).

Page 10: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Background Knowledge Base(NFL, US football)

?> NN NNP:’pass’

NN 24 'Marino’:'pass‘

NN 17 'Kelly':'pass'NN 15

'Elway’:'pass’

?>X:has-instance:’Marino’20 'quarterback':has-

instance:'Marino'6 'passer':has-instance:'Marino'4 'leader':has-instance:'Marino'3 'veteran':has-

instance:'Marino'2 'player':has-instance:'Marino'

?> NPN 'pass':X:'touchdown‘

NPN 712 'pass':'for':'touchdown'

NPN 24 'pass':'include':'touchdown’

?> NVN 'quarterback':X:'pass'

NVN 98 'quarterback':'throw':'pass'

NVN 27 'quarterback':'complete':'pass‘

?> NVNPN 'NNP':X:'pass':Y:'touchdown'NVNPN 189

'NNP':'catch':'pass':'for':'touchdown'NVNPN 26

'NNP':'complete':'pass':'for':'touchdown‘…  

?> NVN 'end':X:'pass‘

NVN 28 'end':'catch':'pass'

NVN 6 'end':'drop':'pass‘

Page 11: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Outline

1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion

Page 12: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Enrichment example (1)

…to set up a 7-yard Young touchdown pass to Brent Jones

pass

Young

touchdown Jones

nn nn to

Young pass?> X:has-instance:Young

X=quarterback?>

NVN:quarterback:X:passX=throwX=complete

pass to Jones?> X:has-

instance:JonesX=end

?> NVN:end:X:passX=catchX=drop

Page 13: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Enrichment example (2)

pass

Young

touchdown Jones

throwcomplete

nn catchdrop

touchdown pass?> NVN touchdown:X:pass

False?> NPN pass:X:touchdown

X=for

…to set up a 7-yard Young touchdown pass to Brent Jones

Page 14: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Enrichment example (3)

pass

Young

touchdown Jones

throwcomplete

for catchdrop

?> NVNPN NAME:X:pass:for:touchdownX=completeX=catch

…to set up a 7-yard Young touchdown pass to Brent Jones

Page 15: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Enrichment example (4)

pass

Young

touchdown Jones

complete for catch

Young complete pass for touchdown Jones catch pass for touchdown

…to set up a 7-yard Young touchdown pass to Brent Jones

Page 16: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Enrichment

Build context for instances Build context for dependencies

Finding prepositionsFinding verbs

Constrain interpretations

Page 17: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Enrichment example (5)

San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.

Before enrichment

forthrow

catchcomplete

After enrichment

Page 18: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Outline

1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion

Page 19: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

What BKBs need for enrichment? (1)

Ability to answer about instances• Not complete population• But allow analogy

Ability to constrain interpretations and accumulate evidence

• Several different queries over the same elements considering different syntactic structures

• Require normalization (and parsing)

Page 20: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

What BKBs need for enrichment? (1)

Ability to discover entity classes with appropriate granularity level

• Quarterbacks throw passes• Ends catch passes• Tag an entity as person or even player is

not specific enough for enrichment

Text frequently introduces the relevant class (appropriate granularity level) for understanding

Page 21: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

What BKBs need for enrichment? (2)

Ability to digest enough knowledge adapted to the domain

• Crucial

Approaches• Macro-reading (web scale) + domain

adaptation• Shallow NLP, lack of normalization

• Reading in context (suggested here)• Domain partitioning• Deeper NLP, specific domain NLP

Page 22: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Digest enough knowledge

DART: general domain propositions storeTextRunner: general domain (web-scale)BKB: specific domain propositions store (only

30,000 docs)

?> quarterback:X:passDART TextRunner BKB (US

Football)(no results) (~200) threw

(~100) completed (36) to throw (26) has thrown (19) makes (19) has (18) fires

(99) throw(25) complete(7) have(5) attempt(5) not-throw(4) toss(3) release

Page 23: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

?> X:intercept:pass

DART TextRunner BKB (US Football)

(13) person (6) person/place/organization(2) full-back(1) place

(30) Early (26) Two plays

(24) fumble (20) game (20) ball (17) Defensively

(75) person(14) cornerback(11) defense(8) safety(7) group(5) linebacker

Digest Knowledge in the domain(entity classes)

Page 24: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Digest Knowledge in the domain(ambiguity problem)

?> person:X:passDART TextRunner BKB (US

Football)(47) make (45) take (36) complete (30) throw (25) let (23) catch (1) make (1) expect

(22) gets (17) makes (10) has (10) receives (7) who has (7) must have (6) acting on (6) to catch (6) who buys (5) bought (5) admits (5) gives

(824) catch(546) throw(256) complete(136) have(59) intercept(56) drop(39) not-catch(37) not-throw(36) snare(27) toss(23) pick off(20) run

Page 25: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Domain issue

?> person:X:passNFL Domain

905:nvn:[person:n, catch:v, pass:n].667:nvn:[person:n, throw:v, pass:n].286:nvn:[person:n, complete:v, pass:n].

204:nvnpn:[person:n, catch:v, pass:n, for:in, yard:n].

85:nvnpn:[person:n, catch:v, pass:n, for:in, touchdown:n].

IC Domain6:nvn:[person:n, have:v, pass:n]3:nvn:[person:n, see:v, pass:n]

1:nvnpn:[person:n, wear:v, pass:n, around:in, neck:n]

BIO Domain<No results>

Page 26: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Domain issue

?> X:receive:YNFL Domain

55:nvn:[person:n, receive:v, call:n].34:nvn:[person:n, receive:v, offer:n].33:nvn:[person:n, receive:v, bonus:n].29:nvn:[team:class, receive:v, pick:n].

IC Domain78 nvn:[person:n, receive:v, call:n]44 nvn:[person:n, receive:v, letter:n]35 nvn:[group:n, receive:v, information:n]31 nvn:[person:n, receive:v, training:n]

BIO Domain24 nvn:[patients:n, receive:v, treatment:n]14 nvn:[patients:n, receive:v, therapy:n]13 nvn:[patients:n, receive:v, care:n]

Page 27: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Outline

1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion

Page 28: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Conclusions

Limiting to a specific domain provides some powerful benefits Ambiguity is reduced Higher density of relevant propositions Different distribution of propositions across domains Amount of source text is reduced, allowing deeper

processing such as parsing Specific tools for specific domains

Proposition stores seem to be useful Improve parsing, corref, WSD,…

We presented a new application: ENRICHMENT

Page 29: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Current work

Develop automatic procedures for EnrichmentNeed better Proposition Stores

• Selectional Preferences• Lexical relatedness• Structural /frame transformations• …

Page 30: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

Future work

Develop appropriate methodologies for evaluationIntrinsic?Extrinsic: QA over single

documents?• Reading comprehension tests?

Page 31: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

Thanks!

Page 32: Semantic Enrichment of Text with Background Knowledge Anselmo Peñas NLP & IR Group UNED nlp.uned.es Eduard Hovy USC / ISI isi.edu.

UNED

nlp.uned.es

NVN 3 'quarterback':'find':'receiver‘NVNPN 3 'quarterback':'throw':'pass':'to':'receiver'NVNPN 2 'quarterback':'complete':'pass':'to':'receiver'NVNPN 1 'receiver':'catch':'pass':'from':'quarterback‘

nvn:('NNP':'quarterback'):'hit':('NNP':'receiver'),177).nvnpn:('NNP':'quarterback'):'throw':'pass':'to':

('NNP':'receiver'),143).nvnpn:('NNP':'quarterback'):'complete':'pass':'to':

('NNP':'receiver'),79).nvn:('NNP':'quarterback'):'find':('NNP':'receiver'),69).nvnpn:('NNP':'receiver'):'catch':'pass':'from':

('NNP':'quarterback'),43).