Joint Information Extraction Ph.D. Thesis Defense Qi Li Advisor: Heng Ji Computer Science Department Rensselaer Polytechnic Institute April 7th, 2015 Doctoral Committee Dr. Heng Ji, Chair, RPI Dr. James Hendler, RPI Dr. Peter Fox, RPI Dr. Dan Roth, UIUC Dr. Daniel Bikel, Google
63
Embed
Joint Information Extraction Ph.D. Thesis Defense Qi Li Advisor: Heng Ji Computer Science Department Rensselaer Polytechnic Institute April 7th, 2015 Doctoral.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Joint Information Extraction
Ph.D. Thesis Defense
Qi Li
Advisor: Heng Ji
Computer Science Department
Rensselaer Polytechnic Institute
April 7th, 2015
Doctoral Committee
Dr. Heng Ji, Chair, RPI
Dr. James Hendler, RPI
Dr. Peter Fox, RPI
Dr. Dan Roth, UIUC
Dr. Daniel Bikel, Google
2
…, dozens of Israeli tanks advanced into the northern Gaza Strip backed by helicopters which fired at least three rockets in the Jabaliya area, Palestinian security sources said. …
advanced
fired
Owner Vehicle
Destination
Instrument
Place
Instrument
Event:Attack
Event: Transport
AFP 2003/03/05
Israeli
GPE
tanks
VEHICLE
Gaza Strip
LOCATION
helicopters
VEHICLE
Jabaliya area
LOCATION
rockets
WEAPON
Relation
Background
Entity Mention: a mention of an entity in the worldRelation: a semantic relationship between two entity
mentionsACE Relation Type
Example
Physical a town(GPE) some 50 miles south of Salzburg(GPE)
Person-Social Relatives(PER) of the dead(PER)
EMP-ORG The tire maker(ORG) still employs 1,400(PER)
Agent-Artifact Rubin Military Design, the makers(ORG) of the Kursk(VEH)
PER/ORG Affiliation
Republican(ORG) senators(PER)
GPE-Affiliation Salzburg(GPE) Red Cross officials(PER)
• Automatic Content Analysis
3
Background
Background
Event Mention: an occurrence of an event with a particular type and subtype
Event Mention Trigger (anchor): the word (or phrase) that most clearly expresses the event mention
Event Argument: the entity mentions that serve as participant or attribute to the event“In Baghdad, a cameraman died when an American tank
•Lack of global inference about the entire results
•End-to-End Performances are limited
o Relation: 40.8%o Event Arg: 36.6%
7
OverviewThis thesis investigates cross-component, cross-document, and cross-lingual dependencies to improve Information Extraction.
8
Cross-component Dependencies
Different components have various dependencies •Long-distance dependencies •Dependencies among multiple subtasks
9
“Since the June 4 summit in Jordan between Abbas, Sharon and George W. Bush,
Hamas has been a thorn in the side of Abbas ...”
George W. Bush Republican PartyMember of
0.91
George W. Bush HamasMember of
0.47
Cross-document Dependencies
“The list included Sheik Ahmed Yassin, Hamas’ founder and spiritual
leader, senior Hamas official Abdel Aziz Rantisi”
10
Cross-lingual Dependencies
Different languages in parallel corpora are complimentary •Resources •Patterns, features, and language phenomenon
11
• Constrained Conditional Models, ILP Inference[Roth2004; Punyakanok2005; Roth2007; Chang2012; Yang2013; Jindal&Roth2013; Cheng&Roth2013]• Our method is a single unified joint model for both
learning and inference • Re-ranking Methods
[Ji2005; Huang2002; Chen2010; McClosky2011]• Their models were separately learned• Need additional training data for re-ranking
• Probabilistic Graphical Models[Sutton2004; Poon2007; Poon2010; Kiddon2012; Wick2012; Singh2013]• Computationally expensive• Our method uses beam-search, and thus can
The tire maker still employs 1,400 O B-ORG L-ORG O O U-PER
The tire maker still employs 1,400
ORG PER
Token-based
vs.Segment-based
(Sarawagi & Cohen, 2004)(Zhang & Clark, 2008)
(Florian et. al., 2006)(Ratinov & Roth, 2009)
BILOU schema: B-X: beginning of X;L-X: last token of X; U-X: single token of X; O: no type
• Structured Perceptron with Beam Search
oUpdate Weights: • Perceptron Update:
• K-best MIRA (Margin Infused Relaxed Algorithm)[McDonald et. al., 2005]
Parameter Estimation
21
Beam Search
update weights
[Collins and Roark 2004, Huang et al. 2012]
input
ground-truth
prediction
Standard-update vs. Early-update
standard update: invalid update!
early update:early update
beam
[Collins and Roark 2004, Huang et al. 2012]
beam
1-best prefix z
global 1-best z
correct solution y
22
ground-truth prefix falls off beam
Token-based Search Algorithm
• Assume argument candidates are given• Decoding example (beam size = 1):
In Baghdad, a cameraman died when an American tank fired on the Palestine Hotel.
place
placevictim
target
targetinstrumentinstrument
LOC
Die Attack O O O O O O O O O O O O
23
PER VEH FAC
Segment-based Search Algorithm
• Limitations of the Token-based decoder
o unfair to compare nodes with different boundaries
• Complete mention is biased by the model
o difficult to synchronize edge steps• (NewB-FAC YorkI-FAC) is not yet a complete mention
no link can be made at this step
24
Not parsed yet
✓
✗
Segment-based Search Algorithm
• Node-step (search for entity mentions and event triggers)o propose various nodes at the current tokeno append to previous assignmentso evaluate and rank new assignments
25
ORG
PER
O
…
Asif Mohammed Hanif detonated explosives in Tel Aviv
Segment-based Search Algorithm
• Node-step (search for entity mentions and event triggers)o propose various nodes at the current tokeno append to previous assignmentso evaluate and rank new assignments
26
PER
ORG
O
…
PER
Context Features:noun phrase
person gazetteerprevious word:
“the”…
× PER
Segment-based Feature
Asif Mohammed Hanif detonated explosives in Tel Aviv
Segment-based Search Algorithm
• Node-step (search for entity mentions and event triggers)o propose various nodes at the current tokeno append to previous assignmentso evaluate and rank new assignments
27
Attack
PER
O
…
Injure
Asif Mohammed Hanif detonated explosives in Tel Aviv
Segment-based Search Algorithm
• Node-step (search for entity mentions and event triggers)o propose various nodes at the current tokeno append to previous assignmentso evaluate and rank new assignments
28
Asif Mohammed Hanif detonated explosives in Tel Aviv
Attack×
Append each candidate to previous prefixes
PER
ORG
PERO
…
Buffer at “hanif”
Segment-based Search Algorithm
• Edge-step (search for relation/argument links)o At each sub-step, connect each new node with a
previous one by a typed edge, or NIL.
29
Asif Mohammed Hanif detonated explosives in Tel Aviv
AttackPER WEAPON
Attacker Instrument
agent-artifact
Relation-Event Feature
Attacker Instrument
Agent-artifact
Segment-based Search Algorithm
• Return the candidate with the highest model scoreas the final prediction
30
Asif Mohammed Hanif detonated explosives in Tel Aviv
AttackPER WEAPON
attacker
O GPE
place
physical
agent-artifact
instrument
Can
did
at
es
• The maximal length of each node typeo ORG example: “Pearl River Hang Cheong Real Estate Consultants Ltd”
Segment-based Search Algorithm
31
Local Features• Local Features
o Similar to the features in pipelined approacheso Only care about local decisions
32
In Baghdad, a cameraman died when an American tank fired on the Palestine Hotel
o Involve a wider range of the output structureo Ask arbitrary questions about the entire
structure
33
In Baghdad, a cameraman died when an American tank fired on the Palestine Hotel
place
targettarget
instrument
1. does “fired” have only one Place ? 2. is “Baghdad” an argument to “died” ?3. …
Global Trigger Feature
34
“a cameraman died when an American tank fired on …”
advcl
Die
advcl
Attack
Dependency link:
Die
“when”
Attack
Context word:
advcl: adverbial clause modifier
o two triggers share the same mention as arguments
“ a cameraman died when an American tank fired on …”
35
Global Argument Feature
Die(“died”)
Attack(“fired”)
Entity(“cameramen
”)
AdvclVict
imTarget
Global Entity Mention Features
• Neighbor entity mentions should have coherent types
36
prep_from
“Barbara Starr was reporting from the Pentagon”
FACPER
prep_from
PERPER
prep_from
Positive feature
Negative feature
prep_from: prepositional modifier “from”
Global Relation Features• Dependency compatibility
o two dependent mentions should have compatible relations
37
“U.S. forces in Somalia, Haiti and Kosovo”
GPE(“Somalia”
)
GPE(“Kosovo”)
PER(“forces”)
conj_andPHYS PHYS
conj_and: conjunction by “and”
• Data Setso ACE’05 corpus: excluding informal genres cts and uno ACE’04 corpus: bnews and nwire subsets
• Evaluate the performance for each subtask and the end-to-end systems by using F1 measure
38
Data Set # sents
# mentions
# relations
# triggers
# args
ACE’05Train 7.2k 26.4k 4.7k 2.8k 4.5k
Dev 1.7k 6.4k 1.1k 0.7k 1.1k
Test 1.5k 5.4k 1.1k 0.6k 1.0k
ACE’04 6.7k 22.7 4.3k N/A
Experiments
• Results on ACE’05 With gold-standard entity mentions, values and timex
Experiments
39
Token-based Decoder
[Q. Li, H. Ji, L. Huang. ACL 2013]
• Results on ACE’05 (Li and Ji, ACL 2014)
40
End-to-end Relation Extraction
Experiments
[Q. Li, H. Ji. ACL 2014]
• Three types of loss functions in K-best MIRAo F1 Measure
o 0-1 loss
o Similar to F1 loss, but sensitive to the size of structures
41
Asif Mohammed Hanif detonated explosives in Tel Aviv
InjurePER
Victim
ExperimentsComplete Model (Entity Mention, Relation, Event)
• Overall Performance
42
ApproachEntity Mentio
n
Relation
Event Trigger
Event Argument
Preliminary Results
Pipelined Baseline 79.5
51.6 64.4 35.7
Pipelined + Token-based
64.5 43.1
Li and Ji (2014) 80.8 52.1
Complete Joint Model
Joint w/ Avg. Perceptron
81.0 52.0 65.3 45.6
Joint w/ MIRA w/ F1 Loss
79.0 49.2 61.5 47.4
Joint w/ MIRA w/ 0-1 Loss
80.0 51.0 63.2 47.9
Joint w/ MIRA w/ Loss 3
80.7 52.8 65.2 46.8
ExperimentsComplete Model (Entity Mention, Relation, Event)
[Q. Li, H. Ji, Y. Hong, S. Li. ACL 2014]
Remaining Challenges• Capture world knowledge
o Williams picked up the child and this time, threwAttack her out the window.
o We believe that the likelihood of them usingAttack those weapons goes up.
• Disambiguate physical and non-physical eventso Sam Brownback vowed Monday to defend Kansas' ban of ... o it is still hurts me to read this. (“hurt” is not an attack here)
• Pronoun resolutiono It’s important that people know that we don’t believe in the warAttack.
o Nobody questions whether thisAttack is right or not.
• Semantic inferenceo Negotiations between Washington and Pyongyang on their nuclear
dispute have been set for April 23 in Beijing and are widely seen here as a blow to Moscow efforts to stamp authority on the region by organizing such a meeting.
38
44
This work • Provided a novel view about the whole task.• Significantly improved the end-to-end
performance.
• Is limited to single-sentence and single language.
Can we go beyond the sentence boundaries, and break the barrier of different languages?
Next: we study cross-doc dependencies and cross-lingual dependencies.
Monolingual bigram factorsLinear-chains on each sentence
Bilingual factorsBased on word alignmentExplicitly model the cross-language dependency
InferenceApply loopy belief propagation to do approximate inference (Wainwright et al., 2001; Sutton et al., 2007)
Bilingual Name Tagging• Training/Test Data
o 288 Chinese-English documents from Parallel Treebank
o 230 documents for training; 58 documents for test
• Evaluation Metric: bilingual name pair metrico Precision/Recall/F-measure on
bilingual name pairs
55
Type English Chinese Bilingual Pairs
GPE(Geo-political entity)
4.0k 4.0k 4.0k
Person 1.0k 1.0k 1.0k
Organization 1.5k 1.5k 1.5k
All 6.6k 6.6k 6.6k
Bilingual Name Tagging
56
• Overall Performance (Li et al., CIKM 2012)
[Q. Li et al. CIKM 2012]
Bilingual Name Tagging
57
• Bilingual name tagging improves name-aware machine translation & word alignment (Haibo Li et al., ACL 2013)o Baseline: Hierarchical Phrase-based Machine Translation
(Zheng et al., 2009)Task Metric Baseline MT Name-aware
MT
Name Translation
Weak Accuracy
66.5% 72.9%
Overall MT BLEU 35.8% 36.3%
Name-aware BLEU
36.1% 39.4%
Name Alignment
F-measure 46.0% 50.3%
[H. Li et al. ACL 2013]
Conclusions
58
• We investigated cross-component, cross-document, and cross-lingual dependencies to improve IE performance
Conclusions
59
• We investigated cross-component, cross-document, and cross-lingual dependencies to improve IE performance
1. For the first time, we formulated the problem of IE as the task of constructing information networks. We showed that performing structured learning with global features is possible and very useful to this task. Our joint framework achieved state-of-the-art in each subtask.
3. Our bilingual name tagger significantly outperforms the traditional monolingual method. It can improve name-aware machine translation.
Future Directions• Expand Information Types
• Knowledge Acquisition for IEo Use world knowledge to guide IE (Chan & Roth 2010
etc.)
60
Asif Mohammed Hanif detonated explosives in Tel Aviv
AttackPER WEAPON
attacker
O GPE
place
physical
agent-artifact
instrument
“A Germanwings flight 9525 crashed in the French Alps”•Germanwings: Commercial ORG•Flight 9525: Air Vehicle•New Event Types: Accident, Rescue, Evacuation etc.
Related Publications• Constructing Information Networks Using One Single Model
Qi Li, Heng Ji, Yu Hong, Sujian Li. EMNLP 2014• Incremental Joint Extraction of Entities and Relations
Qi Li, Heng Ji. ACL 2014• Joint Event Extraction via Structured Prediction with Global Features
Qi Li, Heng Ji, Liang Huang. ACL 2013
• Joint Bilingual Name Tagging for Parallel CorporaQi Li, Haibo Li, Heng Ji, Wen Wang, Jing Zheng, Fei Huang. CIKM 2012