FATE: a FrameNet Annotated corpus for Textual Entailment Marco Pennacchiotti, Aljoscha Burchardt Computerlinguistik Saarland University, Germany LREC 2008,
Post on 18-Dec-2015
214 Views
Preview:
Transcript
FATE:a FrameNet Annotated corpus for
Textual Entailment
Marco Pennacchiotti, Aljoscha BurchardtComputerlinguistik
Saarland University, Germany
LREC 2008 , Marrakech , 28 May 2008
SALSA II - The Saarbrücken Lexical Semantics Acquisition Project
Summary
• FrameNet and Textual Entailment
• FATE annotation schema
• Annotation examples and statistics
• Conclusions
28/05/2008 2 / 17FATE - Marco Pennacchiotti
Frame Semantics• Frame: conceptual structure modeling a prototypical situation• Frame Elements (FE): participants of the situation• Frame Evoking elements (FEE): predicates evoking the situation
[Fillmore 1976, 2003]
28/05/2008 3 / 17FATE - Marco Pennacchiotti
Predicate-argument level normalizations
• FrameNet Berkeley Project 1
– Database of frames for the core lexicon of English– 800 frames, 10.000 lemmas, 135.000 annotated sentences
(1) http://framenet.icsi.berkeley.edu
“Evelyn spoke about her past” “Evelyn’s statement about her past”
STATEMENT(SPEAKER: Evelyn; TOPIC: her past)
Textual Entailment (TE)Given two text fragments, the Text T and the Hypothesis H,
T entails H if the meaning of H can be inferred from the meaning of T, as would typically interpreted by people
[Dagan 2005]
T: “Yahoo has recently acquired Overture” H: “Yahoo owns Overture”T H
• Recognizing Textual Entailment (RTE)– recognize if entailment holds for a given (T,H) pair– Models core inferences of many NLP applications (QA, IE, MT,…)
• RTE Challenges [Dagan et al.,2005 ; Giampiccolo et al., 2007]– Compare systems for RTE– Corpus: 800 training pairs, 800 test pairs, evenly split in + and - pairs
28/05/2008 4 / 17FATE - Marco Pennacchiotti
Predicate-argument and RTE• Predicate-level inference plays a relevant role in TE (20% of positive
examples in RTE-2 [Garoufi, 2007] )
An avalanche has struck a popular skiing resort in Austria, killing at least 11 people.
Humans died in an avalanche.
• Implementation gap :
• [Burchardt et al.,2007] : FrameNet system comparable to lexical overlap • [Hickl et al.,2006] : PropBank-based features are not effective• [Rana et al.,2005]: DIRT paraphrase repository does not help
28/05/2008 5 / 17FATE - Marco Pennacchiotti
DEATH(PROTAGONIST: 11 people / humans ; CAUSE: avalanche / avalanche )
FATE corpus
• Reference corpus : RTE-2 test set, 800 pairs, 29,000 tokens• Frame resource : FrameNet version 1.3• Corpus Format : SALSA/TIGER XML [Burchardt et al.,2006]
• Pre-processing : annotation on top of Collins parser syntactic analysis: T and H are randomly reordered to avoid biases
• Annotation : performed by one highly experienced annotator: inter-annotator agreement over 5% of the corpus
– FEE-agreement : 82%– Frame-agreement: 88%– Role-agreement: 91%
: annotation carried out using the SALTO tool 1
(1) http://www.coli.uni-saarland.de/projects/salsa/salto/doc
28/05/2008 6 / 17FATE - Marco Pennacchiotti
FATE: a manually frame-annotated Textual Entailment corpus, to study the role of frame semantics in RTE
FATE annotation process: an example
28/05/2008 7 / 17FATE - Marco Pennacchiotti
Collins synt. an. Collins synt. an.
full-text annotation (all words considered) [Ruppenhofer,2007]
FATE annotation process: an example
28/05/2008 8 / 17FATE - Marco Pennacchiotti
frameframe
FEEFEE
Collins synt. an. Collins synt. an.
FATE annotation process: an example
28/05/2008 9 / 17FATE - Marco Pennacchiotti
frameframe
FEFE
Collins synt. an. Collins synt. an.
FEEFEE FE fillerFE filler
Maximization principle: chose the largest constituent possible when annotating
Annotation Schema
• Intuition: annotate as FEE only those words evoking a relevant situation (frame) in the sentence at hand– Very intuitive flavor, but high agreement: 83% on a
pilot set of 15 sentences
Relevance Principle
“Authorities in Brazil hold 200 people as hostage”
LEADERSHIP DETAIN PEOPLE KIDNAPPING
28/05/2008 10 / 17FATE - Marco Pennacchiotti
VICTIMPLACEPERPETRATOR
Annotation Schema
• On T of positive pairs, annotate only the fragments (spans) contributing to the inferential process– Spans are obtained from the ARTE annotation
[Garoufi,2007]– For negative pairs it is not straightforward to derive spans,
hence we do full annotation
Span Annotation
T: “Soon after the EZLN had returned to Chiapas, Congress approved a different version of the COCOPA Law, which did not include the autonomy clauses, claiming they were in contradiction with some constitutional rights (private property and secret voting); this was seen as a betrayal by the EZLN and other political groups.”
H: “EZLN is a political group.”
28/05/2008 11 / 17FATE - Marco Pennacchiotti
Annotation Schema
• Unknown frames: use an UNKNOWN frame for words evoking situations not present in the FrameNet database
• Anaphora
• Copula and support verbs
• Modal expressions
• Metaphors
• Existential constructions
• …
Other guidelines
28/05/2008 12 / 17FATE - Marco Pennacchiotti
Corpus statistics
• Annotated pairs : 800 (400 positive, 400 negatives)
• Annotated frames : 4,500: avg. 5.6 frames per pair: 1,600 frames in positive pairs: 2,800 in negative pairs
• Annotated roles : 9,500:avg. 2.1 roles per frame
• Annotation time : 230 hours: 90 h for positive pairs (13 min/pair): 140 h for negative pairs (21 min/pair)
28/05/2008 13 / 17FATE - Marco Pennacchiotti
FrameNet and RTE (simple case)
28/05/2008 14 / 17FATE - Marco Pennacchiotti
• Syntactic normalization– Active / Passive
EDUCATIONAL_TEACHING(STUDENT: ground soldiers / soldiers; MATERIAL: virtual reality/ virtual reality)
(1) Resource coverage is too low(2) Models for predicate-argument inference are weak(3) Automatic annotation models (SRL) are not good
enough to be safely used in RTE
Implementation gap insights
28/05/2008 15 / 17FATE - Marco Pennacchiotti
• FrameNet coverage is good:– 373 Unknown frames (8 % of total frames)– Unknown roles 1 % of total roles
• Coverage is unlikely to be a limiting factor for using FrameNet in applications
(1) Resource coverage is too low(2) Models for predicate-argument inference are weak(3) Automatic annotation models (SRL) are not good
enough to be safely used in RTE
28/05/2008 16 / 17FATE - Marco Pennacchiotti
• To better study predicate-argument inference in RTE• To experiment frame-RTE models on a gold-std corpus• To learn better SRL models, by training on FATE
Corpus is freely available on-line
Why should you use FATE ?
Thank you!Questions?
28/03/2008 FATE – Marco Pennacchiotti 17 / 17
FATE download: http://www.coli.uni-saarland.de/projects/salsa/fate
pennacchiotti@coli.uni-sb.de
www.coli.uni-saarland.de/~pennacchiotti
top related