Top Banner
ANNOTATING EVENT ANAPHORA: A CASE STUDY Tommaso Caselli and Irina Prodanof ILC-CNR, Pisa [email protected] [email protected] LREC-10 – May, 19th, La Valletta, Malta
18

ANNOTATING EVENT ANAPHORA: A CASE STUDY

Dec 31, 2015

Download

Documents

noble-myers

ANNOTATING EVENT ANAPHORA: A CASE STUDY. Tommaso Caselli and Irina Prodanof ILC-CNR, Pisa [email protected] [email protected]. LREC-10 – May, 19th, La Valletta, Malta. Outline. Motivations Coreference annotation in TimeML Annotating event anaphora: a preliminary scheme - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ANNOTATING EVENT ANAPHORA: A CASE STUDY

ANNOTATING EVENT ANAPHORA:

A CASE STUDY

Tommaso Caselli and Irina Prodanof

ILC-CNR, Pisa

[email protected] [email protected]

LREC-10 – May, 19th, La Valletta, Malta

Page 2: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Outline

Motivations Coreference annotation in TimeML Annotating event anaphora: a preliminary

scheme Annotation methodology and results Lesson learned and future works

Page 3: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Motivations

Eventualities represent the building blocks of the informative content of a document

Eventualities give rise to relations which create a rich informative network. temporal relations sharing of participants factivity coreferential relations

Coreferential relations among eventualities plays an important role for facilitating access to content and extract relevant information

Page 4: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Coref. in TimeML

TimeML & ISO-TimeML are standards for the annotation of events, temporal expressions and a set of relations between these entities (temporal, subordinating and aspectual relations)

Main contribution of TimeML: standard definition of event and methodology for its annotation

It-TimeML: Italian adaptation of TimeML (updated version on request) and part of ISO-TimeML

It-TimeML is currently used for the creation of the Italian TimeBank (172 news articles from ISST, PAROLE and Web, 67,140 tokens)

Page 5: ANNOTATING EVENT ANAPHORA: A CASE STUDY

TimeML tags involved: EVENT and TLINK (temporal link) TimeML has not a specific link for coreference

annotation workaround: use of a special value of the TLINK tag: “identity”

“identity” is used to: connect two tokens which are part of a single

event instance (e.g. light verbs) connect coreferential relations between events,

namely set-subset

Coref. in TimeML (2)

Page 6: ANNOTATING EVENT ANAPHORA: A CASE STUDY

fare la spesa [to do shopping].<EVENT id="e1">fare</EVENT> la<EVENT id="e2">spesa</EVENT><TLINK lid="l1" eventInstanceID="e1"relatedToEventInstance="e2“ relType="IDENTITY"/>

Coref. in TimeML (3) – Use of “identity”

Page 7: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Coref. in TimeML – Use of “identity” (3) La sessione privata servira’ a tre adempimentij . Innanzitutto, all’

approvazionej della proposta di Abete (ISST sole006).The private session will be used for three [fulfillments] j . First, the

[approval]j of the proposal of Abete.La <EVENT id="e1">sessione</EVENT> privata <EVENT id="e2">servira’</EVENT> a tre <EVENT id="e3">adempimenti</EVENT>. <SIGNAL id="s1">Innanzitutto</SIGNAL>, all’ <EVENT id="e4>approvazione</EVENT> della <EVENT id="e5">proposta</EVENT>di Abete.

<TLINK lid="l1" eventInstanceID="e4“ relatedToEventInstance="e3"relType="IDENTITY"/>

Page 8: ANNOTATING EVENT ANAPHORA: A CASE STUDY

The use of the value “identity” is not satisfactory since it is NOT homogeneous

During the (current!) annotation effort for the creation of the Italian TimeBank we have observed that this value could be applied to other cases such as: synonyms hypernyms coreference (strict coreference – same referent in the

world)

Coref. in TimeML (4)

Page 9: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Event Anaphora Previous works: Hasler et al 2006; Bejan & Harabagiu

2008 Hasler et al. 2006: only NPs coreference (strict

definition), detailed guidelines – but NO specifications for the annotation; which events? ACE event frame (LIFE, CONFLICT,

MOVEMENT, JUSTICE….) TimeML compliant

Bejan & Harabagiu 2008: event coreference as a side effect of event structure. Event coreference is considered when two predicates express

same predicate, synonyms or hypernyms and share same arguments

TimeML compliant

Page 10: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Event Anaphora - Methodology (2)

Our approach: no event frames nor event templates; all instances of

event annotated in the Italian TimeBank (TimeML compliant);

open-domain text/discourse coarse grained bottom up approach in the definition

of the annotation scheme reduced and limited set of guidelines active

discovery of what is needed through annotation and observations from the data

event anaphora: strict coreference + indirect coreference

Page 11: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Event Anaphora - Annotation scheme (3)

TAGS ATTRIBUTES

MARKABLE ID, POS, DEFINITENESS, CLASS

EMPTY ID

TOPIC ID

LINK ID, ANAPHORTYPE, SRC

MAJJJJJJIII<MARKABLE> = <EVENT> BUT extended includes annotation of pronouns and adverbs.

Page 12: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Event Anaphora - Annotation scheme (4)

<EMPTY> = to annotate cases of zero anaphora and ellipsis (frequent in Italian)

<TOPIC> = to annotate entire portions of text; it provides anchor to those linguistic entities which can refer to discourse topic

“Stiamo ancora parlando, come certamente deve essere, e continueremo a consultarci”j . James Baker, segretario al Tesoro americano, ha commentato cosi’j i risultati dell’assemblea. (ISST els019)

“[We are still speaking, as it should be, and we will keep consulting]”j . James Baker, the American Treasure secretary, commented [so]j the results of the assembly.

Page 13: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Event Anaphora - Annotation scheme (4)

<EMPTY> = to annotate cases of zero anaphora and ellipsis (frequent in Italian)

<TOPIC> = to annotate entire portions of text; it provides anchor to those linguistic entities which can refer to discourse topic

<LINK> = it marks up an anaphoric relations. The attribute “anaphorType” explicits which type of anaporic relation “src” marks the anchor

Page 14: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Event Anaphora – Results (5) Annotation tool: PALinkA (Orasan, 2003) 3 annotators / 1,792 tokens no K scores

-Low agreement on the identification of anaphora but relative good on the anchors

- More specific guidelines and information

-Event anaphora is a widespread phenomenon

Page 15: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Lession Learned and Future Work Event anaphora is a widespread phenomenon which must be

addressed in separate tasks Relations between full event N, V, PP and Adj no pronominal anaphoras

New annotation scheme: 2 tags: <EVENT> and <AnafLink> different attributes for <EVENT>: FACTIVITY, GENERICITY,

POLARITY relations between particular events according to the attributes' values reduced type of anaphors (two values: direct vs. indirect)

Tracking of the participants: how to? Event anaphora annotation as a further link in TimeML or as

a separate task which can be built upon the TimeML annotation

New Tool: BAT (thanks to Marc Verhagen)

Page 16: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Lession Learned and Future Work - Example

Page 17: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Lession Learned and Future Work - Example

Page 18: ANNOTATING EVENT ANAPHORA: A CASE STUDY

Thank you!