Top Banner
Introduction Temporal links Temporal signals Improving annotation Summary A Corpus-based Study of Temporal Signals Leon Derczynski University of Sheffield 20 July, 2011 Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
25

A Corpus-based Study of Temporal Signals

Jan 26, 2015

Download

Technology

Leon Derczynski

Automatic temporal ordering of events described in discourse has been of great interest in recent years. Event orderings are
conveyed in text via various linguistic mechanisms including the use of expressions such as “before”, “after” or “during”
that explicitly assert a temporal relation – temporal signals. We investigate the role of temporal signals in temporal relation extraction and provide a quantitative analysis of these expressions in the TimeBank annotated corpus.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

A Corpus-based Study of Temporal Signals

Leon Derczynski

University of Sheffield

20 July, 2011

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 2: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Outline

1 Introduction

2 Temporal links

3 Temporal signals

4 Improving annotation

5 Summary

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 3: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Motivation

Language for time helps us describe:

changes

planning

history

Time is not always explicit in natural language – we don’t includea timestamp with every actionGoals:

Try to automatically extract temporal information fromdocuments, so that we can build a model that connectsinformation in a text with time

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 4: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Temporal Entities

What elements can we try to extract from discourse?Each document might contain:Basic primitives:

Events – occurences, states, reports

Times – dates and times, durations, sets

Linkages between primitives:

general temporal link

aspectual links and subordination

We can use the basic primitives as nodes on a graph, and links asits arcs.

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 5: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Outline

1 Introduction

2 Temporal links

3 Temporal signals

4 Improving annotation

5 Summary

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 6: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Temporal link labelling

How do we label the links between temporal entities?

First, choose a relation set: TimeML gives us 13, includingbefore, simultaneous, includes..

Some relations have transitive and commutative properties:

If “a before b” and “b before c” then we can infer “a before c”

This means that consistency can be important

Develop a gold-standard corpus – TimeBank

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 7: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Automated temporal link labelling

How can we automatically label links?

Machine learning approaches: teach ourselves how to label alink based on times and events it may connect

Use TimeBank and other as examples of how

A difficult task: notable research effort, including variousevaluation exercises, have attempted it

Overall accuracy remains around 60% – 70% : too low1

1See Chambers & Jurafsky, 2008;

Mirroshandel et. al. 2010; TempEval-2010Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 8: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Source of temporal linking information

What information can we use to label links?

If a human can manage to understand temporal relations, theinformation must be somewhere

Possible sources:

– tense and aspect

– world knowledge

– discourse structure

– specific time information (at 9 o’clock)

– explicit signals: temporal conjunctions

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 9: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Outline

1 Introduction

2 Temporal links

3 Temporal signals

4 Improving annotation

5 Summary

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 10: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Temporal conjunctions

Are these words/phrases useful for automatic understanding?

A baseline system could learn to label links with 62% accuracy

With simple modification, links in TimeBank that hadassociated signals could be annotated with 83% accuracy

Clear indication that signals are an accessible source oftemporal information

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 11: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Temporal conjunctions in newswire

What do temporal conjunctions look like in TimeBank?

11.2% of temporal links are annotated as having one (718instances)

Top words:

– prepositions (in, for, on)

– conjunctions (after, before, since)

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 12: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Temporal conjunctions in newswire

Phrase Corpus freq.

Occurrences

as signal

Likelihood of

being a signal

subsequently 3 3 100%

after 72 67 93%

follows 4 3 75%

before 33 23 70%

until 36 25 69%

during 19 13 68%

as soon as 3 2 67%

Table: A sample of phrases most likely to be annotated as a signal whenthey occur in TimeBank, which occur more than once in the corpus.

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 13: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Discrimination of temporal signal words

What else are these temporal signal words used for?

Some words are very likely to have a temporal sense:

subsequently – 3 instances, all temporal;

after – 72 instances, 93% temporal.

Other words are versatile:

from – 366 instances, 5% temporal.

between – 33 instances, 1 temporal;

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 14: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Signal-to-link relations

What temporal relations do these words signify?

after doesn’t always signify a temporal after relation

Word order is important

After I ate, I went to bed

I ate after I went to bed

Signal phrase TimeML relation Frequency

after AFTER 56

after ENDS 6

after BEGINS 4

after IAFTER 1

already BEFORE 6

already INCLUDES 4

already IS INCLUDED 3

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 15: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Signal class

How can we characterise temporal signals?

Signals are likely to belong to a closed class of words

Common prepositions as seen earlier

Some adverbs – previously, subsequently

Set phrases – as soon as, so far

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 16: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Spatial/Temporal overlap

Time and space are related and events are constrained interms of both

Language for space and time has some similarities

before has both temporal and spatial senses

Spatially annotated corpora – SpatialML

Relative spatial links in this corpus are much more likely toemploy a signal (97.5%)

Possible explanation – temporal language is more diverse(tense, auxiliaries)

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 17: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Outline

1 Introduction

2 Temporal links

3 Temporal signals

4 Improving annotation

5 Summary

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 18: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Re-annotation

Are these signals correctly annotated in TimeBank?

Manual examination: start with words that are likely to betemporal signals

before: found 33 times in the corpus, 23 are signals

Many under-annotated cases:

before the war began

was scheduled to return to port before hostilities erupted

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 19: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Re-annotation

How could we improve signal annotation?

Linguistic description of temporal conjunctions may be weak

Annotation guidelines may be insufficient

Solution: provide an enhanced signal description, and reviseTimeBank accordingly

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 20: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Formal signal description

A temporal signal is a word that indicates the type oftemporal relation between two intervals

Signal surface forms have a head and an optional quantifier

shortly after – quantified temporal signal

Temporal signals have exactly two arguments (events and/ortimes)

One argument may be implicit (e.g. for Later)

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 21: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Augmented TimeBank

We examined 30 of the most frequent signal words andphrases that were not annotated as temporal

This comprised around 1 000 instances in text

We annotated any missed temporal signals, including EVENTand TLINK annotations where required

This resulted in 15.8% of TLINKs using a signal

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 22: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Outline

1 Introduction

2 Temporal links

3 Temporal signals

4 Improving annotation

5 Summary

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 23: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Conclusion

Temporal signals are a usable and important source ofinformation

We have provided a definition for temporal signals

Existing corpora have been upgraded with better annotation

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 24: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Future work

Automatic signal discrimination

Signal association

Applying findings to spatial language

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals

Page 25: A Corpus-based Study of Temporal Signals

Introduction Temporal links Temporal signals Improving annotation Summary

Thank you. Are there any questions?

Leon Derczynski University of Sheffield

A Corpus-based Study of Temporal Signals