Top Banner
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Determining the Types of Temporal Relations in Discourse Leon Derczynski University of Sheffield 5 March, 2013 Leon Derczynski University of Sheffield Determining the Types of Temporal Relations in Discourse
25
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Determining the Types of Temporal Relations inDiscourse

Leon Derczynski

University of Sheffield

5 March, 2013

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 2: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

The Role of Time

Why is time important in language processing?

World state changes constantly

Every empirical assertion has temporal bounds

“The sky is blue”, but it was not always

Without it, naıve knowledge extraction will fail (given anAlmanac of Presidents, who is President?)

By understanding temporal information, you will do betterknowledge extraction.

Overall goal

How do we automatically understand temporal information innatural languages?

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 3: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Temporal Information Extraction

Existing state of the artHow can we categorise types of temporal information?

Events – e.g. occurrences, states

Temporal expressions (timexes) – e.g. dates, durations

Links – relations between pairs of events or times

Supporting texts – e.g. action cardinality, event ordering

We develop and use ISO-TimeML to annotate these entities.Main dataset: TimeBank (about 180 annotated documents)

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 4: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

TimeML

Organizers

<EVENT eid="e2120" class="REPORTING">state</EVENT>the

<TIMEX3 tid="t29" type="DURATION" value="P2D"

temporalFunction="false"

functionInDocument="NONE">two days</TIMEX3>of music, dancing, and speeches is

<EVENT eid="e2123" class="I STATE">expected</EVENT>to

<EVENT eid="e13" class="OCCURRENCE">draw</EVENT>some two million people.

<TLINK eventID="e2123" relatedToTime="t29" relType="BEFORE"/>

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 5: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Times and Events

What are temporal expressions?

They refer to a time

Subtasks: recognition and interpretation; SotA recognition is0.86 F1

What do we consider as events?

Verbal, nominal

State of the art: 0.90 F1 for recognition

Doesn’t cover complex structure; e.g. a music festival

Events are not very useful unless related to other temporalentities

How can we describe this structural complexity?Start by modeling the document as a graph

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 6: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Temporal relations

What are temporal relations?

They describe the links between times and events

Can capture both complex and partial orderings

What kinds of temporal relation are there?

1 Interval (before, after, included by, simultaneous)

2 Subordinate (reported speech, modal, conditional)

3 Aspectual (start, culmination – see Vendler, Comrie)

This work is concerned with the coarsest-grained information: thefirst category

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 7: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Problem Definition

How are these relations represented?

Temporal interval algebra (Allen 1984) – a set of 14 relationsbetween a pair of intervals

TimeML defines a set of relation types and also types ofinterval

What is our problem?

Assume discourse w/ perfect event and timex annotations

In fact, assume we know which intervals to link!

“Given an ordered pair of intervals (arg1, arg2), which relation inthe set Rallen describes them?”

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 8: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Relation Extraction

How can relations be labelled?

Machine learning

Using TimeML attributes: some success

Using syntactic relations: matches SotA in tree kernels

What’s the state of the art?

2007: Mani et al.: baseline 56%, system has 61% accuracy

2008: Bethard, Chambers: many sophisticated improvements– ILP, timex-timex ordering. Improved on Mani et al. by 1.5%.

2010: TempEval-2: baseline 58%, best was 65% accuracy

Why do we find this performance ceiling?

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 9: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Sources of Temporal Relation Information

What are we missing?There is a heterogeneous set of temporal information types,including:

Explicit signals – subsequently, as soon as

Linguistic theory offers some models

What is the evidence these two types will help?

Conducted failure analysis: TempEval-2010 1

Multiple diverse approaches, same dataset

Find the set of difficult links

Characterise information supporting these links

1Verhagen et al., 2010: Semeval Task 13 - TempEval-2Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 10: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Task C: event−timex intra−sentence relations

All systems correct 1 fails 2 fail 3 fail 4 fail 5 fail All systems fail

Task D: event−DCT relations

All systems correct 1 fails 2 fail 3 fail 4 fail All systems fail

Task E: main event inter−sentence relations

All systems correct 1 fails 2 fail 3 fail 4 fail 5 fail All systems fail

Task F: event−subordinate intra−sentence relations

All systems correct 1 fails 2 fail 3 fail 4 fail All systems fail

Figure: TempEval-2 relation labelling tasks, showing proportions ofrelations according to the number of systems that gave correct labels.

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 11: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

C D E F

Proportion of links within a task that are difficult

Task

% d

iffic

ult

010

2030

40

The problem is difficult, and there is a consistently-difficult set oflinks. Perhaps we are ignoring some critical information.

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 12: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

New sources of ordering information

Next step: manually characterise each “difficult” link.Attempt to identify what kind of information could be used tolabel it.

Sources to investigate

Explicit text – signals “After you pull the pin, throw the grenade”

Sources to investigate

Tensed relations “Having eaten, I left”

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 13: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Temporal Signals

What are these?

In TimeML, they are text annotated as being helpful to atemporal relation

Used by 12.2% of TimeBank’s relations

Are temporal signals useful?

A resounding yes! 61% → 83% accuracy with simplefeatures 2

This level of performance on event-event links is abovegeneral state-of-the-art

Existing corpora are under-annotated

2Derczynski and Gaizauskas, 2010: Using signals for temporal relationclassification

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 14: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Temporal Signal Annotation

How can we automatically annotate temporal signals?

Define signals formally 3

Define a closed class of signals

Re-annotate TimeBank

Train discrimination and association

We included dependency information and function tagging.

3Derczynski and Gaizauskas, 2011: A corpus based study of temporal signalsLeon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 15: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Results

How well did our approach perform?

1 Discrimination: 92% accuracy, 75% accuracy on positives(0.77 IAA)

2 Association: 99% accuracy / 80% error reduction

3 Inductive bias towards independence assumption was harmful(MaxEnt, NBayes)

Results: 16% of links have signals (31% improvement) and cannow be labelled at high accuracy.What remains to be done?

How can we remedy under-annotation at the source?

Clear links to spatial signal annotation (e.g. -LOC tags)

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 16: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Reichenbach’s Model of Verbs

How can we model tense in language?

Each verb happens at event time, E

The verb is uttered at speech time, S

Past tense: E < S John ran.

Present tense: E = S I’m free!

What differentiates simple past from past perfect?

John ran. is not the same as John had run.

Introduce abstract reference time, R

John had run. E < R < S

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 17: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Reasoning about tense

How is Reichenbach’s model helpful?

We can describe all verbal events as three points linked byeither equality or precedence

Automatic and quick inference for relating intervals

Does it work?

Conducted first corpus-driven validation of the framework

For reporting-type links, we used features based on pairwiseevent-time relations

Add one feature representing the Reichenbachian ordering

Classifier reached 59% accuracy (48% MCC baseline) on 9%of all temporal relations (above SotA)

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 18: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Extending the model

How else can we use the model?

Positional use

Timexes relate to reference points

Only consider cases where the event and time are linguisticallyconnected

Identify these using dependency parses

Add a feature hinting at the ordering

We reach 75% accuracy from a 67% baseline (above SotA)

Also useful for timex standard transduction 4

4Derczynski, Llorens and Saquete 2012: Massively increasing TIMEX3resources

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 19: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Contributions

A large part of the difficult relation set (roughly 60%) is cateredfor by these new information sources.

Difficult task, with notable impact

Focus on automatic annotation of temporal relations

Pushed beyond SotA understanding of the problem

Creation of and contribution to language resources – e.g.ISO-TimeML, RTMML, CAVaT (among others)

.. where could we go next?

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 20: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Future

Forensic analysisHow can we build a consistent event model from multiplesemi-reliable accounts of an event?

Challenges:

Multi-document event and actor co-reference

Story conflict resolution 5

Spatial and temporal IE from colloquial text

Building and resolving accurate co-constraining models fromunreliable data (belief networks)

5Regneri, Koller and Pinkal 2010: Learning Script Knowledge with WebExperiments

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 21: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Future

Assertion boundingAll assertions have temporal bounds. How can we determine these?

Challenges:

Accurate extraction of document temporal structure

Automated reasoning

High-precision timex normalisation

Doing temporal IE & IR at gigaword scale

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 22: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Future

Temporal dataset constructionMany current systems index whole documents by date, butinformation is more nuanced than that

Challenges:

Mapping events to temporal data points

Storing and extracting events

Anchoring events with uncertain bounds (“last year’s fighting”vs. “the fighting on April 23, 2011”)

Mining complex super-events; e.g. the Fukushima disaster;what happened when?

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 23: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Recap

Temporality is ubiquitous, in the world around us and in thelanguage we use to describe our world

Processing it automatically is difficult

Doing high-performance temporal IE opens exciting researchavenues

Thank you for your time. Are there any questions?

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 24: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Labellings as probability distributions

Automated methods (e.g. classifiers) may have varying degrees ofconfidence about a link’s label.We could assign a set of labels and probabilities to each label.Consistency constraints allow us to find the most-likely possiblegraph.

A:B → before: 0.9; after 0.1

B:C → before: 0.5; simultaneous: 0.5

A:C → before: 1.0

Very time-consuming to compute– optimisations welcome!

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse

Page 25: Determining the Types of Temporal Relations in Discourse

Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion

Unuttered temporal orderings

Event/Time distance

“When I was brushing my teeth”→ This event happens at least twice daily; assume this instance is0-16 hours away

Complex events

“When we were putting up the tents for the festival”→ near the beginning of / just before the “festival” event

Leon Derczynski University of Sheffield

Determining the Types of Temporal Relations in Discourse