Combining Approaches for Identifying Metonymy Classes of Named Locations Sven Hartrumpf and Johannes Leveling Intelligent Information and Communication Systems (IICS) University of Hagen (FernUniversität in Hagen) 58084 Hagen, Germany [email protected]EPIA 2007, Dec. 4, Guimarães, Portugal
29
Embed
Combining Approaches for Identifying Metonymy Classes of Named Locations
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Combining Approaches forIdentifying Metonymy Classes of
Named Locations
Sven Hartrumpf and Johannes Leveling
Intelligent Information and Communication Systems (IICS)University of Hagen (FernUniversität in Hagen)
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for literal :Seit Beginn des Kosovo-Krieges rekrutiert die UCK in DEUTSCHLAND
Kämpfer. – 9951(Since the beginning of the Kosovo war, the UCK recruits fighters inGERMANY.)
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for place-for-event :Nach dem KOSOVO geht es in Makedonien und Montenegro weiter. – 6336(After KOSOVO, it will continue in Macedonia and Montenegro.)
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for place-for-off :. . . DEUTSCHLAND (wird) mehr Geschick haben als Clinton. – 2435(. . . GERMANY will be more successful than Clinton.)
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for place-for-product :Politisch sollte die Unterschrift Belgrads unter RAMBOUILLET erzwungenwerden. – 12087(The signature of Belgrade under RAMBOUILLET should be forced politically.)
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for othermet :Dabei ist AFRIKA auch bei dieser Zusammenstellung von Musik eher eineideelle Klammer. – 8415(But AFRICA is an ideational cramp for this composition of music, too.)
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for mixed :Die Friedensfahrt gewinnt im Osten DEUTSCHLANDS wieder stark anRenommee. – 1498(The peace tour makes a reputation in the eastern part of GERMANY again.)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Data and Annotation (1/2)
• TüBa-D/Z corpus containing articles from the Germannewspaper taz (27,067 sentences with 500,628 tokens)
• Annotation levels:• (PoS tags)• NE tags (LOC, PER, ORG, and MISC)• NE subclasses (e.g. first names, last names, and other
parts of a name)• Label corresponding to medium and fine metonymy
classification• Example: token Africa →(NE, LOC, region, MET,
othermet)
→ 1,515 (18.5%) of all toponyms are used in a nonliteralsense
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Data and Annotation (2/2)
Annotation checking:• Applied the variation (or inconsistency) detection tool
DECCA (http://decca.osu.edu/)• Used corrections supplied by the TüBa-D/Z corpus
publishers• Identify additional spelling errors by frequency analysis→ Errors in text and on levels of PoS tags, NE tags, NE
• All classifiers are based on a memory-based learner,TiMBL (supervised learning)
• All classifiers implemented by different people• Shallow classifier 1 (SC1): relies largely on features
obtained from gazetteer lookup• Shallow classifier 2 (SC2): includes features encoding
ontological sorts from the context• Deep classifier (DC): employs features from parse
results (syntactico-semantic parsing with a semanticallyoriented computer lexicon)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classifier SC1Main features for training instances:• 109 features• Character features (e.g. token starts with capital letter?)• Semantic entities (entity classes for the token obtained
from morpholexical analysis)• PoS tags• Gazetteer lookups (for cities, countries, etc.)• Metonymy context (metonymy class of the token to the left)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classifier SC2Main features for training instances:• 269 features• Sentence context (lemma and distance to the location
token)• Word context (the first three and the last three characters
of the token, PoS tag, position in the sentence,upper/lower case information, and word length)
• Metonymy context (metonymy class of two precedingtokens)
• Ontological sorts (for words in the context, using a bitvector representation of a sort hierachy)
features for the deep classifier• Semantic result: MultiNet (multilayered extended
semantic networks, Helbig (2006)); MultiNet nodes:disambiguated word readings (concepts)
• Syntactic result: dependency graph• Important resource for the parser:
semantically oriented lexicon (HaGenLex)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classifier DC (1/2)• 13 features• p-quality: quality of the parser result as a numerical value between 500
and 1000• token: name token; type: name type (i.e. lemma)• dep-rel: dependency relation leading to the governor (mother
constituent)• role: semantic role filled by the name• appos-molec: name accompanied by a molecular apposition?• adjective: lemma of modifying adjective• csister-ctype: lemma of coordinated sister node with compound
reduction• csister-entity: semantic entity value of coordinated sister node• mother-entity: semantic entity value of mother constituent• mother-sort: ontological sort of mother constituent• mother-type: type (i.e. lemma) of mother• mother-ctype: type (i.e. lemma) of mother with compound reduction
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Classifier Combination
Features for training instances:• 15 features• results for the location token (from SC1, SC2, DC)• results for tokens in the context (from SC1, SC2, DC)
• Classifiers differ in their strengths and weaknesses(for example, the deep method shows the highestprecision values, but recall values are low because theyare limited by the parser coverage)
→ Combined classifier outperforms each single classifiersignificantly
• Created a new resource about metonymy in German• Metonymy support in the lexicon improves results of
syntactico-semantic parser• Future work: investigate semantic representation of
metonymic names;application to QA and GIR
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Selected ReferencesHelbig, Hermann (2006). Knowledge Representation and the Semantics of Natural
Kamei, Shin-ichiro and Takahiro Wakao (1992). Metonymy: Reassessment, survey ofacceptability, and its treatment in machine translation systems. In Proceedings of the30th Annual Meeting of the Association for Computational Linguistics (ACL’92), pp.309–311. Newark, Delaware.
Lakoff, George and Mark Johnson (1980). Metaphors We Live By. Chicago UniversityPress.
Leveling, Johannes and Sven Hartrumpf (2006). On metonymy recognition for GIR. InProceedings of GIR-2006, the 3rd Workshop on Geographical Information Retrieval(hosted by SIGIR 2006). Seattle, Washington. URLhttp://www.geo.unizh.ch/~rsp/gir06/papers/individual/leveling.pdf.
Markert, Katja and Malvina Nissim (2002). Towards a corpus annotated for metonymies:The case of location names. In Proceedings of the 3rd International Conference onLanguage Resources and Evaluation (LREC 2002). Las Palmas, Spain.
Markert, Katja and Malvina Nissim (2007). Task 08: Metonymy resolution at SemEval-07. InProceedings of SemEval 2007.
Stallard, David (1993). Two kinds of metonymy. In Proceedings of the 31st Annual Meetingof the Association for Computational Linguistics (ACL’93), pp. 87–94. Columbus, Ohio.URL http://www.aclweb.org/anthology/P93-1012.