Page 1 SenDiS Sectoral Operational Programme "Increase of Economic Competitiveness" "Investments for your future" Project co-financed by the European Regional Development Fund General Word Sense Disambiguation System applied to Romanian and English Languages - SenDiS - Andrei Mincă - aminca@ softwin.ro SenDiS – WSD model, components, algorithms, methods & results
15
Embed
Page 1 SenDiS Sectoral Operational Programme "Increase of Economic Competitiveness" "Investments for your future" Project co-financed by the European Regional.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1
SenDiS
Sectoral Operational Programme "Increase of Economic Competitiveness""Investments for your future"
Project co-financed by the European Regional Development Fund
General Word Sense Disambiguation System applied to Romanian and English Languages- SenDiS -
lexicon network optimizations considering number of edges loops or strong connected components number of roots and leafs number of levels (in the case of leveling the LN)
Output: ordered lexicon network
OLN Algorithms
Page 6
SenDiS
Input a lexicon network (not necessarily ordered) a meaning ( ID )
Builds a semantic interpretation for the specified meaning over the lexicon network spanning trees sets of nodes sequences of edges or combinations of the above
Output : a semantic interpretation (signature) for the meaning
BMSS Algorithms
Page 7
SenDiS
Input: two or more semantic signatures
comparison depends on the nature of the semantic signatures
Output: degrees of similarity
CMSS Algorithms
Page 8
SenDiS
Input : a matrix with degrees of similarity between the context words sense
Output : one or several WSD variants with the highest cost
CwsdV Algorithms
...
11s
12s
13s
14s
21s
22s
1NS
2NS
3NS
NW
1W
2W
1..3 NW
Page 9
SenDiS
Input text list of meanings lexicon network
Computing tokenization of text annotation of text tokens with meaning interpretations selecting a window-text for WSD other context filters or topologies build meaning semantic signatures for each word-sense compare meaning semantic signatures and fill the matrix compute best WSD variants
Output one or more WSD variants with one or more meaning interpretations for each text token
WSD methods
Page 10
SenDiS
tokenization
part-of-speech tagging
lemmatization
sense interpretations
chunking
parsing
general WSD requirements
Page 11
SenDiS
Performance indicators P - precision
P = noCorrectlyDisambiguated_TargetWords / noDisambiguated_TargetWords
R - recall
R = noCorrectlyDisambiguated_TargetWords / noTargetWords F-measure