M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS) LREC-02, Las Palmas, 31/5/2002 LREC-2002, Las Palmas, May 2002 Mathieur Lafourcade & Christian Boitet LIRMM, Montpellier GETA, CLIPS, IMAG, Grenoble [email protected]http ://www-clips.imag.fr/geta Mathieu.Lafourcade@lirmm. fr http://www.lirmm.fr/~lafourca UNL Lexical Selection with Conceptual Vectors
22
Embed
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas, 31/5/2002 1 LREC-2002, Las Palmas, May 2002 Mathieur Lafourcade & Christian Boitet.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,
31/5/20021
LREC-2002, Las Palmas, May 2002
Mathieur Lafourcade & Christian BoitetLIRMM, Montpellier
if DA(x, y) = /2 x y (orthogonal): nothing in common
if DA(x, y) = DA(x, y) = DA(x, -x): -x anti-idea of x
x’
xy
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,
31/5/20029
Collection processStart from a few handcrafted term/meanings/vectors<do forever> //running constantly on Lafourcade’s Mac
<choose a word at random (with or without a CV) find NL definitions of its senses (mainly on the Web) for each sense definition SD
analyze SD into linguistic tree TreeDefattach existing or null CVs to lexical nodes of TreeDefiterate propagation of CVs in TreeDef (ling. rules used
here)until CV(root) converges or limit of cycle numbers is reached
CV(sense) CV(root(TreeDef)) use vector distance to arrange the CVs of senses into a binary
« discrimination tree »
</choose>
</do>
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,
31/5/200210
An example discrimination tree
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,
31/5/200211
Status on French CVsBy Dec. 2001
64,000 terms 210,000 CVs Average of 3.3 senses/term
Method robot to access web lexicon servers large coverage French analyzer by J.Chauché in Sigmart
See more details on http://www.lirmm.fr/~lafourca
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,
31/5/200212
Disambiguation in French
Recook the vectors attached to a document tree– Take a document– Analyze it with Sigmart analyzer into ONE possibly big
tree (30 pages OK as a unit)– Use the same process as for processing definitions– Final CV(root) usable as thematic classifier of document– Final CV (lexemes) used as « sense in context »
Place each recooked vector in the discrimination tree– Walk down the discrimination tree, using vector distance– Stop at nearest node:
If leave node, full disambiguation (relative to available sense set) If internal node, partial disambigation (subset of senses)
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,
31/5/200213
Example with some ambiguities
•The white ants strike rapidly the trusses of the roof
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,
31/5/200214
Initialize: attach CVs to lexemes
• The white ants strike rapidly the trusses of the roof
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,
31/5/200215
Up / Down propagation of
the CVs
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,
31/5/200216
Result: sense selection
•The white ants strike rapidly the trusses of the roof
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,
31/5/200217
Disambiguation in UNL-French deconversion
Our set-upExample input UNL-graph
Outline of the process Two usages of DCV (disambiguation with CV)
Finding the known UW nearest to an unknown UW
Finding the best French lemma for a given UW
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas,