CTpyKTypHaJI l'I n p l'I Kil 3A H aJI 11 "1 H re "1 CT"1 Ka 9 ISSN 0202-2400 • • • • • •• •• •• •••• • • • • • • • • • • • • •••••• • ••••• ••••••••••• LA FILOLÓGICA POR LA CAUSA
CTpyKTypHaJI l'I n p l'I Kil 3A H aJI 11 "1 H re "1 CT"1 Ka
9
ISSN 0202-2400
• • • • • •• •• •• •••• • • • • • • • • • • • • •••••• • ••••• •••••••••••
LA FILOLÓGICA POR LA CAUSA
CAHKT-TIETEPBYPfCKJ1J1 fOCY}lAPCTBEHHhIJ1 YHJ1BEPCJ1TET
CTPYKTYPHAH 11 I1Pl1KJIA,[J;HAH Jil1HfBl1CTl1KA
Me:HCBY308CKUU c6opHuK
BhrrrycK 9
LA FILOLÓGICA POR LA CAUSA
Y,UK 80+618.31 BBK 81.1
C83
Pe A a K l..l w o H Ha H Ko JI JI er w H: npoc}>. n. H. EellJleea, npocp. A. C. fepo (oTB. pe
AaKTop ), npocp. 0. H. fpuH6ayM, npocp. M.A. MapyceHKO
C e K p e T a p b peAaKQHOHHot1: KOJIJiernn B. J!f. Py6uHep
p e l..l e H 3 e H T KaHA. cpHJIOJI. Hayi< AOQ. J!f. II. llaHKOO
lle11amaemcH no nocmaHoeneHu10 PeoaK11uoHHo-u3oamenbcKOzo coeema
<fiunonozuttecKozo <fiaKynbmema C.-llemep6ypzcKozo zocyoapcmeeHHozo yHueepcumema
CTp'fKTYPHaH H npHKJiap;HaH JIHHrBHCTHKa. Bbm. 9: Me)l(
C83 By3. c6. I no.a; pe,a;. A. C. fep,a;a. - CI16.: l13A-BO C.-I1eTep6. ytt-Ta, 2012. - 356 c.
C6opttHK (Bbm. 8 BbIUieJI B 2010 r.) coAep)f(HT cTaTbH no UIHpOKOMY
Kpyry npo6JieM TeopenrtteCKOH 11 npHKJiaAHOH JIHHfBHCTHKH, no npHMe
HeHHIO MaTeManttteCKHX MeTOAOB B Jl3bIK03HaHHH.
,D;JIR cneQHaJIHCTOB no TeOpJrn: Jl3bIKa, npHKJiaAHOH H TeopenPieCKOH
JIHHfBHCTHKe.
66K81.l
© C.-IleTep6yprCKJ1H
rocyAapcTBeHHbIH,
yttHBepcHTeT,2012
LA FILOLÓGICA POR LA CAUSA
. .(''· ~ .K.
p~~· 0. A. MumpocfiaHo6a, 0. H. JIJcmeBcKaH, M.A. I'pattKoBa,
A. C. llluMopuHa, A. C. lllypvtzuHa, C. B. PoMaHoB
3KCIIEPJ1MEHTbl ITO ABTOMATJtlqECKOMY PA3PEIIIEHJ110 JIEKCJ1KO-CEMAHTJ1qECKOJiJ: HEO,D;H03HA qHOCTH
J1 Bbl,ll;EJIEHHIO KOHCTPYKIJ;HM (Ha MaTep11:arre Hau;11:onarrhnoro Kopnyca pycCKoro .ll3bIKa)*
AHHomal{UJl. HacTOH~ee Hcc11e110Bam1e HMeeT 11e11b10 aBTOMaTw1ecKoe H3Bne'leHHe
IIllHrBHCTH'leCKOH HH<jiopMaQHH H3 KOHTeKCTOB HaQHOHa/lbHOro Kopnyca pyccKoro
Jl3b1Ka (HKPJI) c norne11y10~HM HCTIO/lb30BaHHeM 11aHHbIX ll noCTpoeHHH KOMTI/leKCHO
ro neKCHKorpa<jiH'lecKoro pecypca - Karnnora pyccKHX KOHCTpyK1111i1. ITpe1111araeMb1tt
IlOAXOll npe11no11araeT aBTOMaTH'ICCKYIO K/laCCH<jiHKaQHIO KOHTeKCTOB, Hanpas11em1yio Ha
aBTOMaTH'leCKOe pa3peweHHe neKCHKO-CeMaHTH'leCKOH HeO/IH03Ha'IHOCTH (WSD) H Bbl-
11e11eHHe KOHCTPYKQHH (Cxl). npo11e11ypa aBTOMaTH'leCKOH K/lacrn<jiHKaQHH KOHTeKCTOB
Y"HTbIBaeT rne11yio~He THilbl KOHTeKCTHOH HH<jiopMaQHH, npe11cTas11eHH011 B MHOro-
11pyrnol1 pa3MeTKe HKPJI: neKCH'leCKHe Tern (Tern neMM) (lex), Mop<jionornqecKHe Tern
(gr), /leKCHKO-CeMaHTH'leCKHe TerH (sem), a TaK)Ke KOM6HHaQHH pa31111'1HblX BH/IOB Teros.
Cep1m ::iKcnepHMeHTOB no WSD H Cxl BbmonHCHbl c Hcnonh30BaHHeM penpe3eHTaTHB
HbIX Bb16opoK KOHTeKCTOB H3 HKPJI. B Ka)K11ol1 cepHH 3KcnepHMeHTOB aHa/IH3Hpy10T
Cll (1) pa3/IH'IHbie KOHTCKCTHble MapKepbl 3Ha'leHHH 11e11eBbIX C/IOB H (2) KOHCTPYKLIHH,
BK/llO'la!O~He KOHTCKCTHble MapKepbl 11 11e11eBble C/IOBa.
K111o<teB111e CROBa: pa3peweH11e 11eKCHKo-ceMaHTH'lecKol1 HeO/IH03Ha'IHOCTH, KOH
'TPYKQHH, BbI/ICIIeHHe KOHCTPYKQHl1, Ha11110Ha1IbHbltt Kopnyc pyccKoro Jl3brKa, Knacrn
<jiHKaQHH KOHTeKCTOB
* Pa6oTa BbmonHeHa npH <j>HHaHcosol1 nop;p;ep)l(Ke P<l><l>.11 (npoeKT 10-06-00586-a), nporpaMMbI <j>yHp;aMeHTanbHblX 11ccnep;oBaHHtt ITp'e3HAHyMa PAH «KopnycHaJJ
nHHrBHCrnKa» (npoeKT FrameBank), a TaK)l(e npoeKJa HY:IP «Mop;enb HHTerpHpo
saHHoro nporpaMMHO-/IHHrBHCTH'ICCKOrO KOMTIJICKCa ,D;/!Jl C03,D;aHIDI CflC!.\HaJIH3Hpo
BaHHb!X Kopnycos pyccKoro Jl3bIKa».
© 0 . A. MHTpo<j>aHosa, 0. H. lIHUiescKaJJ, M.A. fpa'IKOBa, A. C. lliHMop1rna,
A. C. IllypbirHHa, C. B. PoMaHos, 2012
159
LA FILOLÓGICA POR LA CAUSA
0. A. Mitrofanova, 0. A. Lyashevskaya, M.A. Grachkova, A. S. Shimorina, A. S. Shurygina, S. V. Romanov
EXPERIMENTS ON AUTOMATIC WORD SENSE DISAMBIGUATION AND CONSTRUCTION IDENTIFICATION
(Based on Russian National Corpus)
Summary. The research project reported in this paper aims at automatic extraction of linguistic information from contexts in the Russian National Corpus (RNC) and its subsequent use in building a comprehensive lexicographic resource - the Index of Russian lexical constructions. The proposed approach implies automatic context classification intended for word sense disambiguation (WSD) and construction identification (Cxl). The automatic context processing procedure takes into account the following types of contextual information represented in the RNC multilevel annotation: lexical (lemma) tags (lex), morphological tags (gr), lexical-semantic (taxonomy) tags (sem), and combinations of the various types of tags. Multiple experiments on WSD and Cxl are performed using RNC representative context samples. In each series of experiments we analyze (1) different context markers of meaning of target words and (2) constructions including context markers and target words.
Keywords: Word Sense Disambiguation, constructions, construction identification, Russian National Corpus, context classification
1. Bse,n;euHe
IlpoeKT, B paM'Kax KOToporo BbIIIOnHeHO HaCTORII.\ee HCCJieAOBaHHe, ocyll.\eCTBJIReTcR cosMeCTHhIMH ycHJIHRMH KOJIJieKTHBOB Ha1..1110HaJihHoro Kopnyca pyccKoro R3bIKa (HKPR) H Ka<l>eApbI MaTeMaTtf'IeCKOH n11:HrBHCTHKH CaHKT-IleTep6yprcKoro rocy,l.\apcrneHHOro YffHBepc11:TeTa. Uenb npoeKTa - aBTOMaT11:a11posaHHoe nocTpoeHHe 3JieKTpOHHOro KaTanora pyccKHX KOHCTPYKLIHH Ha 6aae HKPR. IloA KOHCTPYKQHeH B 3TOM cnr1ae IlOHHMaeTCR CO"feTaHHe QeJieBOrO CJIOBa H KOHTeKCTHblX MapKepoB ero 3HatfeHHR, xapaKTepH3)'10II.\eecR qacTOTHOCTbIO H ycTOH"fHBOCTbIO. B Ka"fecrne KOHTeKCTHbIX MapKepos paccMaTp11:saIOTCR Tern neMMhI (lex), Mop<l>onornqecK11:e (gr) 11: neKCHKo-ceMaHTw1ecK11:e (sem) Tern, AOCTynHhie B MHoroyposHeBoi1 paaMeTKe KOHTeKCTOB HKPR. TaK)l(e nott11MaH11:e KOHCTPYKQHH cornacyeTcR c ocHOBHbIMH H,l.\eRMH fpaMMaTHKH KOHCTPYKQm1 (Fillmore 1988; Goldberg 1995; 2006; Tomasello 2003; KyaHe1..1osa 2007).
TaK KaK BbI,l.\eJieH11:e KOHCTPYKQHH npo11:cxoAHT no np11:H1..111:ny «bottom-up generalization», KOHTeKCTbl nrna BepHbte MHe moou 6YAYT COilOCTaBJieHbl 11:epapXH"feCKOMY CilHCKY rna6JIOHOB, BKJIIO'laR: 6ep-
160
LA FILOLÓGICA POR LA CAUSA
HbtU + SPRO; r: pers; dat, aepHbtU + SPRO; r: pers; dat + 11enoaeK, aepHbtU + (SPRO;r:pers)!(S; t: hum); dat + S; t: hum, aepHbtU + dat + S, aepHbtU + s, H3 KOTOpbIX nepBbie '!eTblpe xapaKTepHbl AJIJI aepHblU B 3Ha'leHHff HaOe)l(HbtU, npo11HbtU, cmouKuu, npeoaHHbtU (Ka'lecTBeHHOe npffnaraTenbHOe, o6o3Ha'!aIOw;ee Ka'!eCTBO '!eJIOBeKa, c IlOJIO)KffTeJibHOW OQeHKOW), a nocneAHHW rna6noH Ha6nIOAaeTc.11 ff B KOHTeKcTax c APYrHMff 3Ha'leHffRMH npffnaraTeJibHOro.
KaK KOHCTPYKllffH, HanpffMep, TpaKTYIOTc.11 co'!eTaHff.11 cnoBa 6UO B 3Ha'leHffff 'noApa3AeJieHffe B CffCTeMantKe, BXOAJl:ll.l;ee B COCTaB BbICIIIero pa3AeJia - pOAa; pa3HOBffAHOCTb, nm' c npaBOCTOpOHHffMH KOJIJIOKaTaMff Tffna cnopm (r:abstrt:sport); OeRmenbHocmb (r:abstr der:v); CO'leTaHHJI CJIOBa 6UO B 3Ha'leHHH 6Heumocmb, 6UOUMblU o6JIUK; cocmo.smue c neBOCTOpOHHHMff KOJIJIOKaTaMH Tffna BHeumuu (r:rel t:place der:adv); oenamb (d:root); coenamb (d:pref I t:impact:creat t:be:appear ca:caus) H np. TeM ca.'dbIM, KOHCTPYKLIH.11 - nHHrBHCTH'leCKHW o6'beKT, B KOTOpOM cocpeAOTO'!eHa pa3HOPOAHaJI ffH<popMaQH.11, Il03BOn.!IIOll.l;aJI pacn03HaBaTb H pa3rpaHH'IHBaTb 3Ha'leHH.H MHOf03Ha'!HOro CJIOBa. 3THM 11 06'b.11cH11eTc.11 06'beAHHeH11e B HaUieM 11ccneAOBaHHH ABYX 3aAa'I KOMilbIOTepHOW ceMaHTHKH - aBTOMaTH3aQffH pa3perneHHJI neKCHKOceMaHTH'leCKOH HeOAH03Ha'!HOCTH (WSD) ff BbIAeneHH.H KOHCTPYKllffW (CxI) (MHTpocl>aHoBa, TiaHff'leBa, JfameBCKaJI 2008; MHTpocl>aHoBa, JfameBCKM, TiaHK'leBa 2008; Mitrofanova, Panicheva, Lashevskaya 2008; Mitrofanova, Lyashevskaya 2009; MHTpocl>aHOBa, fpa'!KOBa, lll11Mop11Ha, JI.11meBcKa11 2010; Shimorina, Grachkova 2011; Automatic Word ... 2011, HT.A.).
vfaBeCTHbl AOCTaTO'IHO 3<l><l>eKTHBHble MeTOAbl pa3peUieHH11 JieKCHKOCeMaHTH'!eCKOH HeOAH03Ha'IHOCTH B nonyaBTOMaTH'leCKOM ffnH aBTOMaTH'!eCKOM pe)l(KMe (Mihalcea, Pedersen 2005; Word Sense Disambiguation.. 2007; Navigli 2009). MeTOAbI nepBoro THna npeAnonaraIOT HcnoJib30BattHe KOMilbIOTepHbIX TesaypycoB (WordNet, http://wordnet.princeton.edu/; FrameNet, http://framenet.icsi.berkeley.edu/) H cl>opMaJibHbIX OHTOJIOrHW B Ka'!eCTBe HCTO'IHHKOB :irncl>opMaQHH 0 3Ha'!eHHJIX CJIOB. MeTOAbI BTOporo THna OCHOBblBaIOTC.11 Ha CTaTHCTH'!eCKHX AaHHblX 0 KOHTeKCTHOM OKp)')Kemrn CJIOB, Il03BOJIJIIOll.l;eM pa3rpaHH'IHBaTb HX ynoTpe6nemte B pa3nH'IHbIX 3Ha'!eHHJIX (Schutze 1998; Pedersen 2002). Cyw;ecTByroT TaIOKe rn6pHAHbie IlOAXOAbI, npeAilOJiara10w;11e COBMew;eHHe neKc11Korpacl>11qecKHX 11 cTamcTH'leCKHX AaHHbIX (Leacock, Chodorow, Miller 1998; Mihalcea 2002). CaMocTo.11TeJibHb1e KCCJieAOBaHH.H npOBeAeHbI
161
LA FILOLÓGICA POR LA CAUSA
c QeJiblO ycrattOBJieHIDI napaMeTpOB paaperneHIDI JieKCHKO-CeMaHTH'leCKOH tteOAH03Ha'IHOCTH (Jarowsky, Florian 2002).
IlpHMeHHTeJibHO K MaTepHany pyccKoro R3bIKa onpo6osaHbI o6a nma MeTOAOB. JIJcnOJib30BaHHe MO~HOro 3JieKTpOHHOro neKCHKOrpa<l>H'leCKOro pecypca (PyTea (IlyKarneBH'I, qyf!Ko 2007), CeMaHTH'leCKHH cnoBapb HKP5l (Prurnmrna, Ko6pHQOB, KycTOBa, JinrneBcKan, llleMaesa 2006; Kustova, Lashevskaja, Paducheva, Rakhilina 2009)) o6ecne'IHBaeT BbICOKHH ypoBeHb paaperneHHR neKCHKO-CeMaHTH'leCKOH HeOAH03Ha'lHOCTH. EcnH )Ke eCTb Heo6XOAHMOCTb 060HTHCb 6e3 CJIOBapHOH IlOAAep)l(KH (HanpHMep, B TOM cnyqae, ecm1 06pa6aTbrna10Tc11 TeKCTbI 6onbIlIHX o6'beMOB, a HX JieKCH'leCKHH COCTaB He noKpbrnaeTCR HMelO~HMHCR B pacnopn)l(eHHH HCCJieAOBaTeJieH CJIOBapRMH), TO npeAilO'ITeHHe CJieAyeT OTAaTb CTaTHCTH'leCKHM MeTOAaM. ,lJ;ocTaTO'IHO HaAe)l(HO paaperneHHe JieKCHKO-CeMaHTH'leCKOH HeOAH03Ha'iHOCTH Ha OCHOBe cpaBHeHHR AHCTpH6yQHH 'laCTepe'IHbIX Teros KOHTeKCTHOro OKpy)l(eHHR CJIOB (Aaaposa, MapHHa 2006; AaapoBa, oH'IHHesa, BrurnTOsa 2008) H Ha ocHoBe neKCH'leCKHX MapKepos KOHTeKCTOB (Ko6pHQOB, JI.11rneBCKM, IlaHH'leBa 2005) . .ll;onycTHMO COBMe~eHHe Teaaypyettoro H CTaTHCTH'leCKOro IlOAXOAOB K paaperneHHlO JieKCHKO-CeMaHTH'leCKOH HeOAH03Ha'IHOCTH c onopoit Ha CJIOBapttyio HH<l>opMaQHlO 0 MOAel111X CO'leTaeMOCTH CJIOB (TOJIAOBa, KycTOsa, JinrneBCKM 2008). Ilo AaHHbIM, nony'leHHbIM B HarneM npoeKTe, 6onee 3<l><l>eKTHBHbIM OKa3bIBaeTCR CTaTHCTH'leCKOe pa3-perneHHe HeOAH03Ha'IHOCTH c yqeTOM AHCTp116yQHH JieKCHKO-CeMaHTH'leCKHX Teros B KOHTeKCTaX. 3KcnepHMeHTbl no.o;o6Horo po.o;a BnepBbie ocy~ecTBneHbI B pycne o6cy)l(.o;aeMoro npoeKTa; no.o;o6ttb1e Hccne,o;oBaHHR Ha MaTepi·taJie Kopnycos pyccKoro 113bIKa pattee He nposo,o;HJIHCb.
MeTO)l;bI H anropHTMbI Bb1,o;eneHH11 KOHCTPYKQHH no cpastteHHlO c aBTOMaTH'leCKHM paaperneHHeM JieKCHKO-CeMaHTH'leCKOH HeO)l;H0-3Ha'IHOCTH Mettee pa3pa6oTaHbl H B HaCTOJI~ee speMJI npe,o;cTaBJ111lOT npe,o;MeT ,o;J111 o6rnHpHbIX ,o;HcKyccHit (Sahlgren, Knutsson 2009; Proceedings of the NAACL. .. 2010; Wible, Tsao 2010). HaH60JibllIHe ycnexH )l;OCTHfHYTbl B o6naCTH H3BJie'leHHR n-rpaMM (KOJIJIOKaQHH, HeOAHOCJIOBHbIX QenocTHOCTew - cp. (Manning, Schutze 2002; 5lryHosa, TIHBOBaposa 2011). O.o;ttaKo H)l;HOMaTH3HpoBaHHbie KOHCTPYKQHH, a TaK)l(e KOHCTPYKQHH c HecTatt,o;apTttow CHHTaKCH'leCKOH CTPYKTypoit, XOTJI OHM no.o;po6Ho OilHCaHbl B HCCJie,o;oBaTellbCKOH JIHTepaType (BopHCOBa 1995; JIJop,o;aHCKM, Mellb'IYK 2007), npe,o;cTaBJ111lOT cepbe3HYlO
npo6neMY s aBTOMaTH'leCKOH o6pa6oTKe TeKcTa. BMecTe c TeM cy~e-
162
LA FILOLÓGICA POR LA CAUSA
cTByeT pRA npoeKTOB, B KOTOpbIX oco6oe BH111Matt111e yAe1U1eTcR cpopMan111saQ111111 neKCJllKO-CJllHTaKCJll'l{eCKJllX CBR3eH e,l\JllHJllQ TeKCTa, cpeAlll HlllX ecTb HCcne,l\0BaH111J1 111 Ha MaTep111ane pyccKoro R3bIKa:WordSketches AM pyccKoro R3hIKa (3axapoB, Xoxnosa 2010), pa6oThI no 111sBneqett11110 neKc111Ko-c111HTaKc111qecK111.X ma6noHOB (BonhmaKoBa, BaeBa, BopAaqeHKOBa, Bac111nbeBa, Mopo30B 2007), no aBTOMaTw1ecK0My nocTpoeHl-i:IO cnosapelf coqeTaeMocrn (fenh6yx, C111AopoB, 3pttaHAec-Py61110, qy6yKoBa 2004).
J1111HrB111cT111'1ecK111e AaHHbie 111 nporpaMMHbie pemeHJllR, nony'leHHbie B XOAe pa60Tbl Ha):I HaCTORll\lllM npoeKTOM, OTKpbIBaIOT B03MO)l(HOCTb cos,1:1aHJ11R KaTanora pyccKJAX KOHCTPYKl..\lllH, cooTHoc111MblX c onpe,1:1eneHHhIMJ11 3HaqeHlllRMlll QeneBbIX cnoB. B Hameft: CTaTbe 06c~,1:1aeTcR o,1:1111H 1113 Ba)l<Heft:m111x acneKTOB npoeKTa, a 111MeHHO nocTpoeH111e wa6noHOB KOHCTPYKQlllH Ha octtose neKcw-1ecK111.X, MopcponornqecKlllX 111 neKc111KoceMaHTJ11'1{eCK111x KOHTeKCTHbIX MapKepoB 3Ha'leHMH QeneBbIX cnoB.
2. ilJIHfBHCTil'ICCKHC )l;3HHl>IC
3Kcnep111MeHTbl no pa3perneHllllO neKCJllKO-CeMaHTlll'leCKOH HeO,LIH0-3Ha'l{HOCTJll 111 BhI,L1eneH11110 KOHCTPYKQlllH npoBO.LIJITCR Ha MaTep111ane HKP.H (http://www.ruscorpora.ru/). KoHTeKCTbI OCHOBHoro 110,1:1Koprryca HKP.H, 1113 Kornporo rrpo1113so,1:1111n111cb Bb16opK111, COAep)f(aT pa3MeTKY Tpex TlllIIOB: Tern neMM (lex,..- neKceMa, KOTOpoft: rrp111Ha,1:1ne)f(J11T cnosocpopMa), Mopcponorn'lecK111e Tern (gr - rpaMMaT111'1eCK111e np111sHaK111 cnosocpopM: 'laCTepe'IHaR 11p111Ha,1:1ne)f(HOCTb, 3Ha'leHJllR: rpaMMaTJll'leCKJllX KaTerop111H Iii T. ,LI.), neKCJllKO-CeMaHTlll'leCKHe Tern (sem - rrp1113HaKJll, yKa3bIBaIOll\He Ha 11p111Ha,l\ne)f(HOCTb cnoBa K orrpeAeneHHOMY neKCJllKO-CeMaHTlll'leCKOMY Knaccy, Hanp111Mep, JZUl{O, Be~ecmBo, npocmpaHcmBo, cKopocmb, OBUJICeHue, o6naoaHue, cBoucmBo 11enoBeKa (110,1:1po6Hee CM.: http://www. ruscorpora.ru/corpora-sem.html).
B QeHTpe BHlllMaHlllR 111ccneAOBaTenbCKOH rpynnhI HaXO.LIJITCR pyccKHe cy1L1eCTBJ11TenbHb1e ooM, Buo, opzaH, nyK, maBa M T. ,1:1., np111naraTenhHb1e 6nu3KUU, BepH1xU 111 T.A., a TaK)f(e rnaronhI nponucamb, cnpaBumbcJC, 3aHecmu, 3aHocumb JA T . .z:I. Attan111s111pyeMb1e neKceMbI orn111'laIOTCR Kon111-'leCTBOM 3Ha'leHJllH, xapaKTepoM pa3BJllTJllR non111ceM111111/0MOHlllMlllJll, CTe
rreHbIO CBR3aHHOCTlll 3Ha'leHlllH Me)f(AY co6olf. B Hamett pa6oTe 111cnonhsyeTcR TpaKTOBKa HeO):IH03Ha'IHOCTJll, rrp111HJITaR B KOMilbIOTepHOK n111HrBJllCTJ11Ke Iii ,1:1011ycKaIOll\aR ycnOBHOe np111paBHJ11BaH111e OMOHlllMJll'IHblX
163
LA FILOLÓGICA POR LA CAUSA
KoppeJIBTOB K MHoro3HaqHbIM CJIOBaM (Pax11mrna, Ko6pmi;oB, KycToBa,
Jl.snneBCKCUI, illeMaHaeBa 2006). Pa3MeTKa 3HaqeHHH CJIOB B KOHTeKcTax
HKP51 np0Bo,zi;11nacb Ha ocHoBe CeMaHT11qecKoro cJI0Bap11 HKP51.
3KcnepHMeHTbI npoBO,IJ;HJIHCb TOJibKO ,IJ;JUI 3HaqeHHH, npe,zi;cTaBJieH
HbIX B HKP.SI ,zi;ocTaToqHbIM KOJIHqecTBOM KOHTeKcTOB (He Mettee 10 KOH
TeKCTOB). HanpHMep, 113 paccMoTpeHHR 6brnH HCKJI10qettb1 cne,zi;y10ni:11e
HH3KOqaCTOTHbie 3HaqeHHR CJIOBa OOM: Mecmo, zoe )l(UBym moou, o6'beOUHeHHble o6tu,UMU UHmepecaMU, ycnOBUHMU cyw,ecmBOBaHuH; j1uHacmuH, poo; 3HaqeHHe CJIOBa maaa: Kynon i{epKBU, BCTpeTHBIIIeecR B BbI6opKe
JIHIIIb B o,zi;HOM KOHTeKcTe. B 06yqa10ni:11x Bb16opKax neKCHKo-ceMaHTH
qecKCUI HeO,ll;H03HaqHOCTb 6bIJia CHRTa BpyqttyIO, B OCTaJibHbIX cnyqaJJ:X
3Ta npoIJ;e.zi;ypa ocyni:ecTBJIBJiacb aBTOMaT11qecK11.
3. K0Mn1>10Tepuoe 06ecne11:euue 3KcnepHMCHTOB
KoMnhIOTepHbIH HHCTpyMeHT WSD 11 Cxl n03BOJIBeT BblilOJIHRTb
aBTOMaT11qeCKy10 KJiaccmpHKaIJ;HIO KOHTCKCTOB, HanpaBJieHHYIO Ha
pa3perneHHe JieKCHKO-CeMaHTHqecKOH HCO,IJ;H03HaqHoCTH CJIOB H Bbl
,zi;eneHHe KOHCTPYKIJ;HH. 3TH npoIJ;eAypbI ocyni:ecTBJIBIOTCR c noMOIIJ;bIO
nporpaMMHoro o6ecneqeHHR, pa3pa6aTbIBaeMoro C. B. PoMaHOBbIM Ha
R3bIKe Python. l1ttcTpyMeHT WSD H Cxl co3,zi;aeT BCKTOptty10 Mo,zi;enh
3KCnepHMCHTaJibHOH BbI6opKH; B KaqecTBe 6a30BOfO amopHTMa Bbl-
6paH amopHTM KJiaccmp11Ka1J;HH c yq11TeJieM. ITporpaMMa pa6oTaeT
B ,ll;Byx pe)f(HMax: <PopMHpOBaHHe KJiaCCOB KOHTCKCTOB, COOTHOCHMbIX
c OT,ll;CJibHblMH 3HaqeHHRMl1 IJ;CJICBOfO CJIOBa; reHepaIJ;HH CilHCKOB HaH-
6onee qacTOTHbIX KOHCTPYKIJ;HH, B KOTOpbIX pean113yeTc11 TO HJIH HHoe
3HaqeH11e IJ;CJieBoro CJIOBa. ITp11 aBTOMaT11qecKOH o6pa6oTKe KOHTeKCTOB
yq11Tb1BaIOTCR pa3Hble Tl1Ilbl TeroB, np11cyTCTBYIOIIJ;HX B MHOroypoBHe
BOH pa3MCTKe HKP.51: Tern lex, gr, sem, KOM611tta1J;1111 TeroB pa3HblX T11noB
(lex+gr, lex+sem, sem+gr, lex+sem+gr). Bo3MO)f(HO BapMtpoBaHHe TaK11x
napaMeTpOB 3KCnepHMCHTOB, KaK IIIHpHHa KOHTCKCTHOro OKHa [-l; +r],
o6pa60TKa c yqeTOM/6e3 yqeTa BCCOB KOHTCKCTHblX 3JICMCHTOB. KoM
IlbIOTepHbIH 11HCTpyMeHT TaK)f(e npe,zi;oCTaBJIBeT ,ll;OilOJIHl1TCJibHbIC CTa
THCTHqecKHe ,zi;aHHblC.
ITpoIJ;e.zi;ypa attan113a n11HrB11cT11qecKoi1 11tt<PopMa1J;11H npo113BO,IJ;HTCR
rne,zi;y10ni:11M o6pa30M. Ha 3Tane npe,zi;o6pa6oTKH B 3Kcnep11MeHTaJibHOH
BbI6opKe onpe,zi;eJIBeTCR q11cJIO KOHTCKCTOB Ha Ka)f(,ll;Oe 113 3HaqeHl1H IJ;e
JICBOfO CJIOBa. ,[l;JIR Ka.)f(,IJ;OfO 113 3HaqeHl1H <PopMHpyIOTCR o6yqaIOIIJ;CUI
164
LA FILOLÓGICA POR LA CAUSA
Bb16opKa ( cnyqaHHblM o6pa30M OT06paHHble KOHTeKCTbl co CIDITOH
HeO}l;H03Ha'IHOCTbIO, r,o;e peaJI113yeTC.R: paccMaTp11BaeMoe 3Ha'!ett11e)
11 TeCTOBa.R: Bbr6opKa (KOHTeKCTbl, }l;Jl.R: KOTOpblX npOBO}l;l1TC.R: aBTOMaTl1-
qecKoe pa3perneH11e Heo,o;H03Ha'IHOCT11 6e3 yqeTa anp11opHOH Jll1HfB11-
CT11'1eCKOH 11tt<l>opMal.l1111). Ha :nane Marn11Httoro 06yqett11.R: np0Bo,o;11TC.R:
<t>opM11poBaH11e CTaT11CT11'1eCK11X o6pa30B 3Ha'!eHl1H lleJieBoro CJIOBa.
06pa3 3Ha'!eH11.R: eCTb BeKTOp B BeKTOpHOM npocTpaHCTBe, Koop,o;11HaTbl
KOToporo onpe,o;eJI.R:IOTC.R: '!aCTOTaMl1 BCTpeqaeMOCTl1 TeroB lex, gr l1Jll1
sem B 06yqa10ll.leH Bb16opKe. YcTaHaBJI11Ba10Tc11 ,o;11cTp116yl.l1111 TeroB pa3-
JI11'1HbIX T11noB B Bb16opKe. Ha 3Tane pacno3HaBaH11.R: o6pa30B TecTOBbie
KOHTeKCTbl npe,o;cTaBJI.R:IOTC.R: KaK BeKTopa B BeKTOpHOM npocTpaHCTBe.
Yl3Mep.R:eTC.R: pacCTO.R:Hl1e Me)K,o;y KOHTeKCTHbIMl1 BeKTOpaM11 11 Ka)K}l;blM
113 o6pa30B 3Ha'!ett11l1. B Ka'!ecTBe MepbI 6m13ocT11 6brna Bb16patta Mepa
Cos ( v1, v), no3BOJI.R:IOll.la.R: BbI'll1CJI.R:Tb Koc11ttyc yrna Me)K,o;y KOHTeKCTHbI
MH BeKTopaM11, CM. <t>opMyny (1):
(1)
Bb1611paeTC.R: o6pa3, K KOTOPOMY KOHTeKCTHblH BeKTop pacnoJIO)KeH
6Jil1)Ke Bcero, ff lleJieBOMY CJIOBY B KOHTeKCTe npffill1CbIBaeTC.R: 3Ha'leH11e
6JI11)KaHrnero o6pa3a •
.Uanee rrp0Bo,o;11Trn npoBepKa Ka'!eCTBa pa3perneH11.R: neKc11Ko-ceMaH
T11'!eCKOH HeO}l;H03Ha'IHOCTl1: cpaBHl1BalOTC.R: pe3yJibTaTbl aBTOMaTl1'!e
CKOH ff pyqttoH o~pa6oTKl1 KOHTeKCTOB, 01.leHffBaIOTC.R: TO'IHOCTb p (,o;oJIH
KOHTeKCTOB B TeCTOBOH Bb16opKe, }l;Jl.R: KOTOpblX 3Ha'!eHffe lleJieBOfO CJIOBa
6blll0 pacn03HaHO Beptto) 11 IlOJIHOTa R (,o;OJIH KOHTeKCTOB B TeCTOBOH Bbl-
6opKe, }l;Jl.R: KOTOpbIX 6blll0 npffIDITO Bepttoe mrn Olllff60'!HOe pernem1e).
ABTOMaTff'lecKoe Bb1,o;enett11e KOHCTPYKlll1H npoff3BO}l;l1TC.R: Ha octtoBe
CTaTl1CTl1'1eCKHX ,o;aHHblX 0 CO'leTaeMOCTff lleJieBblX CJIOB 11 KOHTeKCTHblX
MapKepoB ffX 3Ha'!ett11l1: TeroB lex, gr 11 sem. Co'!eTaeMOCTHa.R: 11tt<l>opMa
llff.R: 113BJieKaeTc.R: 113 06yqa10ll.leH Bb16opK11. Pe3ynhTaT pa6oTbI nporpaM
MbI oTpa)KaeTC.R: B B11,o;e cnffcKa qacTOTHbIX KOHCTPYKllHH (KoM611Halll1H
lleJieBoro CJIOBa 11 CTaTffCTl1'!eCKl1 3Ha'll1MblX JieBOCTOpOHHl1X 11 npaBO
CTOpOHHHX KOHTeKCTHblX MapKepoB) c ,o;aHHblMl1 0 'laCTOTe BCTpeqaeMo
CTH Ka)K}l;OH KOHCTPYKI.lffl1 11 c rrepe'IHRMl1 neKceM, peaJiff3YIOll.ll1X 3Ha'!e
Hl1.R: KOHTeKCTHbIX MapKepoB B cocTaBe KOHCTPYKI.ll1H.
165
LA FILOLÓGICA POR LA CAUSA
4. IlapaMeTpbI 3KCnep11MCHTOB
.D;1111 onpeJJ,enemrn HaHJI)"IWHX napaMeTpos aBTOMaTw1ecKoro pa3-
peweHHJI JieKCHKO-CeMaHTH'leCKOH HeOJJ,H03Ha'IHOCTH H BbIJJ,eJieHHH
KOHCTPYKJ..\HH 6brno nposeJJ,eHo CBhiwe 6000 3KcnepHMeHTOB, Hanpas-
11eHHbIX Ha 1) ycTaHOBJieHHe KoppeJIHQHH Me)l(JJ.Y TeraMH lex, gr H sem, Ha OQeHKY HaJJ,e)!<HOCnt pa3JIH'IHbIX KpHTep11eB (lex, gr, sem, HX KOM6H
HaQHH lex+gr, lex+sem, sem+gr, lex+sem+gr) H onpeJJ,e11eH11e ycrroBHH HX
np11MeHeHHH, 2) OQCHKY pHJJ,a napaMeTpOB, KOTOpbie MOryT BJIJ1HTb Ha
pe3y11bTaTbI 3KCnepHMeHTOB (nrnpHHa KOHTeKCTHOfO OKHa, pa3Mep 06-
yqaJOI.QHX BbI6opoK H T. JI,.). CpaBHHTeJihHoe Hcc11eJJ,0BaH11e KpHTepHeB aBTOMaTH'lecKoro pa3pe
weHHH JieKCHKO-CeMaHTH'leCKOH HeOJJ,H03Ha'IHOCTH H BbIJJ,eJieHHH KOH
CTPYKJ..\HH 6bmo HanpasneHo Ha BhIBCHe~rne 11x HaJJ,e)f(HOCTH, T. e. Toro,
KaKOH 113 KpHTepHeB o6ecne'IHBaeT Ha1111yqwne noKa3aTeJIH TO'IHOCTH
P H nonHOThI R. 0Ka3arrocb, 'ITO HaH6011ee HaJJ,e)f(HaH KOM6HHaQHH Te
roB 11eMMhI 11 ceMaHTH'leCKHX Teros (lex+sem) (P,,,87 ... 89%, R""95%), a HaHMeHee HaJJ,e)f(HbIMH - H3011HpoBaHHbie MOpcpo11orn'leCKHe Tern
(gr). IlpHeMJieMbie c TO'IKH 3pemrn TO'IHOCTH H IlOJIHOTbI pe3y11bTaTbI
TaK)f(e 6bIJIH no11yqeHbI c )"leTOM KOM6HHaQHH scex Tpex THilOB TefOB
(lex+sem+gr), a TaK)f(e H3011HposaHHhIX Teros neMM (lex). IlpoBeJJ,eHbI 3KCnep11MeHTbI c H3MeHeHHeM urnpHHbI KOHTeKCTHOfO
OKHa [-/; +r] (l, rs 5), np11 3TOM JJ,onycKaJIOCb CMMMeTpH'IHOe HJIH aCHM
MeTpH'IHOe KOHTeKCTHOe OKHO, COOTHOCHMOe c CHHTafMOJ1: HJIH CHHTaK
CM'leCKOH rpynnoJ1:. HaH11yqurne 3Ha'leHMH WHpHHbl KOHTeKCTHOfO OKHa
OQeHHBaJIHCh c noMOI.QhlO F-MepbI (2), )"IHTbIBalOI.QeH: OJJ.HOBpeMeHHO
TO'IHOCTb p H IlOJIHOTY R:
F=21(11P + l!R) (2)
HaHJI)"IWHe 3Ha'ICHHH urnpHHhI KOHTeKCTHoro OKHa onpeJJ,eJIHJIHCh
B 3KcnepHMeHTax c pa3HhIMH THnaMH Teros. Hanp11Mep, B 3KcnepHMeH
Tax c y'leTOM TCfOB JICMM (lex) HaHJI)"IWHe pe3yJihTaTbI 6bIJIH IlOJI)"leHbI
npH WHpHHe KOHTCKCTHOfO OKHa [-4; +SJ. IlpH yqeTe Bcex Tpex THilOB
TefOB (lex+sem+gr) HaHJIY'IWHM.11 OKa3aJIJ1Cb ClICJJ.YIOI.QHe pa3MepbI KOH
TCKCTHOfO OKHa: [-2;+4] H [-3;+4]. KOHTCKCTbI yKa3aHHOfO 06'beMa
Ha116orree xapaKTepHbI Jl,JIH HMeH Cyl.QCCTBHTCllbHbIX, nOCKOllbKY COJJ,ep
)l(aT CO'leTaeMOCTHYIO HHcpopMaQHIO, Hrpa!OI.QYIO Ba)f(HyJO ponb B onpe
Jl,CJICHHH 3Ha'leHHH QeJieBOfO CJIOBa B KOHTCKCTe. qaI.Qe BCero TaKoe
166
LA FILOLÓGICA POR LA CAUSA
KOHTeKCTHOe OKHO COOTBeTcTByeT CHHTaKCH'leCKHM rpynnaM, pa3Mell.\a
l01.I..\HMCH B npen03HQHH (TaKOBbI aA'beKTHBHbie rpynnbI) Hl!H B nocTno-
3HQHH (HMeHHbie, HH<i>HHHTHBHbie H Apyrne rpynnbI) no OTHOll!eHHlO
K Qel!eBOMY ClloBy. AHal!H3 KOHTeKCTOB, ocyll.\eCTBAABlllHHCH B XOAe 3KC
nepHMeHTOB, n03BOAAeT rosopHTb 0 TOM, 'ITO yqeT rpaHHQ CHHTaKCH'le
CKHX rpynn nOBbilllaeT TO'IHOCTb pe3YllbTaTOB aBTOMaTH'IeCKOro pa3pe
ll!eHHH neKCHKO-CeMaHTH'IeCKOH HeOAH03Ha'IHOCTH p Ha 0,05 . 0,1. 3KcnepHMeHTbI CBHAeTellbCTBYlOT 0 TOM, 'ITO BbICOKOe Ka'IeCTBO aB
TOMaTH'IeCKOro pa3perneHHR neKCHKO-CeMaHTH'IeCKOH HeO)l;H03Ha'IHO
CTH (B cpeAHeM p,., 0,85, B HeKOTOpbIX ClIY'IaRX p,., 0,95 ... 1) )l;OCTH)l(HMO
npH yc110BHH BbI6opa COOTBeTCTBYlOl.I..\HX THnOB KOHTeKCTHbIX MapKepoB
(Teros), llIHpHHbl KOHTeKCTHoro OKHa, a TaK)l(e )l;OCTaTO'IHOro o6'beMa
06yqa101.I..1ei1: Bh16opKH (100 500 KOHTeKCTOB). ohrnH npose.n;eHhI TeCTbI
c nocTeneHHO ysel!H'IHBalOl.I..\HMHCR o6y'Ial01.I..\HMH BbI60pKaMH (10, 15, 55, 75, 100, 200, 500 ... KOHTeKCTOB), npH 3TOM o6'beM o6y'Ial01.I..\HX Bbl-
6opoK H3MeHHlICH nponopQHOHallbHO 061.I..1eMy 'IHClIY KOHTeKCTOB )l;AA
Ka)l()l;Oro H3 paccMaTpHBaeMbIX 3Ha'leHHH (10%, 15, 20%). TIO Hall!HM
Ha61110AeHHRM, 06yqa101.I..1aR Bb16opKa )l;Oll)l(Ha co.n;ep)l(aTb He MeHee 100 KOHTeKCTOB, HaHlIY'lllIHe pe3yllbTaTbl o6ecne'IHBalOTCH B Bb16opKax, co
.n;ep)l(all.\HX 0Ko110 500 KOHTeKCTOB. B 061.I..1eM cnyqae, o6'heM 06yqa101.I..1e11
BbI6opKH AOl!)l(eH ~OCTaBlIHTb He Mettee 20% OT 061.I..1ero o6'heMa BbI6op
KH KOHTeKCTOB )l;AA Qenesoro ClIOBa, B npOTHBHOM cnyqae o6pa3bI, <Pop
MHpyeMbie Al!H OT)l;ellbHbIX 3Ha'IeHHH, MOryT OKa3aTbCH pa3Mb!TbIMH, 'ITO
CHH3HT Ka'IeCTBO aBTOMaTH'IeCKOro pa3perneHHH neKCHKO-CeMaHTH'le
CKOH tteo.zi;tto3Ha'IHOCTH.
5. Pe3yJibTaTbJ 3KcnepHMeHTOB no aeroMaTH11ecK0My
pa3pemeHHIO JieKCHKO-CeMaHTH'leCKOH
HeO)l;H03Ha'IHOCTH
C noMOl.I..\blO KOMnhlOTepttoro HHCTpyMettTa WSD H Cxl npoBeAeHhI
cepHH 3KCnepHMeHTOB no aBTOMaTH'IeCKOMY pa3perneHHlO neKCHKO-ce
MaHTH'IeCKOH HeO)l;H03Ha'IHOCTH HCC11e.n;yeMbIX MHOf03Ha'IHbIX Cl!OB.
TipHMepbl BbI)l;a'IH nporpaMMbI npHBep;eHbl B Ta611. 1. B KOHTeKcTe [1] 3Ha'IeHHe Qenesoro cnosa pacno3Hatto septto, Tor
.n;a KaK npHMep [2] HHTepnpeTHpyeTCR tteseptto. BepoHTHO, ornH60'IHbie
perneHHH CBH3aHbl c He)l;OCTaTO'IHOCTblO KOHTeKCTHOro OKpy)l(eHlUI )l;l!H
11.n;eHTH<i>HKaQHH 3Ha'IeHHJI.
167
LA FILOLÓGICA POR LA CAUSA
Ta611u14a 1. TipHMepbl KOMilblOTepHOK o6pa60TKH KOHTCKCTOB ynoTpe611eHHJI
CIIOBa ZllQ8Q c }"fCTOM KOM6HHaI~HH TerOB lex+ sem +gr
11cXO,[IHOe Pacno3HaHHoe llhip11Ha
KoHTeKCTbl Cos KOHTeKCTHOro 3Ha'lett11:e 3Ha'lett11:e
OKHa
( l J 3a cmo110M co6pa11ocb Bee 83poCRoe Hace11eHue HuK011ae8KU 80 z11a8e m2 m2 0,555 (-3; +SJ c '!f!OCeiJame11eM ~CmHOZO KOJIXOJa.
[2J BMecmo aiJMUHUcmpa14uu oiJHozo ceJlbCKOZO OKpyza 6yiJem 10-15 ZJIQB
aiJMUHUcmpa14uu 80 8XOORU{UX 8 Hezo m4 ms 0,112 (-1; OJ
iJepe8HRX - HQCKOJlbKO JKe 6yiJym pa3iJymbl rumambl ynpa811eH'leCKozo ann~ama?
OcHOBHbre pe3ynhTaTbI, nonyqeHHhre B xoAe 3Kcnep11MeHTOB, CBH-
3aHbI c BbIBBJieHHeM H CHCTeMaTH3aQHeH pa3JIH'JHbIX THilOB KOHTeKCT
HbIX MapKepoB 3HatJeHHH QeJieBhIX CJIOB. Ha1160Jih1IIHH HHTepec npeA
cTaBJIHIOT TaKHe KOHTeKCTHbie MapKepbI, KaK TerH lex H sem. Bo-nepBbIX,
npOH3BOAHTCJI 'ynopHAO'IeHHe TeroB lex no qacTOTe BCTpeqaeMOCTH
B KOHTeKCTHOM OKpy)l(eHHH QeJieBoro CJIOBa. Bo-BTOpbIX, 3HatJeHHJI KOH
TeKCTHbIX MapKepoB - TeroB lex o6o6IQaIOTCJI Ha OCHOBe HX JieKCHKO
ceMaHTHtJeCKOH pa3MeTKH, ycTaHaBJIHBaeTCJI COOTHOIIIeHHe TeroB sem H peaJIH3YIOIQHX HX neMM. Hanp11Mep, TaKHe neBOCTopoHHHe coceAH
cnoaa nyK, KaK ozypeu, (r:concr t:fruit t:food), apex (r:concr t:fruit t:food pt:part pc:plant), KapmOUlKa (r:concr t:fruit t:food pt:aggr sc:fruit), MO)l(
HO OTHeCTH K OAHOH rpynne npeAMeTHbIX CYIQeCTBHTeJibHbIX, KOTOpbre
o6o3HatJaIOT ynoTpe6JIHeMbre B nHIQY npOAYKTbI. B HTore, AJIH Ka)l(AOro
J13 3HatJeHHH QeJieBorc CJIOBa COCTaBJIHeTCJI Ta6JIHQa tJaCTOTHbIX Ha6o
poB TeroB lex H sem (cM., HanpHMep, Ta6n. 2). Y1CXOAH H3 AaHHbIX 0 COtJeTaeMOCTH CJIOBa maBa, npeACTaBJieHHbIX
B rn6n. 2, MO)l(HO yCTaHOBHTb KOHTeKCTHbie MapKephr 3HatJeHHH pyKoBooumeJZb, Ha'ia!lbHUK, cmapUlUU no nOJIOJICeHUIO - BCTpeqaJIHCb JieKCeMbI
wcyoapcmBo (r:concr t:space), <fieoepau,uJC (r:concr t:space), pezuoH (r:concr t:space pt:part pc:space), wpoo (r:concr t:space sc:constr), <jioHo (r:concr t:space pt:set sc:money) H T.J\. YKa3aHHbie KOHTeKCTHbie MapKepbl MO)l(HO
o6'beAHHHTb B rpynny npeoMemHblX UMeH npocmpaHcmBa u Mecma. Ha6JIIOAeHHH, CAeJiaHHbie B npoQecce o6o6IQeHHH KOHTeKCTHbIX
MapKepoB TeroB lex AO JieKCHKO-CeMaHTH'IeCKHX KJiaCCOB, CBHAeTeJib-
168
LA FILOLÓGICA POR LA CAUSA
Ta6nu~a 2. 06pa31.\bI aHamua npaBOCTOpOHHHX COCeAeH CJIOBa znasa B 3H3'1eHHH pyKosooume1111, 11a'lan1111uK, cmapiuuu no nono"'e11u10
t-1Kc110
JleKCHKO- KOHTeKCTOB 3Ha'leHue ceMaHTH'leCKa.JI ilpHMep (113 o6~ero
3HHOT31.\IDI KOJIH'leCTBa KOHTeKCTOB)
znasa IlpasOCTOpOHHHe COCeAH
m4. PyKOBOA11TeJib, r:concr t:space H3'13JlbHHK, «npeAMCTHoe HM.II»
51(H363) cTapw11if no «npocTpaHCTBO Il0JIO)KeHl1IO HMCCTO»
113 pyK masbl
r:concr t:space zocyoapcmsa
41 (113 44) masa ¢eoepa~uu ~T6011a
r:concr t:space pt:part pc:space masa pezuo11a 5 «'13CTb '!£_0C~HCTB3»
r:concr t:space sc:constr masbl zopooa 3
«CO~lKCHHC»
r :concr t:space pt:set sc:money
masa ¢011oa 2 «COBOKYilHOCTb o6'beKTOB (AeHbrH)»
I·
CTBYIOT B noJib3Y 3aKoHa ceMaHTH'leCKOro comacoBaHHR (faK 1972). Ce
MaHTH'lecKoe comac0Batt11e - 3TO cpopMaJihHoe cpeACTBO opraHH3aQHH
BhICKa3hIBaHHR, npeAnonara10m;ee Ay6n11p0Batt11e xorn 6b1 OAHOro H3 ce
MaHTH'leCKHX npH3HaKOB CJIOB, o6'beAHHeHHbIX KOHTeKCTHbIMH CBR3R
MH. Hanp11Mep, 3Ha'lett11e m4 cnoBa ZllaBa onpeAeJIReTrn B neKCHKo-ce
MaHTH'leCKOH pa3MeTKe KaK r:concr t:hum (npeoMemHoe UMH, TIUl{O), BMe
CTe c TeM Ter t:hum (llUl{O) BXOAHT B COCTaB 60JiblllHHCTBa KOHTeKCTHbIX
MapKepoB yKa3aHHoro 3Ha'leHHR. B KOHTeKcTax co cnoBoM Zllaaa ceMa
TIUl{O 3KCilJIHQHTHO Bblp(l)l(eHa B ceMaHTH'leCKHX npH3HaKax JieBOCTO
pOHHHX coceAeH r:propn t:hum, r:concr t:hum H r:concr t:hum d:nag der:v, a B ApyrHX CJIY'laJIX OHa HMilJIHQHTHO BXOAHT B COCTaB JieKCHKO-CeMaH -
TH'lecKotf aHHOTaQHH. TaK, ceMa TIUl{O noApa3yMeBaeTrn B 3Ha'leHHH ma
ronoB roBopemrn t:speech, B 3Ha'leHHRX CJIOB, xapaKTep113yeMbIX Ter<lMH
t:org (opzaHU3al{uH) , t:group (zpynna), t:action (MeponpuHmue).
169
LA FILOLÓGICA POR LA CAUSA
6. Pe3YJll>TaTbI 3KcnepHMeHTOB no Bbl~e11eHHJO KOHCTPYKl.\HH
3KcnepHMeHThI no asToMaTHqecKoMy BhIAeJieHHIO KOHCTPYKI..\HH
npoBOJ:VITCH B HeCKOJihKO 3Tanos. CttatJ:ana Ami Ka)l(AOro 3HatJ:eHH11 pac
cMaTpHsaeMblX l.\eJieBhlX CJIOB COCTaBJUleTC11 cnHCOK KOHTeKCTOB ero yno
Tpe6nemrn, Aanee l13 KOHTeKCTOB aBTOMaTHqecKH l13BJieKaeTC11 HaH6onee
tJ:aCTOTHa11 JieKCHKO-CeMaHTHtJ:eCKa11 l1 Mop<f>onom'!eCKa11 HH<f>opMaQHR
0 KOHTeKCTHbIX MapKepax 3HatJ:eHHR B 3aAaHHOM OKHe. BbIAeJieHHe KOH
CTPYKQHH npm13BOAHTCR B npeAenax KOHTeKCTHoro OKHa [-1; + l], rAe
BbICOKa Bep011THOCTb BCTpeTHTb KOHTeKCTHbie 3JieMeHTbl, BXOJ:VI~He
B ycrntfqJ1Bbie CJIOBOCoqeTaHHR c HCcneAyeMbIMH cnoBaMH. J(anee <f>op
MHPYIOTC11 Mop<f>onornqecKHe MOAeJIH KOHCTpyicQHH, on11cb1saeTcR HX
neKCHKo-ceMaHn1qecKoe HanonHeH11e. MeTO.flHKa JIHHrBHCTHqecKoro
aHaJIH3a Mop<f>onorntJ:eCKHX MOAene11: KOHCTPYKQHH H 11x neKCHKO-ce
MaHTHqecKoro HanonHeHH11, HCnOJib30BaHHa11 B HarneM HCCJieAOBaHHH,
OCH08bIBaeTC11 Ha OnbITe aHaJIH3a COtJ:eTaeMOCTHblX orpaHHtJ:eHHH CJIOB
pa3HhIX qacTeJ1: peqH (MHTpo<f>aHosa, EenHK, KaAHHa 2008). B 3aBeprne
HHe ocy~eCTBMeTC11 npoQe,a;ypa BepH<f>HKaQHH pe3yJibTaTOB: nonytJ:eH
Hble cnHCKH KOHCTPYKQHH conocTaBJI11IOTCJI co cnHCKaMH KOJIJIOKaQHH,
<f>opMHpyeMhIMH Ha OCHOBe cepBHCa noHCKa 611rpaMM c. A. lliaposa
(http://corpus.leeds.ac. uk/ ruscorpora.html).
B xo,a;e o6pa6oTKH ,a;aHHbIX ,a;nR Ka)l(,D;Oro H3 3HaqeHHH aHaJIH3H
pyeMbIX CJIOB 6hrnl1 BblRBJieHbl xapaKTepttbie Mop<f>oJIOrHtJ:eCKHe MO
AeJIH KOHCTpyKQHH. EhrnH nonyqeHbI cneAyIO~He Mop<f>onorntJ:eCKHe
Mo,a;enH KOHCTPYKI..\HH c np11naraTeJibHbIMH: A + S, V +A, A +A, S +A, Adv+ A. J(n11 rnarona nponucamb B 3HaqeHHRX m2 (HaJHa'iumb Kaiwe-H. !leKapctn80 UllU ne•teHUe 60llbHOMy) l1 m5 (Coenamb JanUCb zoe-ll.) OC
HOBHOH THn KOHCTPYKI..\HH - V + S;acc, a 8 3HaqeHHH ml ( O<fiopMumb o<fiuu,ua!lbHOU JanUCblO npoJ1CU8aHUe KOZO-H. zoe-H.) - V + (S;t:hum); ace, V +Adv, APRO;t:place r:rel: + V.
JleKCHKO-CeMaHTHqecKoe HanonHeHHe 3THX MO,a;eneH AJUI OT,D;eJibHbIX
3HatJ:eHHH np11naraTeJihHoro BepHblU, ,a;n11 rnarona JaHecmu oTpa)f(eHo
B Ta6JI. 3-4 (Tern lex H sem - nesocrnpoHHHe H npasocTOpoHHHe Map
KepbI COOTBeTCTBYIO~HX 3HaqeHHH QeJieBbIX CJIOB).
Ehmo ycTaHOBJieHo, tJ:TO OTAeJibHbJe sapHaHTbI neKCHKo-ceMattTHqecKo
ro ttanontteHM KOHCTPYKI..\HH xapaKTep113y10Tc11 BbICOKOH ycTOHtJ:HBOCTbJO.
TaK, no ,a;aHHhIM ceps11ca noHCKa 611rpaMM C. A. lllaposa (http:/ !corpus.
leeds.ac.uk/ruscorpora.html), cpe,a;H nesocrnpoHHJ1X H npasocrnpoHHHX
170
LA FILOLÓGICA POR LA CAUSA
Ta611u~a 3. JleKCHKo-ceMaHTJf'lecKoe ttarronttettue KOHCTP}'I<I.\HH
c IIpHnaraTCnbHblM sepHblll B OT,l.\CnbHblX 3Ha'ICHJUIX
JleKCHKO-JleBOCTOpoHHlie IlpaBOCTOpoHHHe
3Ha'leHHJI ceMaHTH'ICCKaJI aHHOTallliJI
MapKepbl MapKepbl
~Hl>lll
t:hum:kin r:concr: CblH, JKeHa, m 1. Ha11eJKHbll!,
ev:posit t:humq t:loc: OCTaTbCJI MYlK npO'IHbll!, CTOl!Kllll,
r:qual r:spec: caMblii r:abstr: CII}"JK6a
npe11aHHbll!. r:ref: ce611 t:animal r:concr: nee
t:ment: 3HaTb t:space r:concr: Ilyrb, MecTO
m3. HecOMHCHHbll!, t:mod r:qual t:be:exist: 6b1Tb
t:be:disapp r:abstr: rn6e11b HCH36ClKHbll!.
r:rel t:dir: OTKy11a t:asp r:abstr: cnoco6 t:space r:concr t:fam: 11~a
Ta611u~a 4. JleKCHKo-ccMaHTH'lecKoe HarronHCHHC KOHCYp}'Kl.\HH c rnaronoM
JaHecmu B OT,l.\CJJbHblX 3Ha'ICHJUIX
JleKCHKO· JlesocTOpOHHHC flpaBOCTOpOHHHC
3Ha'ICHHJI CeMaHTH'lecKaJI aHHOTal.IHJI
MapKepb1 MapKepbl
3aHecmu
m7. OTKllOHllTb, t:tool:device:machine
pe3KO rmsepHyTb der:v ca:noncaus r:concr t:fam:
t:tool:device:machine B CTOpoHy lfllH
t:move d:pref r:concr t:fam: MaWHHa
MaUJHHa CHllbHO HaKpeHHTb (np11 /1BHlKCH1111).
t:stuff r:concr r:abstr top:stripe t:space
t:weather: CHer m8. (1-e 112-e 1111110
der:vca:O t:space r:concr pt:part r:concr t:fam: 11opora
He ynoTp.). KOrO-'ITO. r:abstr t:weather: 3aCblilaTb, 3aMeCTH.
t:changest d:pref pc:space: y11111.1a MCTCllb CHerona11
top:stripe t:space Henoro11a
r:concr: Tpona nyTh
MapKepos 3HaqeHYIH QeJieBhIX cnos B cocTase KOHCTPYKQYIH npY1cyTcrny10T KOMilOHeHTbI KonnoKaQYIH c BbICOKYIM noKa3aTeneM Log-Likelihood (LL). OrAenbHhie KonnoKaQYIYI HaIUnYI 0Tprot<eHY1e B cnosapffi{ MAC YI :SAC KaK ycToJlt:qHBble coqeTaHYIJI (npYIBOAJITCJI c IlOMeTOH 0 B :SAC) YI cppa3eonorYJ3MbI (npHBOAJITCR c noMeToJ/t: 0 MAC YI c noMeToJ/t: - B :SAC). HanpttMep, KOHCTPYKQYJR BepHbtil.+ce6e (LL=47,27) npY1cyTcrnyeT B MAC, :SAC, a KOHCTPYKQYJR 3aHecmu + cHez (LL= 346,32) ynoMHHaem:i- B :SAC.
171
LA FILOLÓGICA POR LA CAUSA
B pe3ynhTaTe rrpoBeAeHHhIX 3Kcrrep11IMeHTOB 6hmo AOKa3aHo, qTO
rroqTllI y Bcex 3HaqeHllIH MHoro3HaqHoro cnoBa ecTb KOHCTPYKQllillI, xa
paKTepHbie llIMeHHO Ailll 3Toro KOHKpeTHoro 3Haqemui:. IloA06Horo poAa
llIHcpOpMaQllIJI MO)f(eT 6h!Tb llICI10llb30BaHa B AanbHeHllleM Ailll ITOCTpoe
HJIIR pa3nlliqHhIX cnoBapetf KOHCTPYKQllIH.
7. 3aKJIIO'ICHHC
B CTaThe llI3llO)l(eHbl OCHOBHbie pe3ynbTaTbl, rronyqeHHbie B XOAe pa3-
pa60TKllI rrpoQeAyp aBTOMaTlliqecKoro pa3pellleHllIJI neKCllIKO-CeMaHTllI
qecKOH HeOAH03HaqHoCTllI QeneBbIX cnoB III BhIAeneHllIR KOHCTPYKQllIH
B KOHTeKcTax HKPJI: orr11IcaHo rrporpaMMHoe 06ecrreqeH11Ie AJUI o6pa-
6oTKllI n11IHrB11ICT11IqecKoro MaTep11Iana; rrpoaHanllI311IpoBaHbI rrapaMeTpb1
llICcneAOBaHllIJI III AaHHhie, rronyqeHHbie B XOAe 3KCnep11IMeHTOB c KOHTeK
CTaM11 HKPJI. 0CHOBHOH llITOr llICCneAOBaHllIJI 3aKmoqaeTCJI B ITOATBep)l(AeHllillI Toro,
qTo TllIIT III CTerreHb AeTanllI3aQllillI llllIHfBllICTlliqecKOH pa3MeTKllI KOH
TeKCTOB HKPJI rro3BOJUIIOT ccpopM11IpoBaTh MHO)l(eCTBO KOHTeKCTHhIX
MapKepOB Toro llillllI llIHOro 3HaqeHllIR Ha OCHOBe BbI6opoK KOHTeKCTOB;
06061.I.\llITb AaHHbl~ 0 KOHTeKCTHbIX MapKepax c TQqKllI 3peHJIIR I/IX npllIHaAne)l(HOCTllI K neKCllIKO-CeMaHTlliqecKllIM KnaccaM; OITllICaTb KnaCCbl
KOHCTPYKQllIH, CBJI3aHHhIX c TeM 11In11I llIHbIM 3HaqeH11IeM; 11Icrronh30BaTb
rronyqeHHYIO TaKllIM o6pa30M MOAenh coqerneMOCTllI An.a aBTOMaT11Iqe
cKoro rrocTpoeHllIR Karnnora KOHCTPYKQllIH Ha ocHoBe HKPJI.
A3apoaa Ji!. B., 5u<iuHeaa C. B., Baxumoaa f!. T. ABTOMarnqecKoe pa3peweH11e neKCW'leCKOH HeO,[(H03Ha'IHOCT"1 'laCTOTHblX cyl.J.(eCTBWTenbHblX (s TepMw
Hax CTPYKTYPHhIX e,[(WHWl.I RussNet) 11 TpyAhI Me)l(,[(yHap. KoH<f>. «KopnyCHaH JIWHrswcTwKa-2008». CI16., 2008.
A3apoaa Ji!. B., MapuHa A. C. ABTOMaTw3wpoBaHHaH KJ1acrn<f>wKa11w11 KOHTeKCTOB npw nO,[(fOTOBKe ,[(aHHblX AJIH KOMTibJOTepHoro Te3aypyca RussNet 11 KoMnh10TepHa11 nwHrswcTwKa w WHTenneKTyanhHhie TeXHonorw11: TPYAhI Me)l(.r1yttap. KoHcp. «,L\11anor- 2006». M., 2006.
5om,tuaKoBa E. Ji!., 5aeaa H.B., 5opoa'ieHKOBa E. A., Bacunbeaa H. 3., MopoJOB c. c. JleKCWKO-CWHTaKCW'leCKWe wa6nOHbl B 3a,[(a'lax aBTOMaTW'leCKOH o6pa-
6oTK11 TeKCTa 11 KoMnhJOTepHaJI n11HrBWCT11Ka 11 11HTenneKTyanhHbie TeXHono
n111: TPYAhI Me)l{,[(yttap. KOHcp. «,L\11anor-2007». M., 2007.
172
LA FILOLÓGICA POR LA CAUSA
EopucoBa E. r. Kon110Ka1..11111. qTO 3TO TaKoe 11 KaK HX l13)"1aTh. M., 1995.
faic B.f. K npo611eMe ceMaHTwrecKol1 rnHTarMaT11KH // ITpo611eMbI cTpyK
Typttol1 11HHrs11cT11KH. M., 1972.
fe11b6yx A. <P., CuoopoB r. 0 ., 3pHaHoec-Py6uo 3., lfy6yKoBa M. B. C11osap11
COlfeTaeMOCTl1 CllOB: KaKOH MeTOA COCTaBlleHHH ll)"lllle? // KOMilblOTepHaH mrnr
Bl1CTHKa "' HHTe1111eKTya11bHb1e TeXH011ornH: TPYAhI Me~yttap. Kott<P. «,!l11a-
11or-2004». M., 2004.
3axapoB B. II., Xox110Ba M. B. Atta11113 3<P<PeKTHBHocrn crnrncT11lfecKHX Me
To.v;os Bb!HBlleHJ1Jl KOllllOKaQHH B TeKCTax Ha pyccKOM H3bIKe // KOMilblOTCpHM
1111HrBHCTHKa H HHTe1111eKTya11bHb1e TexH011orn11: TPYAhI Me)l(.v;yHap. Kott<P. «,!l11a-
11or-2010». M., 2010.
JitopoaHCKQJC n. H., Mellb'iYK Ji!. A. CMblCll "' COlfeTaeMOCTb B CllOBape. M.,
2007
Ko6puu,oB E. II., HHweBcKaH 0. H., llleMaHaeBa 0. IO. CHHT11e 11eKrnKo-ce
MaHTHlfeCKol1 OMOHHMHH B HOBOCTHblX l1 ra3eTHO-)l(ypHallbHblX TeKCTax: no
sepXHOCTHbie <PHllbTpbl "' CTaTHCTHlfeCKaH OQeHKa // J.1HTepHeT-MaTeMaTl1Ka
2005: asToMaTHlfeCKM o6pa6oTKa se6-.v;aHHbIX. M.,2005.
KyJHeu,oBa IO. n. fpaMMaTHKa KOHCTPYKl.ll1H. 063op // Ha)"IHO-TeXHl1lfeCKaH
11tt<PopMaQHH. 2007 Cep. 2. No 4.
HyKa1«eBU'i H. B., lfyuKo J(. C. AsTOMaTHlfecKoe pa3pe11JeHHe 11eKrnlfecKol1
MHOf03HalfHOCTJ1 Ha 6a3e Te3aypyCHblX 3HaHHH // J.1HTepHeT-MaTeMaTJ1Ka 2007:
c6. pa6oT )"laCTHHKOB KOHKypca. EKaTepHH6ypr, 2007.
HH1«eBCKaH 0. H., KyJHeU,OBa IO. n. PyccKHH <I>pei1MHeT: K 3a.v;aqe C03AaHHH
KopnycHoro c11osapH KOHCTPYKQHH // KoMnhlOTepHM llHHrBl1CTHKa 11 11Hre11-
11eKTya11bHb1e rexHonornH: TPYAhI Me)l(AyHap. Kott<P. «,!lHa11or-2009». M., 2009.
MumpocfiaHOBa O.A., Ee11uK B.B., KaouHa B.B. KopnyCHoe 11crnenoaatt11e
colfeTaeMOCTHbIX npe.v;nolfTeHHH qacTOTHbIX 11eKceM pyccKoro H3bIKa // KoM
IlblOTepHM llHHrBHCTHKa "' HHre1111eKTya11bHb1e TexH011ornH: TPYAhI Me)l(AyHap.
KoH<P. «,!lHa11or-2008». M., 2008.
MumpocfiaHoBa 0. A., fpa<iKOBa M.A., llluMopuHa A. C., HH1«eBcKaR 0. H. JleKC11lfeCKHe, ceMaHTHlfeCKl1e "' Mop<Po11ornlfeCKHe npH3HaKH KOHTeKCTOB
B pa3pellleHHl1 HeOAH03HalfHocr11 pyccKHX cy1.1.1ecTB11Te11bHbIX // XXXIX Me:>K
.v;yttap. 4>1111011. KoHqi. CeKQ11H MaTeMaTHlfecKol111HHrBHCT11K11. CIT6., 2010.
MumpocfiaHoBa 0 . A., HR1«eBcKaR 0. H., IlaHU'ieBa II. B. 3Kcnep11MeHTbI no
CTaTHCTHlfeCKOMY pa3pellleHl1IO 11eKCHKO-CeMaHTl1lfeCKOH HCO.D;H03HalfHOCTlf
pycCKHX HMeH cy1.1.1eCTBl1TellbHblX B Kopnyce // TpyAhl Me)l(.v;yHap. KOH<P. «Kop
nyCHaH lll1HrBHCTHKa-2008». CIT6., 2008.
MumpocfiaHoBa 0. A., IlaHu'ieBa II. B., HH1«eBcKaH 0. H. CrnrncTHlfeCKoe
pa3perneH11e 11eKCHKO-CeMaHTHlfeCKOH HeO.D;H03HalfHOCTl1 B KOHTeKCTax .D;llH
npe.v;MeTHbIX HMeH cy1.1.1ecrs11TenbHbIX // KoMilblOTepttaH IIHHrBl1CTl1Ka 11 HH
Te1111eKTya11bHb1e TeXH011or1111: TPYAhI Me)l(nyttap. KoHqi. «,!l11a11or-2008». M.,
2008.
173
LA FILOLÓGICA POR LA CAUSA
PaxunuHa E. B., Ko6pu14oa E. 11., Kycmoaa f. 11., /IxiueacKaH 0. H., IlleMaHaeoa 0. IO. Mttoro3Ha'IHOCTb KaK npHKrraAHaH npo6rreMa: rreKCHKO-ceMaHTH'leCKaH pa3MeTKa B Hal.\HOHallbHOM Kopnyce pyccKoro R3bIKa // KoMnblOTepHaH rrHHrBHCTHKa H HHTenrreKTyarrbHbie TexHorrornH: TPYAbI Me)!{Ayttap. KOHQ>. «.QHarror- 2006». M., 2006.
Crrosapb pyccKoro R3bIKa: B 4 T. I noA peA. A. IL Esrettbesow: 2-e H3A., 11cnp. 11 AOn. M., 1981- 1984 (s TeKcTe - MAC).
Crrosapb cospeMeHHoro pyccKoro rrHTepaTypHoro R3bIKa: B 17 T. I noA peA. B.11. '1.Jepttb1111esa. M.; JI., 1948-1965 (s TeKcTe - BAC).
Tonooaa C. IO.. Kycmoaa f. 11., JlxiueacKaH 0. H. CeMaHrn'leCKHe Q>11rrbTpb1 AITR pa3pe111eHHR MHOr03Ha'IHOCTH B Hal.\HOHarrhHOM Kopnyce pyccKoro R3bIKa: rrrarorrb1 // KoMnb10TepHa11 lIHHrBHCTHKa H HHTerrrreKTyarrbHble TeXHorror11H: TPYAbl Me)KAyHap. KOHQ>. «.QHarror-2008». M., 2008.
JlzyHoBa E. B., I1UBOBapoaa n. M. OT KOlllIOKa~Hfl K KOHCTPYKl.\HRM II PycCKHH R3bJK: KOHCTPYKl.\HOHHbie H rreKCHKo-ceMaHTH'leCKHe nOAXOAbI. Cil6., 2011.
Fillmore Ch.J, The Mechanisms of Construction Grammar I I Proceedings of the Berkeley Linguistic Society. 1988. Vol. 14.
Goldberg A. E. Constructions at Work: the Nature of Generalization in Language. Oxford, 2006.
Goldberg A. E. Constructions: a Construction Grammar Approach to Argument Structure. Chicago (Ill.); London, 1995.
farowsky D., Florian R. Evaluating Sense Disambiguation Across Diverse Parameter Spaces II Natural Language Engineering. 2002. Vol. 8(4).
Kustova G. I., Lashevskaja 0. N., Paducheva E. V. , Rakhilina E. V. Verb Taxonomy: From Theoretical Lexical Semantics to Practice of Corpus Tagging 11 Cognitive Corpus Linguistics Studies I ed. by B. Lewandowska, K. Dziwirek. Frankfur, 2009.
Leacock C., Chodorow M., Miller G. Using Corpus Statistics and WordNet Relations for Sense Identification // Computational Linguistics. 1998. Vol. 24 ( 1).
Automatic Word Sense Disambiguation and Construction Identification Based on Corpus Multilevel Annotation I 0. Lyashevskaya, 0. Mitrofanova, M. Grachkova, S. Romanov, A. Shimorina, A. Shurygina // Text, Speech and Dialogue. Proceedings of the 14th International Conference TSD 2011, Pilsen, Czech Republic, September 1-5, 2011. Pilsen, 2011.
Manning C., Schutze H. Collocations I I Foundations of Statistical NLP. 2002. Mihalcea R. Word Sense Disambiguation Using Pattern Learning and
Automatic Feature Selection I I Journal of Natural Language and Engineering. 2002. Vol. 1 (1).
Mitrofanova 0., Lyashevskaya 0. Disambiguation of Taxonomy Markers in Context: Russian Nouns I/ 17th Nordic Conference of Computational Linguistics NODALIDA- 2009, Odense, Denmark, May 14- 16, 2009.
174
LA FILOLÓGICA POR LA CAUSA
Mitrofanova 0., Panicheva P., Lashevskaya 0. Statistical Word Sense Disambiguation in Contexts for Russian Nouns Denoting Physical Objects 11 Text, Speech and Dialogue. Proceedings of the 11th International Conference TSD 2008, Brno, Czech Republic, September 8-12, 2008. Brno, 2008.
Navigli R. Word Sense Disambiguation: a Survey. ACM Computing Surveys. 2009. Vol. 41(2).
Pedersen T. A Baseline Methodology for Word Sense Disambiguation 11 CICLing. LNCS. Vol. 2276 /ed. by A. F. Gelbukh. Heidelberg, 2002.
Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics. Los Angeles (CA), 2010.
Sahlgren M., Knutsson 0. Workshop on Extracting and Using Constructions in NLP II NODALID.4:09: SICS Technical Report. Odenge, 2009.
Schutze H. Automatic Word Sense Disambiguation // Computational Linguistics. 1998. Vol. 24(1).
Shimorina A., Grachkova M. Identification of Context Markers for Russian Nouns // 18th Nordic Conference of Computational Linguistics NODALIDA 2011, Riga, Latvia, May 11-13. Riga, 2011.
Tomasello M. Constructing a Language: A Usage-Based Approach to Child Language Acquisition. Cambridge (MA), 2003.
Wible D., Tsao N.-L. StringNet as a Computational Resource for Discovering and Investigating Linguistic Constructions // Proceedings of the NAACL HLN Workshop on Extracting and Using Constructions in Computational Linguistics. Los Angeles (CA), 2010.
Word Sense Disambiguation: Algorithms and Applications. Text, Speech and Language Technology I ed. by E. Agirre, Ph. Edmonds. Vol. 33. Berlin; Heidelberg; New York, 2007
LA FILOLÓGICA POR LA CAUSA
Hay'-IHOe H3AaHHe
CTPYKTYPHMI VI CTPVIKflAllHMI flVIHfBVICTVIKA
MeJKayJoacKuu c6opHuK
BbinycK 9
PeAaKTOp JI. A. Kapnoaa
KoMnblOTepHa.H sepcTKa E. M. BopoHKOaoii
no,l{mfCaHO B ne'-!aTb 13.07.12. <I>opMaT 60x84 I I 16'
CTe'-!aTb ocpcernaR. fiyMara ocpceTH<UI.
Yrn. ne'-1. n. 20,69. THpa:>K 250 3K3. 3aKa3 2.f.C
l13AaTellbCTBO CaHKT-DeTep6yprcKoro YHHBepcineTa.
199004, C.-0eTep6ypr, B.O., 6-.R !IHHH.R, 11/21.
Ten. (812)328-96-17; cpaKc (812)328-44-22
E-mail: [email protected]
www. uni press. ru
TimorpacpH.R l13,l{aTe!IbCTBa cn6n:
199061, C.-DeTep6ypr, CpeAHHtt np., 41.
LA FILOLÓGICA POR LA CAUSA