NooJ 2012Paris
2012-06-14
SkupMjestogggg-mm-dd
Derivation of Adjectives from Proper Nouns
Kristina Vučković, Sara Librenjak, Zdravko Dovedan Han
University of Zagreb, Faculty of Humanities and Social SciencesDepartment of Information and Communication Sciences
{kvuckovi, slibrenj,zdovedan}@ffzg.hr
NooJ 2012Paris
2012-06-14
Der
iva
tio
n o
f A
dje
ctiv
es f
rom
Pro
per
No
un
s
NooJ 2012Paris
2012-06-14
Overview of the work
1. Purpose of the research2. Proper nouns in Croatian
1. Proper names1. Male names2. Female names
2. Surnames1. Derivations of female surnames
3. Toponyms and geographical terms4. Acronyms
3. Results4. Pending work and issues
NooJ 2012Paris
2012-06-14
Purpose of the research
42 420 proper nouns in Croatian dictionary 560 779 inflected forms
important part of the dictionary main goal:
NooJ grammar would save dictionary space and work time
minor goal: semantical marking of proper noun derived
adjectives could be useful for research concerning syntax and semantics
NooJ 2012Paris
2012-06-14
Proper nouns and adjectives derivation (1)
first and last names of people female names (-a or -0)
Ana-Anin, Ines-Inesin, Mia-Mijin, Ankica-Ankičin male names (various endings)
Ivan-ov, Luka-Lukin, Ivica-Ivičin, Matej-ev, Harry-jev surnames (common: –ić, ar, ač, j, lj, š, ac, ec, ak, v, o)
Anić-ev, Stipaničev-ljev, Debeljak-ov, Posavec-Posavčev, Varga-Vargin, Krišto-v, Rahimovski-jev, Mance-tov
female derivations of surnames (-ka, -a, -ova, -eva) Anićka-Anićkin, Novakova, Zagorčeva, Pavlova
foreign names – according to pronunciation
NooJ 2012Paris
2012-06-14
Proper nouns and adjectives derivation (2)
toponyms, geographic adjectives commonly ending in –ski, čki, ški
Split-ski, Kutina-Kutinski, Buje-Bujanski Amerika-američki, Karlovac-
karlovački, ! Zagreb+a+čki
Pag-paški, Požega-požeški, Komiža-komiški
names of languages from toponyms or nationalities, ending in -
ski Portugal-ski, Francuz-francuski, Kinez-kineski
acronyms syntactic grammar (hyphen)
HDZ-ov, FIFA-in
NooJ 2012Paris
2012-06-14
Results
Tested on a corpus of 752 sentences with 972 instances of adjectives derived from proper nouns
Precision: 0,971 Recall: 0,958 F-measure: 0,965
NooJ 2012Paris
2012-06-14
Pending work and issues
Recall issues: irregular adjetives
Italija->talijanski; Osijek->osječki, Bog->Božji compound ajdectives
Bosna i Hercegovina – bosanskohercegovački nationalities, names of institutions – not yet
added Precision issues: toponymes ending in –ska/čka: Češka, Hrvatska,
Hrvatske, Njemačkoj, Slovačka, Turskoj some common nouns (UPP) homographic with
last names (Glas, God)