-
Thesis HinMaT MT Framework
83 | P a g e
Chapter 3
Study of Hindi & MarathiLanguages
Language is a process of free creation; its laws and principles
are fixed, but the
manner in which the principles of generation are used is free
and infinitely varied.
-Noam Chomsky
In order to develop a rule based MT, we must have thorough
knowledge of the
language pair under MT system. Hence vis-a-vis study of the
grammar of SL and TL
is necessary. This chapter presents our study of Hindi and
Marathi from purely
linguistic and computational perspective. Complete detailed
description of grammar
of both the languages is itself a huge topic, hence we have
confined our discussion on
grammar in the context of HinMaT. However, where ever necessary,
we have
presented detail discussion on particular topics. One of the
objective of this study is to
freeze the approach and architecture for HinMaT, which is highly
constrained by the
linguistic character of the language pair i.e. Hindi-Marathi.
Hence as stated earlier,
this study was scoped to purely linguistic perspective and MT
perspective. This study
involved detail study of script i.e. Devanagari, basic alphabet
set (Vowels and
Consonants), morphology of both languages, part-of-speech types,
grammatical
categories ( ) like gender, number, person, case, tense,
aspect,
modality & voice, sentence structures, anatomical study of
verbs and verb phrases.
Based on the study, a paper was presented in All India
Conference on Linguistics
(AICL) held at Deccan College, Pune (Bhavsar & Pawar, 2008).
For carrying out this
study, linguistic literature (Sing, 2006), (Sing, 1985), (Sahay,
2000), (Tiwari, 2000)
(Pandharipande, 1997) (Dhongade & Wali, 2009), (Kaul, 2008),
(Guru, 1920)
(Bajpayee K. P., 1959), (Deshmukh, 1990), (Deshmukh, 1990),
(Basutkar, 1970),
(Hiremath, 1993), (Sabanis, 1974), (Sahay, 2000),
(Vishwanathdev), (Valimbe, 1983),
-
Thesis HinMaT MT Framework
84 | P a g e
(Damale M. K., 1965) and few internet sites were referred. Our
study is comparative
as well as contrastive in nature.
3.1 General Introduction
India is a very vibrant country full of cultural as well as
lingual diversity. According
to a comprehensive survey by the People's Linguistic Survey of
India (PLSI), a public
consultation and appraisal forum, in 2009-2013, around 780
spoken languages and 86
scripts are used in the country and country has lost 250
languages in last 50 years.
PLSI survey was conducted in 04 years time in collaboration with
85 institutions and
universities involving 3000 experts. This is the first survey in
independent democratic
India, after earlier linguistic survey conducted by Irish
linguistic scholar George
Abraham Grierson during 1898 to 1928 (Indian Express article
dated 16th July 2013).
Out of these languages, 122 languages (already recognized in
government census) are
spoken by population more than 10,000 people, while others are
spoken by less than
that number. Government of India, under schedule 8 of Indian
constitution, has
recognized 22 languages as official languages of Indian union.
These 22 languages
includes Assames, Bengali, Bodo, Dogri, Gujrati, Hindi,
Malayalam, Manipuri,
Marathi, Nepali, Oriya, Punjabi, Sanskrit, Kannada, Kashmiri,
Konkani, Maithili,
Santali, Sindhi, Tamil, Telugu, Urdu. It means language pair
under the HinMaT
purview is recognized under schedule 8 of Indian constitution.
Figure 3.1 below
shows the language map of India, which clearly reveals that
Hindi is largest spoken
language of India, which is spoken in around 10 states. After
independence, Hindi
written in Devnagari has been declared as the official language
of Indian union
(Indian Constitution, Official Language Act-1963, Part 17,
article 343, section A);
while Marathi was declared as official language of Maharashtra
state vide
Maharashtra Official Language Act 1964 of Government of
Maharashtra. Both
languages belong to Indo-Aryan language family, which is sub set
of Indo-European
language family, which is the worlds biggest language
family.
Globalization has changed the lingual character of countries
worldwide; hence
different languages have crossed their traditional jurisdiction
and are now spoken in
many parts of the world. Besides India, Hindi is spoken in other
parts of the world
also, majorly including Pakistan, Bangladesh, Nepal, UAE etc.,
Marathi is widely
spoken by Marathi community in Maharashtra and rest of the
country as well as
-
Thesis HinMaT MT Framework
85 | P a g e
Mauritius & Israel due to mass migration of labour class.
Today Hindi is ranked as
worlds 4th largest spoken language and Marathi is worlds 16th
most spoken language
(Lewis, 2014).
Figure 3.1 Language Map of India (courtesy:
http://www.mapsofindia.com)
3.2 History of evolution and development
The history of origin and evolution of Hindi and Marathi (Sing,
2006), (Guru, 1920),
(Pandharipande, 1997) is shown using following tree diagram on
time scale (Figure
3.2):
-
Thesis HinMaT MT Framework
86 | P a g e
Figure 3.2 Hindi-Marathi Language Evolution Tree
From the above Figure 3.2, it is clear that both languages have
their root in Sanskrit
and evolved after Apbhransha ( ) era. Milestone eras of
development of these
languages as excerpted from (Sing, 2006), (Deshmukh, 1990) are
presented in the
following Table 3.1.
Table 3.1 Historical Milestones in development of Hindi and
Marathi
Period Ruler Highlights
Hindi
1000-1200 AD
Early PeriodHindu Rulers
Last Hindu Rule phase Use of Apbhransha Languages in Literature
Shauraseni Apbhransha become language for
literary work in Northern Indian literature Evolution of Urdu
and Khadhi boli from
apbhransh
1200-1500 AD
Pre-Medieval
Period
Muslim Rulers
Heirs of Ghulam family, Khilaji family, Tuglagfamily and Lodi
family rulers ruled country.
Spiritual Era: Amir Khusaro(1225-1325),Kabir(1378-1448), Guru
Nanak(1469-1538),Sant Dnyaneshwar, Sant Namdev
Ending of Apbhransha and rising of Khadi boli Spread of Khadi
boli in south(Dakhhani) Pharsi became official language of
India
-
Thesis HinMaT MT Framework
87 | P a g e
1500-1800 AD
Post-Medieval
Period
Muslim(Mughal)
Rulers
Era of peace and prosperity (Akbar, Jahangir,Shahjahan)
Encouragement for Arts, Languages andLiterature.
Spread of Hindi during Mughal rule
1800-1947 AD
Modern Period British Rule
Fall of Mughal rule Use of Khadi Boli for literature
preparation
and propagation Translation of Bible in Hindi Hindi played
crucial role in Independence
movement. Gandhijis role in popularization of Hindi at
national level. Era of standardization and research and
development initiatives for Hindi Grammar Inclusion of Hindi in
education system through
Fort Williams College.
1947-Till date
Post
Independent
era
Indian Republic
Modernization and Standardization of Hindi Hindi declared as
official language of Indian
union vide article 343 of the Indianconstitution.
Enforcement of Official Language Act 1963 Official Language
rules (1976/1987) Establishment of Central Hindi Directorate,
Central Hindi Training Institute, CentralTranslation
bureau(CTB), Department ofofficial languages
Development of Hindi font by Gist group, C-DAC, Pune
Development of Software like dictionaries,encyclopedia, etc. for
Hindi language,
Emergence of Unicode (1987).
Marathi
605 AD - First inscription at Shravan Belgolar
(Mysore), Karnataka in support of existence ofMarathi.
1250-1350 AD Yadav era Important literary work: Mukundraj-Vivek
Sindhu Dnyaneshwar Dnyaneshwari Mahanubhav Panth LilaCharitra
1350-1600 AD Bramhani era Important literary work: Sant Eknath
Ramayana and Bhagwat Dasopant Mitarnav,Padarnav
1600-1700 AD Shivaji era Important literary work: Sant Ramdas
Dasbodh and Manache
Shlok Tukaram Abhang
-
Thesis HinMaT MT Framework
88 | P a g e
Vaman Pandit Yatharth Dipeeka Raghunath Pandiit Damayanti
Swayamvar Muketeshwar Ramayana,
Harishchandrakhy- an
1700-1800 AD Peshava Rule
Important literary work: Moropant Sanskrit Granth,
Mahabharat,
Ramayana Kavya Ram Joshi Chhand-manjiri Shridhar Harivijayy,
Pandav Pratap,
Shivlilamrut
1800- till dateBritish Rule to
Indian Republic
New dimensions to literature by variousscholars Marathi
Grammar(Dadoba) Use of Marathi in Mathematics,
Geography, Astrology, Newspaper, Dramawriting, Spiritual
Granthas, TranslatedGranthas
3.3 Script ( )
Hindi as well as Marathi has officially adopted Devnagari script
for writing purpose.
It is important to note here that during 17th century,
alternative cursive script called
modii was used in official documentation by Hemadpanth () rule
as well as
during Shivaji era, it was replaced by Balbodh(based on Sanskrit
Devnagari) in 1917
during British rule. Balbodh script was used for writing Marathi
poetry. Devnagari
script is based on Brahmi script (Abugida family of writing
system). The Brahmi
script is said to have come into existence in 500 BC (though
there is dispute on this
amongst scholars). When it came to India, it got split into two
streams North Stream
and South Stream and they got transformed into different
scripts. The details of this
evolution (Sing, 2006) are presented in the chart form below
(pl. see Figure 3.3):
-
Thesis HinMaT MT Framework
89 | P a g e
Figure 3.3 Evolution of Devnagari Script
Like most of the world languages, Devnagari is also written from
left to right. Due to
its syllabic nature and alphabet set, it is scientifically a
perfect script than other
existing scripts in the world, like Chinese, Arabic and Roman
because, every sound
can be expressed in alphabet. Like Roman script, it does not
have provision for capital
and small letters. Traditional Devnagari alphabet set has 52
characters. It does have
cluster consonants ( ) , , , Maatras and special symbols.
The
alphabet set is exhaustive and was augmented due to influence of
foreign languages
like Arabic-Pharsi (, , , , ) and English (, ). Some vowels,
especially from Sanskrit have been deprecated (, , ) due least
usage. The
Devnagari alphabet set as adopted in Unicode is given in
Appendix A. The Unicode
set contains the original character set (along with deprecated
letters) plus extended
characters.
3.3.1 Hindi Alphabet set ( )
We found that there is disagreement in linguistic community with
regards to number
of consonants and vowels from original Devnagari alphabet set.
Here we have
presented the alphabet set (pl. see Table 3.2 below) as dictated
by Kendriya Hindi
BramhiScript(500 BC)
NorthStream
Gupta Lipi(400-500 AD)
Kutil Lipi(600 AD)
Ancient Nagari(900-1100 AD)
Bangla Devnagari
Modernization(15th Century)
FurtherModifications(19th Century)
Gujrathi Asamiya
SouthStream
Nandnagari
Tamil Kannada Telugu Malayalam
-
Thesis HinMaT MT Framework
90 | P a g e
Sansthan, Agra, which is government of India funded autonomous
institution, set up
especially for promotion of Hindi.
Table 3.2 Hindi Alphabet Set ()
(Vowels) : 15 a aa i ee,ii,I u oo,uu,U e ai,ei o
au,ou a^ a:,aH A O Ria (Consonants) : 33 ka,ca kha ga gha Nga
cha chha ja jha Nja Ta Tha Da Dha Na ta tha da dha na pa pha,fa ba
bha ma ya ra la va,wa sha Sha sa ha
xa,kSha tra Gya, jNja, dny D_a Dh_a (Nuktas) : 05 qa Kha Ga za
Fa (Matras) : 12 aa i ee,ii,I u oo,uu,O R e ai,ei o au,ou A
O , (special symbols): 03
^ H M
3.3.2 Marathi Alphabet set ( )
Traditionally, Marathi alphabet set has 52 alphabets, which
includes 16 vowels and 36
consonants. As Marathi follows the Sanskrit alphabet set, there
are few vowels and
consonants which are typically found in Sanskrit; hence most of
grammarians have
recommended deletion of such vowels and consonants from Marathi
Alphabet set.
These includes consonants like , , and whereas leading linguist
M. K.
Damale is of the opinion that only cluster consonants and may be
deleted. Other
-
Thesis HinMaT MT Framework
91 | P a g e
linguist M. P. Sabanis is against of deleting any consonant.
Arvind Manglurkar has
advocated for only 22 consonants. With respect to vowels M. K.
Damale suggested
deletion of long , , and as they are used only in Sanskrit.
Arvind
Manglurkar has recommended 07 traditional (excluding , , , ,
,
), while Dr. Lila Govilkar is in favor of 9 traditional (, , , ,
, ,
, ) and 02 from English (, ). Ideally consonants are written
using halant
() i.e. , , , etc. but they cant be pronounced in their bare
form, hence
while representing them, we have used their vowelized letter
forms, which can be
pronounced (Pl. see following Table 3.3).
Table 3.3 Marathi Alphabet Set ()
50
, , , , , , , , , , , , , (14), , , , ,
, , , , , , , , , , , , , , , , , , ,
, , , , ,
So, if we compare the alphabet set of Marathi and Hindi, we can
find that the alphabet
is not present in Hindi, while Hindi alphabets which are added
due to influence of
Pharsi and Urdu like , , , , as well as , are not found in
Marathi. On Vowel side they are same with exception of but as
already stated
many people have advocated for its deletion from the alphabet
set of Marathi.
Regardless of these conflicting views and controversies on
accepting alphabet sets, we
are of the opinion that whatever is available in Unicode
character set for Devnagari
and whatever is required by all sections of society i.e. lay
man, intellect, writer, poet,
-
Thesis HinMaT MT Framework
92 | P a g e
government and private sector employees, business community etc.
should be
acceptable.
In Many cases, Hindi and Marathi differ in manner of spelling
vowels, transformation
from short length vowel ( ) to long length vowel ( ) and
vice-a-versa is a
common phenomenon. This is common source of spelling mistakes by
speakers of
these languages, e. g. -, -. As a raw observation, we have
witnessed common transformations like -, -, -/, -, - in
few alphabets. These are illustrated in the following table
(Table 3.4) below:
Table 3.4 Alphabet Transformations in Hindi-Marathi
Spellings
Hindi Marathi Example(Hindi-Marathi)
- , -
- , -, -
/ -, -
-, -
-, -
-
-, -
, -
- , -
3.4 Language Structure ()
Language is collection of sentences, which are formed using
words in its vocabulary
and grammar rules. Vocabulary in normal sense refers to the
collection of words in a
given Natural Language. Before, we discuss the nature and
structure of Hindi and
Marathi vocabulary, few notions in the context of vocabulary are
discussed first.
-
Thesis HinMaT MT Framework
93 | P a g e
3.4.1 Word
A meaningful cluster of alphabets is called as word. Every word
has definite meaning
and its independent existence. When word is employed in a
sentence, it may undergo
some changes (mostly morphological)6 and it is transformed into
new form called
Pada, hence Lexicon or Dictionary always contains words and not
Padas.
3.4.2 Classification of Words
Words can be classified using different dimensions of
abstraction such as its
grammatical category, morphology, vocabulary etc. Language
vocabulary gets
enriched from various sources. For Indian Languages, words are
primarily classified
using four main categories as depicted in (Sharma), these are
formally presented in
Figure 3.4 below:
6 Morphological changes inflect words with Gender, Number,
Person, Case, Tense, Aspect and Moodsuffixes (last three applies
only to verb category).
-
Thesis HinMaT MT Framework
94 | P a g e
Figure 3.4 Word Classification
WordClassification
Criteria
Part-of-Speech(POS)
Category
Noun/Pronoun/Verb/Auxiliaryverb/Adjective/Adverb
Construction
( )
Traditional ( )
Conjunct ( )
suffixing ( )
Sandhi
Compounding()
Prefixing ()
Origin
( )
Tatsam( )
Tadbhav( )
Deshaj()
Foreign words ( )
Arabic()
Farsi()
Turki( )
English( )
Portuguese ( )
Other Indian Languages
( )
Meaning
( )
Synonyms ()
Polysemic( )
Antonyms( )
Usage
( )
Declinable
)
Noun
Pronoun
Adjective
Verb
Indeclinable
)
Adverb
Conjuctives
Relative post position
Vocatives
-
Thesis HinMaT MT Framework
95 | P a g e
3.4.2.1 Part-of-Speech (POS)
Every word impart a syntactic role, when employed in a sentence,
this syntactic role is
its lexical category, which is called as, part-of-speech (POS).
The term POS is also
referred as word type or syntactic category or simply category.
Universally eight
basic POS categories have been recommended, these are: Noun,
Adjective, Pronoun,
Verb, (Auxiliary verb), Adverb, Pre-positions/Post-positions,
Conjunction and
Interjection. Besides these, Determiner/Article and Auxiliary
verb, which are
subclasses of Pronoun and Verb, are also treated as separate POS
categories. Each of
the POS type can be further sub classified into hierarchical
level of any depth, thus
defining ontology. Both Hindi and Marathi both follow the
universal POS categories,
Pre-positions are not observed in Hindi as well as Marathi.
Following table (Table
3.5) lists the POS categories of English, Hindi and Marathi.
Table 3.5 English-Hindi-Marathi POS categories
Sr.No. English Hindi Marathi
1 Noun 2 Adjective 3 Pronoun ,
( : , )
( : ,
)
4 Verb/
(Auxiliary verb)
, ,
5 Adverb 6
Prepositions
Post-positions, ,
Post-positions , ,
7 Conjunction
8 Interjection
(exclamatory)
-
Thesis HinMaT MT Framework
96 | P a g e
Except verb, which is complex category, other POS categories
have conventional
meaning/sense and are comparatively easy to understand; hence we
are skipping detail
discussion on other POS categories. Since verb is very important
lexical category
from computational perspective, we are presenting detail
discussion on verbs. Sample
words for each POS category from Hindi and Marathi are listed in
Appendix-B.
3.4.2.1.1 Verb category
Verb is very crucial lexical category as far as sentence
analysis and construction is
concerned, because verb denotes the action/message specified in
the sentence. Entire
sentence is anchored at verb. Verb system of any language
including Hindi and
Marathi is complex because an action may be expressed using
combination of lexical
categories (mostly noun, adjective, and verb) and intensifiers,
verbalizers and
auxiliary verbs. Verb classification is interesting problem,
because verbs can be
classified using different levels of abstractions such as
lexical, functional (predicate-
argument structure) and semantic (action scope) levels.
At lexical level, verbs are primarily classified as main verb
and auxiliary verb.
Auxiliary verbs assist the main verb to specify grammatical
information pertaining to
verb action such as tense, mood and aspect (these are discussed
in section 3.5.2.5)
ahead. The auxiliary verbs in Hindi and Marathi7 can be further
sub classified as
forms of / , modal auxiliaries like /, /, and
auxiliaries denoting TAM and Voice. Main verb can be classified
as simple verb (verb
root), complex verb ( ), compound verb ( ), and conjunct
verb
( ) (Sing, 1985).
Simple verbs are formed from verb roots like (eat), (drink), /
(wash),
(beat) etc., while complex verbs ( ) are formed with the help of
noun or
adjective and verbalizer ( ). Verbalizer , / are most frequently
used,
while in idiomatic/conventional complex verb constructions,
other verbalizers like
, , , , etc. are also used. These verbalizers can also be
7 x/y forms denote word x in Hindi and y in Marathi, unless
specified.
-
Thesis HinMaT MT Framework
97 | P a g e
used as intensifiers. Compound verb ( ) is constructed using
lexical verbs
( ) (generally two verbs) and an intensifier ( ). Intensifier,
itself
can appear as simple verb. In complex verbs the intensifier
losses its own lexical
meaning rather, it is used to intensify the meaning of lexical
verb. Around 18
intensifiers are observed in Hindi, but primarily only eight are
used. These are /
(to go), / (to take sense), / (to come) , /
(obligation sense), /, /, /, / .
Compound verb is peculiarity of Indian languages. Conjunct verb
( ) is
constructed with the help of two lexical verbs, such that
meaning of both verbs is
preserved rather such verbs indicate multiple actions. E.g.
/
(delivering and returning back) would mean going-giving and
coming back. Use of
between two verbs is also observed like (delivering and
returning
back).
From functional point of view (argument structure wise) verbs
can be classified as
intransitive (), transitive (), di-transitive/co-agentive ( /
),
subject + ( ), linking verb ( ), verbs taking object
compliment
(), and causatives ( ) (Sing, 2006).
At meaning level (verb action scope) verbs are classified as
finite and non-finite verb
forms. Finite forms denote finiteness of verb action
(constrained to time), while non-
finite forms express non-finiteness of verb forms. It is
important to note that the only
finite form verbs can serve as root (main verb) of a sentence.
Non-finite forms are
used in adjectival and adverbial sense. Non-finite forms are
further subcategorized as
infinitives and participles. Participles are further classified
as perfective participle,
imperfective participle and conjunctive participles (Kaul,
2008). Complete verb
classification is shown in following chart (Figure 3.5).
-
Thesis HinMaT MT Framework
98 | P a g e
Figure 3.5 Verb Abstraction Classification
Various types of verbs as listed in this classification are
explained in brief below:
Intransitive (): Intransitive verb take only subject (Karta) as
its mandatoryargument, it cant take any object (Karma). Verb agrees
with GNP of subject (Karta)
except in past tense constructions where Karta takes
post-position marker and hence
does not agree with verb in such forms. Hindi verb root and
Marathi verb root,
are examples of intransitive verbs. Example sentences for these
verbs are:
Hindi: (subject) (Shoham slept)
Marathi: (subject) . (Shoham slept)
Transitive (): Transitive verbs take subject (Karta) as well as
object (Karma).When subject takes post position marker verb agrees
with object in GNP.
E. g. Hindi: (Shyam ate Mango)
Marathi: . (Shyam ate Mango)
-
Thesis HinMaT MT Framework
99 | P a g e
Linking Verbs ( ): Linking verbs are used in copula sentences,
where they
link the subject with predicate (complement/). Noun, adjective
and adverb can
appear as the predicate. Hindi verb root form of and Marathi
verb root form of
are mostly used as linking verbs.
E.g. Hindi: i) (subject) (noun predicate) (Rafiq is my
servant)
ii) (subject) (noun predicate) (Radha is beautiful)
Marathi: i) . ( Rafiq is my servant)
ii) . (Radha is beautiful)
Di-transitive/Co-agentive ( / ): Di-transitive verb forms take
03arguments, subject, direct object and indirect object.
Di-transitive constructions are
also referred as dative constructions.
E.g. consider following illustrations,
Hindi: i) (subject) (indirect object) (direct object)
(Rameshgave money to servant)
ii) (co-agent: Ablative, source) (Gita took money
frommother)
iii) (co-agent: Instrument) (Prashant opened the lockwith
key)
Their Marathi equivalents are:
i) .
ii) (co-agent: Ablative, source) .
iii) (co-agent: Instrument) .
Object Compliments ():
-
Thesis HinMaT MT Framework
100 | P a g e
The object in such verbs requires a complement () to complete
the meaning ofsentence. It is important to note that these verbs
are not classified under di-transitive
though their argument structure appears to be like
di-transitive.
E.g.
Hindi: i) (subject) (object) (object complement) (Weelected
Ramesh as our leader)
ii) (object) (object complement)
Marathi: i) (object) (object complement) .
ii) (object) (object complement) .
Causative verbs ( ):
Two types of causatives i.e. Causative-1 and Causative-2 are
witnessed in Hindi as
well as Marathi. Intransitive and transitive verbs can be
causitivized by introducing
causitivization suffixes. Causative-1 constructions involve two
essential entities,
causer (sponsor), and agent (subject), followed by optional
object8 (karma) and
causative form of the verb. Here sponsor or causer functions as
grammatical Karta
which agrees with verb, while the agent (Karta) from real world
point of view (
) takes post position marker (quite often ko:). Causer in
causative-1 formsparticipate in action. Causative-2 verb forms
require a mediator to perform verb
action. Causer in such cases is only triggering the verb action,
mediator is
participating in the process to aid the agent ( ) for performing
verb action.Intransitive as well as transitive verb forms can be
transformed into causative-2 verb
forms.
It is important to note here that not all verb roots can be
causitivised (Basutkar, 1970).
E.g. Hindi verb roots like (come), (go), (want) etc. cant be
causitivized,
the same is true for Marathi also. Use of causatives is not as
frequent as Hindi, in
8 Presence of object depends upon the functional nature of verb
i.e. intransitive or transitive.
-
Thesis HinMaT MT Framework
101 | P a g e
Marathi. Hindi suffixes -, , - are mainly used for obtaining
causative
forms. Marathi causitivisation is done in two ways, first by
adding causative suffixes
or -9 with some morphological transformations and secondly by
adding
additional affixes like , , and , intensifier. First one is more
popular and
frequent. Following table (Table 3.6) shows the causative
suffixes for both Hindi as
well as Marathi (Basutkar, 1970) in both causative forms along
with examples (note
that / in the following table indicate option).
Table 3.6 Causative suffixes in Hindi and Marathi
Language Causitive-1 suffix Causitive-2suffix
Example
Hindi
, -, , ,
E.g.
-,- ,-,,
, -,-
E.g.
-,- ,-
Causitive-1:
i) (causer) (agent) ii)(causer) (agent) Causitive-2:
i) (causer) (mediator) (agent) ii) (causer) (mediator)
(agent)
Marathi
Type1:suffixing
-,-,-E.g.-/ , -/ -/ , - Type2:transformations
, , E.g.
-, -,
Type1:suffixing
-,-,-E.g.
-/ ,-/ ,-/ ,-/
Type2:usingintensifiers incompound verb
Causitive-1:
i) (causer) (agent) .ii) (causer) (agent) .iii) (causer) (agent)
.iv) (causer) (agent) .v)
9 Marathi uses same suffixes for both causative forms i.e.
causative-1 and causative-2.
-
Thesis HinMaT MT Framework
102 | P a g e
-, - , , .Causitive-2:
i) (causer) (mediator) (agent) .ii) (causer) (mediator) (agent)
.iii) (causer) (mediator) (agent) .
Infinitives ( ): These are also called as gerunds and are used
as noun(abstract class) or adjective. In Hindi, infinitive form is
obtained by adding suffix -
to verb stem. It may also take object as its argument.
Adjectival use of infinitives is
observed in verbs of obligation like , . In Hindi, if infinitive
is transitive, it
may take inflected forms of suffix - i.e. , -. Marathi
infinitives use suffix -
with stem to mark infinitive.
E.g. Following examples show use of Infinitives which are marked
with underline,
while infinitival suffixes are represented with bold face.
Hindi: (noun) (Running is good for health)
Marathi: (noun) .
Hindi: (object) (He is fast in book reading)
Marathi: (object) .
Hindi: (object) (I will have to drink the medicine)
Marathi: (object) .
Hindi: (object) (I want to drink water.)
-
Thesis HinMaT MT Framework
103 | P a g e
Marathi: (object) .
Participles ( ): Participles in Hindi work as adjectives and
adverb. They are
further classified as perfective, imperfective and conjunctive
participles (Kaul, 2008).
Perfective participles indicate completed activities, while
imperfective denote
unfinished actions.
In Hindi, imperfective participles are formed by adding suffix,
- (ms)10, - (fs),
- (mp), - (fp) to stem to agree with noun in GNP, in case of
adverbial use, only
- suffix is used, while for adjectival usage, all suffixes are
used. Adverbialimperfectives may be reduplicated and used in time
expressions. Adjectival
imperfective as well as perfective participles are expanded with
simple present
inflections of Hindi Auxiliary verb i.e. (ms), (*p), (fs).
Perfective participles in Hindi are formed by adding suffixes -,
-, -. Likeimperfective, perfective participles can be used
adjectivally or adverbially. Perfective
adverbial participles are often reduplicated. Conjunctive
participles are used in
sentences where two actions share same subject and these actions
are temporal in
nature where first one is antecedent of other. In such sentences
the first verb appears
in stem form followed by purvkalik krudant ( ) suffix (kar),
while
second verb take other conjugation suffixes. Examples of each
type of participles are
given in the following table (Table 3.7), note that the suffixes
are marked with bold,
while participles are underlined in examples.
Table 3.7 Participle Usage and Examples
Participle Usage Example
Imperfective Adjectival
Hindi: Marathi: .(Running boy fall down)Hindi:
10 ms-masculine singular, mp-masculine plural, fp-feminine
plural, fs-feminine singular, *- anything(mor f/s or p)
-
Thesis HinMaT MT Framework
104 | P a g e
Marathi: .
Adverbial
Hindi: (Radha met me while returning back fromthe school.)
Marathi: .
Perfective
Adjectival
Hindi: (ms)/ (fs)/(mp)(//) (ms)/(fs)/(mp)(Sitting
man/woman/boys)
Marathi: (ms) / (fs)/ (mp)
Adverbial
Hindi: (ms)/(fs) / (The man/woman sitting on the roof
wassinging)
Marathi: (ms)/ (fs)(ms)/(fs) /.Hindi: - (I got tired of sitting
ideal at home.)Marathi: - .
Conjunctive
Hindi: (verb1 stem+) (verb2)(He wrote the letter, after reading
thenewspaper.)
Marathi:
.
Hindi: (conjunctivemarker) (He took tea, after finishing the
work,).Marathi: .
AdverbialHindi: (He went and came back)
-
Thesis HinMaT MT Framework
105 | P a g e
Marathi: .
Fixed expressions
Hindi: (I specially meet him.)
Marathi: .
3.4.2.2 Construction
Words can be classified on the basis of their construction.
There are two sub
classifications under this categorization viz. Traditional( )
and Conjunct( ).Traditional words are those words which have been
in practice since past and accepted
as part of tradition, while conjunct words are formed by
conjoining two or more
words. This conjoining may be done through affixation, Sandhi,
Compounding
(). Affixation is done using suffixes ( ) and prefixes ().
3.4.2.3 Origin ( )
There are four subtypes under this type i.e. Tatsam ( ), Tadbhav
( ), Deshaj
() and Videshaj ( ). Most of these words are loan words or
transformations
of loan words except for Deshaj category.
Tatsam words are loan words borrowed from Sanskrit and used in
as-it-is form.Tadbhav words are borrowed Sanskrit words that have
undergone some
transformations.
Deshaj words are words not borrowed from other Indian languages
or Sanskrit
language but came from dialects and have strong influence of
local culture and
lifestyle.
Videshaj words are borrowed strictly from foreign (non-Indian)
languages.
3.4.2.4 Meaning ()
This typically refers to semantic ontological classification
like synonymy, antonymy,
and polysemy. Synonymy talks about meaning equivalence between
two different
words, while antonymy relates words with opposite meaning.
Polysemy refers to
multiple meanings associated with the same word.
-
Thesis HinMaT MT Framework
106 | P a g e
3.4.2.5 Usage ( )
This classification is done purely on linguistic consideration
of morphology and part
of speech (POS). Declinable words are those words that undergo
morphological
inflections due to Gender, Number, Person and Case, while
indeclinable words do not
change at all.
3.4.3 Hindi & Marathi Vocabulary
Hindi and Marathi are rich languages in terms of their
vocabulary (Deshmukh, 1990).
Their vocabulary consists of all types of words as expressed in
Figure 3.4 above.
Since Maharashtra share its state border with neighboring states
like Karanata, Andhra
Pradesh, Gujrat and Madhya Pradesh (Dhongade & Wali, 2009),
Marathi vocabulary
is enriched by Telugu, Kannada, Gujrati and even Hindi. Hindi
and Marathi languages
also have good amount of Tatsam and Tadbhav words as well as
foreign words (pl.
see Appendix B). It is interesting to see that Hindi and Marathi
vocabulary are highly
influenced by Sanskrit. It is common to observe words having
same origin but
different meanings and same origin and different spelling.
Consider following
examples from (Deshmukh, 1990):
E.g. 1) same source different spellings:
English word Guest: (Hindi) and (Marathi)
English word Committee: (Hindi) and (Marathi).
2) Same source different meaning:
Hindi word means (attempt), while in Marathi same word means
(mockery), similarly (Hindi) means education, while in
Marathi it means punishment.
3.5 Morphology ( )
Morphology is branch of linguistics that deals with deriving new
word forms from
the language vocabulary. Morphology attracted special attention
of scholars after
advent of NLP and when practical NLP application development
started. Every
natural language in the world has its own morphology system. In
Indian context, study
-
Thesis HinMaT MT Framework
107 | P a g e
of morphology is very important aspect because Indian Languages
are
morphologically very rich. Morphology helps a word to get
employed in a sentence by
deriving Pada. The linguistic literature (Kaul, 2008), (Bajpayee
K. P., 1959), (Guru,
1920) prescribes two kinds of morphologies viz. Inflectional and
Derivational
morphology. The derivation process may or may not change the
Part of Speech (POS)
category of a word. Based on later phenomenon morphology can be
classified as
inflectional or derivational. Before, we initiate discussion on
these types, lets
discuss detail structure of a Pada.
3.5.1 Anatomy of Pada
For study purpose the Pada is divided into two parts Prakriti
and Pratyaya (Affix).
Group of Prakriti and Pratyaya is called as Tidant( ) (Deshmukh,
1990). Prakriti
part can be further classified as Pratipadik (stem) and Dhaatu
(Verbal root). The term
Pratipadik (stem) is used to denote non-verb POS categories,
while Dhaatu (Verbal
root) is used for referring to Verb. Partipadik can be subtyped
as Vyutpanna
(constructed) and Avyutpanna. Vyutpanna (constructed)
Pratipadiks are constructed
by affixation process. Compounds (Saamasik words) are also
treated as Vyutpanna.
The Pratyayas (Affixes) are classified as grammatical and
Vyutpadak krut () and
taddhit ( ). The detail classification is shown chart (pl. see
Figure 3.6 below)
krut affixes gets conjoined only with Dhaatu, while taddhit can
get conjoined only
with Pratipadiks. Both Hindi and Marathi Padas follow same
structure as depicted in
Figure 3.6 below. It is important to note here that both
languages have only prefixes
and suffixes but no infixes. Words formed by suffixing krut
suffixes to Dhaatu are
called as kridant ( ), which can further take taddhit
Suffix(es).
Words formed by suffixing taddhit suffixes can recursively take
additional taddhit
suffix(es) (not more than 2-3 levels). Both Marathi and Hindi
have borrowed prefixes,
from Sanskrit as well as foreign languages like Farsi and Arabic
etc. List of sample
prefixes and suffixes (Sharma), (Deshmukh, 1990),
(Pandharipande, 1997),
(Dhongade & Wali, 2009) is presented in the following Table
3.8 below. Discussion
on Grammatical suffixes is presented in following section of
this chapter.
-
Thesis HinMaT MT Framework
108 | P a g e
Table 3.8 Sample Affixes used in Hindi/Marathi
Affix Type Hindi Marathi
Suffix
Sanskrit:-, -, -, - , -, -, -, - , -, -, -,,-, - ,-, -, -, - ,
-,- , -, -, -, - , -, -, -, -,-,-, -
Arabic/Farsi:-, -, -, -, -,-, -, -, -, -, - , -, -, -, -,
Sanskrit:-,-,-, - ,-, -,-, - , -, -, -, -, -, - ,-, -,-, - ,-,
-, - ,- ,-, - , - , -, , -, -, -
Arabic/Farsi:-, - , -, -, -, -, -, -, - , -,-, -, -/-, -, -, -,
-,-, -
Prefix
Sanskrit: -, -, -, -, -,-, -, -, -, -, -, -, -, -, -, -, -, -,
-, -, -, -, -
Arabic/Farsi:-, -, -, -, -, -, -, -, -, -, -, -
Sanskrit:-, -, -, -, -,-, - -,-, -,-, -, -, -, -, -, -, -, -, -,
-, -,-, -, -, - -, -Arabic/Farsi:-, -, -, -, -, -,-, -, -
-
Thesis HinMaT MT Framework
109 | P a g e
Figure 3.6 Anatomy of Pada
3.5.2 Grammatical Categories ( )
Words employed in a sentence (Padas) convey two types of
information viz. lexical
meaning ( ) and grammatical information ( ). Lexical
meaning reflects some object or concept in the physical world,
whereas the
grammatical information includes attributes/features like
Gender, Number, Person,
Case, Tense, Aspect, Voice and Mood. The Gender, Number, Person,
Case are
collectively referred as GNPC, while Tense, Aspect, Mood are
acronym as TAM.
Lexical meaning convey semantic character of word, which is
called logical category
and the grammatical information, is called as grammatical
category (Deshmukh,
1990) (Sing, 2006). This grammatical category is often confused
with part-of-speech
(POS) category, which is actually the lexical category of word.
E.g. consider a
sentence (Girl is eating chapatti). Here word (girl) conveys
the lexical meaning about a female human being and grammatical
information that
word has feminine gender and singular (only one) number, while
verb phrase
tells about the action called eating in present continuous
tense. Grammatical
= +
-
Thesis HinMaT MT Framework
110 | P a g e
categories are scoped only to grammatical world. They may or may
not have
relevance in physical world context. E.g. Grammatical category
Gender, may or may
not have any resemblance to the concept of biological sex
because if we consider
English word chair, it as such does not have any biological sex
in physical sense, it
has feminine gender in Hindi and neuter gender in English. In
our opinion this is more
a matter of convenience for coping with grammar world than
physical world. As
stated above, 07 such grammatical categories like Gender,
Number, Person, Case,
Tense, Aspect, Voice and Mood have been considered. The later
four are specifically
applicable to only verb and auxiliary verb, while first three
are applicable to noun and
adjectives. Their detail classification is shown in Figure 3.7
below. Verb does not
have gender or number, person by nature but verbs do get
inflected for GNP features
because for maintaining the harmony of sentence construction,
verb form in a
sentence has to agree with either subject (Karta) or object
(Karma) or none. Hence
does get inflected due to gender, number, person and case.
Following section
discusses these categories to sufficient detail level, first in
general context and then in
specific context of Hindi and Marathi.
Figure 3.7 Grammatical Categories ( )
3.5.2.1 Gender
This important category is amongst the most discussed
grammatical categories in
linguistic literature. As stated in opening discussion (pl. see
3.5.2), the concept of
Gender ( ) is inspired from biological sex ( ) which is
Verb/Aux Verb
Noun/Adjective
Tense ()Aspect ( )Mood ( )Voice ( )
Gender ( )Number ()Person ( )Case ()
-
Thesis HinMaT MT Framework
111 | P a g e
natural phenomenon in all living beings but at times it may or
may not have any
resemblance with natural sex, in such cases it is more a
grammatical convenience.
Gender of a word is scoped to grammar space, while biological
sex is scoped to
physical world. Every language in the world has adopted variable
number of
permissible genders ranging from two to up to 30 (Deshmukh,
1990). By and large
most of the languages follow three gender ( ) system. These
include masculine,
feminine and neutral genders. As a thumb rule the animate things
are classified as
masculine or feminine according to their biological sex and
non-living things are
normally put under neutral gender class. This rule is followed
in Sanskrit. Most of the
Indian languages follow three genders system, while Hindi has
only two genders:
masculine and feminine. Gender fixing and identification is
common problem across
most of the languages due to the fact that gender system is
irregular and not well
defined in many languages. Here, we are presenting some commonly
used tricks for
identifying gender of words in Hindi and Marathi language. These
tricks are based
more on intuitions and heuristics than sound scientific
principles. Their applicability
to other languages has not been studied in present research
work. One may very easily
find exceptions to the rules. These tricks are explained
below:
a. Using Affixes: Affixes can be used to identify gender in some
cases in Hindi as
well as Marathi. E.g. Hindi Suffixes , -, -, -, , -,
-,- are commonly used to denote feminine gender , ,
.
b. Karaka: Hindi Vibhakti symbols / / and Marathi Vibhakti
symbols
// can also be used for identifying gender. E.g. (son of
Ram),
(Daughter of Ram), (son of Ram),
(Daughter of Ram), (tree of tamarind)
c. Compound words: [Prince] (masculine), [Princess]
(feminine).
d. Based on Importance/size/greatness: Things bigger in size and
important from
prevailing social norms are treated as masculine, while things
smaller in size and
socially less important are treated as feminine. E.g.
(King),
(proprietor), (driver), (Commander in chief) are treated as
-
Thesis HinMaT MT Framework
112 | P a g e
masculine, while (queen), (land lady),etc. are feminine,
exceptions to this are, (police), (rope), (rope), are treated
as
feminine, while ' (thread) is treated as masculine.
e. Word ending: generally - or- ending words are considered as
feminine in
Hindi as well as Marathi, while ending words are treated as
masculine.
E.g. Hindi: (m) (bull), (f)(rope), (f)(Jalebi-sweet dish)
etc.
Marathi: (m)(boy), (m)(horse), (f)(girl), (f)(mare:
female horse) etc.
Exceptions: Marathi-(f)(school)
In Indian languages, Gender has impact on number inflections. It
is interesting to note
here that synonymous words may be divided into masculine and
feminine classes.
Also vocabulary of a language is often dominated by masculine
gender hence by
default gender transformation suffixes are found for masculine
to feminine
transformation. Discussion on this issue is presented in section
on Number category.
3.5.2.1.1 Hindi Gender System
Though Hindi has its root in Sanskrit, unlike Sanskrit11 it has
two genders i.e.
masculine and feminine. It does not have neutral gender. When,
we tried to trace the
reason for omission of neutral gender, scholars have stated that
during long span of
foreign rule specifically Islamic, Hindi underwent influence of
Farsi, which has only
masculine and feminine genders and hence Hindi too has two
genders (Yadav, 2011).
Senior grammarian Kishoridas Bajpayee has strongly justified the
two genders system
for bringing in simplicity in gender system. It is a common
practice to put most of
neutral gender words from Sanskrit and other languages under
masculine gender. The
two gender system is source of problem for Hindi speakers while
learning foreign and
other Indian languages having three genders. As reported by
(Yadav, 2011) Hindi
gender system is not regular and stable, this is due to the fact
that Hindi has wide
geographical spread and hence it has influence of local culture,
traditions and dialects.
Gender transformation in foreign words is also observed in
Hindi. Gender forms are
11 Sanskrit has three genders, masculine, feminine, and
neutral
-
Thesis HinMaT MT Framework
113 | P a g e
generated by Suffixing gender suffixes to stems, this process is
called inflection. It is
important to note here that only masculine nouns are converted
to their feminine
counterparts. Not all masculine nouns can be transformed to
feminine nouns. Few
nouns are always in masculine plural gender, e.g. (darshana),
(tears),
(lips), (hair). Stems are inflected not only for gender but also
for other
grammatical categories listed above. This kind of inflection is
possible for verbs also.
Besides GNP features, verbs are also inflected for Tense, Aspect
and Mood12. Gender
has strong influence on Number inflections. For feminine gender,
suffix - is heavily
used in Hindi. Sanskrit feminine suffixes , - are derived as ,
-,
-, -, - in Hindi. In addition to these suffixes Hindi also uses
other
suffixes but the 08 frequently used suffixes are: , -, -, -, -,
-
, and -. Sample masculine and feminine words in Hindi and
Marathi are given in
Appendix B. Most of the Hindi pronouns in all three persons and
cases are gender
neutral so to say; they can be used for masculine and feminine
gender.
Computationally this is very important aspect in the context of
Hindi parsing and
Machine Translation.
3.5.2.1.2 Marathi Gender System
Since Marathi follows Sanskrit gender system, hence it has three
genders i.e.
masculine, feminine and neutral. Marathi pronouns are more
diversified in terms of
gender than Hindi. On the contrary senior grammarian Dadoba
Pandurang has
advocated for using additional common gender ( ) for words for
whom we
cant extract gender in any tense, E.g. (you), (bird) but other
grammarians like
Damale and Chiplunkar have opposed the idea of common gender
(Hiremath, 1993).
As far as Gender fixing or identification is concerned, word
ending and traditional
usage are important criterion. Due to three genders, Marathi
gender system is more
complex as compared to Hindi. Like Hindi, Marathi Gender system
is also found to be
irregular, because we can find words, whose Gender cant really
be justified within
existing frame of rules but these have been followed as part of
tradition and ancient
12 In our view, TAM is a manifestation of case feature for
verbs
-
Thesis HinMaT MT Framework
114 | P a g e
practice, e.g. (Gold)-Neutral, (Silver)-Feminine. Neutral gender
is also used
for animate living things. Words (common nouns) representing
particular ontological
class (college, class of all fruits etc.) are generally put
under Neutral gender.
Inanimate things can be put under any of the three genders.
Various suffixes are used
for Gender identification as well as gender transformation, the
list of frequently used
suffixes are listed in following table (Table 3.9). It is
interesting to note that only
masculine to feminine and vice-versa gender transformation is
observed in Marathi,
masculine or feminine to neutral is not observed. However
neutral to masculine or
feminine are observed in some cases (e.g. / ) Marathi pronouns
are more
diversified as compared to Hindi. Marathi verbs are not affected
by Gender in their
future tense forms. Few words in Marathi represent more than one
gender, common
nouns representing profession generally fall under this category
e.g. (advocate),
(judge), (client), (registrar), (Prime-minister) etc.
Borrowed foreign words as well as Sanskrit words may also
undergo gender
transformation in some cases.
Table 3.9 List of Hindi-Marathi Gender Transformation
Suffixes
Language
Gender
Masculine
(word ending)
Feminine
(word ending)
Hindi
-( ) - ( )
-( ) -( )
Hindi
-() -()
-( ) - ( )
-() - ( )
-() - ()
-
Thesis HinMaT MT Framework
115 | P a g e
-() - ()
-() - ()
-() - ()
-() - ( )
- ( ) - ( )
- ( ) - ( )
- ( ) - ( )
- () - ()
- ( ) - ( )
Marathi
-() - ()
-() - ( )
() - ( )
-() - ()
-( ) - ( )
-() - ()
-( ) - ( )
-() - ()
- ( ) - ( )
-
Thesis HinMaT MT Framework
116 | P a g e
Hindi and Marathi vocabulary contain lot of common words.
Considerable number
of such masculine Hindi words is put under neutral gender in
Marathi. Some
examples of such words are given in the following table (Table
3.10).
Table 3.10 Hindi-Marathi common words with different gender
Due to difference in genders between Hindi and Marathi pair,
gender divergence is
largely observed during translation. This divergence has strong
effect on MT, as it
may break the agreement between sentence constituents. This
effect is not limited to
only the divergent word and its modifiers, but it may even
affect the verb and
auxiliary verbs, if divergent word governs the GNP features of
the verb. HinMaT
handles these issues very neatly and carefully.
3.5.2.2 Number
This is a simple grammatical category, as compared to other
categories. Number is
used to represent the cardinality (count) of things denoted by
lexical item. Even
though it is primarily associated with nouns13, its effect can
be observed on adjectives
and verbs, as they can also be inflected for number by affixing
appropriate suffix. This
is grammatical convenience for coping up with feature agreements
between sentence
constituents. Quite often, notion of grammatical number agrees
with real world
number, but sometimes there is disagreement. But this is due to
the fact that some
words are being used in that way by tradition. E.g. wheat (),
sugar ( ) denotes
singular number, whereas it actually refers to any number of
wheat grains. Few words
13 Only Common Nouns/Pronouns are affected due to number
category, other noun types such asProper noun, Abstract noun are
not affected by number.
Word Marathi Gender Hindi Gender
Neutral Masculine Neutral Masculine
Neutral Masculine Neutral Masculine Neutral Masculine Neutral
Feminine Neutral Masculine Neutral Masculine Neutral Masculine
-
Thesis HinMaT MT Framework
117 | P a g e
are always used in either singular or plural form. We cant
change the number of such
words, uncountable things (milk, water, hair) fall under this
category. To quantify
such things, metrics such as liter, grams etc. are used. There
is no uniformity
regarding number category amongst different languages of the
world. Languages like
Greek, Latin, and Sanskrit have three numbers, singular (),
double number
( ) and plural (). Fiji language has four numbers; they are
singular
(), double number ( ), tri number ( ) and plural (). Both
Hindi and Marathi use two numbers i.e. singular () and plural
(). Detail
discussion on number systems of Hindi and Marathi languages is
presented below.
3.5.2.2.1 Hindi Number System
Hindi uses two numbers, singular and plural. Ontological
classification of number
(with examples) for noun POS category is represented using
following Figure 3.8.
Figure 3.8 Number category classifications in Hindi
The uncountable nouns under mass category above are quantified
with the help of
metrics like liter, kilo-gram, meter etc. depending on their
natural property like liquid,
solid etc. The uninflected word form normally denotes singular
number14. Plural
forms are derived by affixing plural suffixes to singular word
forms. These suffixes
are shown in Table 3.11 below.
14 Excluding those words which are by default plural
Number
Countable
Singular
(boy),(goat),(horse
)
Group
(crowd), (meeting),
(family)
Uncountable
Mass
(gold), (steel),
(water)
Abstract
(truth), (fear)
-
Thesis HinMaT MT Framework
118 | P a g e
Singularity or plurality is also governed by another factor i.e.
case marker. Two forms
are found for singular and plural representation depending on
use of post-position
marker15, these forms are called (Savibhaktik: with
post-position marker)
and (Avibhaktik: without post-position marker). It is important
to note
that, for Hindi, all masculine plural direct case common noun
forms of Hindi are same
are masculine singular oblique forms, E.g. word (bacche:boys)
denote plural
direct case, whereas same word in the phrase (bacche ne),
denotes
singular sense a boy. This fact can be modelled using following
equation: , , = , , Table 3.11 Hindi Plural Suffixes (Direct
case)
This phenomenon is not observed in Marathi. For plural forms two
separate word
forms are found in Hindi. While pluralizing singular form
through affixing, sometimes
the long length vowels are transformed to short length vowels,
e.g. (ladki/girl:
Singular) (ladkiyaan/girls: Plural), (Nadii/River: Singular)
(Nadiiyaan/Rivers: Plural). Adjectives, pronouns and verbs are
also inflected
due to number category.
In case of Adjectives, masculine-singular (ms), masculine-plural
(mp), feminine-
singular (fs) and feminine-plural (fp) are possible. However,
feminine singular and
plural forms are same as well as masculine plural direct case
forms are same as
masculine singular oblique form. E.g. (Acchha:good, masculine,
S),
(Acchha:good, feminine, S/P), (Acchhe:good, masculine, /). This
important fact is depicted using following equation.
15 The term post position is used here in broader senses, which
also include case markers. The termcase markers are also found in
literature.
Form( ) Singular() Plural() , , , , ,
-
Thesis HinMaT MT Framework
119 | P a g e
, , = , , , , = , ,
In Marathi one more form i.e. total four forms are found, they
are explained in later
section. Verb forms get inflected for number, and take the
number of either subject or
object or none. For the none agreement case i.e. when Hindi verb
does not agree with
either karta or karma, Hindi verb form is always in masculine,
singular form. In case
of masculine/feminine plural verb forms, the auxiliary verbs are
nasalized with
anuswaar (), e.g. / (They are going). If the auxiliary verb is
not present
then feminine plural verb form is nasalized with anuswaar (),
e.g. (gave-
feminine, plural), (ate- feminine, plural), (sent- feminine,
plural) etc.,
while the masculine plural forms become - , e.g. (gave),
(ate),
(sent). The plurality can also be described using compound words
( ),
reduplication ( ), and quantifiers ( ). Few words by default are
used
in plural form ( ) only. Examples of such words are given
below:
Compound Words: (cow) + (buffalos) = , (sheeps) +
(goats) = , (teacher) + (community/group)=
Reduplication: -, - etc.
Quantifiers: (five boys), (all students), (some rupees),
(all employees) etc.
Default Plural: (tears), (life), (Darshana) etc.
Number feature of foreign words is mostly decided as per rules
of foreign language; in
few cases Hindi suffixes are conjoined to such words to derive
plural forms. Detail
discussion on this aspect can be found in (Guru, 1920). Pronouns
in Hindi are also
affected by number, their morphology is completely irregular.
Detail discussion on
pronouns is presented in Person category (section 3.5.2.3
ahead). -
-
Thesis HinMaT MT Framework
120 | P a g e
adjectives are normally inflected for gender and number. Plural
forms are also used
for denoting honor, in which case they are actually being used
in singular sense. As
stated earlier (section 3.5.2.1), Gender and the word ending
letter ( ) have strong
influence on number suffixes along with Case. The following
Table 3.12 describes
the paradigm used in deriving plural inflections, considering
gender, word ending and
direct/oblique cases.
Table 3.12 Hindi Plural Suffixes Paradigm
Form ( ) Gender( ) Word Ending(- )
Plural()
Example(.)
( )
Withoutpost-position
marker
(masculine)
- - ,,
- ,
(feminine)
- +
+
, , , ,
- -
- -
-, ,
, - -
(masculine)
-/- - ,
(feminine)
-/- -
./ . -/- - ->- : --
,
./ . - - (--) , Rest AllForms
Rest All Forms - , ,
-
Thesis HinMaT MT Framework
121 | P a g e
The suffix besides above usage with post position markers, is
also used to denote
plurality of words without post position markers e.g. (many
decades),
(many years), (both), (thousands), (crores) etc.
3.5.2.2.2 Marathi Number system
Marathi also follows two numbers viz. singular and plural. Like
Hindi, number
category affects only common nouns in Marathi. Pronoun
classification is based on
gender and number, personal plural pronouns are also used to
express honour, and in
such cases they denote singular number case. Verb forms are also
affected by number,
depending upon their agreement with either subject or object or
none at particular time
instance, in case of no agreement, verb form is always in
neutral gender and
singular form (ns). For expressing singular form no suffixes are
required. However in
singular number, word has two forms, with post-position marker (
) and
without post-position marker ( ). When used with post-positions,
oblique
forms of nouns and adjectives are used. Derivation of Marathi
oblique forms is
explained in later section. The paradigm for pluralization of
Marathi words in
different genders is given in following chart (Table 3.13).
Table 3.13 Marathi Plural Suffixes Paradigm
WordEnding
Masculine Feminine Neutral
-, -,- - - - No word -, - - -, - - -, - - -, - - No word - - No
word - No word - - No word - No word No word
-
Thesis HinMaT MT Framework
122 | P a g e
The chart is self explanatory. In Marathi only ending masculine
word forms are
inflected for plural number, other forms are same in singular as
well as plural. It is
also important to note here that, certain vowel ending words are
not found in all three
genders. During pluralization, Marathi words also undergo some
morphological
changes like change in vowel length (long to short) and
introduction of or at
the end16. ending feminine forms can be pluralized in three
different ways (-
, - or both). In both classes, two plural forms are derived,
e.g.
(behavior) plural-1, plural-2 etc. Like Hindi, plurality can
also be
described using compound words ( ), reduplication ( ), and
quantifiers
( ) in Marathi. Like Hindi, few words are by default used in
plural form
( ) only. It is important to note here that Marathi has four
inflected forms
for gender and number as against three in Hindi. E.g. /// ,
/ // , / // . For declinable words, following
equations hold.Word , , = Word , ,Word , , = Word , ,Word , / ,
= Word , / , = Word , / ,
This feature overloading aspect of Marathi morphology, on word
forms is very
important from computational as well as from storage point of
view, since we dont
need to store all these forms with different feature
specification separately, we can
store them as single word with compact feature specification. In
such cases, we must
resolve them to appropriate features specification from above
before parsing,
otherwise we may parse wrong words.
16 / change, e.g. , . later is governed by letter clustering
rules( ): - += (E.g. - ), - += (E.g. , )
-
Thesis HinMaT MT Framework
123 | P a g e
3.5.2.3 Person
Besides message (sentence gist), every sentence also encodes
reference to either
speaker or listener or some other entity by means of person
category. All languages
use three persons to denote the participating entities in the
sentence. Formally first
person, second person and third person. First person refers to
speaker, second person
to listener or hearer while third person refers to anything
other than these two. All
nouns (except pronouns) are always treated under third person.
Number affects the
person category in all three persons. Different pronouns are
used to denote singular or
plural number for each of the first, second and third persons.
First and Second person
pronouns are gender neutral i.e. they are not affected by
gender, only third person
pronouns/nouns are affected by gender category. Person category
also affects the
auxiliary verbs in most of the languages. The third person
pronouns can be further
classified based on grammatical features like human
(+h)/non-human (-h),
proximity/remoteness, definiteness/indefiniteness,
interrogative, presence/absence,
relational and reflexive, Detail discussion on the person
category w.r.t. to Hindi and
Marathi is presented below.
3.5.2.3.1 Hindi and Marathi Person category
Like English, Hindi and Marathi too have three persons. Person
category primarily
affects the pronouns and auxiliary verbs. Theoretically, person
category is influenced
by gender as well as number categories. However, Hindi pronouns
are affected only
by number and not by gender. Whereas Marathi pronouns are
affected by both
number and gender. For third person personal pronouns four forms
are observed in
Marathi e.g. for English demonstrative pronoun (remote) that, we
have
/// forms in Hindi, this- demonstrative pronoun (proximity)
has
/ // equivalent forms in Marathi. Hindi has only one form for
all these
cases, (this) and (that). The and forms are used for marking
oblique case in all three genders & both numbers of Marathi
language. This is
important factor from MT point of view, as there is one to many
mapping between
pronouns of Hindi and Marathi. This hints the parsing process to
fix the person feature
value of the pronoun under consideration. As such no divergence
is observed in
-
Thesis HinMaT MT Framework
124 | P a g e
pronouns. The complete list of pronouns in Hindi and Marathi has
been presented in
following table (Table 3.14).
Table 3.14 Hindi & Marathi Person System
As stated earlier the auxiliary verbs are also affected by
person category in Hindi as
well as Marathi. This impact is more in Hindi as compared to
Marathi. The Hindi and
Marathi auxiliary verbs in different persons are listed in
following table (Table 3.15).
Hindi ( ) Marathi ( )Singular
()Plural
()Singular
()Plural
()(First) (I) (we) (I) (We)
(Second)/(you) (you) (you) /(
you)
(you)()
(you)()
(you)()
(you)()
(Third)
/(Demonstrative)
/(He) /(They)
(m)(He/that)/(f) (she/that)/(n)(it/that)/(*Obl)(that)
(m)/ (f)/: (Those)
(Proximate)
(this)(this)
(these)(these)
(m)(this)/ (f)(this)/(n)(this)/ (*Obl)(this)
(m)/ (f)/ (f):
(these)
(Relative)
(who)/(who) (whom)
(m)(who)/(f)(who)/
(n) (who)/(*Obl)(those)
(m)/ (pl. obl)/(n):(those)
(Interrogative)
(who) (who) (who) (who)
(Indefinite)
/ (anybody)
/ (anybody)
(who)
(Reflexive)
-(automatically)
(self)
-(automati
cally)(self)
(self))
(you)(Self)
-
Thesis HinMaT MT Framework
125 | P a g e
Table 3.15 Person suffixes for Auxiliary verbs of Hindi &
Marathi
(Person)
(Tense)
(Hindi) (Marathi)
(Masculine)
(Feminine)
(Masculine)
(Feminine)
(Neutral)Sing
..Plural
..Sing
..Plural
..Sing
..Plural
..Sing
..Plural
..Sing
..Plural
..
First.. - - - - - - -
- -
..
-
-
-
.. -/-
-/-
-/-
-/-
- -
.. -
-
-
-
-
. - - - - - - - - - -.. -
-
-
-
-
- -
-
Second
.. - - -
- - - - -- -
..
-
-
-
-
.. -/-
-/-
-/-
- /-
-/-
--/
- /-
- /-
- -
..
-
-
-
-
- -
.. - - - - - - - - - -.. -
-
-
-
-
/
-
/
-
/
-
/ - -
Third
.. - - - - - - - - - -
..
-
-
-
-
-
-
.. -/-
-/-
-/- -
-/-
-
-
-
-
-
-
-
Thesis HinMaT MT Framework
126 | P a g e
..
-
-
-
-
-
-
.. - - - - - -/ - - - -.. -
-
-
-
- /
-/
- /
-/
- /
-/
3.5.2.4 Karaka
Karaka (case) is an important category as it is directly related
to sentence level and
can be exploited computationally during the parsing of sentence.
Generally sentence is
defined as sequence of meaningful words, but this is not a
complete definition because
it is mandatory for the participating words to have some
relationship with each other.
These relationships help at different levels of sentence
analysis i.e. morphological,
syntactic and semantic. Without this correlation amongst the
words, sentence is not
meaningful. So we can say that for maintaining the harmony in
the sentence, words in
the sentence must be compatible to each other. This relationship
between these words
is denoted by Karaka. For a sentence to be meaningful, it should
have three
characteristics: Yogyata (eligibility), Aakansha (expectancy),
and Aasatti (bonding).
These have been mentioned in Sanskrit verse by Hindi scholar
Vishwanathji
(Vishwanathdev), it says, . The three
characteristics as mentioned above are explained below:
Yogyata (eligibility): The meaning of sentence constituents
should be capable of
relating to other constituents in meaningful way.
Aakansha (expectancy): Meaning of some constituents cant be
expressed solely, such
constituents expect presence of other constituents with whom
they relate. This
dependence is called as Aakansha (expectancy). The expectancy is
of two types:
mandatory and optional. The first type of expectancy is such
that without fulfillment
of such expectancy, sentence is not meaningful and while second
type is required for
extending the meaning of sentence.
-
Thesis HinMaT MT Framework
127 | P a g e
Aasatti (bonding): This is also called Saannidhi, it talks about
the positional proximity
(word position closeness) between related sentence
constituents.
With regards to exact definition of Karaka, scholars have
different opinions, according
to Jespersen (Otto, 1965), relationship between noun(s),
adjective(s) or pronoun with
other constituents of sentence is called Karaka relationship.
This relationship is
scoped to noun-noun, noun-verb, auxiliary verb main verb,
adjective-noun etc.
relationships. According many Sanskrit scholars including
Sanskrit legend Panini,
i.e. any relationship between constituents (mostly nouns)
with
verb in a given sentence is Karaka relation. The nature of this
relationship is
functional. These scholars dont accept noun-noun or
adjective-noun or non-noun-
verb relationships as karaka relationship. Western philosopher,
Fillmore (Fillmore,
1968) gave serious thought to Case theory. His notion of case is
based on early
conception of theta role theory. According to theta theory,
following seven theta
roles17 have been specified, these per se does not go hand in
hand with Sanskrit
Karakas:
1. Agent ( ): Doer of action is Agent. E.g. He (agent) is
writing letter.
2. Experiencer (): One who experiences the act mentioned in verb
or or
takes denoted action. E.g. Ramesh (experiencer) was happy to
receive the
prize.
17 The numbers of theta roles as such are not fixed, but we have
described primary theta roles. Thetaroles give semantic
(functional) relationship between constituent words at meaning
level.
-
Thesis HinMaT MT Framework
128 | P a g e
3. Instrumental (): Inanimate thing or object which is used in
carrying out
action specified in verb. He cut the apple with a
knife(instrument)
4. Object/Patient ( ): Thing or somebody who undergoes change
as
implied in verb. E.g. Ram painted the house(patient)
5. Theme: Something or somebody, who is topic of discussion and
whose state
can be perceived by speaker as in motion or steady. E.g. the
ball (theme) is
rolling down, the bottle is green (theme).
6. Locative ( ): Place of action. E.g. He was arrested in
Diamond Hotel
(locative).
7. Source ( ): The point of separation, which remains stationary
as the action
progresses in most of the motion verbs. E.g. The train departed
from the
platform (source)
8. Goal ( ): Last point of action where the action ends in
stative/motion verbs.
E.g. Ram went to school (goal) from his home (source).
The standard Latin and Greek case grammar assumes 07 cases,
these are nominative
case, accusative case, instrumental case, dative case, ablative
case, possessive case,
locative case, and vocative case, while Paninis karaka theory
(500 BC) describe six
Karaka relations, Karta, Karm, Karan, Sampradan, Apadan, and
Adhikaran. In
Sanskrit, the term Vibhakti refers to word form which gets
morphologically inflected
for denoting particular case using special post position marker
symbol, as prescribed
for that Vibhakti. The relation between Karaka and Vibhakti is
many to many
because, same Karaka can be expressed by many morphological
forms (Vibhakti) and
one morphological form (Vibhakti) can represent different
karakas. Karakas are
syntactico-semantic (Bharati, Chaitanya, & Sangal, 1995) in
nature, they are identified
with syntactic cues like post position markers (Vibhakti
symbols). Karaka and
Vibhakti are closely related to each other; hence many people
often confuse each
other. But Panini has clearly differentiated Karaka from
Vibhakti. According to him,
Karaka is a semantic element ( ) and Vibhakti is its
morphological
representation ( ). We have presented discussion on this aspect
in later section
of this chapter. Now, we will review the Hindi and Marathi
Karaka system.
-
Thesis HinMaT MT Framework
129 | P a g e
3.5.2.4.1 Hindi Karaka system
Due to western influence on Hindi Grammar, the early grammarians
have prescribed
08 Karaka relations (Guru, 1920), (Kellogs, 1955). These Karakas
along with their
equivalent cases in Case Theory are presented below:
1. Karta () Karaka (Nominative case)
2. Karm () Karaka (Accusative case)
3. Karan () Karaka (Instrumental case)
4. Sampradan ( ) Karaka (Dative case)
5. Apadan () Karaka (Ablative case)
6. Sambandh () Karaka (Possessive case)
7. Adhikaran ( ) Karaka (Locative case)
8. Sambodhan () Karaka (Vocative case)
Hindi Scholars like Kishoridas Bajpayee (Bajpayee K. P., 1959)
and others coming
from Sanskrit school of thought are not in favor Sambandh Karaka
(Possessive case)
and Sambodhan Karaka (Vocative case) as they do not describe any
relationship with
Verb. For HinMaT, we have considered 1-7 Karakas. In Hindi
karaka relation is
expressed with the help of post-position markers (Vibhakti
symbol/parsarga).
Following table (Table 3.16) shows Karaka and their Vibhakti
symbols in Hindi.
Table 3.16 Hindi Karaka-Vibhakti Table
-Zero(no) Vibhakti marker
Sr. No. Karaka Vibhakti Symbol
1 Karta (ne), ()2 Karma (ko), ()3 Karan (se)4 Sampradan (ko),
(ke liye)5 Apadan (se)6 Sambandh (ka)/ (ki)/(ke)7 Adhikaran
(mai)/(par)
-
Thesis HinMaT MT Framework
130 | P a g e
It is apparent from above table that Vibhakti markers , are
overloaded, is
used to denote Karma (Accusative/theme/object) Karaka and
Sampradan
(dative/beneficiary) Karaka, while denotes Karan (Instrumental)
and Apadan
(Ablative/Source) Karaka. From parsing point of view, this is
important aspect as weneed to resolve the appropriate Karaka.
Whenever a Vibhakti marker is used to
specify Karaka relationship(s), for which it is designated, such
usage of Vibhakti
marker is called SwaVibhaktik ( : native usage), while in other
instances,
where Vibhakti Marker specify other Karaka relationship(s) is
called Parvibhakti
( : foreign usage). The foreign Vibhakti usage is discussed
during discussion
on individual Karaka.
Hindi sentence words must be converted to their oblique form,
whenever they take
Vibhakti marker to specify a Karaka relationship. Since, proper
noun word forms are
same for direct as well as oblique case by default; they dont
undergo any
morphological inflection, while other noun types based on their
number and gender
may undergo morphological change(s). The Hindi oblique
morphological suffixes
(Kaul, 2008) are listed in following table (Table 3.17a) and
their examples are
presented in table (Table 3.17b)
Table 3.17a Hindi Oblique Suffixes Table (Kaul, 2008)
Case Masculine Feminine
Singular Plural Singular Plural
Direct
Oblique - - - -
Vocative - - - -
-
Thesis HinMaT MT Framework
131 | P a g e
Table 3.17b Hindi Oblique Word Form Examples
Hindi literature uses the terms Parsarg or Vibhakti to mean case
markers. Early
grammarians like Pandit Kamata Prasad Guru and Pandit Kishoridas
Bajpeyi used the
term Vibhakti, as it is being used by tradition from Sanskrit
times. Parsarg or Vibhakti
are different from normal suffixes because the normal suffixes
agglutinate with
preceding word (noun, adjective etc.) E.g. + = , while Vibhakti
suffixes
does not18. Another argument is that Vibhakti marker is right
extreme suffix (
), i.e. after that no other marker can be used, E.g. , etc.,
but
Parsargas can appear in any finite number of times, E.g. ...,
...,
, Here - , - , - two parsargas are appearing in sequence. In
our
understanding Parsarga is a broader class of post position
markers which also includes
Vibhakti markers. Parsarga can be further classified as
declinable ( ) and non-
declinable ( ), as given in following figure (Figure 3.9).
18 Exception pronoun, all pronominal forms agglutinate in their
oblique cases, eg. , , etc.
Case Masculine Feminine
Singular Plural Singular Plural
Direct ()(boy)
()(boys)
()(girl)
()(girls)
Oblique (-)(boy)
(-)(boys)
(-)(girl)
(-)(girls)
Vocative (-)//
(hey boy)
(-)// // (hey boys)
(-)//
(hey girl)
(-)//
(hey girls)
-
Thesis HinMaT MT Framework
132 | P a g e
Figure 3.9 Post-position marker classification
3.5.2.4.1.1 Karaka Usage in Hindi Sentences
Each of the 07 Karakas as discussed in preceding section are
explained below, our
discussion here is not confined to traditional literature and
opinions of the early
grammarians but also the modern Paninian analysis theory from
Computational
Linguistics point of view by Prof. Rajeev Sangal, Vineet
Chaitnya, Prof. Amba
Kulkarni (Bharati, et al., 2006), (Begum, Husain, Dhwaj, Sharma,
& L. Bai, 2008):
3.5.2.4.1.1.1 Karta () Karaka
Karta Karaka refers to the doer of action or subject in the
sentence. E.g.
(Ram went to home), (Sita cooked the food). Here and
have appeared in Karta Karaka. Karta can govern the GNP features
of the verb
in given sentence. In maximum cases the Karta is animate entity.
The Karta Karaka
may or may not have resemblance with real world conception of
doer of action, in
such cases the scope of Karta is restricted to grammatical world
and it is grammatical
Karta. E.g. (The man died). This is common scenario in
causative
sentence constructions, where actual action indicated in the
verb is performed by an
entity, but it is enforced or initiated by someone else with the
help of one more object.
E.g. (Mother made the baby to drink milk from
maid). Here (Bachhe: oblique form of baby) is actually drinking
the milk, he is
doer of the action, but he is not drinking it himself, he is
drinking it with the help of
Post Positionmarkers
Non declinable
others: , ,, , ,
Declinable
(Possesive):
, , , , ,
Similarity: , ,, ,
,
-
Thesis HinMaT MT Framework
133 | P a g e
maid who is doing so on orders from the mother. Maid is
mediating between mother
and baby. Mother (), maid () and baby ( -oblique form) are
treated
as prayojak Karta(sponsor), madhyasth Karta(mediator) and
prayojya/anubhavak
Karta (experiencer) respectively. This terminology is used in
modern Paninian
analysis (Bharati, et al., 2006), (Begum, Husain, Dhwaj, Sharma,
& L. Bai, 2008).
Karta Karaka is specified using Vibhakti marker, whenever the
verb is in past
tense form (Karmani Prayog), where Karma generally agrees with
GNP of
verb. It takes no Vibhakti () in (Kartari Prayog), where Karta
aggress
with GNP features of verb. Vibhakti marker symbol is also used
to denote Karta
Karaka in conjunction with verbs like, , , , and -
main verbs + auxiliary verb usages. Whenever the Karta is in
experiencer
role, it takes Vibhakti marker. Detailed discussion on use of as
Karta is
presented in (Sing, 1985). Examples of above usage are given
below:
1. (Mohan wants the book)
2. (Ram is feeling hungry)
3. (He appears to be brave to me)
4. (Kalpana got the prize)
In i.e. verb does not agree with either Karta or Karma in GNP
and is
always in masculine singular form, Vibhakti symbol is used to
denote Kartas
inability or ability to perform action in the verb. E.g. (Ram
is
not able (unable) to eat), (I committed a mistake).
3.5.2.4.1.1.2 Karma () Karaka
Object of verb or thing which is directly impacted due to action
indicated in verb is
called the Karma. E.g. (Ram eats mango),
(Ram killed Ravan) Here words (mango), (Ravan) are representing
the
Karma Karaka. One more instance of Karma called gaun karma is
described in
-
Thesis HinMaT MT Framework
134 | P a g e
Paninian analysis, e.g. (home) (Bombay)
(Leader) . In these sentences the bold words
(home), (Bombay), (Leader) are gaun karma.
Vibhakti marker is used with animate nouns, in case of inanimate
nouns generally
no Vibhakti () is used, except in special cases where use of is
allowed, e.g.
(you save the country).
The Vibhakti Marker is used with animate nouns in accusative
case with special
reference to verbs of psychological predicate like /(to speak),
(to
ask), (to demand) e.g. (Ram said to his father) etc.
Kamataprasad Guru (Guru, 1920) has described this usage of Karma
as Gaun karma.
19 Vibhakti marker is used to mark Karma Karaka in sentence with
complex verb
( ) as discussed in (3.5.2.3.1.1.6) ahead. E.g. (we
worship god)
3.5.2.4.1.1.3 Karan () Karaka
Karan denotes the Instrument used for carrying out the action.
E.g. (knife)
(Gita cut the apple with knife). (knife) is used to cut the
apple
hence it has occurred in Instrumental case. The Vibhakti Marker
is used to mark
this Karaka.
3.5.2.4.1.1.4 Sampradan ( ) Karaka
The beneficiary of action in di-transitive verbs is denoted by
Sampradan Karaka. E.g.
(Dhananjay gave book to Vilas). Vilas who received the
book is the beneficiary of the verb (Give) and appears in
Sampradan Karaka.
3.5.2.4.1.1.5 Apadan () Karaka
19 is used in representative sense to denote its inflections ,
also.
-
Thesis HinMaT MT Framework
135 | P a g e
Panini in his legendary Ashtadhyayi has mentioned three types of
Apandan,
(motion), (state/stative), (about fear).
Apadan Karaka is used to indicate point of separation which
remains stationary as the
action indicated in the verb progresses in motion verbs or verbs
indicating change in
state. Some people treat Apadan as source from where verb action
begins e.g. consider
Hindi sentences (Train departed from the platform),
(Cat fall off the roof), (He saved me from tiger),
(I am afraid of snake). Here (platform), (roof) are
denoting the point of departure, which remains stationary after
the train and cat moves
ahead hence they have appeared in Apadan Karaka. So does
(Tiger),
(Snake) in psychological predicates like saving-from, afraid-of
etc. Besides, one more
type of Apadan is prescribed in modern Paninian analysis
(Bharati, et al., 2006),
(Begum, Husain, Dhwaj, Sharma, & L. Bai, 2008) called
Prakritik Apadan (Natural
ablative), which is related to change of state of material
e.g.
(Doors are made from wood). (Ice-cream is made from
milk). Here words (Wood), (milk) denote source material from
which
doors and ice-cream are made. We can observe that the Vibhakti
marker symbol is
used to mark two Karakas i.e. Karan and Apadan, disambiguating
its instance usage
amongst these two is a computational challenge in Hindi parsing.
Apadan is also
expressed with the help of Vibhakti compounds like , . E.g.
(bullet fired from gun), (he fall off the roof). Different
usages of are listed in the following table (Table. 3.18).
Table 3.18 Use of (Se) post position Marker
Sr. No. Usage Example
1. (Instrumental )
|
2. (Ablative)
, , |
3. (Comparative)
| | | ,
-
Thesis HinMaT MT Framework
136 | P a g e
4. (Incapabilitaive)
( , ) | ( ) / ( ) (causative-2)
5. (Indirect Object)
, , , , : |
6. (Co-agentive)
|
3.5.2.4.1.1.6 Sambandh () Karaka
Sambandh Karaka specifies possessive ( ) relationship between
two nouns. The
Vibhakti marker (kaa) is used to mark this Karaka, which gets
inflected for GNP
features to feminine form and plural form (kaa), (ke)
respectively. It agrees
with the GNP feature of its following noun, so this Vibhakti
marker is a preposition,
e.g. (Latas father/father of Lata), (Rams
brother/brother of Ram), (Sachins Car). Actually these
Vibhakti
markers are also used in conjunction with verbs in conjunct
verbs ( ), hence
in modern Paninian analysis fine shades of this Karaka as Karta
Sambandh vachak or
Karma sambandh vachak in complex predicate and argument of
complex verb have
been described.
E.g. 1) (The shop was inaugurated yesterday)
2) (Vice-Chancellor sir inaugurated the
exhibition yesterday)
3) (Guests are about to come).
In example 1) the bold word (exhibition) appears as
Karta-Sambandh karaka
as (Inauguration) is part of complex verb , while in example
2)
it is Karma-Sambandh karaka and in 3) infinitive appears in
Kriya-
Sambandh karaka.
-
Thesis HinMaT MT Framework
137 | P a g e
3.5.2.3.1.1.7 Adhikaran ( ) Karaka
Adhikaran primarily answers the questions like where and when in
the context of verb
action. In classical sense, it denotes place where the action
took place. The locative
case markers in Hindi, , are also used to denote objects other
than place and
time expressions also.
E. g. 1. (We had met in Delhi)
2. (The book was kept on the table)
3. (they were discussing the state of thecountry)
Here bold words in 1) and 2) show Adhikaran of place and time
respectively while in
3) bold words denote Vishayadhikaran. The Vibhakti symbol or
compound
Vibhakti compound are used for comparison between two objects,
e.g.
?(Who sings well out of Suman and Sudip),
(Out of three of you, first one is better)
In foreign usage ( ) of Vibhakti marker, , are also used to
express
Adhikaran Karaka of time.
E.g. 1) (I had not been to office since last four days)
2) (Santosh will return on Sunday)
Cl