Top Banner
Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran
23

Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Apr 02, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Constraint based Dependency Telugu Parser

Guided by -Dr.Rajeev

SangalDr.Dipti MisraSamar Hussain

Team members -Phani ChaitanyaRavi kiran

Page 2: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Overview

• Motivation• A word about the language• Overview of constraint based parser• Analysis of special cases– Genitives– Copula– “ani” construction– Conjuncts

• Future work

Page 3: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Motivation

– We thought about a question answering system in Telugu mainly for medical and tourism domain which could help native Telugu speakers (as a preliminary diagnosis tool and a travel guide). And we were in need of a parser to make things easier.

Page 4: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

A word about the language

• Telugu is a South Asian language• Features– Morphologically rich– Free word order– Agglutinative

• challenges– No Treebank– No parser– No wordnet

Page 5: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Overview of constraint based parserTelugu : rAmudu iMtiki vaccAka paMdu ni wiMtadu

Gloss :Rama home after_coming apple eats

English :Ram eats an apple after coming home

Page 6: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Overview of constraint based parser1 (( NP1.1 rAmudu NN <af=rAma,n,,,,0,,adj_vAdu,>

))2 (( NP2.1 iMtiki NN <af=illu,n,,s,,0,,ki,>

))3 (( VG3.1 vaccAka VRB <af=vaccu,v,,,any,0,,ina_Aka,>

))4 (( NP4.1 paMdu NN <af=paMdu,n,,s,,0,,0,>|<af=paMdu,n,,s,,0,,obl,>4.2 ni PREP <af=ni,n,,s,,0,,0,>

))5 (( VG5.1 wiMtAdu VFM <af=winu,v,,,3_p,0,,wA,>5.2 . SYM

))))

Page 7: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Overview of constraint based parser1 (( NP Source1.1 rAmudu NN <af=rAma,n,,,,0,,adj_vAdu,>

))2 (( NP Source2.1 iMtiki NN <af=illu,n,,s,,0,,ki,>

))3 (( VG Demand3.1 vaccAka VRB <af=vaccu,v,,,any,0,,ina_Aka,>

))4 (( NP Source4.1 paMdu NN <af=paMdu,n,,s,,0,,0,>4.2 ni PREP <af=ni,n,,s,,0,,0,>

))5 (( VG Demand5.1 wiMtAdu VFM <af=winu,v,,,3_p,0,,wA,>5.2 . SYM

))))

Page 8: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Overview of constraint based parserFrame for winu (eat in basic form so no transformation required)-------------------------------------------------------------------arc-label |necessity| vibhakti|lextype |posn|reln-------------------------------------------------------------------k1 m 0 n l ck2 m ni n l c k1 k2--------------------------------------------------------------------

Frame for vaccu (come)-------------------------------------------------------------------arc-label |necessity| vibhakti|lextype |posn|reln Vmod-------------------------------------------------------------------k1 m 0 n l cK2 m ki n l c------------------------------------------------------------------- k1 k2

Transformation charts [ina_aka (after+ing)]----------------------------------------------------------------------------arc-label |necessity| vibhakti|lextype |posn|reln|op----------------------------------------------------------------------------K1 m 0 n l c removeVmod m - v r p insert-----------------------------------------------------------------------------

Winu[wa] (eat)

rAmudu(Ram) paMdu (fruit)

(after coming )Vaccu[ina_aka]

(House)iMtiki rAmudu

Page 9: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Overview of constraint based parserFrame for vaccAka (after transformation)arc-label necessity vibhakti lextype posn relnk2 m ki n l cVmod m - v r p

-------------------------------------------------------------Frame for winuk1 m 0 n l ck2 m ni n l c----------------------------------------------------------------------------------------

rAmudu iMtiki vaccAka paMduni wiMtadu

X1:k1

X3:k2 X2:k2

X4:vmod

Page 10: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Overview of constraint based parser

C1 : For each of the mandatory karakas in a karaka chart for each demand group, there should be exactly one outgoing edge labeled by the karaka by the demand group.

C2 : for each of the optional or desirable karakas in a karaka chart for each demand group, there should be at most one outgoing edge labeled by the karaka by the demand group.

C3 : There should be exactly one incoming arc into each source group

Equations formed by applying the above constraints are :C1 : X1 = 1

X2 = 1X3 = 1X4 = 1

C2 : No optional field found

C3 : X1 = 1X2 = 1X3 = 1X4 = 1

Page 11: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Overview of constraint based parser1 (( NP < af=rAma,n,,,,0,,adj_vAdu,/drel=k1:5/name=1>1.1 rAmudu NN <af=rAma,n,,,,0,,adj_vAdu,>

))2 (( NP <af=illu,n,,s,,0,,ki,/drel = k2:3/name=2>2.1 iMtiki NN <af=illu,n,,s,,0,,ki,>

))3 (( VG <af=vaccu,v,,,any,0,,ina_Aka,/drel = vmod:5/name=3>3.1 vaccAka VRB <af=vaccu,v,,,any,0,,ina_Aka,>

))4 (( NP <af=paMdu,n,,s,,0,,0,/drel = k2:5/name=4>4.1 paMdu NN <af=paMdu,n,,s,,0,,0,>|<af=paMdu,n,,s,,0,,obl,>4.2 ni PREP <af=ni,n,,s,,0,,0,>

))5 (( VG <af=winu,v,,,3_p,0,,wA,/name = 5>5.1 wiMtAdu VFM <af=winu,v,,,3_p,0,,wA,>5.2 . SYM

))))

Page 12: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Analysis of special cases

• Genitives• Copula• “ani” construction• Conjuncts

Page 13: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Genitives• Genitives is the case that marks a noun as being the

possessor of another noun (ex – his, her, its …… etc)• Cases – Genitive marker exists

– Telugu : rAmudi yoVkka puswakaM– Gloss : ram 's book

• So when there is a marker then it is a straight forward that the noun preceding “yoVkka” holds an R6 relation with the noun succeeding “yoVkka”.

– Genitive marker is dropped– Telugu : rAmudi puswakaM– Gloss : ram book

• here is the suffix “udi” in “rAmudi” which gives the information about existence of genitive.

Page 14: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Genitive contd..

• Exceptions in case where genitive marker can be dropped• Telugu : raGu puswakaM rAmudiki icCadu• Gloss : Raghu book Ram gave• English (sense 1): Raghu gave book to sita.• English (sense 2): Raghu’s book is given to sita.

So for non-masculine nouns (Raghu and Sita)in Telugu we don’t have any markers for genitives.

• So we output all possible parses for this case. The parses include

raGu

icCAdu

puswakam

rAmudiki

puswakam

icCAdu

raGur6

k1k4

k2rAmudiki

k4 k2

Page 15: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Copula• Ex – is, are, were ….. Etc• Copula is generally dropped in Telugu

For ex-– Telugu : rAmudu maMci bAludu– gloss : RAM good boy– Eng : Ram is a good boy.

• So we handle these cases by introducing a “NULL_VG”Frame for NULL_VG--------------------------------------------------------------------------------------------arc-label necessity vibhakti lextype posn reln--------------------------------------------------------------------------------------------k1 m 0 n l ck1S m 0 n l c--------------------------------------------------------------------------------------------

Page 16: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

‘ani’ construction• ‘ani’ in telugu is some times similar to “that” in english.• There are three different ways of using “ani” as follows :

Used as complementizer :• Telugu : rAmudu paMdu wiMtAdu ani mohan ceVppAdu.• Gloss : Ram fruit will_eat that mohan said .• English : Ram said that Mohan will eat a fruit.

Used as verb :• Telugu : mohan rAmudu paMdu wiMtAdu ani vellipoyAdu .• English : mohan left saying ram eats an apple.

Used to state a reason :• Telugu : mohan rAmudu paMdu winnAdani vellipoyAdu.• Gloss : Mohan Ram fruit had_eaten went.• English : Mohan went because ram had eaten the fruit.

Page 17: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

“ani” construction Contd …

So we created a demand frame for “ani”

Frame for ani--------------------------------------------------------------------------------------------arc-label necessity vibhakti lextype posn reln--------------------------------------------------------------------------------------------Ccof m - v_fin l cCcof m - v_fin r p--------------------------------------------------------------------------------------------

Page 18: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Conjuncts • In Telugu conjuncts occur as suffixes (tam of the

verb) , DheergAs and as lexical items such as “inkA” , “anduke” , “mariyu” , “kAni” , “aiwe” and “anwe”.

Suffixes : Here , just applying the corresponding transformation

chart of the verb solves the case. Telugu : nenu iMtiki velwe nixrapowAnu.

Gloss : I home if gowill_sleep .

English: I will sleep if I go home.

Page 19: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Contd …• Lexical items :

Here we will have frame for each lexical entry which will do the corresponding job.

In case of “mariyu” :

Frame 1 :--------------------------------------------------------------------------------------------arc-label necessity vibhakti lextype posn reln--------------------------------------------------------------------------------------------Ccof m - v l cCcof m - v r c--------------------------------------------------------------------------------------------

Frame 2 :--------------------------------------------------------------------------------------------arc-label necessity vibhakti lextype posn reln--------------------------------------------------------------------------------------------Ccof m - n l c

Ccof m - n r c--------------------------------------------------------------------------------------------

Page 20: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Contd …• DheergAs :

Often by elongation of the vowel at the end of lexical items the conjuncts information is implicit there without the need of explicit lexical entries such as “mariyu”.• Telugu : rAmudU siwA iMtiki vellAru.• Gloss : Ram (implicit conj) sita home went .• English : Ram and Sita went home .

In such cases a NULL_CCP is introduced which serves like explicit conjunct lexical entry and we have a frames for the NULL_CCP similar to the one in previous slide.

Page 21: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Future work !!

• A thorough analysis of Relative clauses.• Analysis and handling of NULL VERBS in case

of complex constructions.• And their implementation.• Verb and TAM Classification.

Page 22: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

THANKS !!

Page 23: Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Any Queries ??