Top Banner
An Approach to the Automatic An Approach to the Automatic Extraction of Complex Predicates in Extraction of Complex Predicates in Bengali Bengali by by MEGHADITYA ROY CHAUDHURY (BCSE- III) Jadavpur University
15

Complex predicate meghaditya

May 25, 2015

Download

Education

Seminar presentations as part of the UGC Infrastructure Grants Project on Comeplex Predicates.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Complex predicate meghaditya

An Approach to the Automatic An Approach to the Automatic Extraction of Complex Predicates in Extraction of Complex Predicates in

BengaliBengali

bybyMEGHADITYA ROY CHAUDHURY

(BCSE- III)Jadavpur University

Page 2: Complex predicate meghaditya

What are Complex Predicates?What are Complex Predicates?

Complex Predicates are defined as predicates Complex Predicates are defined as predicates which are composed of more than one which are composed of more than one grammatical element (either morphemes/words), grammatical element (either morphemes/words), each of which contributes a noneach of which contributes a non--trivial part of the trivial part of the information of the complex predicate (Alex information of the complex predicate (Alex information of the complex predicate (Alex information of the complex predicate (Alex Alsina 1996). Alsina 1996). Complex Predicates contain (verb + verb) or Complex Predicates contain (verb + verb) or (noun/adjective + verb) combinations in South (noun/adjective + verb) combinations in South Asian Languages (Hook, 1974).Asian Languages (Hook, 1974).

Page 3: Complex predicate meghaditya

Identifying Complex Predicates in Identifying Complex Predicates in BengaliBengali

Bengali is less computerized compared to Bengali is less computerized compared to English due to its morphological enrichment.English due to its morphological enrichment.

As the identification of Complex Predicates As the identification of Complex Predicates requires the knowledge of morphology, the task requires the knowledge of morphology, the task of automatically extracting the Complex of automatically extracting the Complex Predicates is a challenge. Predicates is a challenge.

Page 4: Complex predicate meghaditya

Benefits of Identification of Benefits of Identification of Complex PredicatesComplex Predicates

Detection and interpretation of complex Detection and interpretation of complex predicates are important for tasks such as predicates are important for tasks such as machine translation, information retrieval, machine translation, information retrieval, machine translation, information retrieval, machine translation, information retrieval, summarizationsummarization etc.etc.A mere listing of complex predicates constitutes A mere listing of complex predicates constitutes valuable linguistic resource for valuable linguistic resource for lexicographers, lexicographers, wordnet designers and other NLP system wordnet designers and other NLP system designersdesigners..

Page 5: Complex predicate meghaditya

Approach to the identification of Approach to the identification of Complex PredicatesComplex Predicates

A RuleA Rule--Based Approach.Based Approach.

In this project, I follow an algorithm for In this project, I follow an algorithm for In this project, I follow an algorithm for In this project, I follow an algorithm for automatic extraction of Complex automatic extraction of Complex predicates from an untagged corpus using predicates from an untagged corpus using only morphological analyzer and root only morphological analyzer and root lexicon. lexicon.

Page 6: Complex predicate meghaditya

Approach to the Extraction of Complex Approach to the Extraction of Complex Predicates in Bengali LanguagePredicates in Bengali Language

Complex Predicates in Bengali consists of Complex Predicates in Bengali consists of two types, Compound verbs and Conjunct two types, Compound verbs and Conjunct verbs.verbs.

Compound Verbs: Verb + Light VerbCompound Verbs: Verb + Light VerbConjunct Verbs : Noun/Adj + VerbConjunct Verbs : Noun/Adj + Verb

The second verb is called Light Verb.The second verb is called Light Verb.

Page 7: Complex predicate meghaditya

16 Light Verbs in Bengali16 Light Verbs in Bengali

aSa ‘come’ • dãRa ‘stand’ rakha ‘keep’ • ana ‘bring’ deoya ‘give’ • pOra ‘fall’ paTha ‘send’ • bERano ‘roam’ paTha ‘send’ • bERano ‘roam’ neoya ‘take’ • tola ‘lift’ bOSa ‘sit’ • oTha ‘rise’ jaoya ‘go’ • chaRa ‘leave’ phEla ‘drop’ • mOra ‘die’

Page 8: Complex predicate meghaditya

Bengali Shallow ParserBengali Shallow Parser

The analysis begins at the morphological The analysis begins at the morphological level and accumulates at results of POS level and accumulates at results of POS tagger and chunker. tagger and chunker. tagger and chunker. tagger and chunker.

The final output combines the results of all The final output combines the results of all these levels and shows them in a single these levels and shows them in a single representation (called Shakti Standard representation (called Shakti Standard Format). Format).

Page 9: Complex predicate meghaditya

The Console Output of the Bengali The Console Output of the Bengali Shallow ParserShallow Parser

Page 10: Complex predicate meghaditya

Functions That Work in the Functions That Work in the BackgroundBackground

Load_resource()Load_resource()

morph_file_creating()morph_file_creating()

Find_complex_predicate()Find_complex_predicate()

prepareOutput()prepareOutput()

deleteFile()deleteFile()

Page 11: Complex predicate meghaditya

Sample Run : Input FileSample Run : Input File

Page 12: Complex predicate meghaditya

Sample Run : Execution beginningSample Run : Execution beginning

Page 13: Complex predicate meghaditya

Sample Run : Execution EndsSample Run : Execution Ends

Page 14: Complex predicate meghaditya

Sample Run : OutputSample Run : Output

Page 15: Complex predicate meghaditya

ConclusionConclusion

The algorithm heavily depends on The The algorithm heavily depends on The Bengali Shallow Parser, hence it suffers Bengali Shallow Parser, hence it suffers from some error crept in the parser tool. from some error crept in the parser tool. This can be modified by reducing the This can be modified by reducing the This can be modified by reducing the This can be modified by reducing the dependence and developing a more selfdependence and developing a more self--sufficient algorithm .sufficient algorithm .It definitely calls for a large amount work in It definitely calls for a large amount work in future. future.