Computational Model of Grammar for English to Sinhala Machine Translation By Budditha Hettige Department of Statistics and Computer Science, University of Sri Jayewardenepura, Sri Lanka & Asoka S. Karunanada Faculty of Information Technology, University of Moratuwa, Sri Lanka 1
31
Embed
Computational Model of Grammar for English to Sinhala editstaffweb.sjp.ac.lk/sites/default/files/budditha/files/icter2011.pdf · Computational Model of Grammar for English to Sinhala
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Computational Model of Grammar for English to
Sinhala Machine Translation
By
Budditha Hettige
Department of Statistics and Computer Science,
University of Sri Jayewardenepura, Sri Lanka
&
Asoka S. Karunanada
Faculty of Information Technology,
University of Moratuwa, Sri Lanka
1
OverviewOverview
• Introduction
• Machine Translation
• Sinhala Language
• Computational Model of Grammar for Sinhala
Language
• Design & Implementation
• Evaluation
• Conclusion & further works
Computational Model of Grammar for English to Sinhala Machine Translation 2
IntroductionIntroduction
• Machine Translation
– Computer software that translates text or speech from one natural language to another
• Machine Translation gives a potential solution for language barrier
• Many countries use Machine Translation as a solution for their language barrier
– India
– Japan etc.
3Computational Model of Grammar for English to Sinhala Machine Translation
Existing ApproachesExisting Approaches
• Human-assisted
• Rule-based
• Statistical
• Example-based
• Knowledge-based
• Hybrid
• Agent-based
4Computational Model of Grammar for English to Sinhala Machine Translation
NLP @ NLP @ SSri Lankari Lanka
• UCSC
– Optical Character Recognizer
– Sinhala Corpus
– MT etc.
• Other NLP Systems
– Several undergraduate Research
• BEES
– Rule-based machine translation system run under the concept of “Varanageema” (Conjugation)
5Computational Model of Grammar for English to Sinhala Machine Translation
Sinhala LanguageSinhala Language
6
Sinhala LanguageSinhala Language
• Sinhala language has its own writing system, which is
an offspring of the Brahmi script
• Sinhala alphabet consists of 61 letters comprising 18
vowels, 41 consonants and 2 semi-consonants
• Part of speech
– Noun
– Verb
– Indeclinable particles (�පාත, උපස�ග)
7Computational Model of Grammar for English to Sinhala Machine Translation
Sinhala Noun Morphology Sinhala Noun Morphology
• Sinhala Noun is a word that represents the
noun, pronoun and the adjective
• Is inflected for – Gender (lingaya)
– Number (Wachana)
– Person (Purusha)
– Case (Vibhakthi)
– Definiteness
8Computational Model of Grammar for English to Sinhala Machine Translation
Word conjugation Word conjugation ((නාමනාමනාමනාමනාමනාමනාමනාම වරණැ��ලවරණැ��ලවරණැ��ලවරණැ��ලවරණැ��ලවරණැ��ලවරණැ��ලවරණැ��ල))
• More than 27 forms of nouns that can be generated by
inflecting a single root word
• Contains more than hundred rules to conjugate a noun using a
given base form (Prakurthi)
• There are 15 conjugation patterns identified for generating a
Sinhala noun (GANA)
– Eath Ganaya (ඇ� ගණය)
– Wasu Ganaya (ව� ගණය)
– Tara Ganaya (තාර ගණය)
– etc.
9Computational Model of Grammar for English to Sinhala Machine Translation