ة ي م لا سلا ود ا ع س ن ب مد ح مام ملا ا عة ام ج ومات ل ع م ل وا ب س حا ل وم ا ل ع ة ي ل ك ب س حا ل وم ا ل ع م س قImam Mohammad Ibn Saud Islamic University College of Computing and Information Science Computer sciences Department Prepared by: Al-Moammar.A., Al-Abdullah.H., and Al-Ajlan.N Arabic Tokenization and Stemming Supervised by: Dr. Amal Al-Saif.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
بن محمد اإلمام جامعةاإلسالمية سعود
الحاسب علوم كليةوالمعلومات
الحاسب علوم قسم
Imam Mohammad Ibn Saud Islamic University
College of Computing and Information Science
Computer sciences Department
Prepared by: Al-Moammar.A., Al-Abdullah.H., and Al-Ajlan.N
Hybrid Method Incorporates three different techniques for Arabic Stemming.
The Hybrid algorithm starts with constructing the root file containing more than 9,000 valid Arabic roots.
Results
Results Hybrid algorithm was found to supersede the other
stemming ones.
The obtained results illustrate that using the hybrid stemmer enhances the performance of some Arabic process.
In Arabic Text Categorization: the averages accuracies are: 74.41% for khoja, 59.71% for light stemming, 48.17% for n-grams, and 82.33% for Hybrid stemmer.