H ot News Reporter: Hossein Kamyar Asef poormasoomi Supervisor Dr. Mohsen Kahani.

Post on 15-Jan-2016

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Hot News

Reporter:

Hossein KamyarAsef poormasoomi

SupervisorDr. Mohsen Kahani

Tehran University

Database Research Group Natural Language and Text Processing Group

Database Research Grouphttp://ece.ut.ac.ir/dbrg

Members :Faculty Staff : 8

Students : 9

Alumni : 17

Dr.Caro Lucas Dr.Behzad Moshiri Dr. Rohani Rankouhi

Database Research Group

Research Project: Modernization Of Systems

Information Retrieval

Data Mining

Data Management

Database Research Group

Research Project: Modernization Of Systems

Information Retrieval

Data Mining

Data Management

Database Research Group

Research Project: Modernization Of Systems

Information Retrieval

Data Mining

Data Management

Database Research Group

Industrial ProjectIndustrial Project

Industrial Project

Database Research Group

Related Course:1. Introduction to Database Systems

2. Advanced Database Systems

3. Special Topics in Database Systems

4. Database Laboratory

5. Data Mining

6. Information Retrieval

7. Natural Language Processing

Database Research Group

Persian CorpusHamshahri Corpus

رس�می مجموع�ه همش�هری توس�ط برگزارکنن�دگان 1نس�خه CLEF نگه�داری و توزی�ع می ش�ود. این مجموع�ه در CLEF2008 وCLEF2009 � پرس و جو دارد.100استفاده شده ا�ست و

توس�ط س�امانه 1388 مجموع�ه همش�هری در س�ال 2نس�خه UTIRE در گ�روه تحقیق�اتی پایگ�اه داده دانش�گاه ته�ران و ب�ر اس�استهیه شده ا�ست. TRECاستان�دارد

Database Research Group

Persian CorpusBijankhan Corpus

Bijankhan corpus is a tagged corpus that is suitable for natural language processing research on the Persian (Farsi) language. This collection is gathered form daily news and common texts. In this collection all documents are categorized into different subjects such as political, cultural and so on. Totally, there are 4300 different subjects. The Bijankhan collection contains about 2.6 millions manually tagged words with a tag set that contains 40 Persian POS tags.

Database Research Group

Persian Corpus

dotIR مجموعه محک وب

این مجموع�ه حاص�ل از خ�زش وب در ح�وزه.ir ش�امل ی�ک میلی�ون س�ند ایج�اد ش�د. س�پس ب�ا ک�اربر س�اخته ش�دند. این 25 پرس و ج�و توس�ط 50تع�داد UTIREاس�تفاده از نرم اف�زار اب�داعی

پرس و جو ه�ا ب�رای جس�تجوی مجموع�ه م�ورد اس�تفاده ق�رار گرفتن�د و ص�فحات بازی�ابی ش�ده، 25 س�ند ب�رای ه�ر پرس و ج�و(، توس�ط هم�ان 369 س�ند )بط�ور متوس�ط 18424ش�امل مجم�وع

کاربر مورد قضاوت قرار گرفتند. بدین ترتیب اسناد مرتبط با هر پرس و جو مشخص گردید. وی�ژگی 56بعالوه ب�رای بررس�ی و مقایس�ه الگوریتم ه�ای رتبه بن�دی در فع�الیتی م�وازی تع�داد

)ارائ�ه ش�ده توس�ط LETORاز اس�ناد بازی�ابی ش�ده ب�رای ه�ر پرس و ج�و ب�ر اس�اس اس�تاندارد Microsoft Research Asia اس�تخراج ش�دند. محقق�ان گ�رامی می توانن�د از برداره�ای مق�دار )

ی�ا آم�وزش و ب�رای رتبه بن�دی و ب�رای مقایس�ه الگوریتم ه�ای پیش�نهادی خ�ود وی�ژگی، ارتب�اط تنظیم الگوریتم ها سود ببرند.

این پ�روژه توس�ط مرک�ز تحقیق�ات مخ�ابرات ای�ران و آزمایش�گاه پایگ�اه داده دانش�گاه ته�ران.پشتیبانی شده است

Natural Language and Text Processing Group

Members:10 members

Heshaam Faili

[Assistant Professor, Ph.D. Artificial Intelligence from Sharif University of Technology]

Research Project:

More Than 23 Papers ?

Natural Language and Text Processing Group

Industrial ProjectIndustrial Project

Industrial Project

Natural Language and Text Processing Group

تشخیص و تصحیح خطاهای تایپی، •دستوری و معنایی

قابلیت نصب بر روی ویرایشگر متداول •word

قابلیت یادگیری و ارتقاء عملکرد به •صورت خودکار

دقیق و کارآمد• رایگان•

Persian Corpus1. TEP: Tehran English-Persian Parallel Corpus

First free Eng-Per corpus

4-million tokens on each side

Sentence Aligned

2. TMC: Tehran Monolingual Corpus

Largest freely available monolingual corpus for Persian language

Tokenized

Suitable for Language Modeling

3. Mutual Information

http://ece.ut.ac.ir/nlp/resources.html

Natural Language and Text Processing Group

Related Course:Introduction to Natural Language Processing, Dr. Heshaam Faili Advanced Database Systems

Natural Language and Text Processing Group

Beheshti Universityshahid

Natural Language Processing research laboratory was founded by Dr. Mehrnoush Shamsfard at the beginning of 2006 in computer engineering department of Shahid Beheshti University

More Than 25 members. More Than 92 papers.

http://nlp.sbu.ac.ir/

Research Project

A. Developing Linguistic resources

Developing Semantic annotated corpus

Developing chunked corpus

Developing parallel corpus

Developing Persian Verbs database

Semi-automatic Lexicon Acquisition 

Start : 2006

Researchers : Maliheh Monshizadeh, Elham Fekri

Research Project

B. Fundamental Persian text processing tools Standard Text Preparation for Persian

Stemmer /Morphological analyzer / lemmatizer

Tokenizer

POS Tagger

Spell checker

chunker

Syntax parser

Persian Named Entity Recognition - SBUNER

Persian Anaphora resolution

Semantic Role Labelling

Start : 2006

Researchers : Samira Noferesti, Rana Forsati, Pooneh Mortazavi, Hoda Sadat Jafari

Research Project

C. NLP Applications Machine translation – PenTrans project   

English to Persian Translation System

Persian to English Translation System

Machine translation evaluation toolkit

Persian Text summarization – PARSUMIST   

Question Answering    Persian –

English – SBUQA

Information Extraction - Mersad   

Text understanding   

Conversion between Persian sentences and first order logic

Text generation

Start : 2006

Researchers : Chakaveh Saedi, Yasaman Motazedi, Mostafa Nazari

Research Project

D. Ontology engineering Ontology development   

Development of CMMI-ACQ ontology

Collaborative development of ontology of computer science and engineering (COMON)

Fuzzy ontologies

Ontology Learning Ontology learning from text

Ontology learning from web

Relation extraction

Ontology mapping    Evolutionary ontology matching

A linguistic-Structural Approach to Bilingual Ontology Mapping

Ontology population and instantiation

Start : 2006

Researchers : Aynaz Taheri, Hakimeh Fadaei, Tara akhavan, Rahim Dehkharghani, Valeh Montaghami, Bahareh Sarrafzadeh, Amir Sharifloo, Rana Forsati

Research Project

E. Semantic Web Semantic Annotation of documents    

Converting web documents into semantic web resources   

Semantic search   

Semantic web service discovery and composition

Start : 2006

Researchers : Bahareh Sarrafzadeh, Hoda Mirzaie, Maryam Haghollahi, Homan Farrokhzad

Research Project

F. Hybrids Application of fuzzy ontologies in qualitative reasoning    

E-learning    Ontology based Content Rearrangement for Intelligent Tutoring Systems  – OCRITS Intelligent Content Management

Start : 2006

Researchers : Hamzeh Motahari, Marzieh Shariati

Courseware

Ontology Engineering Natural Language Processing Semantic Web Advanced Natural Language Processing, Fall 2005 BY:

Regina Barzilay and Michael Collins

Columbia UniversityMIT University

Tools

FarsNet The first Persian WordNet 

STeP-1  Standard Text Preparation for Persian

Tokenizer

Stemmer

POS tagger

Spell checker

SNatural Language ProcessingWeb Intelligence Laboratory

harif University

Natural Language Processing

Dr ghasem Sani

Dr hesham FailiSince 2003 after three inactivity

ElizaPOS TaggerUnsupervised Natural Grammar Induction

Supervisor:Dr Abolhasani

with 28 members

Web Intelligence Laboratory

Web Intelligence Laboratory

Advanced Researches:Semantic Search EnginesSemantic Web ServicesSemantic web for pervasive computingAnnotationSemantic GridsSocial Networks AnalysisOntology Alignment and LearningWeb ClusteringBusiness Intelligence

New Researches:Composite Web Service Execution Framework.Tracking news to find hot topics.Semantic Programming.Trust model in Semantic Web.New models for recommender systems.Using web to create a lecture for a subject.A Farsi framework for Information Retrieval.A semantic based framework for business intelligence applications.

Web Intelligence Laboratory

S Unknown Laboratory

but Online POS Tagger

با همکاری پروژه ی عروض تحت پشتیانی شورای عالی اطالع رسانی

http://persianp.ir/index.php?option=com_wrapper&view=wrapper&Itemid=7

http://www.prosody.ir

cience & Technology University

Conferences

The Cross-Language Evaluation Forum (CLEF)(i)developing an infrastructure for the testing, tuning and evaluation of information retrieval systems operating on European languages in both monolingual and cross-language contexts

(ii) (ii) creating test-suites of reusable data which can be employed by system developers for benchmarking purposes.CLEF Conferences be held since 2000

CLEF2011 will be held by Amsterdam University

Computational Approaches to Arabic Script-based Languages (CAASL)CAASL2011 will be held in Geneva

Corporationعصر گویش پرداز

استخراج اطالعات آماريn-gram براي زبان فارسياستخراج گرامر زبان فارسيتهيه مجموعه واژگان زبان فارسياستخراج كلمات پركاربرد زبان فارسي به تفكيك موضوعي

پروژه های در حال تحقیق مدل احتمالي کلمات تکي، دوتايي، سه تايي و چهارکلمه اي براي زبان هاي فارسي

و انگليسي قوانين دستوريGPSG براي زبان فارسي دستور زبان احتمالي پارسرهاي مناسب مدل زباني روشهاي خوشه بندي کلمات

we do ...

top related