Top Banner
Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing Center of Moscow State University (MGU NIVC)
13

Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Jan 02, 2016

Download

Documents

Audra Gordon
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Ontology-based information retrieval of scientific information

Natalia V. LoukachevitchLaboratory of Information Resources Analysis

Research Computing Center of Moscow State University (MGU NIVC)

Page 2: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Thematic Search of Scientific Information

• Knowledge-based (ontology-based) search

• Use of synonyms

• Automatic query expansion

• Automatic analysis of query results

• Help in interactive search

Page 3: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Sociopolitical domain

Le

ve

ls o

f h

iera

rch

y

LawAccounting

Taxes

Banking

Bilingual Sociopolitical Thesaurus The thesaurus development is based on three methodologies:

• methods of construction of information-retrieval thesauri (information-retrieval context, analysis of terminology, terminology-based concepts, a small set of relation types)

• development of wordnets for various languages (word-based concepts,

detailed sets of synonyms, description of ambiguous text expressions)

• ontology and formal ontology research (strictness of relations description, necessity of many-step inference)

(33,000 concepts, 80,000 Russian terms, 85,000 English terms)

Page 4: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

General Lexicon

Specific Lexicon

Специальная лексика

Socio-Political Domain vs. General Lexicon and Specific

Lexicons

Intermediate Zone

Information

Security

Aviation Ontology

Cul

tura

l H

erita

ge

Ontology on Natural

Sciences

and Technology

30,000 concepts; 70,000

terms

Page 5: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Thematic Structure tax; taxation system; tax payer;

finances; economy; tax legislation; VAT

legislation; law; draft law;

Taxation Code;

deputy minister; Ministry of Finance;

finances; reform; tax reform

populationbudget, estimate;

finances; economy; document

government; state power; Minister of

Finance

State Duma; state power;

state

Page 6: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Thematic representation of a text:Thematic Node i||+ == Thematic Node j

Thematic node in the text

Page 7: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

University Information System RUSSIA(http://www.cir.ru, http://uisrussia.msu.ru )

- Database of Fulltext Documents (1,5 mln): Legal Acts, Newspaper articles, Scientific Reports

- Database “Statistics of Russian Federation” (Socio-economic Statistics, Demographic Statistics, Agrarian Statistics, Urban Statistics)

- Database “Budget system of Russian Federation”) (www.budgetrf.ru)

Page 8: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Visualisation of Data in Dynamic Tables and Maps

Page 9: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Convertors Processing Interfaces Services

Unified Technology Platform (Constructor) Russian University Social Sciences

I nformation and Analytical consortiumwww.cir.ru

www.echr- base.ru

БД Статистика Россииwww.budgetrf.ru

Page 10: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Cross-Language Information

Retrieval

Page 11: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Applications of technology

• Concept-based information retrieval (monolingual, bilingual)

• Information-Retrieval systems combining word-based and concept-based serach

• Concept-based automatic text categorization

• Automatic Question-Answering

• Automatic Text Summarization

Page 12: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Main Projects

State Duma of RF (1999 - …) Central Election Commission of RF (1997 - …) Legal Company “Garant” (2002 – …) Ministry of Education (2005-2006) Accounting Chamber of RF (2003 – …) Central Bank of RF (2006 – …) Grants:

– McArthur Foundation (1994, 1995, 2004 - …)

– Ford Foundation (2002, 2003)

– Russian Foundation for Basic Research (9)

– Russian Foundation for Humanitarian (5)

– Eurasia Foundation (2002, 2003)

Page 13: Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing.

Participance in International Forums

• Participance in Text REtrieval Conference TREC

organized by NIST DARPA (TREC-6, TREC-8)

• Participance in Summarizarion Conference SUMMAC

organized by NIST DARPA (1st place)

• Cross-Language Evaluation Forum CLEF

(DELOS program)

– paricipance in Steering Committee

– provision of Russian collections for evaluation purposes

– information retrieval of domain-specific information

retrieval

• Organizers of Russian Information Retrieval Evaluation

Seminar ROMIP (www.romip.ru/en/ )