Artificial Intelligence Research Center

Post on 30-Dec-2015

34 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Program Systems Institute, RAS. Artificial Intelligence Research Center. Pereslavl-Zalessky , Russia. Lines of research. Knowledge-based Dynamic Systems Computer Linguistics: Information Extraction, Information Retrieval, Text Categorization Image Analysis of Data Nested Petri N ets. - PowerPoint PPT Presentation

Transcript

Artificial Intelligence Research Center

Pereslavl-Zalessky, Russia

Program Systems Institute, RAS

Lines of research Knowledge-based Dynamic Systems Computer Linguistics: Information

Extraction, Information Retrieval, Text Categorization

Image Analysis of Data Nested Petri Nets

Miracle PS

A program system of tools for designing intelligence systems

System Architecture

Control over docking of a space vehicle with the orbital station

Control System Model: docking parameters (restrictions); analytical description of control zones; ship conditions database; ship model; station model; a set of goals; a system of rules; planned trajectory.

Control over docking of a space vehicle with the orbital station

Main control fields and boundaries between them

Control over docking of a space vehicle with the orbital station

Main Goals: Approaching Divergence Minimal destruction contact with the

station

Subgoals: Finding the station Approaching Hovering Flyby

Control over docking of a space vehicle with the orbital station

Interface

Research Prototype

Visualization Module

SIRIUS

IntelligentMeta-Search System

Intelligent Meta-Search System

Sirius - Meta-Search System with the multiagent environment of the distributed calculations and the powerful linguistic module of texts analysis

Features of system Sirius

Expansion of standard keywords search mechanisms

Input of inquiry in a natural language Use of semantic texts processing methods Automatic inclusion of new information sources Increase in accuracy of search Use of parallel calculations

Example of search inquiry The inquiry = “The President has arrived to Bruxelles”

Semantic relation DIR(X, Y) defines that Y there is a direction of movement X (role of X is «subject», role of Y is «directiv»):

DIR(President, Bruxelles)

The calculation of relevance

Relevance is calculated on :

Semantic roles Semantic connections Key words

INEX: Tools for Information

Extraction

Artificial Intelligence Research CentreProgram Systems InstituteRussian Academy of Science152020 Pereslavl-ZalesskyRussia+7 08535 98065inex@epk.botik.ru

Information extraction

Objective: extract meaningful information of a

pre-specified type from (typically large amounts of) texts for further analytical purposes

Output: data structures of a pre-specified

format (filled scenario templates)

Possible IE application scenarios:

inference of new information (knowledge acquisition)query formulation and answering in human-computer systemsautomatic generation of abstracts and summariesvisualization of document content, etc.

Named entity recognizer

identifies proper names assigns semantic features to certain

items

Information extraction rules

a domain knowledge representation formalism (scenario templates)

a set of patterns to identify template elements in a text (covering the many possible ways to talk about the target event elements)

IE pattern includes:

a set of rules that define how to retrieve this pattern in a text

a set of constraints imposed on textual elements to fit into a particular slot of the target

Coreference Resolver

recognizes different occurrences of the same entity in a text

Merging partial results

merging partially filled templates to produce a final, maximally filled template

Text categorization system The goal of text categorization is to

classify documents into a certain number of predefined categories, or classes. Each document may fall into one, more than one, or not even one category. When machine learning is used for text categorization, the goal is to train classifiers on a training set (a set of category-labeled documents).

Features Both one-word and multi-word terms

are used for text categorization. Extraction of multi-word terms is

based on partial syntactic analysis of texts.

Conventional statistics-based term weighing is enhanced by taking into account different types of term occurrence in a document.

top related