SIRIUS - Pereslavl.ru

Download Report

Transcript SIRIUS - Pereslavl.ru

Program Systems Institute, RAS
Artificial Intelligence
Research Center
Pereslavl-Zalessky, Russia
Lines of research
Knowledge-based Dynamic Systems
 Computer Linguistics: Information
Extraction, Information Retrieval,
Text Categorization
 Image Analysis of Data
 Nested Petri Nets

Miracle PS
A program system of tools for
designing intelligence systems
System Architecture
Control over docking of a space vehicle with
the orbital station
Control System Model:








docking parameters (restrictions);
analytical description of control zones;
ship conditions database;
ship model;
station model;
a set of goals;
a system of rules;
planned trajectory.
Control over docking of a space vehicle with
the orbital station
Main control fields and boundaries between them
Control over docking of a space vehicle with
the orbital station
Main Goals:



Approaching
Divergence
Minimal destruction contact with the
station
Subgoals:




Finding the station
Approaching
Hovering
Flyby
Control over docking of a space vehicle with
the orbital station
Interface
Visualization Module
Research Prototype
SIRIUS
Intelligent
Meta-Search
System
Intelligent Meta-Search
System
Sirius - Meta-Search
System with the multiagent
environment of the distributed
calculations and the powerful
linguistic module of texts analysis
Features of system
Sirius






Expansion of standard keywords search
mechanisms
Input of inquiry in a natural language
Use of semantic texts processing methods
Automatic inclusion of new information sources
Increase in accuracy of search
Use of parallel calculations
Example of search inquiry
The inquiry = “The President has arrived to
Bruxelles”
Semantic relation DIR(X, Y) defines that Y there is
a direction of movement X
(role of X is «subject», role of Y is «directiv»):
DIR(President, Bruxelles)
The calculation of relevance
Relevance is calculated on :
Semantic roles
 Semantic connections
 Key words

INEX:
Tools for Information
Extraction
Artificial Intelligence Research Centre
Program Systems Institute
Russian Academy of Science
152020 Pereslavl-Zalessky
Russia
+7 08535 98065
[email protected]
Information extraction
Objective:
 extract meaningful information of a
pre-specified type from (typically
large amounts of) texts for further
analytical purposes
Output:
 data structures of a pre-specified
format (filled scenario templates)
Possible IE application
scenarios:
inference of new information
(knowledge acquisition)
query formulation and answering in
human-computer systems
automatic generation of abstracts
and summaries
visualization of document content,
etc.
Named entity recognizer
identifies proper names
 assigns semantic features to certain
items

Information extraction rules
a domain knowledge representation
formalism (scenario templates)
 a set of patterns to identify template
elements in a text (covering the
many possible ways to talk about
the target event elements)

IE pattern includes:
a set of rules that define how to
retrieve this pattern in a text
 a set of constraints imposed on
textual elements to fit into a
particular slot of the target

Coreference Resolver

recognizes different occurrences of
the same entity in a text
Merging partial results

merging partially filled templates to
produce a final, maximally filled
template
Text categorization system

The goal of text categorization is to
classify documents into a certain
number of predefined categories, or
classes. Each document may fall into
one, more than one, or not even one
category. When machine learning is
used for text categorization, the
goal is to train classifiers on a
training set (a set of categorylabeled documents).
Features
Both one-word and multi-word
terms are used for text
categorization.
 Extraction of multi-word terms is
based on partial syntactic analysis of
texts.
 Conventional statistics-based term
weighing is enhanced by taking into
account different types of term
occurrence in a document.
