Transcript Document

Филиал-кафедра
First Joint Research Seminar of AI
Departments of Ukraine and Netherlands
A Review of Research Topics of the
AI Department in Kharkov
(MetaIntelligence Laboratory)
Vagan Terziyan, Helen Kaikova
November 25 - December 5, 1999
Vrije University of Amsterdam (Netherlands)
Authors
Vagan Terziyan
[email protected]
Helen Kaikova
[email protected]
In cooperation with:
Metaintelligence Laboratory
Department of Artificial Intelligence
Kharkov State Technical University of Radioelectronics, UKRAINE
Contents
 A Metasemantic Network
 Metasemantic Algebra of Contexts
 The Law of Semantic Balance
 Metapetrinets
 Multidatabase Mining and Ensemble of Classifiers
 Trends of Uncertainty, Expanding Context and
Discovering Knowledge
 Recursive Arithmetic
 Similarity Evaluation in Multiagent Systems
 On-Line Learning
A Metasemantic Network
A Semantic Metanetwork
Semantic Metanetwork is considered
formally as the set of semantic networks,
which are put on each other in such a way
that links of every previous semantic network
are in the same time nodes of the next
network
An Example of a Semantic
Metanetwork
A''
L''1
2
L''2
S e c o n d le v e l
A''
A''
A'
A'
1
L' 2
L' 1
A'
3
1
3
A'
4
F irs t le v e l
L' 3
2
L 1
A
2
L 2
L 4
Z e ro le v e l
A
1
L 3
A
3
How it Works
• In a Semantic Metanetwork every higher level controls
semantic structure of the lower level.
• Simple controlling rules might be, for example, in what
contexts certain link of a semantic structure can exist and in
what context it should be deleted from the semantic
structure.
• Such multilevel network can be used in an adaptive control
system which structure is automatically changed following
changes in a context of the environment.
• The algebra for reasoning with a semantic metanetwork is
also developed.
Published and Further Developed in
Puuronen S., Terziyan V., A Metasemantic Network, In:
E. Hyvonen, J. Seppanen and M. Syrjanen (eds.), SteP-92
- New Directions in Artificial Intelligence, Publications of
the Finnish AI Society, Otaniemi, Finland, 1992, Vol. 1,
pp. 136-143.
Terziyan V., Multilevel Models for Knowledge Bases
Control and Their Applications to Automated
Information Systems, Doctor of Technical Sciences
Degree Thesis, Kharkov State Technical University of
Radioelectronics, 1993
A Metasemantic Algebra for
Managing Contexts
A Semantic Predicate
Semantic predicate describes a piece of
knowledge (relation or property) by the
expression:
P( Ai , Lk , Aj )  true
if there is knowledge that a relation with name
Lk holds between objects Ai and Aj
Example of Knowledge
“Bill hates poor Mary”.
A1 : <Bill>; A2: <Mary>
- objects;
L1: <to hate>; L2: <to be poor> - names of relations;
P ( A1 , L1 , A2 )   Bill hates Mary  - relation;
P( A1 , L2 , A2 )   Mary is pure 
Knowledge:
- property;
P ( A2 , L1 , A3 )  P ( A3 , L2 , A3 )  true
Semantic Operations: Inversion
Ai
Lk
Aj

Ai
~
Lk
~
P( Ai , Lk , Aj )  P( Aj , Lk , Ai )
~
~
Lk  Lk
Aj
Semantic Operations: Negation
P( Ai , Lk , Aj )  P( Ai , Lk , Aj )
P(<Mary>, <to_love>, <Tom>) = false,
it is the same as:
P(<Mary>, <not_to_love>, <Tom>) = true.
Lk  Lk
~ ~
Lk  Lk
Semantic Operations: Composition
Lk
Ai
As
Ln
Aj

Lk
As
Ln
Aj
Ai
Lk * Ln
P( Ai , Lk , As )  P( As , Ln , Aj )  P( Ai , Lk * Ln , Aj )
If it is true: P(<Mary>, <to_be_married_with>, <Tom>) and
P(<Tom>, <to_have_mother>, <Diana>),
then it is also true that:
P(<Mary>, <to_have_mother-in-law>, <Diana>).
Semantic Operations: Intersection
Lk
Aj
Ai

Ai
Lk  Ln
Aj
Ln
P( Ai , Lk , Aj )  P( Ai , Ln , Aj )  P( Ai , Lk  Ln , Aj )
<to_give_birth_to> +
<to_take_care_of> =
<to_be_mother_of>.
Semantic Operations: Interpretation
a)
L'n
Lk
Ai

Aj
Ai
L'n
Lk
Al'
b)
L'n
Al'
Ai
Lk

L'n
Lk
Ai
P( Ai , Lk , Aj )  P( Al' , L'n , Al' )  ist ( Al' , P( Ai , Lk , Aj )) 
L'n
k
 P( Ai , L , Aj )
Aj
Interpreting Knowledge in a Context
 interprete d knowledge  
  knowledge 
 knowl.aboutmetacont.of n  th level
 knowl.about context ...
Example of Interpretation
A''
2
L''1
L''2
S e c o n d le v e l
A''
A''
1
A'
A'
1
L' 2
L' 1
A'
3
3
A'
4
F irs t le v e l
L' 3
2
L 1
A
L 2
2
L 4
Z e ro le v e l
A
1
A
L 3
LA1  A3  ( L1 )
L
A1'
3
* ( L2 )
 ( L1 * L2  L3 )
The interpreted
knowledge about the
relation between A1 and
A3 taking all contexts
and metacontexts into
account is as follows:
L
A2'
 ( L3 )
L
A3'

~ ' ~ ' ~ ' ' ( L1'' *L'2'  L~2'' *L~1'' )
'
'
'
'
( L1*L2  L1*L3  L2 *L1  L2 *L3 )
Decontextualization
Lxknowledgeabout context  interprete d knowledge 
Suppose that your colleague, whose context you know well,
has described you a situation. You use knowledge about
context of this person to interpret the “real” situation.
Example is more complicated if several persons describe
you the same situation. In this case, the context of the
situation is the semantic sum over all personal contexts.
Context Recognition
 knowledge   interprete d knowledge 
Lx
Suppose that someone sends you a message describing the
situation that you know well. You compare your own
knowledge with the knowledge you received. Usually you
can derive your opinion about the sender of this letter.
Knowledge about the source of the message, you derived,
can be considered as certain context in which real situation
has been interpreted and this can help you to recognize a
source or at least his motivation to change the reality.
Lifting (Relative Decontextualization)
This means deriving knowledge interpreted in
some context if it is known how this
knowledge was interpreted in another context.
( L  Lk )  ( Lx  L )decontextu alization  
Lm
x
Lm
k
 ( L  L )interpreta tion 
Ln
x
LmLn
k
Published and Further Developed in
Terziyan V., Puuronen S., Multilevel Context
Representation Using Semantic Metanetwork, In: Context97 - Proceedings of International and Interdisciplinary
Conference on Modeling and Using Context, Rio de
Janeiro, Brazil, Febr. 4-6, 1997, pp. 21-32.
Terziyan V., Puuronen S., Reasoning with Multilevel
Contexts in Semantic Metanetworks, In: D.M. Gabbay
(Ed.), Formal Aspects in Context, Kluwer Academic
Publishers, 1999, pp. 173-190.
The Law of Semantic Balance
An Object in Possible World
A8
W
A4
L6
A
A1
L1
L3
L 2 A2
L5
A3
L4
A5
A6
A7
Internal and External View of an
Object
a)
W
b)
A8
A
W
A
L6
A4
A1
L1
A3
L 2 A2
L5
L3
L4
A5
A6
A7
Internal Semantics of an Object
Internal semantics of object is equal to
semantic sum of all chains of semantic
relations that start and finish on the
shell of this object and pass inside it:
LAi  Ai 
 ( HAS _ PART * L
j ,k , j  k ,
P ( Ai , HAS _ PART , A j )
P ( Ai , HAS _ PART , Ak )
A j  Ak
* PART _ OF )
External Semantics of an Object
External semantics of object is equal to
internal semantics of the World if
consider this object as an Atom in this
World (i.e. after removing internal
structure of the object from the World):
Ai  Atom
Eex ( Ai )  E in
(World )
The Law of Semantic Balance
External and internal semantics of
any object as evolutionary
knowledge are equivalent to each
other in a limit.
(t )
in
(t )
ex
lim ( E ( Ai )) = lim ( E ( Ai ))
t 
t 
The Evolution of Knowledge
a)
b)
c)
N IL
N IL
n i l in
E in
n i l in
E in
d)
n i l in
E in
e)
n i l in
E in
Balance
E ex
n i l ex
Step 1
E ex
n i l ex
Balance
E ex
n i l ex
E ex
n i l ex
Step 2
Balance
...
f)
E in
E ex
Balance
Published and Further Developed in
Terziyan V., Multilevel Models for Knowledge Bases Control and Their
Applications to Automated Information Systems, Doctor of Technical
Sciences Degree Thesis, Kharkov State Technical University of
Radioelectronics, 1993
Grebenyuk V., Kaikova H., Terziyan V., Puuronen S., The Law of
Semantic Balance and its Use in Modeling Possible Worlds, In: STeP-96 Genes, Nets and Symbols, Publications of the Finnish AI Society, Vaasa,
Finland, 1996, pp. 97-103.
Terziyan V., Puuronen S., Knowledge Acquisition Based on Semantic
Balance of Internal and External Knowledge, In: I. Imam, Y.Kondratoff,
A. El-Dessouki and A. Moonis (Eds.), Multiple Approaches to Intelligent
Systems, Lecture Notes in Artificial Intelligence, Springer-Verlag, V.
1611, 1999, pp. 353-361.
Metapetrinets for Flexible
Modelling and Control of
Complicated Dynamic Processes
A Metapetrinet
• A metapetrinet is able not only to change the marking of
a petrinet but also to reconfigure dynamically its structure
• Each level of the new structure is an ordinary petrinet of
some traditional type.
• A basic level petrinet simulates the process of some
application.
• The second level, i.e. the metapetrinet, is used to
simulate and help controlling the configuration change at
the basic level.
How it Works
 There is conformity between the places of the second level
structure and places or transitions of the basic level
structure.
 One possible control rule is such that a certain place or
transition is removed from the present configuration of the
basic level if the corresponding place at the metalevel
becomes empty.
 If at least one token appears to an empty metalevel place,
then the originally defined corresponding basic level place
or transition immediately is created back to the
configuration
Example of a Metapetrinet
P´3
P´2
t´1
P´1
t´3
P´4
Controlling
level
P´5
t´2
P4
P3
t1
Basic level
P1
P2
t2
Controlling Interactions between
Metapetrinet’s Levels
basic level petrinet attributes
<place>
<transition>
1) removing a place;
1) removing a
transition;
1) removing
link;
a
1) removing a
token;
2) restoring
link;
a
2) restoring a
token;
2) restoring a place;
controlling
effect
3)
changing
place’s capacity;
a
2) restoring a
transition;
4)
changing
place’s marking
a
3) changing
time settings;
4) changing
the fire rule
<link>
<token>
3) changing a
link’s direction;
3) changing a
token’s color;
4) changing a
link’s capacity
4) changing a
token’s place
Published and Further Developed in
Terziyan V., Multilevel Models for Knowledge Bases
Control and Their Applications to Automated Information
Systems, Doctor of Technical Sciences Degree Thesis,
Kharkov State Technical University of Radioelectronics,
1993
Savolainen V., Terziyan V., Metapetrinets for Controlling
Complex and Dynamic Processes, International Journal
of Information and Management Sciences, V. 10, No. 1,
March 1999, pp.13-32.
Mining Several Databases with
an Ensemble of Classifiers
Problem
Case ONE:MANY
Dynamic Integration of Classifiers
 Final classification is made by weighted voting of
classifiers from the ensemble;
 Weights of classifiers are recalculated for every
new instance;
 Weighting is based on predicted errors of the
classifiers in the neighborhood area of the
instance
Case MANY:ONE
Integration of Databases
 Final classification of an instance is
obtained by weighted voting of
predictions made by the classifier for
every database separately;
 Weighting is based on taking the integral
of the error function of the classifier
across every database
Case MANY:MANY
Solutions for MANY:MANY
Decontextualization of Predictions
 Sometimes actual value cannot be predicted as weighted




mean of individual predictions of classifiers from the
ensemble;
It means that the actual value is outside the area of
predictions;
It happens if classifiers are effected by the same type of a
context with different power;
It results to a trend among predictions from the less
powerful context to the most powerful one;
In this case actual value can be obtained as the result of
“decontextualization” of the individual predictions
Neighbor Context Trend
y
actual value: “ideal context”
y(xi)
prediction in (1,2,3) neighbor
context: “better context”
y+(xi)
2
y-(x )
i
1
prediction in (1,2) neighbor context:
“worse context”
xi
3
x
Main Decontextalization Formula
y-
y’
y+
y
Y
’
y+ - prediction in better context
+
y’ - decontextualized prediction
-
- ·+
’ =  - +  +
y- - prediction in worse context
y - actual value
+ < -
’ < - ; ’ < +
Some Notes
 Dynamic integration of classifiers based on locally adaptive
weights of classifiers allows to handle the case «One Dataset
- Many Classifiers»;
 Integration of databases based on their integral weights
relatively to the classification accuracy allows to handle the
case «One Classifier - Many Datasets»;
 Successive or parallel application of the two abowe
algorithms allows a variety of solutions for the case «Many
Classifiers - Many Datasets»;
 Decontextualization as the opposite to weighted voting way
of integration of classifiers allows to handle context of
classification in the case of a trend
Published in
Puuronen S., Terziyan V., Logvinovsky A., Mining
Several Data Bases with an Ensemble of Classifiers, In:
T. Bench-Capon, G. Soda and M. Tjoa (Eds.), Database
and Expert Systems Applications, Lecture Notes in
Computer Science, Springer-Verlag , V. 1677, 1999, pp.
882-891.
Other Related Publications
Terziyan V., Tsymbal A., Puuronen S., The Decision Support System for
Telemedicine Based on Multiple Expertise, International Journal of Medical
Informatics, Elsevier, V. 49, No.2, 1998, pp. 217-229.
Tsymbal A., Puuronen S., Terziyan V., Arbiter Meta-Learning with Dynamic
Selection of Classifiers and its Experimental Investigation, In: J. Eder, I. Rozman,
and T. Welzer (Eds.), Advances in Databases and Information Systems, Lecture
Notes in Computer Science, Springer-Verlag, Vol. 1691, 1999, pp. 205-217.
Skrypnik I., Terziyan V., Puuronen S., Tsymbal A., Learning Feature Selection
for Medical Databases, In: Proceedings of the 12th IEEE Symposium on
Computer-Based Medical Systems CBMS'99, Stamford, CT, USA, June 1999,
IEEE CS Press, pp.53-58.
Puuronen S., Terziyan V., Tsymbal A., A Dynamic Integration Algorithm for an
Ensemble of Classifiers, In: Zbigniew W. Ras, Andrzej Skowron (Eds.),
Foundations of Intelligent Systems: 11th International Symposium ISMIS'99,
Warsaw, Poland, June 1999, Lecture Notes in Artificial Intelligence, V. 1609,
Springer-Verlag, pp. 592-600.
An Interval Approach to Discover
Knowledge from Multiple Fuzzy
Estimations
The Problem of Interval Estimation
 Measurements (as well as expert opinions) are not absolutely
accurate.
 The measurement result is expected to lie in the interval
around the actual value.
 The inaccuracy leads to the need to estimate the resulting
inaccuracy of data processing.
 When experts are used to estimate the value of some
parameter, intervals are commonly used to describe degrees
of belief.
Noise of an Interval Estimation
 In many real life cases there is also some noise which does
not allow direct measurement of parameters.
 The noise can be considered as an undesirable effect
(context) to the evaluation of a parameter.
 Different measurement instruments as well as different
experts possess different resistance against the influence of
noise.
 Using measurements from several different instruments as
well as estimations from multiple experts we try to discover
the effect caused by noise and thus be able to derive the
decontextualized measurement result.
Decontextualization of Noise in Pattern
Recognition with Multiple Estimations
noise
estimations
1
pattern
2
3
4
Decontextualization
1
2
3
4
result
recognized
pattern
Basic Assumption
 The estimation of some parameter x given by more
accurate knowledge source (i.e. source guarantees
smaller upper bound of measurement error) is
supposed to be closer to the actual value of
parameter x (i.e. source is more resistant against a
noise of estimation).
 The assumption allows us to derive different
trends in cases when there are multiple estimations
that result to shorter estimation intervals.
Basic Idea of Decontextualization
a,b
bres
b   ( u)
bi
bj
ares
ai
a  f ( u)
ui  u j
ui  u j
aj
ui
uj
u
An Example
ares
a1
a2
b1
b2
a3
0
bres
b3
1
2
3
4
5
6
7
8
9 10 11 12 13 x
Some Notes
 If you have several opinions (estimations,
recognition results, solutions etc.) with different
value of uncertainty you can select the most
precise one,
however
 it seems more reasonable to order opinions from
the worst to the best one and try to recognize a
trend of uncertainty which helps you to derive
opinion more precise than the best one.
Application of the Trend of
Uncertainty to Image Restoration
Published and Further Developed in
Terziyan V., Puuronen S., Kaikova H., Handling Uncertainty by
Decontextualizing Estimated Intervals, In: Proceedings of MISC'99
Workshop on Applications of Interval Analysis to Systems and Control
with special emphasis on recent advances in Modal Interval Analysis,
24-26 February 1999, Universitat de Girona, Girona, Spain, pp. 111-121.
Terziyan V., Puuronen S., Kaikova H., Interval-Based Parameter
Recognition with the Trends in Multiple Estimations, Pattern Recognition
and Image Analysis: Advances of Mathematical Theory and Applications,
Interperiodica Publishing, V. 9, No. 4, August 1999.
Flexible Arithmetic for Huge Numbers
with Recursive Series of Operations
Infinite Series of Arithmetical
Operations
1. Addition: a  b (we use as the basic operation);
a  ...
a;
2. Multiplication: a * b  a




b
b
a
 a*
a
* ...
3. Raising to a power:
*a .
b
n 1
n
n 1
n 1
a  b  a
a
 ...
a

4. General case
b times
n
n
n 1
n
1
a  b  (a (b  1))  a; a  1  a; a  b  a  b.
Some results
• A recursive expansion of the set of ordinary arithmetical
operations was investigated;
n
• The recursive arithmetical operation
ab
was defined,
where n is the level of recursion starting with ordinary +
(n=1);
• Basic properties of recursive operations were investigated,
an algorithm for calculation of these operations was
considered;
• The recursive counters’ were proposed for representation
of huge integers, which are results of recursive operations, in
a restricted memory.
Published in
Terziyan V., Tsymbal A., Puuronen S., Flexible
Arithmetic For Huge Numbers with Recursive
Series of Operations, In: 13-th AAECC Symposium
on Applied Algebra, Algebraic Algorithms, and
Error-Correcting Codes, 15-19 November 1999,
Hawaii, USA.
A Similarity Evaluation Technique
for Cooperative Problem Solving
with a Group of Agents
Goal
 The goal of this research is to develop
simple similarity evaluation technique to be
used for cooperative problem solving based
on opinions of several agents
 Problem solving here is finding of an
appropriate solution for the problem among
available ones based on opinions of several
agents
Basic Concepts:
Virtual Training Environment (VTE)
 VTE of a group of agents is a quadruple:
<D,C,S,P>
• D is the set of problems D1, D2,..., Dn in the VTE;
• C is the set of solutions C1, C2,..., Cm , that are used to solve
the problems;
• S is the set of agents S1, S2,..., Sr , who selects solutions to
solve the problems;
• P is the set of semantic predicates that define relationships
between D, C, S
External Similarity Values
D
C
DCi,j
Di
Cj
SDk,i
SCk,j
S
Sk
External Similarity Values (ESV): binary relations DC, SC,
and SD between the elements of (sub)sets of D and C; S
and C; and S and D.
ESV are based on total support among all the agents for
voting for the appropriate connection (or refusal to vote)
Internal Similarity Values
Di’’
D
Cj’’
C
DDi’,i’’
CCj’,j’’
Cj’
Di’
Sk’’
S
SSk’,k’’
Sk’
Internal Similarity Values (ISV): binary relations between
two subsets of D, two subsets of C and two subsets of S.
ISV are based on total support among all the agents for
voting for the appropriate connection (or refusal to vote)
Why we Need Similarity Values
(or Distance Measure) ?
 Distance between problems is used by agents to
recognize nearest solved problems for any new
problem
 distance between solutions is necessary to
compare and evaluate solutions made by different
agents
 distance between agents is useful to evaluate
weights of all agents to be able to integrate them
by weighted voting.
Deriving External Relation DC:
How well solution fits the problem
r
DCi , j  CD j ,i   P ( Di , C j , S k ), Di  D, C j  C
k
Problems D
C
DCi,j=3
Di
Cj
1
S
Agents
Sk
3
Sk
2
Sk
Solutions
Deriving External Relation SC:
Measures Agents Competence in the Area of
Solutions
 The value of the relation (Sk,Cj) in a way
represents the total support that the agent Sk
obtains selecting (refusing to select) the
solution Cj to solve all the problems.
n
SCk , j  CS j ,k   DCi, j  P ( Di , C j , Sk ), Sk  S , C j  C
i
Deriving External Relation SD:
Measures Agents Competence in the
Problem’s Area
 The value of the relation (Sk,Di) represents
the total support that the agent Sk receives
selecting (or refusing to select) all the
solutions to solve the problem Di.
m
SDk ,i  DSi,k   DCi, j  P( Di , C j , Sk ), Sk  S, Di  D
j
Agent’s Evaluation:
Competence Quality in Problem Area
n
1
Q ( Sk )    SD k ,i
n i
D
- measure of the abilities of an agent in the area
of problems from the support point of view
Agent’s Evaluation:
Competence Quality in Solutions’ Area
m
1
Q ( Sk )    SC
m j
C
k, j
- measure of the abilities of an agent in the area
of solutions from the support point of view
Quality Balance Theorem
Q ( Sk )  Q ( Sk )
D
C
The evaluation of an agent competence
(ranking, weighting, quality evaluation) does
not depend on the competence area “virtual
world of problems” or “conceptual world of
solutions” because both competence values
are always equal.
Internal Similarity for Agents:
Problems-based Similarity
Problems D
C
S’D
DS’’
S’
S’S’’D
S’’
S
' '' D
S  S, S  S  S S
'
''
Agents
 S D  DS
'
''
Internal Similarity for Agents:
Solutions-Based Similarity
Solutions
C
D
CS’’
S’C
S’
Agents
S’S’’C
S’’
S
' '' C
S  S, S  S  S S
'
''
 S C  CS
'
''
Internal Similarity for Agents:
Solutions-Problems-Based Similarity
Problems
Solutions
CD
C
D
S’
DS’’
S’C
S’S’’CD
S’’
S
' '' CD
S  S, S  S  S S
'
''
Agents
 S'C  CD  DS''
Conclusion
 Discussion was given to methods of deriving the total
support of each binary similarity relation. This can be used,
for example, to derive the most supported solution and to
evaluate the agents according to their competence
 We also discussed relations between elements taken from
the same set: problems, solutions, or agents. This can be
used, for example, to divide agents into groups of similar
competence relatively to the problems-solutions
environment
Published in
Puuronen S., Terziyan V., A Similarity Evaluation
Technique for Cooperative Problem Solving with a Group
of Agents, In: M. Klush, O. M. Shegory, G. Weiss (Eds.),
Cooperative Information Agents III, Lecture Notes in
Artificial Intelligence, Springer-Verlag, V. 1652, 1999, pp.
163-174.
On-Line Incremental InstanceBased Learning
The Problems Addressed
The following problems has been investigated
both on-line learning for human experts and for
artificial predictors:
• How to derive the most supported knowledge (on-line
prediction or classification) from the multiple experts
(ensemble of classifiers);
• how to make quality evaluation of the most supported
opinion (of the ensemble prediction);
• how to make, evaluate, use and refine ranks (weights) of
all the experts (predictors) to improve the results of the online learning algorithm.
Results Published in
Kaikova H., Terziyan V., Temporal Knowledge Acquisition From Multiple
Experts, In: Shoval P. & Silberschatz A. (Eds.), Proceedings of NGITS’97 The Third International Workshop on Next Generation Information
Technologies and Systems, Neve Ilan, Israel, June - July, 1997, pp. 44 - 55.
Puuronen S., Terziyan V., Omelayenko B., Experimental Investigation of Two
Rank Refinement Strategies for Voting with Multiple Experts, Artificial
Intelligence, Donetsk Institute of Artificial Intelligence, V. 2, 1988, pp. 25-41.
Omelayenko B., Terziyan. V., Puuronen S., Managing Training Examples for
Fast Learning of Classifiers Ranks, In: CSIT’99 - International Workshop on
Computer Science and Information Technologies, January 1999, Moscow,
Russia, pp. 139-148.
Puuronen S., Terziyan V., Omelayenko B., Multiple Experts Voting: Two Rank
Refinement Strategies, In: Integrating Technology & Human Decisions: Global
Bridges into the 21st Century, Proceedings of the D.S.I.’99 5-th International
Conference, 4-7 July 1999, Athens, Greece, V. 1, pp. 634-636.
We will be Happy to Cooperate
with You !
Филиал-кафедра
Advanced Diagnostics
Algorithms in Online Field
Device Monitoring
Vagan Terziyan (editor)
http://www.cs.jyu.fi/ai/Metso_Diagnostics.ppt
“Industrial Ontologies” Group: http://www.cs.jyu.fi/ai/OntoGroup/index.html
“Industrial Ontologies” Group, Agora Center, University of Jyväskylä, 2003
Contents
 Introduction: OntoServ.Net – Global “Health-
Care” Environment for Industrial Devices;
 Bayesian Metanetworks for Context-Sensitive
Industrial Diagnostics;
 Temporal Industrial Diagnostics with
Uncertainty;
 Dynamic Integration of Classification
Algorithms for Industrial Diagnostics;
 Industrial Diagnostics with Real-Time NeuroFuzzy Systems;
 Conclusion.
Vagan
Terziyan
Andriy
Zharko
Oleksandr
Kononenko
Oleksiy
Khriyenko
Web Services for Smart Devices
Smart industrial devices can be also
Web
Service
“users”.
Their
embedded agents are able to monitor
the state of appropriate device, to
communicate and exchange data
with another agents. There is a good
reason to launch special Web
Services for such smart industrial
devices to provide necessary online
condition monitoring, diagnostics,
maintenance support, etc.
OntoServ.Net: “Semantic Web Enabled Network of Maintenance Services
for Smart Devices”, Industrial Ontologies Group, Tekes Project Proposal,
March 2003,
Global Network of Maintenance Services
OntoServ.Net: “Semantic Web Enabled Network of Maintenance Services
for Smart Devices”, Industrial Ontologies Group, Tekes Project Proposal,
March 2003,
Embedded Maintenance Platforms
Embedded
Platform
Host Agent
Maintenance
Service
Service Agents
Based on the online
diagnostics, a service
agent, selected for the
specific emergency
situation, moves to the
embedded platform to
help the host agent to
manage it and to carry
out the predictive
maintenance activities
OntoServ.Net Challenges
 New group of Web service users – smart industrial
devices.
 Internal (embedded) and external (Web-based) agent
enabled service platforms.
 “Mobile Service Component” concept supposes that any
service component can move, be executed and learn at any
platform from the Service Network, including service
requestor side.
 Semantic Peer-to-Peer concept for service network
management assumes ontology-based decentralized
service network management.
Agents in Semantic Web
1. “I feel bad, pressure
more than 200,
headache, … Who can
advise what to do ? “
3. “Wait a bit, I
will give you
some pills”
Agents in Semantic Web supposed
to understand each other because
they will share common standard,
platform, ontology and language
2. “ I think you
should stop drink
beer for a while “
4. “Never had such
experience. No idea
what to do”
The Challenge: Global Understanding
eNvironment (GUN)
How to make entities from our
physical world to understand
each other when necessary
?..
… Its elementary ! But not easy !!
Just to make agents from them !!!
GUN Concept
2. “I have some
pills for you”
1. “I feel bad,
temperature 40, pain in
stomach, … Who can
advise what to do ? “
Entities will interoperate
through OntoShells, which
are “supplements” of these
entities up to Semantic Web
enabled agents
Semantic Web: Before GUN
Semantic Web Applications
Semantic Web applications
“understand”, (re)use, share,
integrate, etc. Semantic Web
resources
Semantic Web Resources
GUN Concept:
All GUN resources “understand” each other
Real World
objects
Real World Object +
+ OntoAdapter +
+ OntoShell =
= GUN Resource
GUN
OntoShells
OntoAdapters
Real World objects
of new generation
(OntoAdapter inside)
Read Our Reports
 Semantic Web: The Future Starts Today
– (collection of research papers and presentations of Industrial Ontologies
Group for the Period November 2002-April 2003)
V. Terziyan
 Semantic Web and Peer-to-Peer:
Integration and Interoperability in Industry
A. Zharko
 Semantic Web Enabled Web Services:
State-of-Art and Challenges
O. Kononenko
 Distributed Mobile Web Services Based on Semantic Web:
Distributed Industrial Product Maintenance System
O. Khriyenko
 Available online in: http://www.cs.jyu.fi/ai/OntoGroup/index.html
Industrial Ontologies Group
Vagan Terziyan
Oleksandra Vitko
Example of Simple Bayesian Network
P(X)
X
P(Y|X)
Y
P(Y)-?
Conditional (in)dependence rule
n
P( X 1 , X 2 ,..., X n )   P( X i | Parents( X i ))
i 1
P(Y  y j , X  xi )  P( X  xi )  P(Y  y j | X  xi )
P(Y  y j )   P( X  xi )  P(Y  y j | X  xi )
Joint probability rule
Marginalization rule
i
P( X  xi | Y  y j ) 
P( X  xi )  P(Y  y j | X  xi )
P(Y  y j )
Bayesian rule
Contextual and Predictive Attributes
air pressure
dust
humidity
temperature
Machine
emission
Environment
Sensors
X
x1
x2
x3
predictive attributes
x4
x5
x6
x7
contextual attributes
Contextual Effect on Conditional
Probability
X
x1
x2
x3
x4
xk
x6
x7
contextual attributes
predictive attributes
Assume conditional
dependence between
predictive attributes
(causal relation between
physical quantities)…
x5
xt
xr
… some contextual
attribute may effect
directly the conditional
dependence between
predictive attributes but
not the attributes itself
Contextual Effect on Conditional
Probability
•X ={x1, x2, …, xn} – predictive attribute with
n values;
•Z ={z1, z2, …, zq} – contextual attribute with q
values;
•P(Y|X) = {p1(Y|X), p2(Y|X), …, p r(Y|X)} –
conditional dependence attribute (random
variable) between X and Y with r possible
values;
•P(P(Y|X)|Z) – conditional dependence
between attribute Z and attribute P(Y|X);
r
n
P(Y  y j )   { pk (Y  y j | X  xi )  P( X  xi ) 
k 1 i 1
q
  [ P( Z  z m )  P( P(Y | X )  pk (Y | X ) | Z  z m )]}
m 1
Contextual Effect on Unconditional
Probability
X
x1
x2
x3
x4
X
xk
x7
xt
P(X)
x1 x2 x3 x4
x6
contextual attributes
predictive attributes
Assume some predictive
attribute is a random
variable with appropriate
probability distribution
for its values…
x5
… some contextual
attribute may effect
directly the probability
distribution of the
predictive attribute
Contextual Effect on Unconditional

Probability
X ={x1, x2, …, xn} – predictive attribute with n
values;
· Z ={z1, z2, …, zq} – contextual attribute with q values
and P(Z) – probability distribution for values of Z;
• P(X) = {p1(X), p2(X), …, pr(X)} – probability
distribution attribute for X (random variable) with r
possible values (different possible probability
distributions for X) and P(P(X)) is probability
distribution for values of attribute P(X);
· P(Y|X) is a conditional probability distribution of Y
given X;
· P(P(X)|Z) is a conditional probability distribution for
attribute P(X) given Z
r
n
P(Y  y j )   {P(Y  y j | X  xi )  pk ( X  xi ) 
k 1 i 1
q
  [ P( Z  z m )  P( P( X )  pk ( X ) | Z  z m )]}
m 1
Bayesian Metanetworks for Advanced Diagnostics
Two-level Bayesian Metanetwork for
managing conditional dependencies
Two-level Bayesian Metanetwork for
managing conditional dependencies
Contextual level
A
X
Q
B
Y
X
A
Predictive level
S
R
Q
B
S
Y
R
2-level Bayesian Metanetwork for
modelling relevant features’ selection
Contextual level
Predictive level
Terziyan V., Vitko O., Probabilistic Metanetworks for Intelligent Data Analysis, Artificial Intelligence,
Donetsk Institute of Artificial Intelligence, Vol. 3, 2002, pp. 188-197.
Terziyan V., Vitko O., Bayesian Metanetwork for Modelling User Preferences in Mobile Environment, In:
German Conference on Artificial Intelligence (KI-2003), Hamburg, Germany, September 15-18, 2003.
Two-level Bayesian Metanetwork for
managing conditional dependencies
Contextual level
P(B|A)
P(Y|X)
A
B
X
Predictive level
Y
Causal Relation between Conditional
Probabilities
xm
xn
P(P(Xn| Xm))
P(Xn| Xm)
P1(Xn|Xm) P2(Xn|Xm) P3(Xn|Xm)
P(P(Xr| Xk))
P(P(Xr| Xk)|P(Xn| Xm))
P(Xr| Xk)
P1(Xr|Xk) P2(Xr|Xk)
xk
There might be causal relationship
between two pairs of conditional
probabilities
xr
Example of Bayesian Metanetwork
The nodes of the
2nd-level network
correspond to the
conditional
probabilities of the
1st-level network
P(B|A) and P(Y|X).
The arc in the 2ndlevel network
corresponds to the
conditional
probability
P(P(Y|X)|P(B|A))
P(Y  y j )  { pk (Y  y j | X  xi )  P( X  xi ) 
i
k
  [ P( P(Y | X )  pk (Y | X ) | P( P( B | A)  pr (Y | X )) P( P( B | A)  pr ( B | A))]}.
r
Other Cases of Bayesian Metanetwork (1)
a)
Contextual level
P(A)
P(X)
Predictive level
X
A
Unconditional probability distributions associated
with nodes of the predictive level network depend
on probability distributions associated with nodes
of the contextual level network
Other Cases of Bayesian Metanetwork (2)
c)
Contextual level
P(A)
P(Y|X)
Predictive level
X
A
Y
The metanetwork on the contextual
level models conditional dependence
particularly between unconditional
and conditional probabilities of the
predictive level
Other Cases of Bayesian Metanetwork (3)
e)
P(A)
P(B)
Contextual level
P(Y|X)
Predictive level
X
A
B
Y
The combination of cases 1 and 2
2-level Relevance Bayesian Metanetwork
(for modelling relevant features’ selection)
Contextual level
Predictive level
Simple Relevance Bayesian Metanetwork
We consider relevance as
a probability of importance
of the variable to the
inference of target attribute
in the given context. In
such definition relevance
inherits all properties of a
probability.
P(Y ) 
Probability to have this model
is:
P((X)=”no”)= 1-X
1
  P(Y | X )  [nx  X  P( X )  (1   X )].
nx X
P(X)
P0(Y)
X
Probability to have this
model is:
P((X)=”yes”)= X
P(Y|X)
Y
Y
Example of 2-level Relevance Bayesian
Metanetwork
In a relevance network
the relevancies are
considered as random
variables between
which the conditional
dependencies can be
learned.
P(Y ) 
1
  {P(Y | X )  [nx  P( X )  P( X |  A )  P( A )  (1  X )]}.
nx X
A
More Complicated Case of
Managing Relevance (1)
1
Probability
Probability
P(X)
P(Z)
X
Z
Probability of this case is equal to:
P((X)=”yes”)×P((Z)=”yes”) =
= X·Z
P(Y|X,Z)
Y
2
3
Probability
P(X)
X
4
Probability
P(Z)
Probability of this case is equal to:
P(Y|X)
Z
Probability of this case is equal to:
P((X)=”no”)×P((Z)=”no”) =
= (1-X)·(1-Z)
P((X)=”no”)×P((Z)=”yes”) =
= (1-X)·Z
Probability of this case is equal to:
P0(Y)
P(Y|Z)
P((X)=”yes”)×P((Z)=”no”) =
= X·(1-Z)
Y
Probability
Y
Y
More Complicated Case of
Managing Relevance (2)
nx
nz
P(Y )   X  Z   P(Y | X  xi , Z  z k )  P( X  xi )  P( Z  z k ) 
i 1 k i
1 nx nz
  X  (1  Z )    P(Y | X  xi , Z  z k )  P( X  xi ) 
nz i 1 k i
1 nx nz
 (1  X )  Z    P(Y | X  xi , Z  z k )  P( Z  z k ) 
nx i 1 k i
nx nz
1
 (1  X )  (1  Z ) 
  P(Y | X  xi , Z  z k ),
nx  nz i 1 k i
General Case of Managing Relevance (1)
Predictive attributes:
X1 with values {x11,x12,…,x1nx1};
X2 with values {x21,x22,…,x2nx2};
…
XN with values {xn1,xn2,…,xnnxn};
Target attribute:
Y with values {y1,y2,…,yny}.
Probabilities:
P(X1), P(X2),…, P(XN);
P(Y|X1,X2,…,XN).
Relevancies:
X1 = P((X1) = “yes”);
X2 = P((X2) = “yes”);
…
XN = P((XN) = “yes”);
Goal: to estimate P(Y).
General Case of Managing Relevance (2)
P(Y ) 
1
N
 nxs
s 1
  ... [ P(Y | X 1, X 2,... XN ) 
X1 X 2
XN
nxr 


r ( ( Xr )" yes ")
Xr
 P( Xr ) 
(1  


q ( ( Xq )"no")
Xq
)]
Example of Relevance Metanetwork
a)
A
X
Q
B
Y
S
Predictive level
b)
A
R
X
c)
Q
B
S
Y
R
Relevance level
Combined Bayesian Metanetwork
Contextual level A
Contextual level B
Predictive level
In a combined Metanetwork two controlling
(contextual) levels will effect the basic level
Learning Bayesian
Metanetworks from Data
 Learning Bayesian Metanetwork structure
(conditional, contextual and relevance
(in)dependencies at each level);
 Learning Bayesian Metanetwork parameters
(conditional and unconditional probabilities
and relevancies at each level).
Vitko O., Multilevel Probabilistic Networks for Modelling Complex
Information Systems under Uncertainty, Ph.D. Thesis, Kharkov National
University of Radioelectronics, June 2003. Supervisor: Terziyan V.
When Bayesian Metanetworks ?
1. Bayesian Metanetwork can be considered as
very powerful tool in cases where structure
(or strengths) of causal relationships between
observed parameters of an object essentially
depends on context (e.g. external environment
parameters);
2. Also it can be considered as a useful model
for such an object, which diagnosis depends
on different set of observed parameters
depending on the context.
Vagan Terziyan
Vladimir Ryabov
Temporal Diagnostics of Field Devices
• The approach to temporal diagnostics uses the
algebra of uncertain temporal relations*.
• Uncertain temporal relations are formalized
using probabilistic representation.
• Relational networks are composed of uncertain
relations between some events (set of symptoms)
• A number of relational networks can be
combined into a temporal scenario describing
some particular course of events (diagnosis).
• In future, a newly composed relational network
can be compared with existing temporal
scenarios, and the probabilities of belonging to
each particular scenario are derived.
* Ryabov V., Puuronen S., Terziyan V., Representation and Reasoning with
Uncertain Temporal Relations, In: A. Kumar and I. Russel (Eds.), Proceedings of
the Twelfth International Florida AI Research Society Conference - FLAIRS-99,
AAAI Press, California, 1999, pp. 449-453.
Conceptual Schema for Temporal Diagnostics
Generating temporal scenarios
Recognition of temporal scenarios
N
N2
N1
N3
D N ,S n
DN ,S1
D N ,S 2
S
S1
N5
S2
… Sn
N4
• We compose a temporal scenario
combining a number of relational
networks consisting of the same set of
symptoms and possibly different
temporal relations between them.
Temporal scenarios
• We estimate the probability of
belonging of the particular relational
network to known temporal scenarios.
Terziyan V., Ryabov V., Abstract Diagnostics Based on Uncertain Temporal
Scenarios, International Conference on Computational Intelligence for Modelling
Control and Automation CIMCA’2003, Vienna, Austria, 12-14 February 2003, 6 pp.
Industrial Temporal Diagnostics
(conceptual schema)
Temporal
data
Estimation
Recognition
Diagnosis
Relational
network
Industrial object
DB of
scenarios
Learning
Ryabov V., Terziyan V., Industrial Diagnostics Using Algebra of Uncertain
Temporal Relations, IASTED International Conference on Artificial Intelligence
and Applications, Innsbruck, Austria, 10-13 February 2003, 6 pp.
Imperfect Relation Between Temporal
Point Events: Definition
Event 1
 < a1; a2; a3 > - imperfect temporal relation
between temporal points (Event 1 and Event 2):
 P(event 1, before, event 2) = a1;
< a1 ; a 2 ; a 3 >
 P(event 1, same time, event 2) = a2;
 P(event 1, after, event 2) = a3.
Event 2
Ryabov V., Handling Imperfect Temporal Relations, Ph.D. Thesis, University of
Jyvaskyla, December 2002. Supervisors: Puuronen S., Terziyan V.
Example of Imperfect Relation
 < 0.5; 0.2; 0.3 > - imperfect
temporal relation between
temporal points:
Event 1
 P(event 1, before, event 2) = 0.5;
 P(event 1, same time, event 2) = 0.2;
< 0.5; 0.2; 0.3 >
 P(event 1, after, event 2) = 0.3.
1
Event 2
<
=
>
R(Event 1,Event 2)
Operations for Reasoning with
Temporal Relations
Composition
a
ra ,b
rb,a = ~
a
b
ra,b
rb,c
c
r a,c = r a,b  r b,c
b
ra,b
r2a,b
r1a,b
Inversion
a
ra ,b  r1a ,b  r 2 a ,b
Sum
b
Temporal Interval Relations
 The basic interval relations are the thirteen
Allen’s relations:
A
A
A
B
A before (b) B
B after (bi) A
B
A meets (m) B
B met-by (mi) A
B
A overlaps (o) B
B overlapped-by (oi) A
B
A starts (s) B
B started-by (si) A
A
B
A during (d) B
B contains (di) A
A
B
A finishes (f) B
B finished-by (fi) A
A
B
A equals (eq) B
B equals A
A
Imperfect Relation Between
Temporal Intervals: Definition
interval 1
< a1; a2 ;… ; a13 >
interval 2
 < a1; a2;… ; a13 > - imperfect temporal
relation between temporal intervals (interval 1
and interval 2):
 P(interval 1, before, interval 2) =
a1;
 P(interval , meets, interval 2) =
a2;
 P(interval 1, overlaps, interval 2) =
a3;
…
 P(interval 1, equals, interval 2) =
a13;
Industrial Temporal Diagnostics
(composing a network of relations)
Sensor 1
Estimation of
temporal
relations between
symptoms
Sensor 2
Industrial object
Sensor 3
Relational network
representing the
particular case
Industrial Temporal Diagnostics
(generating temporal scenarios)
Object B
Object A
N1
Object C
N2
1. for i=1 to n do
2. for j=i+1 to n do
3.
if (R1) or…or (Rk) then
4.
begin
5.
for g=1 to n do
6.
if not (Rg) then Reasoning(, Rg)
7.
// if “Reasoning” = False then (Rg)=TUR
8.
( R) = Å ( Rt), where t=1,..k
9.
end
10.
else go to line 2
N3
Scenario S
Generating the
temporal
scenario
for “Failure X”
DB of
scenarios
Recognition of Temporal Scenario
Temporal
data
Estimation
Recognition
Diagnosis
Relational
network
Industrial object
DB of
scenarios
Learning
m
D N,S 
w d
i 1
m
i
w
i 1
i
Probability
value
i
b
o
m
fi
di
si
e
q
d
s
f
oi
m
i
bi
d R A,B ,RC ,D  Bal( R A,B )  Bal( R C,D )
12
1

Bal(RA,B) =  i  e Ai,B1
12 i  0
wb
=0
weq
=0.5
Balance point for
RA,B
wf
=0.75
Balance point for
RC,D
wbi
=1
When Temporal Diagnostics ?
1.
2.
3.
Temporal diagnostics considers not only a static set of
symptoms, but also the time during which they were
monitored. This often allows having a broader view on
the situation, and sometimes only considering temporal
relations between different symptoms can give us a
hint to precise diagnostics;
This approach might be useful for example in cases
when appropriate causal relationships between events
(symptoms) are not yet known and the only available
for study are temporal relationships;
Combination of Bayesian (based on probabilistic
causal knowledge) and Temporal Diagnostics would be
quite powerful diagnostic tool.
Vagan
Terziyan
Terziyan V., Dynamic Integration of Virtual Predictors, In: L.I. Kuncheva, F. Steimann,
C. Haefke, M. Aladjem, V. Novak (Eds), Proceedings of the International ICSC Congress
on Computational Intelligence: Methods and Applications - CIMA'2001, Bangor, Wales,
UK, June 19 - 22, 2001, ICSC Academic Press, Canada/The Netherlands, pp. 463-469.
The Problem
During the past several years, in a variety of
application domains, researchers in machine
learning, computational learning theory, pattern
recognition and statistics have tried to combine
efforts to learn how to create and combine an
ensemble of classifiers.
The primary goal of combining several classifiers is to
obtain a more accurate prediction than can be
obtained from any single classifier alone.
Approaches to Integrate Multiple
Classifiers
Integrating Multiple Classifiers
Combination
Selection
Decontextualization
Global
(Static)
Local
(Dynamic)
Local
Global
(“Virtual”
(Voting-Type) Classifier)
Inductive learning with
integration of predictors
 xt1, xt 2 ,..., xtm 
Sample Instances
Learning Environment
Predictors/Classifiers
 xr1, xr 2 ,..., xrm  yr 
P1
P2
...
yt
Pn
Virtual Classifier
Virtual Classifier is a group of seven cooperative agents:
Constant Team Members ElectiveTeam Members

 
 TC,
TM,
TP,
TI
,
FS,
DE,
CL

 

Team Instructors
Classification Team
TC - Team Collector
FS - Feature Selector
TM - Training Manager
DE - Distance Evaluator
TP - Team Predictor
CL - Classification Processor
TI - Team Integrator
Classification Team:
Feature Selector
Constant Team Members ElectiveTeam Members

 
 TC,
TM,
TP,
TI
,
FS
,
DE,
CL

 

Team Instructors
ClassificationTeam
FS - Feature Selector
Feature Selector:
finds the minimally sized feature subset that is sufficient for
correct classification of the instance
Sample Instances
Sample Instances
 Χr  yr 
 Χ'r  yr , Χ' r  Χr
Classification Team:
Distance Evaluator
Constant Team Members ElectiveTeam Members

 
 TC,
TM,
TP,
TI
,
FS,
DE
,
CL

 

Team Instructors
DE - Distance Evaluator
ClassificationTeam
Distance between Two Instances with
Heterogeneous Attributes (example)
i  d ( xi , yi )
D( X , Y ) 
2
i , xiX , yiY
where:

0, if xi  yi
if i  th attribute is nominal - 

1, otherwise
d ( xi , yi )  
else : xi  yi

rangei

d (“red”, “yellow”) = 1
d (15°, 25°) = 10°/((+50°)-(-50°)) = 0.1
Distance Evaluator:
measures distance between instances based on
their numerical or nominal attribute values
 xi1, xi 2 ,..., xim 
 x j1, x j 2 ,..., x jm 
Distance Evaluator
dij
Classification Team:
Classification Processor
Constant Team Members ElectiveTeam Members

 
 TC,
TM,
TP,
TI
,
FS,
DE,
CL

 

Team Instructors
ClassificationTeam
CL - Classification Processor
Classification Processor:
predicts class for a new instance based on its selected
features and its location relatively to sample instances
 xi1, xi 2 ,..., xim 
Sample Instances
Feature
Selector
Classification
Processor
Distance
Evaluator
yi
Team Instructors:
Team Collector
Constant Team Members ElectiveTeam Members

 
 TC,
TM,
TP,
TI
,
FS,
DE,
CL


 
Team Instructors
ClassificationTeam
TC - Team Collector completes
Classification Teams for training
Team Collector
completes classification teams for future training
Distance Evaluation
functions
Classification
rules
Feature Selection
methods
Team Collector
FSi
DEj
CLk
Team Instructors:
Training Manager
Constant Team Members ElectiveTeam Members

 
 TC,
TM
,
TP,
TI
,
FS,
DE,
CL

 

Team Instructors
ClassificationTeam
TM - Training Manager trains all
completed teams on sample instances
Training Manager
trains all completed teams on sample instances
Training Manager
Sample Instances
 xr1, xr 2 ,..., xrm  yr 
FSi1
DEj1
CLk1
FSi2
DEj2
CLk2
FSin
DEjn
Sample Metadata
 xr1, xr 2 ,..., xrm  wr1, wr 2 ,..., wrn 
CLkn
Classification Teams
Team Instructors:
Team Predictor
Constant Team Members ElectiveTeam Members

 
 TC,
TM,
TP
,
TI
,
FS,
DE,
CL

 

Team Instructors
ClassificationTeam
TP - Team Predictor predicts weights for
every classification team in certain location
Team Predictor
predicts weights for every classification
team in certain location
Predicted weights
of classification teams
Location
 xi1, xi 2 ,..., xim 
Team Predictor:
e.g. WNN algorithm
Sample Metadata
 xr1, xr 2 ,..., xrm  wr1, wr 2 ,..., wrn 
 wi1, wi 2 ,..., win 
Team Prediction:
Locality assumption
Each team has certain subdomains in the space of
instance attributes, where it is more reliable than the
others;
This assumption is supported by the experiences, that
classifiers usually work well not only in certain points of
the domain space, but in certain subareas of the domain
space [Quinlan, 1993];
If a team does not work well with the instances near a
new instance, then it is quite probable that it will not work
well with this new instance also.
Team Instructors:
Team Integrator
Constant Team Members ElectiveTeam Members

 
 TC,
TM,
TP
,
TI
,
FS,
DE,
CL

 

Team Instructors
ClassificationTeam
TI - Team Integrator produces classification
result for a new instance by integrating
appropriate outcomes of learned teams
Team integrator
produces classification result for a new instance by
integrating appropriate outcomes of learned teams
Weights of classification teams
in the location of a new instance
 xt1, xt 2 ,..., xtm 
 wt1, wt 2 ,..., wtn 
FSi1
DEj1
CLk1
yt1
FSi2
DEj2
CLk2
yt2
FSin
DEjn
CLkn
yt1
Classification teams
Team Integrator
New instance
yt
Static Selection of a Classifier
 Static selection means that we try all teams
on a sample set and for further
classification select one, which achieved
the best classification accuracy among
others for the whole sample set. Thus we
select a team only once and then use it to
classify all new domain instances.
Dynamic Selection of a Classifier
 Dynamic selection means that the team is
being selected for every new instance
separately depending on where this instance is
located. If it has been predicted that certain
team can better classify this new instance than
other teams, then this team is used to classify
this new instance. In such case we say that the
new instance belongs to the “competence area”
of that classification team.
Conclusion
 Knowledge discovery with an ensemble of classifiers is
known to be more accurate than with any classifier alone
[e.g. Dietterich, 1997].
 If a classifier somehow consists of certain feature selection
algorithm, distance evaluation function and classification
rule, then why not to consider these parts also as ensembles
making a classifier itself more flexible?
 We expect that classification teams completed from
different feature selection, distance evaluation, and
classification methods will be more accurate than any
ensemble of known classifiers alone, and we focus our
research and implementation on this assumption.
Yevgeniy Bodyanskiy
Volodymyr Kushnaryov
Online Stochastic Faults’ Prediction
Control Systems Research Laboratory,
AI Department, Kharkov National University of
Radioelectronics. Head: Prof. E. Bodyanskiy. Carries
out research on development of mathematical and
algorithmic support of systems for control, diagnostics,
forecasting and emulation:
1. Neural network architectures and real-time
algorithms for observation and sensor data processing
(smoothing, filtering, prediction) under substantial
uncertainty conditions;
2. Neural networks in polyharmonic sequence
analysis with unknown non-stationary parameters;
Bodyanskiy Y., Vorobyov S, Recurrent Neural Network
Detecting Changes in the Properties of Non-Linear
Stochastic Sequences, Automation and Remote Control, V.
1, No. 7, 2000, pp. 1113-1124.
Bodyanskiy Y., Vorobyov S., Cichocki A., Adaptive Noise
Cancellation for Multi-Sensory Signals, Fluctuation and
Noise Letters, V. 1, No. 1, 2001, pp. 12-23.
Bodyanskiy Y., Kolodyazhniy V., Stephan A. An Adaptive
Learning Algorithm for a Neuro-Fuzzy Network, In: B.
Reusch (ed.), Computational Intelligence. Theory and
Applications, Berlin-Heidelberg-New York: Springer, 2001,
pp. 68-75.
3. Analysis of chaotic time series; adaptive algorithms
and neural network architectures for early fault
detection and diagnostics of stochastic processes;
4. Adaptive multivariable predictive control
algorithms for stochastic systems under various types
of constraints;
5. Adaptive neuro-fuzzy control of non-stationary
nonlinear systems;
6. Adaptive forecasting of non-stationary nonlinear
time series by means of neuro-fuzzy networks;
7. Fast real-time adaptive learning procedures for
various types of neural and neuro-fuzzy networks.
Existing Tools
Most existing (neuro-) fuzzy systems used for fault
diagnosis or classification are based on offline learning
with the use of genetic algorithms or modifications of
the error back propagation. When the number of features
and possible fault situations is large, tuning of the
classifying system becomes very time consuming.
Moreover, such systems perform very poorly in high
dimensions of the input space, so special modifications
of the known architectures are required.
Neuro-Fuzzy Fault Diagnostics
Successful application of the neuro-fuzzy synergism to
fault diagnosis of complex systems demands development
of an online diagnosing system that quickly learns from
examples even with a large amount of data, and maintains
high processing speed and high classification accuracy
when the number of features is large as well.
Challenge: Growing (Learning) Probabilistic
Neuro-Fuzzy Network (1)
input layer,
n inputs
1-st hidden layer,
N neurons
2-nd hidden layer,
(m+1) elements
output layer,
m divisors
Bodyanskiy Ye., Gorshkov Ye., Kolodyazhniy V., Wernstedt J., Probabilistic Neuro-Fuzzy Network with
Non-Conventional Activation Functions, In: Knowledge-Based Intelligent Information & Engineering
Systems, Proceedings of Seventh International Conference KES’2003, 3–5 September, Oxford, United
Kingdom, LNAI, Springer-Verlag, 2003.
Bodyanskiy Ye., Gorshkov Ye., Kolodyazhniy V. Resource-Allocating Probabilistic Neuro-Fuzzy Network,
In: Proceedings of International Conference on Fuzzy Logic and Technology, 10–12 September, Zittau,
Germany, 2003.
Challenge: Growing (Learning) Probabilistic
Neuro-Fuzzy Network (2)
Tested on real data in comparison with
classical probabilistic neural network
Unique
combination
of features
 Implements fuzzy reasoning and classification (fuzzy classification





network);
Creates automatically neurons based on training set (growing
network);
Learns free parameters of the network based on training set
(learning network);
Guarantees high precision of classification based on fast learning
(high- performance network);
Able to perform with huge volumes of data with limited
computational resources (powerful and economical network);
Able to work in real-time (real-time network).
Tests for Neuro-Fuzzy Algorithms
Industrial Ontologies Group (Kharkov’s Branch), Data
Mining Research Group and Control Systems Research
Laboratory of the Artificial Intelligence Department of
Kharkov National University of Radioelectronics have
essential theoretical and practical experience in implementing
neuro-fuzzy
approach
and
specifically
Real-Time
Probabilistic Neuro-Fuzzy Systems for Simulation,
Modeling, Forecasting, Diagnostics, Clustering, Control .
We are interested in cooperation with Metso in that area
and we are ready to present the performance of our
algorithms on real data taken from any of Metso’s products
to compare our algorithms with existing in Metso
algorithms.
Inventions we can offer (1)
 Method
of intelligent preventive or predictive
diagnostics and forecasting of technical condition of
industrial equipment, machines, devices, systems, etc.
in real time based on analysis of non-stationary
stochastic signals (e.g. from sensors of temperature,
pressure, current, shifting, frequency, energy
consumption, and other parameters with threshold
values).
 The method is based on advanced data mining
techniques, which utilize fuzzy-neuro technologies, and
differs from existing tools by flexible self-organizing
network structure and by optimization of computational
resources while learning.
Inventions we can offer (2)
 Method
of intelligent real-time preventive or
predictive diagnostics and forecasting of technical
condition of industrial equipment, machines, devices,
systems, etc. based on analysis of signals with nonstationary and non-multiplied periodical components
(e.g. from sensors of vibration, noise, frequencies of
rotation, current, voltage, etc.).
 The method is based on optimization of
computational resources while learning because of
intelligent reducing of the number of signal
components being analyzed.
Inventions we can offer (3)
 Method and mechanism of optimal control of
dosage and real-time infusion of anti-wear oil
additives into industrial machines based on its
real-time condition monitoring.
Summary of problems we can solve
 Rather global system for condition monitoring and
preventive maintenance based on OntoServ.Net (global,
agent-based, ontology-based, Semantic Web servicesbased, semantic P2P search-based) technologies, modern
and advanced data-mining methods and tools with
knowledge creation, warehousing, and updating during
not only device’s lifetime, but also utilizing (for various
maintenance needs) knowledge obtained afterwards
(various testing and investigations techniques other than
information taken from “living” device’s sensors) from
broken-down, worn out or aged components of the same
type.
Recently Performed Case Studies (1)
 Ontology Development for Gas Compressing Equipment
Diagnostics Realized by Neural Networks
 Available in: http://www.cs.jyu.fi/ai/OntoGroup/docs/July2003.pdf
Semen
Simkin
NN and Ontology using for Diagnostic
Neural Network
SENSOR
SIGNAL
Diagnostic out
Training
The creating ontology classes
instance program

Diagnosing
12
The subclasses and their slots forming and
instances filling by the information is
carried out automatically with the program
on Java. The filling occurs from RDBMS
Oracle, which contains in the actualized
base using in ”UkrTransGas”.
Oracle
Java
Program
Ontology
15
Volodymyr
Kushnaryov
Recently Performed Case Studies (2)
Konstantin
Tatarnikov
 The use of Ontologies for Faults and State Description of Gas-
Transfer Units
 Available in: http://www.cs.jyu.fi/ai/OntoGroup/docs/July2003.pdf
GTU
Repair-Reason
Trend
Analog
Signal
Launch
Signal-Types
PARAMETER
Compute
Variable
ACTIONS
GTUMAINTENANCE
Shutdown
Control-type
Oil-temperature
deviation
Period
Subsystem
Mid-life Repair
GTU
GTU
GTU
SITUATIONS
Axle-shear
REPAIR
GTU
Support-History
GTU-State
Major Repair
Vibration
GTU-Node
Rise-oftemperature
Current Repair
Planned Repair
Compressor
station
SCADA
SCADA
SCADA
SCADA
Volodymyr
Kushnaryov
Agent
Diagnosist
Diagnosist
Ontology
for agent communication
Agent
Conclusion
 Industrial Ontologies Research Group (University of
Jyvaskyla), which is piloting the OntoServ.Net concept of
the Global Semantic Web - Based System for Industrial
Maintenance, has also powerful branches in Kharkov
(e.g. IOG-Kharkov’s Branch, Control Systems Research
Laboratory, Data Mining Research Group, etc.) with
experts and experiences in various and challenging data
mining and knowledge discovery, online diagnostics,
forecasting and control, models learning and integration,
etc. methods, which can be and reasonable to be
successfully utilized within going-on cooperation between
Metso and Industrial Ontologies Group.
Find about our recent activities in:
 http://www.cs.jyu.fi/ai/OntoGroup/projects.htm
 http://www.cs.jyu.fi/ai/vagan/papers.html