Discovery Robust Knowledge from Databases that Change
Download
Report
Transcript Discovery Robust Knowledge from Databases that Change
Discovering Robust Knowledge
from Databases that Change
Chun-Nan Hsu
Craig A. Knoblock
Arizona State University
University of Southern California
Journal of Data Mining and Knowledge Discovery,
Volume 2, 1998
Presented & Edited by: Danielle Steimke
CS332/295 Data Mining – Spring 2014
1
Outline
Introduction
Preliminaries
Robustness of Knowledge
Applying Robustness in Knowledge
Discovery
Conclusions
2
Introduction: Importance
Real world databases are dynamic.
Changing information (updates, deletions) may make
current rules incorrect
How can we detect, check & update these inconsistent
rules without high maintenance costs?
Use robustness as a measure of how likely knowledge is to
be consistent after database changes
3
Introduction: BASIL
BASIL rule discovery system – discovers
semantic rules that optimize performance
Many rules may be invalid after DB changes
Use robustness estimation to guide data mining
using a rule pruning approach
Search for pruning pattern that yields highly
robust rules
4
Introduction: Applications of Robustness Estimation
Minimize the maintenance cost of inconsistent rules in the
presence of database changes.
When an inconsistent rule is detected, a system may:
remove the rule: repeatedly invoke discovery system
repair the rule: repaired frequently after data changes
Apply to guide the data mining and evaluation of semantic rules:
In data mining stage prune antecedents of a discovered rule
to increase its robustness.
In evaluation stage, eliminate rules with robustness values below a
threshold.
5
Introduction: Robustness Comparison
Robustness vs. Support:
Support - the probability that a data instance satisfies a rule.
Robustness - probability that an entire database state is
consistent with a rule.
Robustness vs. Predictive Accuracy:
Predictive - the probability that knowledge is consistent with
randomly selected unseen data.
The difference is significant in databases that are interpreted
using the closed-world assumption (CWA).
6
Introduction: Closed-world Assumption
Information that is not explicitly present in the database is taken to be
false.
Closed-world databases are widely used in relational db’s, deductive
db’s and rule-based information systems because of
The characteristics of application domains.
The limitation of the representation systems.
Databases change by deletions, insertions and updates, and in a closeworld database, they may affect the validity of a rule.
Closed-world data tends to be dynamic, important for knowledge
discovery systems to handle dynamic and closed world data.
7
Important Definitions
Database state at time t: the collection of the instances
presents in the database at time t.
Database Literals vs. Built-In Literals
Range Rule vs. Relational Rule
Consistent rule: given a database state, if all variable
instantiations that satisfy the antecedents of the rule also
satisfy the consequent of the rule.
8
Outline
Introduction
Preliminaries
Robustness of Knowledge
Applying Robustness in Knowledge
Discovery
Conclusions
9
First Definition of Robustness
Robust rule: does not become inconsistent
(invalid) after database changes
Definition 1: (Robustness for all states)
Given a rule r, let D be the event that a database
is in a state that is consistent with r.
Robust 1 (r )= Pr( D)
# of database states consistent with r
Robust 1 (r )=
# of all possible database states
10
First Definition Continued
Two problems with estimate
Treats all database states as if they are equally
probable
Number of database states is large, even in small
databases
11
Second Definition of Robustness
Robustness for accessible states:
Given a rule r, and a database in a state d, in which r is
consistent. Let t denote the event of performing a
transaction on d that results in new database states
inconsistent with r. The robustness of r in accessible
states from the current state d is defined as
Robust (r∣ d )= Pr (¬t∣ d )= 1 - Pr (t∣ d )
12
Definitions of Robustness (cont.)
If all transactions are equally probable, then
Robust 1 (r )≡ Robust( r∣ d )
The robustness of a rule could be different in different
database states.
E.g.: Suppose there are two db states d1 and d2 of a given db. To
reach a state inconsistent with r, we need to delete 10 tuples in d1
and only 1 tuple in d2.
Robust(r|d1) > Robust(r|d2)
13
Estimate Robustness
Estimate the probabilities of data changes, rather
than the number of possible database states.
Estimate the robustness of a rule based on the
probability of transactions that may invalidate the
rule.
Decompose data changing transactions and
estimate their probabilities using the Laplace law
of succession.
14
Estimate Robustness (cont.)
Laplace law of Succession:
Given a repeatable experiment with an outcome of one of
any of k classes. Suppose we have conducted this
experiment n times, r of which have resulted in some
outcome C, in which we are interested. The probability
that the outcome of the next experiment will be C can be
estimated as
r+1
n+k
15
Decomposition of Transactions
To estimate Pr(t|d)
Use Bayesian network model to decompose
the transaction and estimate local probabilities
first
X3:
on which tuples?
X1:
type of transaction?
X2:
on which relation?
X4:
on which attributes?
X5:
what new attribute
value?
16
Estimating Robustness
Estimation accuracy depends on available information.
However, even given only database schemas, the method
can still come up with some estimates.
How to derive transactions that invalidate an arbitrary logic
statement?
Most knowledge discovery systems have strong restrictions on the
syntax of discovered knowledge.
Therefore the invalidating transactions can be manually generalized
into a small sets of transaction templates, as well as templates of
probability estimates for robustness estimation.
17
Outline
Introduction
Preliminaries
Robustness of Knowledge
Applying Robustness in Knowledge
Discovery
Conclusions
18
Applying Robustness in Knowledge
Discovery
A rule pruning algorithm which can increase the robustness and
applicability of machine discovered rules by pruning their antecedent
literals.
Specification of rule pruning problem:
Take a machine-generated rule as input, which is consistent with a
database but potentially overly-specific, and remove antecedent literals
of the rule so that it remains consistent but is short and robust.
Basic idea of pruning algorithm:
To search for a subset of antecedent literals to remove until any further
removal will yield an inconsistent rule.
However the search space can be exponentially large with respect to the
number of literals in rule.
A beam-search algorithm was presented to trim the search space.
19
Antecedent pruning algorithm
Pruning rule antecedents
1. INPUT R = rules (initially the rule to be pruned), B = beam size;
2. LET O = results;( initially empty);
3.
WHILE (R is not empty) DO
4.
Move the first rule r in R to O;
5.
Prune r, LET R’ = resulting rules;
6.
Remove visited dangling or inconsistent rules in R’;
7.
Estimate and sort on the robustness of rules in R’;
8.
Retain top B rules in R’ and remove the rest;
9.
Merge sorted R’ into R in sorted order of the robustness;
10. RETURN O ;
20
Pruning Algorithm
Beam-search algorithm
Apply the robustness estimation approach to estimate
the robustness of a partially pruned rule and guide the
pruning search.
Two properties to optimize: robustness and length.
For each set of equally short rules, the algorithm
searches for the rule that is as robust as possible while
still being consistent.
21
Experimental Results
Experiment setups:
Use the rule discovery system BASIL to derive rules from
two large ORACLE relations databases
BASIL part of SIMS information mediator developed by authors
Synthesize 123 sample transactions to represent possible
transactions of the experimental databases (27 updates, 29
deletions and 67 insertions)
Steps of the experiments:
Train BASIL to discover a set of rules and estimate their robustness,
generates 355 rules
Use another set of 202 sample transactions to assist the robustness
estimation.
Apply the set of 123 transactions to the two relational database and
check consistency of all 355 rules
22
Conclusions
Formalize the notion of robustness of a rule r in a given
database state d
Estimate probabilities of rule invalidating transactions
Apply in a rule discovery system
Decomposes the probability of a transactions into local probabilities
that can be estimated using Laplace law.
No need to provide additional information for the estimation.
Pruning antecedents of a discovered rule so that the rule will be
highly robust and widely applicable.
Beam-search algorithm
Conduct empirical experiments to demonstrate the
algorithm.
23
Real-world Applications
Face recognition software
Envirionment is the “data set” – constantly
changing
Rules must resist variability in lighting and pose
Systems prognostics
Data set continuously updated with diagnostic or
calibration information
Predictive failure rules used to dictate system
resource planning
24
First Exam Question
Why is finding robust rules significant?
25
First Question - Answer
Real world databases tend to be dynamic
Changing information – updates and
deletions, rather than just additions – could
invalidate current rules
Continually checking and updating rules may
incur high maintenance costs, especially for
large databases
Robustness measures how likely the
knowledge found will be consistent after
26
Second Exam Question
Compare and Contrast Robustness
estimations with support and predictive
accuracy
27
Second Question - Answer
Robustness is the probability that an entire
database state is consistent with a rule, while
support is the probability that a specific data
instance satisfies a rule.
Predictive Accuracy is the probability that
knowledge is consistent with randomly selected
unseen data. This is significant in closed world
databases, where data tends to be dynamic.
28
Third Exam Question
Why are transaction classes mutually exclusive
when calculating robustness?
29
Third Question – Answer
Transaction classes are mutually exclusive so
that no transaction class covers another
because for any two classes of transactions
t_a and t_b, if t_a covers t_b, then
Pr(t_a⋀t_b)=Pr(t_a) and it is redundant to
consider t_b.
30