form of the rules

Download Report

Transcript form of the rules

Rajesh
[email protected]
Join Xtreme Friends Group….
[email protected]
Constraint Based Association Rule Mining




Concepts
Metarule-Guided Rule mining
Constraint pushing
Types of rule constraints
 antimonotonic
 monotonic
 succinct
 convertible
 inconvertible
2
 Constraints ?
 Users expectation or intuition helps confine the search
space
 Forms of constraints
 Knowledge type constraints
 Data constraints
 Dimension / Level constraints
 Interestingness constraints
 Rule constraints
3
 Rule constraints ?
 Specify the ‘form of the rules’
 Rules take the form
 Rule Template / Meta Rule
 Set/subset relationships of attributes mined, aggregates etc.
 ‘Mining query optimizer’ must be incorporated in the
mining process to exploit the constraints specified
4
 Specifies the syntactic form of the rules, interested
 Syntactic forms serves as the constraint
 Based on analysts experience, expectation, or
intuition regarding data
 To analyze the customers traits leading to the
purchase of office software, meta rule will be
P1(X,Y) Λ P2(X,Z)  buys (X, ”office software”)
where P1,P2 are the predicates on customer X
5
 Data mining system searches for the rule of the
form that matches the meta rule given
 For ex. The rule generated matching the given
metarule is
age (X, “30..40”) Λ income (X, “30K..50K”)
 buys (X, “office software”)
6

Consider the template
P1 Λ P2 Λ … Λ Pl  Q1 Λ Q2 Λ… Λ Qr
Each Pi’s and Qj’s are predicates (instantiated / variables)
and l + r = p

To mine for the rules satisfying this template
Find all frequent p-predicate sets, Lp
2. Find support & confidence of Lp
1.
7
 Allows pushing constraints deep into mining
process to confine the search space, assuring the
completeness of the result as well
 Rule constraints specified as expected set/subset
relationship of the variables involved, aggregate
functions etc
 Can be used in conjunction with metarule-guided
mining
8
Look at the following scenario
A datawarehouse with
Fact table
:
sales (cust_name, item_name, TID)
Dimension Tables :
lives_in (cust_name, region, city)
item (item_name, region, city)
transaction (TID, day, month, year)
And the mining query
“Find the sales of which cheap items (price<100$) promote sales
of expensive items (price>500$) of the same group for delhi
customers in 2004”
9

The DMQL query above case would be
1)
mine association as
2)
lives_in(C,_, “delhi”) Λ sales+ (C, ?{ I}, {S})  sales+ (C, ?{ I}, {S} )
from sales
where S.year=2004 and T.year=2004 and I.group=J.group
3)
4)
5)
6)
7)
8)
group by C, I.group
having sum(I.price) < 100 and min (J.price)>500
with support threshold = 1%
with confidence threshold = 50%
10
 From this DMQL Query we can deduce the
following constraints specified
 Meta Rule
 Knowledge constraint
 Data constraint
 Level constraint
 Rule constraint
:
:
:
:
:
Line
Line
Line
Line
Line
2
1
3, line2
2
4 and Line6
:
Line 8
 Interestingness
Constraint
11

Rule constraints can be categorized as
1.
2.
3.
4.
5.

antimonotonic
monotonic
succinct
convertible
inconvertible
Ensures completeness of result while pushing
these rules deep into the mining process
12
 antimonotonic
 “if a itemset does not satisfy the rule constraint, then none
of its supersets satisfy” , property of antimonotonic rules
 example :
sum ( I.price >100)
count ( I ) < 100
 avg ( I ) < 250 is not antimonotonic
 Note, apriori property is antimonotonic.
13
 monotonic
 “if a itemset satisfy the rule constraint, then all of its
supersets satisfy” , property of monotonic rules
 Example
:
sum (I.price) > 100
vєS
 min(S) ≥ V is not monotonic
 Once the subset satisfies this property, further testing for
this rule is redundant
14
 succinct
 “All and only those set guaranteed to satisfy the rule can
be enumerated” Property of succinct rules
 The itemsets can be generated that satisfy the rule even
before the support count starts
 Once such subset is generated, iterative testing for the
constraint can be effectively avoided
 Example :
min(J.price) > 500
max(S) < 120
 avg(S) > v , avg(S) <v are not succinct
15
 convertible constraints
 Constraints not satisfying to any of antimonotonic,
monotonic, succinct can be made to satisfy antimonotonic,
monotonic constraints by changing order of elements in
the set
 Ex :
Avg(price) < 100
 Inconvertible
 Constraints which are not convertible
 Ex :
Sum(S) < v , sum (S) > V ,
element of set S could be any real value
16