Actionable Knowledge Discovery - The Analytic Hierarchy Approach

Download Report

Transcript Actionable Knowledge Discovery - The Analytic Hierarchy Approach

ACTIONABLE KNOWLEDGE DISCOVERY:
THE ANALYTIC HIERACHY PROCESS
APPROACH
Ikuvwerha L.O.; Odumuyiwa, V.T; Ogunbiyi T.D.;
Uwadia, C.O.; Abass O.
DEPARTMENT OF COMPUTER SCIENCE
UNIVERSITY OF LAGOS AKOKA, LAGOS
INTRODUCTION
2

There is an urgent need for a new generation of computational theories and
tools to assist humans in extracting useful information (knowledge) from
the rapidly growing volumes of digital data.

One of the central problems of data mining is the discovering of
interestingness and actionable patterns. “Actionable patterns” is referred to
knowledge that end-user (which could be decision –maker) can act upon or
take action on.

Therefore it is important to filter these patterns through the use of some
measures (interestingness) to produce patterns that are actionable that is
usable to the end-users
STATEMENT OF PROBLEM
3

The blind application of data mining methods(which is criticize as data
dredging in the statistical literature) can easily leading to discovery of
meaningless and invalid patterns.

This is because Data mining has only concentrate more on the mining
techniques.
RELATED WORKS
According to Cao & Zhang (2007) “the traditional data-centered mining
methodology could be complimented by the involvement of domainrelated social intelligence in data mining which leads to domain-driven
data mining“. Simply knowing many algorithms used for data analysis is
not sufficient for a successful data mining (DM) and Knowledge
Discovery (KD) project.
Kavitha and Ramaraj (2013), presented a framework that uses combined
mining and composite approach to generate actionable patterns in terms
of rules. The concept from meta- learning that uses decision theory was
used to formulate a utility interestingness measures (objective and
subjective). Zoo and Mushroom data from the University of California
Irvine was used for the experiment.
4
RELATED WORK………CON’T
Cao (2012), summarised the extreme imbalances that exist in the current data
mining, which are:
• Algorithm imbalance
• Pattern Imbalance
• Decision Imbalance
The paper treats AKD as closed optimisation problem.
AKD := OPTIMAZATION (PROBLEM, DATA, ENVIRONMENT, MODEL,
DECISION)
AKD is a problem-solving Process that transforms business problem ѱ with
problem status t to a problem- solution ф.
Ѱ(./t)
5
ф( ).
……..1
RELATED WORK……….CON’T
Amruta and Balachandran (2013), reviewed the four most used AKD
frameworks for business need. These frameworks are:
 Postanalysis-interestingness-based AKD
 Unified-interestingness-based AKD
 Combined-mining-based AKD
 Multisource combined- mining- based AKD
Their performance (the numbers of actionable pattern sets) was
evaluated under decision making system using a real time tennis
data set. The multisource combine-mining-based AKD performs
better than the others.
6
What is Data Mining?
According to Fayyad, Piatetsky-Shapiro and Smyth (1996), who define it as “the nontrivial process
of identifying valid, novel, potentially useful, and ultimately understandable patterns in data.”
Data mining is a process that takes data as input and outputs knowledge.
7
KNOWLEDGE DISCOVERY PROCESS
It is defined as the nontrivial process of identifying valid, novel,
potentially useful, and ultimately understandable patterns in data.
It consists of many steps (one of them is Data Mining), each
attempting to complete a particular discovery task and each
accomplished by the application of a discovery method.
Knowledge discovery concerns the entire knowledge extraction
process.
8
WHAT IS ACTIONABLE KNOWLEDGE?
The term “actionable pattern” refers to knowledge that can be uncovered in large complex
databases and can act as the impetus for some action.
It is important to distinguish these actionable patterns from the lower value patterns that can be
found in great quantities and with relative ease through so called data dredging.
9
WHAT IS KNOWLEDGE?
Knowledge is a subset of information. But it is a subset that has been
extracted, filtered, or formatted in a very special way. More specifically,
the information we call knowledge is information that has been subjected
to, and passed tests of validation.
10
The basic differences between KDP and AKD
11
ASPECTS
KDP
AKD (DOMAIN – DRIVEN)
OBJECT MINED
Data tells the story
Data and Domain tells the story
AIM
Develop innovative approach
Generate business impacts
OBJECTIVE
Algorithms are the focus
Solving business problem is the focus
DATA SET
Mining abstract and refined data set
Mining constraints real- life data
PROCESS
Data mining is an automated process
Humans are integrated into the process
EVALUATION
Based on technical metrics
Based on actionable options
GOAL
Let data create and verify research innovation.
Let data and metasynthetic knowledge tell the hidden
Push novel algorithms to discover knowledge of
business story. Discover actionable knowledge to satisfy
research interest.
end user
Measuring Knowledge Actionability
Actionability of a pattern: Given a pattern P, its actionable capability
act() is described as to what degree it can satisfy both technical
interestingness and business one.
It is not only interesting to data miners, but generally interesting to
decision- makers.
∀ x ∈ I, ∃P : x.tech_int(P) ∧ x.biz_int(P) ∧ x.act(P)
Therefore, the work of actionable knowledge discovery must focus on
knowledge findings, which can not only satisfy technical interestingness
but also business measures.
12
THE ANALYTIC HIERARCHY PROCESS (AHP)
13

The foundation of the Analytic Hierarchy Process (AHP) is a set of axioms
that carefully delimits the scope of the problem environment (Saaty 1996).
It is based on the well-defined mathematical structure of consistent
matrices and their associated right- eigenvector's ability to generate true
or approximate weights, (Saaty,1980).

It converts individual preferences into ratio scale weights that can be
combined into a linear additive weight w(a) for each alternative a. The
resultant w(a) can be used to compare and rank the alternatives and,
hence, assist the decision maker in making a choice.
THE PROPOSED CONCEPTUAL MODEL AHP-AKD.
14
ILLUSTRATIVE EXAMPLE
According to Mcgarry (2005), a data mining algorithm produced the following patterns.

Patterns 1: IF (age > 60) ∧ (salary = high) THEN
loan =approved

Patterns 2: IF (age < 60) ∧ (salary = average) ∧ (Record = poor) THEN
loan = not approved

Patterns 3: IF (age < 60) ∧ (salary = low) THEN
loan = approved

While the end-user/expert defined pattern is
IF (age > 50) ∧(salary = low ) THEN
loan = not approved.
The major issue is to find the pattern that is more actionable in
terms of less risk.
15
Structuring the problem using AHP

16

The goal is to find actionable pattern. The criteria used are
actionability, unexpected and novel.
The alternatives are Pattern 1, pattern 2 and pattern 3.
Table1:PAIRWISE MATRIX RESULTS
FACTORS
ACTIONABLE
UNEXPECTED
NOVEL
NORMALISED
EIGEN VECTOR
17
ACTIONABLE
1
2
5
0.5701
UNEXPECTED
1/2
1
3
0.3207
NOVEL
1/5
1/3
1
0.1092
λmax = 3.041 CR= 0.0356
This result shows that actionable is of more important with 57%,
followed by unexpectedness with 32%, and Novel with 11%. In
finding actionable patterns or knowledge, actionability of the pattern
comes first followed by unexpectedness and novel.
Table 2: Pairwise comparison matrix for the Alternative with respect actionable factor
ACTIONABLE
PATTERN 1
PATTERN 2
PATTERN 3
NORMALISED
EIGEN
VECTOR
PATTERN 1
1
5
3
0.6485
PATTERN 2
0.33
1
3
0.2296
PATTERN 3
0.2
0.33
1
0.1219
λmax = 3.002 CR= 0.0138
the patterns are evaluated according to their actionability. We
find out that pattern 1 is more actionable with 65%, followed by
Pattern 2 with 23%, and pattern 3 with 12%.
18
Table 3:Pairwise comparison matrix for the Alternative with
respect actionable factor
UNEXPECTED
PATTERN 1
PATTERN 2
PATTERN 3
NORMALISED
EIGEN VECTOR
PATTERN 1
1
0.5
0.2
0.1213
PATTERN 2
2
1
0.33
0.2374
PATTERN 3
5
3
1
0.6413
λmax = 3.0392

19
CR= 0.033.
This result shows that according to pattern unexpectedness, the pattern
are ranked as follow: pattern 3 with 64.13%, pattern 2 with 23.755 and
Pattern 1 with 12.14%. From this it is clear that pattern 3 contradicts the
user’s belief and it is therefore unexpected. This also confirm the result
from the Mcgarry (2005) results using unexpectedness as a factor.
Table 4 : Overall priority

20
ALTERNATIVES
OVREALL PRIORITY
PATTERN 1
0.4664
PATTERN 2
0.2407
PATTERN 3
0.2929
This result shows that the pattern are ranked as follow: pattern 1 with 46.64%, pattern 2
with 24.07% and Pattern 3 with 29.29%. From this it is clear that pattern 1 is seen to be
more actionable followed by pattern 3 and then pattern 2
CONCLUSION
21

The major issue in actionable knowledge discovery is the interestingness measure:
objective and subjective measure.

The proposed conceptual model uses the AHP as the subjective measure

This research therefore concludes that AHP can be effectively used as subjective
interestingness measure for actionable knowledge.
THANK YOU
22