ERP Centric Data Mining and KD
Download
Report
Transcript ERP Centric Data Mining and KD
1
Webcast - searchsap.com
September 10, 2002
ERP Centric Data Mining and
Knowledge Discovery
Naeem Hashmi
Chief Technology Officer
Information Frameworks
e-mail: [email protected]
Web: http://infoframeworks.com
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
2
About the Speaker
•
Founder and CTO of Information Frameworks, an author, speaker and
world-renowned expert on emerging Information Architectures,
Integration and Business Intelligence Technologies.
•
Author of the best selling book titled,
– SAP Business Information Warehouse for SAP, 2000.
Naeem Hashmi
•
Technical Editor
•
– SAP BW Certification Guide, authored by Catherine Roze 2002
Contributing Author, SAP BW Handbook, 2002
•
•
Member of Intelligent ERP magazine's board of editors, is a frequent speaker at IT
industry conferences including SAP TechEd, ASUG, Oracle Open World, DCI, The
ERP World, Data Mining and the Data Warehouse Institute.
25+ years of experience in emerging Information Technology research,
development, and management; Information Architectures; Enterprise Application
Integration e-business; ERP applications; Data Warehousing; Data Mining; CRM;
Internet, Object and Client/Server Technologies and Strategic Consulting.
•
Email- [email protected] url: http://infoframeworks.com
Tel: 603-432-4550
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
3
Agenda
• Data Mining and Knowledge Discovery Basics
• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
4
Agenda
• Data Mining and Knowledge Discovery Basics
• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
5
What is Data Mining and Knowledge Discovery ?
• Data Mining is a tactical process that uses
mathematical algorithms to sift through large datastores to extract data patterns/models/rules
• The Knowledge Discovery is the process of
identifying and understanding potentially useful
hidden anomalies, trends and patterns. Data
mining is an integral part of knowledge discovery
process
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
6
Data Mining and Statistics ?
• DM sounds very similar to regression analysis but
its approach and purpose are quite different
– Statistical methods tests a hypothesis on a data set
– Data Mining starts from the data sets to construct a
hypothesis
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
7
Data Mining - Present State
Application Domains
Business
Life Sciences
Other
317
85
31
73%
20%
7%
Source: http://www.kdnuggets.com/polls/
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
8
Data Mining Methodologies
CRISP-DM
http://www.crisp-dm.org/
Source: http://www.kdnuggets.com/polls/
CRoss Industry Standard
Process for Data Mining
SIX STEPS PROCESS
1. Business Understanding
2. Data Understanding
3. Data Preparation
4. Modeling
5. Evaluation
6. Deployment
Source: http://www.crisp-dm.org/
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
9
Data Mining Process
http://www.crisp-dm.org/
CRoss Industry Standard Process (CRISP) for Data Mining
Data Warehouse
Data
Understanding
Data
Preparation
Initially will take about
60% to 80%
of the data mining project
time
Source: http://www.crisp-dm.org/
1.
2.
3.
4.
5.
6.
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
10
Data Mining - Tools and Data Formats
Domains
Business
Life Sciences
Other
317
85
31
73%
20%
7%
57% Flat files
37% Proprietary
27% DBMS
Source: http://www.kdnuggets.com/polls/
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
11
Data Mining Technology
Visualization
Use human pattern recognition capabilities
Statistics
T
E
C
H
N
I
Q
U
E
S
Applying statistical techniques to predict
Decision Trees
Building scripts based on historic data
Association Rules (Rule Induction)
Reasoning from specific facts to reach a hypothesis
Clustering
U
S
A
G
E
Discover
Understand
Predict
Refers to finding and visualizing groups of facts that were
not previously known
Neural Networks
Learning how to solve problems based on examples
K-Nearest Neighbor
Classification by looking at similar data
Genetic Algorithms
Survival of the fittest …
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
12
Data Mining Models
Two Types of Data Mining Models
Prediction Models
Prediction and Classification
Regression algorithms
• Neural Networks, Rule Induction
• Predict Numerical Outcome
Classification algorithm
Descriptive Models
Grouping & Associations
Clustering/Grouping algorithms
• K-means, Kohonen, Factor
Analysis
Association algorithms
• CHAID, discriminant analysis
• Predict Symbolic Outcome
Copyrights 2002
•
ERP Data Mining & Knowledge Discovery
Apriori, Sequence
webcast searchsap.com Sept 10, 2002
13
Traditional DM vendors
•
•
•
•
•
SPSS Clementine
SAS Enterprise Miner
IBM Intelligent Miner
Salford CART/MARTS
…more
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
Database Vendors – DM within the Products
•
14
Data Mining Engine in Oracle 9i
–
Oracle 9i consists of key products
•
Oracle9i Database ,Oracle9i Application Server,Oracle9i Developer Suite
•
•
•
IBM Intelligent Miner into DB2
TeraMiner into Teradata
Microsoft – SQL Server 2000
•
When you implement DM functionality in a DBMS, you are
limited to a specific database engine and not quite flexible in
a typical enterprise application landscape - heterogeneous
environment.
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
15
Data Mining Standards
•
•
•
•
PMML - Predictive Model Markup Language
OleDB for Data Mining
Java Data Mining API
Other Data Exchange Standards for Analytics and
need Data Mining extensions
–
–
–
–
CWM: Common Warehouse Metadata
XML/A: XML for Analytics
CPEX: Customer Profile EXchange
xCIL: Extensible Customer Information Language
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
16
Agenda
• Data Mining and Knowledge Discovery Basics
• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
17
Enterprise Applications Landscape
• ERP Solutions
– Oracle
– PeopleSoft
– SAP
•
ERP vendors have extended
scope of their applications far
beyond tradition ERP functions
to a wide array of business
solutions such as:
Customer Relationships
Management
Business Intelligence
Enterprise Portals
• Siebel
Copyrights 2002
ERP Data Mining & Knowledge Discovery
• Oracle Business
Intelligence Solution
• Peoplesoft Enterprise
Performance Management
• SAP Business Information
Warehouse
webcast searchsap.com Sept 10, 2002
18
Oracle Business Intelligence Solution
Business Processes (Pre-Built Portlets)
• Response to Lead (27)
• Lead to Quote
(56)
• Quote to Order (15)
• Order to Cash
(34)
• Demand to Build (40)
• Procure to Pay
(28)
• Revenue to Compensation (29)
• Expiration to Renewal (33)
• Issue to Resolution (51)
• HR Family
(43)
Oracle 9i DM Integration
• Oracle Marketing Online for
Campaign Management
• Oracle9iAS Personalization
• iStore
• more to come…
Oracle 9i
Business
Intelligence
Copyrights 2002
Source: Oracle
Oracle9iDS Warehouse Builder
Oracle9iDS Reports
Oracle9iAS Clickstream Intelligence
Oracle9i Data Mining
ERP Data Mining & Knowledge Discovery
Oracle9iAS Discoverer
Oracle9iAS Portal
Oracle9iAS Personalization
Oracle9iDS Business Intelligence Beans
webcast searchsap.com Sept 10, 2002
PeoplSoft Business Intelligence Solution
Enterprise Performance Management (EPM)
Customer Profitability
Finance
Workforce Analytics
Supply Chain Management Process
Workforce Rewards
Enrollment Management
Retail Merchandise
Project Analysis
Student Administration
Balanced Scorecard
CRM Prospect Analysis
Employee Scorecard
Customer Scorecard
Vendor Scorecard
Data mining
Capabilities
CRM Marketing Analysis
CRM Sales Effectiveness
CRM Service Effectiveness
No word on PeopleSoft Data Mining tools/technologies for predictive analytics - home grown, acquired or 3rd Party Products.
No response from PeopleSoft contacts
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
19
20
SAP Business Intelligence Solution
Business Information Warehouse
SAP Markets, Procurement
SAP CRM
Campaign management
Bidding, pattern-based offering
Opportunity analytics
Activity reproting, service
Customer behavior modeling
analytics
SAP SCM
Demand planning
Spend optimization
SCOR KPIs
+1700
Queries
SAP Portals
E-commerce analysis
Closed loop platform capabilities
SAP Financials, Human Capital
Management
SEM
Balanced scorecard
Planning
Economic profit
Benchmarking
Employee turnover & retention
Corporate investment management
+420
InfoCubes
Drill-through (report-report i/f)
Remote cubes (read through)
90
ODS
Objects
Real-time data warehousing
Data mining
Write back to operational system
Source: SAP
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
CRM Venders – Data Mining Integration
• Oracle CRM
– Pre 9i Darwin
– Post 9i ODM
• RightPoint and E.piphany
• SPSS and Siebel
• SAP CRM
– Native Data Mining built in SAP BW - Database Independent
– Interface to IBM Intelligent Miner Interface with SAP BW
• PeopleSoft CRM
– No official data mining product or vendor solution
– Waiting for their response on what they have?
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
21
22
Agenda
• Data Mining and Knowledge Discovery Basics
• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
SAP BW 3.0b Data Mining Implementation
• Currently for Customer Subject Area
• Algorithm Supported
–
–
–
–
Decision Trees
Scoring
Clustering/Segmentation
Association
• Data Mining process
–
–
–
–
–
–
Model definition
No Extensive
Training the model
Data Staging
Performing prediction using the training results
Uploading the results back into BW
Utilizing the mining results (on the operational side)
SAPGUI is the Interface to the Data Mining modeling and analysis
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
23
24
Modeling a Decision Tree
Create a mining model
1
Model ccolumns
Specifying the column parameters
Data type of the column
6
2
7
Specifying the
values in case the
original values in
the column are to
be treated
differently
4
3
The nature of the column content
Indicating the
prediction
column
5
Indicating the key column
Source: SAP
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
25
Modeling a Decision Tree
Specify Model Parameters
Size of the window (such as 10%)
Use portion (%) of
the data for training
or the whole data set
for training
1
2
3
The number of repeats
with different samples
4
7
Stop training when
the no. of cases
under the given node
is less than/equal to
the specified value
6
5
Stop training when the
accuracy is greater than or
equal to the expected accuracy
Use the
information gain
threshold to check
the relevance
If the tree is too big,
prune the tree without
violating the expected
accuracy
Source: SAP
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
26
Modeling a Decision Tree
Create a training source and map the model columns
BW Query
5
1
Runtime parameters for query
Model
columns
3
2
Selected source columns
4
Mapping between model column and source column
Source: SAP
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
27
SAP BW Data Mining – Process Steps
Create a mining model
Train the model
Predictions using
Training results
Using the data mining
results against BW Query
Source: SAP
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
Viewing Decision Tree Training Results
2
Chances of a customer leaving is
70.7% if the profession is
“LABOURER”
1
This decision tree predicts
whether the customer has
left or is still “on board
28
Out of a total of
705 cases, 41
cases are covered
under this node
4
3
Chart shows the
distribution at the
selected node
6
5
28/41 customers
are likely to leave
13/41 customers
are likely to stay
Source: SAP
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
29
Data Mining – Decision Trees
Source: SAP
Uploaded in BW
Then BEX for further Analysis
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
30
Data Mining – Association
• Create a Association
model
• Define Model Columns
• Train the model
• Predictions using
Training results
• Using the data mining
results against BW Query
Source: SAP
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
31
Data Mining – Association
Source: SAP
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
32
Data Mining – Cluster Analysis
• Create a Cluster model
• Train the model
• Predictions using
Training results
• Using the data mining
results against BW Query
Source: SAP
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
33
Viewing Cluster Analysis Results
2
3
1
Source: SAP
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
34
Viewing Cluster Analysis results
Uploaded in BW
Then BEX for further Analysis
Copyrights 2002
ERP Data Mining & Knowledge Discovery
Source: SAP
webcast searchsap.com Sept 10, 2002
35
SAP Data Mining
• Good attempt to implement few Data Mining
Algorithms
• Very traditional Data Mining Approach
• Requires a well versed Statistician or Data
Mining Expert to model and interpret the
results
• Source: BEX Query – Big Limitation in DM
• Weak Visualization
• BEX for additional discovery - slicing and
dicing
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
36
SAP BW - IBM Intelligent Miner
IBM Intelligent Miner is designed to:
• Copy data from SAP BW to IBM Intelligent
Miner
– Results of reports in BW – Modeling in Business
Explorer Analyzer
– Data direct from InfoCubes (for cross-selling analysis)
– Descriptions, hierarchies
• Results data from IBM IM back into SAP BW
– Results of segmentation can be loaded as master data
or hierarchies
• Data transport is designed through Wizards in
SAP BW
– Possible to get a good view of Intelligent Miner
Results from SAP BW
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
37
Agenda
• Data Mining and Knowledge Discovery Basics
• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
38
ERPs and Data Mining: Good and the Bad News
• Good News
–
–
–
–
–
–
–
Known Business Processes
Few data Sources
Improved Data Quality
Metadata Integration
Near real-time data mining
Closed-loop Knowledge Discovery
Consistent Infrastructure
CRISP-DM
1. Business Understanding
2. Data Understanding
3. Data Preparation
4. Modeling
5. Evaluation
6. Deployment
• Bad News
–
–
–
–
Complex Data Structures
Performance
Availability
Very few Data Mining algorithms - Today
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
Data Mining Process and ERP Data Mining
Business
Understanding
Data
Understanding
Data
Preparation
Will reduce data mining
project time up to
50%
Deployment
39
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
Source: http://www.crisp-dm.org/
Good News for Future Business Applications
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
40
Agenda
• Data Mining and Knowledge Discovery Basics
• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
41
INFORMATION FRAMEWORKS
Seminars
Webinars
Keynotes
Panel Moderator
Publications
Hands-on training
Conferences
Executive and Senior IT Management
Consulting
KNOWLEDGE
TRANSFER
Enterprise Information Architectures (EIA)
Business Case Development
Information Architecture Application
Deployment Architectures implementation
Legacy Application Migration Strategies
ERP Application deployment strategies
Enterprise Applications Integration (EAI)
Market Research
Market Assessment
Competitive Analysis
Technology due
INFORMATION
TECHNOLOGY
INVESTORS
INFORMATION
TECHNOLOGY
ORGANIZATION
Architectures, Service Modeling and
design, EAI technology assessment
Tools and Technology Assessment
Vendor Selection and Assessment
Conference Room Pilot implementation
Business Intelligence and Portals
Architectures, Methodologies
Technology/Solution
Assessment
Product Strategy
Solution Strategy
Product Positioning
Competitive Analysis
Software product architecture
Marketing Strategy
Product Performance and
Benchmarking Consulting
Hardware Configuration
SOFTWARE
AND
SOLUTION
VENDORS
Tool/technology/Vendor assessment and
selection
Data Warehouse, Data Marts, Analytics,
Information Delivery
Deployment Architectures
Business Intelligence and eBusiness
Integration architectures
Portals Strategies, Business case,
Assessment, Architectures, Modeling,
Planning and knowledge Transfer
http://infoframeworks.com
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002
42
Questions
Naeem Hashmi
Chief Technology Officer
September 10, 2002
Email: [email protected]
Web Site: http://infoframeworks.com
Tel: 603-432-4550
Copyrights 2002
ERP Data Mining & Knowledge Discovery
webcast searchsap.com Sept 10, 2002