Transcript LES10

Data Mining
What is Data Mining?
“Data Mining is the process of
selecting, exploring and modeling
large amounts of data to uncover
previously unknown information
for using it to make crucial
business decisions.”
Goal of Data Mining
Simplification and automation of the overall statistical
process, from data source(s) to model application
DATA
INFORMATION
KNOWLEDGE
Data Mining vs. Other analytical approach
Goal
Deliver
-ables
Output
Format
DATA MINING
OLAP
In order to solve problems,
companies look into
their data for scientific &
logical evidence
Users would like to
see pre-defined business
trends quickly and easily
‘If customer
age is between 35 ~ 45 &
product is ‘A’,’E’ &
there is 30% increase in
usage of ATM recently then
response rate is 4 times higher”
Reports in the form of
‘revenue by year/month/area ’
‘revenue by month/area/
weekday’
.
• Rule : If age in (35,45) and
product (‘A’,’E’) and
ATM usage > 30% then…
• Score : 0.55, 0.90..
area/
weekday
BK
PK
SM
01/M
02/T
03/W
9999
3456
4335
1234
4353
5467
3456
6578
5673
Data Mining is …
Decision Trees
Nearest Neighbor Classification
Neural Networks
Rule Induction
K-Means Clustering
Data Mining Algorithms
Predictive
use data on past process to predict future production
Historical
Data
Predictive
algorithm
- neural
- tree
- regression
Probability
of
Future
production
Descriptive
use data on past process to describe current situation
Historical
Data
Descriptive
algorithm
- cluster
- association
Description
of
current
production
Why Data Mining?—Potential Applications
Data analysis and decision making support
• Market analysis and management
– Target marketing, customer relationship management, market
basket analysis, cross selling, etc
• Risk analysis and management
– Forecasting, customer retention, improved underwriting, quality
control, competitive analysis
• Fraud detection and detection of unusual patterns (outliers)
• Text mining (news group, email, documents) and Web mining
• Stream data mining
• Bioinformatics and bio-data analysis
Market Analysis and Management
• Where does the data come from?—Credit card transactions, loyalty cards, discount
coupons, customer complaint calls, plus (public) lifestyle studies
• Target marketing
•
Find clusters of “model” customers who share the same characteristics: interest,
income level, spending habits, etc.
•
Determine customer purchasing patterns over time
• Cross-market analysis—Find associations/co-relations between product sales, &
predict based on such association
• Customer profiling—What types of customers buy what products (clustering or
classification)
• Customer requirement analysis
•
Identify the best products for different groups of customers
•
Predict what factors will attract new customers
• Provision of summary information
•
Multidimensional summary reports
•
Statistical summary information (data central tendency and variation)
Corporate Analysis & Risk Management
•
Finance planning and asset evaluation
• cash flow analysis and prediction
• contingent claim analysis to evaluate assets
• cross-sectional and time series analysis (financial-ratio,
trend analysis, etc.)
•
Resource planning
• summarize and compare the resources and spending
•
Competition
• monitor competitors and market directions
• group customers into classes and a class-based pricing
procedure
• set pricing strategy in a highly competitive market
Fraud Detection & Mining Unusual Patterns
•
•
Approaches: Clustering & model construction for frauds, outlier analysis
Applications: Health care, retail, credit card service, telecomm.
• Auto insurance: ring of collisions
• Money laundering: suspicious monetary transactions
• Medical insurance
– Professional patients, ring of doctors, and ring of references
– Unnecessary or correlated screening tests
• Telecommunications: phone-call fraud
– Phone call model: destination of the call, duration, time of day or week.
• Retail industry
– Analysts estimate that 38% of retail shrink is due to dishonest
employees
• Anti-terrorism
Data Mining Process
Define business
problem
Evaluate
environment
Make data
available
Review
Mine in cycles
Explore
Modify
Implement in
production
Sample
Model
Assess
Indicative ROI Example
Retention Targeting
Assumptions
Number of customers (in selected segment) = 300,000
Average revenue per user (ARPU)/year = THB 14,400
Annual churn rate = 30%
New churn rate through targeted churn activities = 29%

Annual Loss due to old churn rate = THB 1,296 million

Annual Loss due to new churn rate = THB 1,252.8 million

Annual Savings = THB 43.2 million
Indicative ROI Example
Cross selling/Up selling
Assumptions
Number of customers(in selected segment) = 400,000
Number of direct mail/year = 6
Variable cost per direct mail = THB 80.00
Modeling allows for elimination of lower 20% ranked direct mail
list without significant loss in gross response

Annual Cost without modeling = THB 192 million

Annual Cost with modeling = THB 153.6 million

Annual Savings = THB 38.4 million
Indicative ROI Example
Acquisition Targeting
Assumptions
Number of targeted prospects = 30 000
Number of direct marketing campaigns/year = 12
Average response rate = 2%
Average revenue per user (ARPU) = THB 14,400
Improved response rate (due to market segmentation & value
proposition) = 3%

Annual Benefit without modeling = THB 103.6 million

Annual Benefit with modeling = THB 155.5 million

Annual Savings = THB 51.9 million
Justifying ROI
Indicative ROI Example
Retention Targeting
Cross selling/up selling
Acquisition Targeting
=
=
=
THB 43.2 million
THB 38.4 million
THB 51.9 million
Total Savings/Benefits
=
THB 133.5 million
Case Study
The Financial Services of La Poste
“A bank like other banks, but not like other banks”
Generalist Positioning
28 million people have an account
with the Financial Services of La
Poste
12 million have a current account at La Poste
5.6 million customers are under 25
1.2 million customers are financially insecure
500,000 own assets
500,000 are professionals and companies
Multi-channel Customers
800 million incoming annual
contacts with La Poste
320 million visits to Post Offices
368 million cash machine contacts
60 million Internet/Minitel contacts
40 million "incoming" telephone calls
500 million annual outgoing contacts
with La Poste
Very Loyal Customers
Customers who have great confidence in us and who
are very loyalto La Poste because they share our
values…..
Build an integrated CRM
…. But whom we don’t know well enough and with
whom we need to improve the relationship.