Capabilities Presentation presented to Microsoft 08.17.05

Download Report

Transcript Capabilities Presentation presented to Microsoft 08.17.05

Capabilities
Apollo and SQL Server Data Mining
Presented by
Jeff Kaplan, Principal Client Services
Paul Bradley, Ph.D., Principal Data Mining Technology
312.787.7376
Agenda
Apollo Overview
Data Mining 101
Project REAL Case Study
SQL Server 2005 Data Mining Demo
Real-life Examples
2
PART ONE
Apollo Overview
3
overview
Company Background
First company delivering true predictive analytic solutions
10 plus years in data mining and data warehousing
Premier Partner for SQL Server 2005 Data Mining
Cater to a wide range of business including Microsoft, Sprint, Wal-Mart,
Barnes & Noble, Seattle Times, Knight Ridder
Variety of Industries
•
Retail and Consumer Goods
•
Media
•
Financial Services
•
Manufacturing
•
Public Services
4
overview
Industry Recognition
5
overview
Testimonials
6
overview
Testimonials
7
overview
Testimonials
8
overview
Analytic Landscape
9
overview
Capabilities
Sales & Distribution
Marketing
Operations
Market Research
•
Customer Acquisition
•
Inventory Forecasting
•
Correlation Analysis
•
Claim Analysis
•
Campaign Targeting
•
Sales Forecasting
•
Key Driver Analysis
•
Call Center Analytics
•
Cross-sell/Up-sell
•
Pricing Optimization
•
Verbatim Summarization
•
Data Warehousing
•
Customer Segmentation
•
Next Best Offer
•
Dashboard Reporting
•
Retention Modeling
•
Market Basket Analysis
•
Behavioral Targeting
•
Recency & Frequency
•
Personalization
Modeling
10
overview
Customer Targeting Models
• Join Customer Data Sources
• Run Predictive Algorithms
• Score Model Results
• Deliver Targeted Predictions
Customer Clustering Models
Red
Card
Phone
Predictive Models
Web
Booking
SQL-Server
2005
Call
Center
Automate Predictions for
Targeting, Forecasting, Detection,
etc.
Email
Dashboard &
Ad-hoc Reporting
Stores
Direct
Mail
Measure Promotion Success
11
PART TWO
MS Data Mining
12
ms data mining
Background
Fastest Growing BI Segment (IDC)
•
•
Data Mining Tools: $1.85B in 2006
Predictive Analytic projects yield a high median ROI of 145%
Uses
•
•
•
•
•
Marketing: Customer Acquisition and Targeting, Cross-Sell/Up-Sell
Retail: Inventory Forecasting, Price Optimization
Market Research: Driver Analysis, Verbatim Summarization
Operations: Call Center Analytics
Finance: Fraud Detection, Risk Models
Mainstream Emergence
•
•
•
E-commerce (e.g Amazon.com)
Search (e.g. Vivisimo.com)
Behavioral Advertising
SQL-Server is in a Unique Position to Service Market Needs
13
ms data mining
Evolution of SQL Server Data Mining
Enter the Game
 Create industry standard
 Target developer audience
 V1.0 product with 2
algorithms
Win Leadership
 Continue standards and
developer effort
 Comprehensive feature set
 Penetrate the Enterprise
 Thought leadership
14
ms data mining
Value of Data Mining
Relative Business Value
Business Knowledge
SQL-Server 2005
Data Mining
OLAP
Reports (Adhoc)
Reports (Static)
Easy
Difficult
15
ms data mining
SQL-Server 2005 BI Platform
Analysis Services
OLAP & Data Mining
Integration Services
ETL
SQL Server
Management Tools
Development Tools
Reporting Services
Relational Engine
16
ms data mining
SQL Server 2005 BI Platform
Embed Data Mining: Development Tool Integration
•
Make Decisions Without Coding
•
Customized Logic Based on Client Data
•
Logic Updated by Model Reprocessing – Applications Do Not Need to be ReWritten, Re-Compiled, and Re-Deployed
Data Mining Key Points
•
Price Point to Achieve Market Penetration
•
Database Metaphors for Building, Managing, Utilizing Extracted Patterns and
Trends
•
APIs for Embedding Data Mining Functionality into Applications
17
ms data mining
SQL-Server 2005 Algorithms
Decision Trees
Clustering
Sequence Clustering
Time Series
Association
Linear and Logistic Regression
Neural Net
Naïve Bayes
18
PART THREE
Project REAL
19
project real
Client Profile – Inventory Forecasting
•
Create a Reference Implementation of a BI System Using Real Retail Data.
•
Partners - Barnes & Noble, Microsoft, Scalability Experts, EMC, Unisys,
Panorama, Apollo
•
Forecast Out-of-Stock for 5 Book Titles Across Entire Chain (800 Stores)
•
Predictive Models to Flag Items That Are Going to be Out-of-Stock
•
Model on 48 Weeks of Data, Predictions for Month of December
•
Models Predicted Out-of-Stock Occurrences > 90% Accuracy
•
Conservative Sales Opportunity for just 5 Titles: $6,800 per year
•
Extrapolate Across Millions of Titles - Million Dollar Sales Opportunity
20
project real
Predictive Modeling Process
STEP 1
STORE
+
ITEM
STEP 2
Identify the
cluster which
the store
belongs to,
for the
category of
that item.
STEP 3
Utilize sales
data predict
item sales 2
weeks out.
Each item
belongs to a
category
Category
CATEGORY
For the category,
create a set of
store clusters
predictive of sales
in the category
21
project real
Store Clustering Demo
22
project real
Out-of-Stock Data Preparation Summary
Apollo Explored 3 Data Preparation Strategies
1.
Use Sales, On-Hand, On-Order History Data for All Stores in the Same Cluster
Build One Mining Structure per Cluster, For All Stores in that Cluster for Each Title
Build One Mining Model per Store, per Cluster for Each Title
Negative: Few OOS Examples per Store, Computation to Deploy One Mining Model per
Store/Title Combination
2.
Use Sales, On-Hand, On-Order History for All Stores, Across All Clusters
Build One Mining Structure per Book, Use Cluster Membership of Store as Input Attribute
Positive: Optimizes OOS Examples per Title by Considering All Stores
Negative: Does Not Capture Derivative Sales Information
3.
Removed Negative of Strategy 2
Included Historical Week-on-Week Sales Derivative Information for Each Title
Increase the Information Content of the Source Data for Modeling
24
project real
Creating Variables for Success
Using:
•
•
Sales and Inventory History from January 2004 to end of November 2004
Recommend two (2) years of Historical Data to Increase accuracy for training model
Key:
•
Store + Fiscal Year + WeekID
Predicted Variables
•
•
•
•
1 Week Ahead OOS Boolean
1 Week Ahead Sales Bin (None, 1 to 2, 3 to 4, 4+)
2 Week Ahead OOS Boolean
2 Week Ahead Sales Bin (None, 1 to 2, 3 to 4, 4+)
Input Attributes
•
•
•
•
Store Cluster Membership (Derived from Store Cluster Model)
Current Week Sales, On-Hand, On-Order
Preceding 1-5 Week Sales, On-Hand, On-Order
Sales Derivative Atttributes
25
project real
Model Training and Testing Scenarios
Purpose: Intelligence on Model Training Frequency
•
Scenario 1: Train Models Every 2 Weeks
Training Dataset: All Data Prior to Last 2 Fiscal Weeks in December 2004
Test Dataset: Last 2 Fiscal Week in December 2004
•
Scenario 2: Train Models Monthly
Training Dataset: All Data Prior to End of Fiscal November 2004
Test Dataset: Fiscal Month of December 2004
26
project real
Balancing Training Data
When Considering All Stores, Still Have Un-Balanced Datasets
•
[# Store/Week Combinations Where OOS is False] >> [# Store/Week
Combinations Where OOS is True]
•
Common in Many Data Mining Applications
Training Datasets were Balanced
•
Sample Store/Week Combinations Where OOS is False to Obtain Equal
Proportion of True/False Values
“Cost” of Predictive Errors are Equal
•
Requested by Client
27
project real
Prediction Methods
Algorithm Selection
Microsoft Decision Trees for Predicting OOS Boolean flags
Consistently High Overall Accuracy
Straightforward Interpretation
Data Preparation
•
Scenario 2
•
Rebuild models monthly
Predictive Models are Contextual and Optimized for Behavior in the Coming Month
28
project real
Prediction Methods
Modeling Methodology Benefits
•
Scalability (Titles and Stores)
•
Saves 4x to 5x on Computational Cost when Rebuilding Models (versus Neural
Networks)
5 Minutes for All 5 Titles => 1 Minute per Title for All Stores
29
project real
Out-of-Stock Prediction Demo
30
project real
Inventory Prediction Results
1 week and 2 week prediction accuracies
OOS
TITLE
JUNIE B JONES IS A GRADUATION
CAPTAIN UNDERPANTS & THE INVA
MTH RESEARCH GDE #01 DINO
MTH RESEARCH GDE #08 TWISTERS
SECRETS OF DROON #04 CITY IN
AVERAGE ACCURACY
Week 1
97.53%
99.06%
100.00%
98.29%
97.71%
Week 2
92.87%
87.67%
83.82%
83.60%
84.31%
98.52%
86.45%
SALES BINS
Week 1
Week 2
98.46%
99.98%
99.06%
99.96%
100.00%
100.00%
99.48%
100.00%
99.13%
100.00%
99.23%
99.99%
32
project real
Sales Opportunity
Data Mining created revenue generating opportunity
Based on 55 titles for Jan 2004 - Dec 2004
•
•
•
(# of weeks OOS across all stores)(Apollo Boolean Predicted Accuracy)
X (actual % of actual sales across all stores) x (retail price)
= Yearly Increase in Sales Opportunity using Apollo OOS Predictions
TITLE
JUNIE B JONES IS A GRADUATION
CAPTAIN UNDERPANTS & THE INVA
MTH RESEARCH GDE #01 DINO
MTH RESEARCH GDE #08 TWISTERS
SECRETS OF DROON #04 CITY IN
# of OOS
1,165
10,040
15,227
4,444
7,115
1-2 Sales
1.16%
1.01%
0.16%
0.44%
0.65%
Price
14.95
17.95
14.95
27.95
21.95
2 Wk Pred
92.87%
87.67%
83.82%
83.60%
84.31%
$
$
$
$
$
$
1 Copy Sales
187.44
1,590.96
305.13
460.57
861.37
$
$
$
$
$
2 Copy Sales
374.89
3,181.93
610.26
921.14
1,722.74
3,405.48 $
6,810.95
Sales bins produced $3.4K, $6.8K potential lift in sales
33
PART FOUR
Client Profiles
34
client profiles
Client Profile – Customer Acquisition
•
Decrease Subscriber Churn
•
Increase New Subscriptions
•
Segment Geo-Demographic and Attitudinal Behaviors for Subscribers
and Non-Subscribers
•
Build Predictive Models to Identify Likely New Subscribers
•
Using Analysis to Deliver Targeted Marketing Campaigns for Acquisition
•
Increased Stop Saves by 2%
35
client profiles
Client Profile – Cross sell / Up sell (Global Catalog Retailer)
•
Increase Average Purchase Size
•
Deploy Product Recommendations on their Website
•
Modeling Historical Sales to Determine Product Affinities
•
Incorporate Business Logic into Modeling Process (e.g. Same category
recommendation)
•
Increase Average Shopping Cart Size
•
Increase Sales Lift
•
Data Mining Driven Product Recommendation Performed Better than
Manual Recommendations
36
client profiles
Client Profile – Customer Support Automation
•
Increase Visibility into Customer Service Center
•
Increase Speed of Customer Support
•
Utilizing Text Mining Engines to Automate Processing of Customer Support (Email,
Web Inquiries, etc.)
•
Automating the Process of Rolling up Keywords into Concepts
•
Customer Support Center has the Ability to View Trends in Minutes versus Weeks
•
Improved Accuracy - Text Mining Engines Removed the Bias and Inaccuracies
Often Occurring in Call Center Representative Notes and Tagging.
37
client profiles
Client Profile – Key Driver Analysis
•
Evaluate Customer Satisfaction Metrics
•
Increase Customer Satisfaction
•
Partnered with Apollo to Develop Market Research Database and Reporting
•
Developed Models to Identify “Key” Satisfaction Drivers
•
Successfully Identified Drivers to Increase Customer Satisfaction
•
Delivered Driver Recommendations to Field Operations - Insight into Action
•
Company Wide (sales, marketing, executive level) Visibility into Customer
Satisfaction Metrics
38
Presented by
Jeff Kaplan
Principal Client Services
[email protected]
312.787.7376