EPSAPA - OCBIG

Download Report

Transcript EPSAPA - OCBIG

CONFIDENTIAL
Advanced Analytics
Business Intelligence with
Data Mining
Data Mining
 What’s important
 Association/Binning
 Clustering
 Classification
 Segmentation
 What to expect
 What-if
 Estimation

Curve Fitting

Fill in Sparse Matrix
 Prediction

Probability

Quantitative
Methodology
Statistical Analyst – Business Modeling
Collected Sample
Data
Store
Predictive Metrics & Segments
DBA
business interpretation
Marts
Warehouse
•Optimize data marts
Methodology - EDMDAPA
 Extract
 Integrate disparate data systems
 Build holistic business view
 Group and organize large sets of categorize
 Discretize/Classify
 Grouping and Segmentation
 Simplify large flat dimensions
 Model
 Create predictive estimation functions
 Deploy
 Build/score data marts, cubes with predictive probability and quantitative metrics
and simplified dimensional categories
 Analyze, Visualize, Scorecard
 Identify KPI's, Identify business problems
 Plan
 Predict(Forecast)/Test(What-If)
 Apply performance rules on KPI’s
 Act
 Campaigns, personalization, optimization
Extract
 DecisionStream unites information from disparate data
sources for sampling the enterprise
 80% of the work involved in analytics is collecting,
cleansing, and preparing data
Classification with Scenario
 Segment and
Classify
combinations of
stores, regions,
divisions, customers
or products
 Benchmark against
last month!
Path of success
Model with 4Thought
 Avoids over-fitting
 Works well with
 Noisy
 Co-linear
 Not much or sparse data
 Factor Analysis
 What-if
Filling in the sparse matrix – e.g. #1
 Revenue estimation:
 Dimensional intersect:

Red shoes, southwest, women, springtime:
 $50,000

Black shoes, northeast, men, summer:
 $38,000

Black shoes, southwest, women, summer:
 $43,000

Black shoes, northeast, men, springtime:
 ????
 Once a model is build against historical data, the resultant
function can productively fill in the question marks
Filling in the sparse matrix – e.g. #2
 Insurance cost estimation:
 Dimensional intersect:

Age 38, southwest, female, non-smoker, married:
 $1,800

Age 24, northeast, male, smoker, single:
 $2,300

Age 32, southwest, female, smoker, single:
 $3,000

Age 28, southwest, men, non-smoker, married:
 ????
 Once a model is build against historical data, the resultant
function can productively fill in the question marks
Deploy with DecisionStream
 DecisionStream uses predictive function from
4Thought as UDF for derivation
 Deploy data marts, cubes, and metadata
Analyze, Visualize, Scorecard
Plan
 Determine Business Goals
and apply
 NoticeCast Agents
 KPI Business Pack
 Exception highlighting with
reports
 Forecast with 4Thought
 Access forecasted results with
ETL
Keys to Mining
 Usefulness
 Can the information discovered be
considered knowledge?
 Certainty
 How viable is the discovered
knowledge
 Expressiveness
 Can the discovered knowledge be
represented in a meaningful way
Problems for Mining
 Missing data
 Inconsistent categories
 Too much data
 Difficult to focus
 Not enough data
 Nothing meaningful
 Too many patterns
 Hard to discern knowledge from garbage
 Complexity of discoveries
 Knowledge is too complex to be used
 Unavailable data
The Cognos BI Solution
 Integrating touch-points leads to a 360-degree view of your business.
 Many scored metrics are loaded via predictive models.
 Segmentation is useful for simplifying large flat dimensions.