Slides (3MB PowerPoint document)

Download Report

Transcript Slides (3MB PowerPoint document)

Portrait Quadstone
Webinar:
Scorecard Secrets
Starting
Starting
Starting
in
in10
15
5 now
2
minutes
minutes
Issue 5.2-1
Please join the teleconference call—any
problems, [email protected]
How to ask questions
Use Q&A (not Chat please):
– Click on the Q&A Panel icon at the bottom-right of your screen:
–
Type in your question:
Webinar: Scorecard Secrets
–
–
Presenter: Patrick Surry, VP Technology
Agenda:
– Predictive modeling process
– How do you assess a given model (scorecard)?
– How do you pick the weights in the boxes?
– How do you pick the boxes for each field?
– How do you pick the fields?
– Why use a scorecard (why boxes?), e.g. vs. ‘traditional’ regression
Predictive Modeling Process
1. What is the business problem (what are we predicting)?
2. How is success measured (when is one model ‘better’ than another)?
3. What modeling approach to use?
4. Preprocessing:
– Variable creation
– Variable selection
– Variable transformation
5. Core solver: fitting algorithm to generate “best” model
6. Postprocessing to transform model output into desired prediction (score)
7. Final model
Typical Business Problems in Marketing
Objective field
Prediction
Examples
Binary
Probability
Response
Risk
Churn
Continuous
Predicted value
Spend
Value
(We’ll focus mainly on binary outcomes; approaches are similar for continuous case)
How do we measure success?
–
The score given to a customer is equivalent to either:
– The estimated probability of a binary outcome
– The estimated value of a continuous outcome
–
Sometimes we only care about performance at a cutoff score (e.g. a bank deciding
to make a loan or not)
–
Sometimes we only care about ranking or classifying customers (e.g. outbound
marketing wants to call customers most likely to buy first)
–
Sometimes we care about some quantitative measure of accuracy (e.g. bank
wants to predict level of reserves to keep against future bad loans)
How good is a given model?
–
–
–
–
–
Nominal non-parameteric measures: How good at a cutoff?
– Two-by-two contingency tables
– Information gain, chi-squared significance, Cramer’s V
Ranked non-parametric measures: How well-ordered?
– Gini / RoC
– Kolmogorov-Smirnov
Parametric measures: How accurate for each customer?
– Divergence statistic
– Maximum likelihood measures
– Linear regression
– Logistic regression
– Probit regression
NB. Tend to choose what is mathematically tractable, not what’s business relevant
Luckily they’re typically highly correlated
Scorecard performance
Count
Cutoff
Bads
D
C
A
Goods
B
Score
Target rate = Accept rate
=(A+B)/(A+B+C+D)
Hit rate = Bad rate
=B/(A+B)
– Often can directly assign a
financial value to each
category
–
–
Ranked metrics (Gini, KS)
measure how well the score
sorts goods to the right and
bads to the left
Parametric metrics (R2, MLE)
measure how accurate each
prediction is
Scorecards
Issue 5.2-1
What is a scorecard?
bins
Age
fields
Income
<25
25-34
35-54
55+
-100
-50
0
100
<20k
20k-30k
30k+
-25
0
50
Base
Score
scores
(weights)
500
Apply scorecard
Customer ID Age Income
Score
C001
19
15000
– 100 – 25 + 500 = 375
C002
65
21000
100 + 0 + 500 = 600
C003
42
65000
0 + 50 + 500 = 550
Scorecard ingredients
–
–
–
–
How do you assess a given scorecard (model)?
How do you pick the weights in the boxes?
How do you pick the boxes for each field?
How do you pick the fields?
–
Why a scorecard (why boxes?) – scorecard vs regression
What numbers in the
boxes?
Issue 5.2-1
Linear model
–
–
Linear model (multiple regression), perhaps with manually transformed variables
y = w x + b
(e.g. y is Response, x is Age, Income, w are coefficients, b is intercept)
Scorecard builder doesn’t implement this form
– Even with continuous outcomes we use transformed inputs
Optimizer
Output
(w, b)
Inputs
(x, y)
Linear solver
Generalized linear model
–
–
–
Generalized linear model (including link function, e.g. f() as log-odds)
f(y) = w  x + b  y = f-1(w  x + b)
Although the core is still linear, finding w to optimize the quality metric typically isn’t
Scorecard builder doesn’t implement this form
Optimizer
Output
Transformation
& Rescaling
Postprocessing
Output
(w, b)
Inputs
(x, y)
Non-linear
solver
Generalized additive model
–
–
–
–
Generalized additive model (arbitrary functions of the independent variables)
f(y) = w  F(x) + b
Scorecard builder uses very simple class of functions F(x):
Piecewise constant fit of x to the observed outcome, or indicator variables
For continuous variables, we use this form without the link function
For binary variables, we always use the link function (“linear regression” just uses
an approximation of the non-linear solution)
Preprocessing
Non-linear or
Linear approx.
solver
Optimizer
Output
Transformation
& Rescaling
Postprocessing
Output
(w, b)
Inputs
(x, y)
Variable
Transformation
(F)
Core Solver
–
–
–
–
Variable
Transformation
Core Solver
Output
Transformation
& Rescaling
Choose weights (numbers in boxes) to maximize likelihood of observing actual
outcomes (based on quality measure)
Solver window: controls optimization parameters
Singular value decomposition provides robust solution with correlated variables
Can still see sensitivity with very small categories (though shouldn’t impact
predictions unless those categories become large when scoring)
Postprocessing
–
–
–
Variable
Transformation
Core Solver
Output
Transformation
& Rescaling
Model types: Risk, Response, Churn, Satisfaction
– No change to ‘core’ statistics, just flipping signs and labels
Scaling of final score via two constants:
– Even-odds point: log(odds) = 0 (50% likelihood)
– Odds-doubling factor: log(odds) increment (e.g. +20 points double odds)
Core model always fits odds
– Always ‘logistic’ form (except with ‘continuous’ model)
– Prediction y is rescaled as Ay + B to give best logistic fit with desired scaling
– “Linear regression” quality measure solves linear approximation to logistic
What boxes for each
field?
Issue 5.2-1
Variable transformation
Variable
Transformation
Core Solver
Output
Transformation
& Rescaling
Generalized additive model: f(y) = w  F(x) + b
–
What are the input variables x or F(x)?
–
In traditional regression, x are raw variables,
or manually transformed, e.g. Income, log(Income)
–
In scorecard building, either:
– One weight per bin: indicator (dummy) variables, one per bin in source
variable, representing bin membership.
– More fitting power (but also more free parameters)
– One weight per field: Continuous variables, one per field, transformed
from the source variable based on the outcome rate in each bin
– Implicit transform (to the observed bad rate) gives a significant
advantage over “standard” regression techniques
Sample Field Transformations
Field transformation options
Optimized binning
–
–
Maximize a measure of (categorical) association with the outcome
The default technique is a hierarchical merge
– Similar to that used to generate a decision tree
– Maximize the information gain at each step
Optimized binning
Target number of bins = 5
Age
–
–
Attempts to maximize univariate predictiveness, that is, minimize loss of predictiveness
Uses either iterative splitting (like decision tree) or exhaustive search
Summary: Scorecard vs Traditional (linear) regression
–
–
–
–
–
scorecard is a generalized additive model based on indicator functions
model is thus piecewise constant in each of the independent variables
automatically captures non-linear relationships, more robust to outliers
increases in lift of 2% or more in real-world risk, response and retention modeling
applications
simpler to build, understand & explain / socialize
What fields?
Issue 5.2-1
Scorecard Builder: stepwise inclusion / exclusion
– Why not use all available fields?
– More fields typically increase training performance but risk overfit on test
– Larger model is more difficult to explain & socialize
– Build time scales by square of number of (transformed) variables
– Build a set of trial scorecards:
– Include a candidate field
– Build a scorecard
– Compute the quality measure
– Include the next field…
– Choose the field that creates the best trial scorecard
– Linear Fit uses residuals to compute the marginal sum-of-squares error
– Quality Measure uses a hybrid
– Similar to traditional -score technique
“Right-Size” Scorecard
Select point at which model quality exhibits diminishing returns on test data
1. Generate test/training split
2. Build logistic model on training data using remaining variables (initially all)
3. Measure quality (gini measure) when applied to test data
4. Exclude least contributory variable (based on training data)
5. Repeat to step 2 until no variables remain
6. Choose last point where test-set performance increases by minimum threshold
7. Refit model with selected number of variables using all data
Response Model Right-sizing
48
46
44
42
Gini%
–
40
38
36
Train
34
Test
32
30
0
10
20
30
# independent variables
40
50
Automating scorecard building workflow
Analytic
Dataset
(Optional)
Parameter
overrides
Optimize
Binnings
Variable
Reduction
Right-size
Model
Final Model
– Optimized binning of each
independent variable: “best”
piecewise constant transformation
– Many variables (1000s) require
automated tools to help focus effort
– Time is money
– Variable reduction using recursive
step-wise exclusion
– To perform same steps by hand
would take several days (or weeks)
– Model “right-sizing” by seeking
point of diminishing returns on testdata
– Automation completes in minutes
Further Reading
–
–
–
–
–
–
–
–
–
Generalized additive models (GAM)
Generalized linear models (GLM)
Gini
ROC
Kolmogorov-Smirnov,
Probit
Logit
Singular value decomposition (SVD)
McCullagh P, Nelder JA, Generalised Linear Models (2nd edition), Chapman and
Hall 1989.
Any questions?
After the webinar
–
–
These slides, and a recording of this webinar will be available via
http://support.quadstone.com/info/events/webinars/
Any problems or questions, please contact [email protected]
Upcoming webinars
See http://support.quadstone.com/info/events/webinars/
www.portraitsoftware.com