a MS Powerpoint version of "Comparing Time Series

Download Report

Transcript a MS Powerpoint version of "Comparing Time Series

Comparing Time Series, Neural
Nets and Probability Models for
New Product Trial Forecasting
• Eugene Brusilovskiy
• Ka Lok Lee
• These slides are based on the authors’
presentation at the 4th Annual Hawaii
International Conference on Statistics,
Mathematics, and Related Fields
Problem Introduction
• Goal: To predict future sales using sales information from
an introductory period
• Product: A new (unnamed) soft beverage that was
introduced to a test market
• Data: We have 52 weeks of sales data, which we split into
training (first 39 weeks) and validation (last 13 weeks)
datasets
– We build the models using the training dataset and
then examine how well the models predict sales in the
last 13 weeks
• The methods employed here apply to predicting the sales
of any newly introduced consumer good
2
Prediction Methods Used
• Time Series
– Most common technique, available in almost every
statistics software
• Neural Nets
– Extensive data-mining tool (requires expensive
software)
• Probability Modeling
– Not always available in standard statistical packages,
may be coded in Excel
3
Cumulative Sales (Units Sold)
Training Data – Cumulative Sales for the First 39
Weeks
180
160
140
120
100
80
60
40
20
0
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
Week
T = 39
4
Time Series
• A time-series (TS) model accounts for patterns in the
past movements of a variable and uses that information
to predict its future movements. In a sense a time-series
model is just a sophisticated method of extrapolation
(Pindyck and Rubinfeld, 1998).
5
Time Series
• Autoregressive Moving Average Model: ARMA(1,1) –
generally recognized to be a good approximation for
many observed time series
y t  y t 1   t   t 1
or
1  Byt  1  B t
6
Neural Networks
• A Neural Network (NN) is an information processing
paradigm inspired by the way the brain processes
information (Stergiou and Siganos, 1996).
• MLP (The Multi-Layer Perceptron) is used here
7
Neural Networks
• A Neural Network consists of neuron layers of 3 types:
– Input layer
– Hidden layer
– Output layer
• We use two models with different MLP architectures: a
model with one hidden layer and a model with a skip
layer
8
Neural Networks (cont’d)
Given the rule on the left, we deduce the pattern on the right:
X1X2X3
X
AND
X1X2X3
X
X1 X2 X3
X1 X2 X3
X1 X2 X3
X1 X2 X3
X1 X2 X3
X1 X2 X3
X1 X2 X3
X1 X2 X3
X
X
X
X
X
X
X
X
or X
or X
or X
or X
9
Neural Networks
Structure of Neural Net Models:
10
Neural Networks
• Neural Networks are especially useful for problems
where
– Prediction is more important than explanation
– There are lots of training data
– No mathematical formula that relates inputs to
outputs is known
• Source: SAS Enterprise Miner Reference Help.
Neural Network Node: Reference
11
Probability Modeling
• Probability models:
– Are representations of individual buying behavior
– Provide structural insight into the ways in which
consumers make purchase decisions (Massy el at.,1970)
• Specific assumptions of purchase process and latent
propensity (Bayesian flavor)
• Explicit consideration of unobserved heterogeneity
12
Probability Modeling
• Individual purchase time or time-to-trial is modeled by
“Diffusion Model”.
• Exponential-Gamma (EG), also known as the Pareto
distribution (Hardie et al., 2003)
• Time to trial ~ Exponential (λ)
• λ~ Gamma (r, α)
 1  e 

0
 t
 r r 1e 
r 
d
13
Probability Modeling
• After solving the integral, the cumulative probability
function becomes:
• F(t) =
• LL =
  
1 

  t 
r
 F t   F t  1 

Sales t  ln 

F T 
t 1


T
• Estimation uses Excel Solver
14
15
Results
• All three models do a relatively good job predicting future
sales, but Exponential Gamma is the best
Mean Absolute
Percentage Error
(MAPE)
T
MAPE 

t 1
Exp.
Gamma
2.7%
Neural
Nets
9.0%
Time
Series
5.5%
Actual Sales t  P redicted Sales t
Actual Sales t
T
Where T is the total number of time periods (weeks). Here, t=1 is the
first validation week (week 40)
16
New Product Sales – Results
Cumulative Sales (Units Sold)
200
Actual
Exp. Gamma
Neural Nets
Time Series
180
160
140
120
100
80
60
40
20
T=39
0
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
Week
17
Time Series - Results
• Captures “jumps” in the training data
• Implies no additional sales (the product is “dead”),
extreme case of forecast
180
160
140
120
100
80
60
40
20
0
1
4
7
10 13 16 19 22 25 28 31 34 37 40 43 46 49 52
Forecast
Actual
18
Neural Nets - Results
• Can sometimes be over-responsive to “jumps” in training
data
180
160
140
120
100
80
60
40
20
0
1
4
7
10 13 16 19 22 25 28 31 34 37 40 43 46 49 52
Actual
Forecast
19
Probability Model - Results
• Overall, the best method
• Furthermore, allows the analyst to make statements
about the consumers in the market
160
140
120
100
80
60
40
20
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51
Actual
Forecast
20
Next Steps
• Include covariates
• Different training periods
• Perform comparative analysis for other areas of
forecasting
– Customer Lifetime Value
21
References
• Hardie B. G.S., Zeithammer R., and Fader P. (2003),
Forecasting New Product Trial in a Controlled Test
Market Environment, Journal of Forecasting, 22: 391410
• Massy, W.F., Montgomery, D.B. and Morrison, D.G.
(1970), Stochastic Models of Buying Behavior, The M.I.T.
Press, 464 pp.
• Pindyck, R.S. and Rubinfeld D.L. (1998), Econometric
Models and Economic Forecasts, Irwin/McGraw-Hill.
• SAS Enterprise Miner Reference Help. Article: Neural
Network Node: Reference
• Stergiou, C., & Siganos, D. (1996), Introduction to Neural
Networks. Available online at
www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/repo
rt.html
22