6.1.4 AIC, Model Selection, and the Correct Model

Download Report

Transcript 6.1.4 AIC, Model Selection, and the Correct Model

6.1.4 AIC, Model Selection, and the Correct Model
o Any model is a simplification of reality
o If a model has relatively little bias, it tends to provide accurate
estimates of the quantities of interest
o Best model is often the simplest (less parameters)- model parsimony
Akaike Information Criterion (AIC)- alternative to significance tests to
estimate quantities of interest
o Criterion for choosing between competing statistical models
o AIC judges a model by how close its fitted values tend to be to the
true values
o The AIC selects the model that minimizes:
AIC = -2(maximized log likelihood – # parameters in the model)
o This penalizes a model for having too many parameters
o Serves the purpose of model comparison only; does not provide
diagnostic about the fit of the model to the data
AIC = -2(maximized log likelihood – # parameters in the model)
In SAS:
AIC = -2LogL + 2p
Crab Example :
Table 6.2 (p. 215): The best models have
smallest AIC’s
o Best models have main effects,
COLOR and WIDTH (AIC = 197.5)
Parameter
D
F
Estimate
Standard
Error
Wald
Chi-Square
Pr > ChiSq
PROC LOGISTIC (Backward Elimination) :
Intercept
1
-12.3508
2.6287
22.0749
<.0001
proc logistic descending ; class color
spine / param = ref ;
model y = width weight color spine /
selection = backward lackfit ;
width
1
0.4972
0.1017
23.8872
<.0001
Backward Elimination Procedure
Step 0. The following effects were
entered: Intercept width weight
color spine
Step 1. Effect spine is removed
Step 2. Effect weight is removed
Analysis of Maximum Likelihood Estimates
Model Fit Statistics
Intercept
Only
Intercept
and
Covariates
AIC
227.759
201.202
-2 Log L
225.759
185.202
Criterion
In our case, AIC is equal in all steps:
227.759 = -2LogL + 2p = 225.759 + 2(1),
where p = 1
6.1.5 Using Causal Hypotheses to Guide Model Building
o Rather than using selection techniques, such as stepwise, which look at
significance levels of each parameter, use theory and common sense to
build a model (Add and remove parameters that make sense)
o A time ordering among variables may suggest causal relationships
Example : (table 6.3, p. 217)
In a British study, 1036 men and women (married and divorced) were asked whether
they’ve had premarital and/or extramarital sex. We want to determine whether G =
gender, P = premarital sex, and E = extramarital sex are factors in whether a person
is M= married or divorced.
Simple Model :
G→P→E→M
Any of these is an explanatory variable when a variable listed to its right is the response
Complex Model (Triangular) : (Fig. 3.1, p. 218)
1st stage : predicts G has a direct effect on P
2nd stage : predicts P and G have direct effects on E
3rd stage : predicts E has direct effect on M ; P has direct and indirect effects on M;
G has indirect effects through P and E
Table 6.4 : Goodness of Fit Tests for Model Selection
1st Stage : predicts Gender has a direct effect on
Premarital Sex
ˆ 
100 x 219
 .27
141 x576
The estimated odds of
premarital sex for
females is .27 times
that for males.
data causal2 ;
input gender $ PMS TOTALPMS ;
datalines ;
F 100 676
M 141 360
;
Model (Response P, no Actual Explanatory)
PROC GENMOD DATA = CAUSAL2 DESCENDING ;
CLASS GENDER ;
MODEL PMS/TOTALPMS = / DIST = BIN LINK = LOGIT;
Model (Response P, Actual Explanatory G)
PROC GENMOD DATA = CAUSAL2 DESCENDING ;
CLASS GENDER ;
MODEL PMS/TOTALPMS = GENDER / DIST = BIN LINK =
LOGIT TYPE3 RESIDUALS OBSTATS ;
PMS
Yes
No
Total
Female
100
576
676
Male
141
219
360
241
795
1036
Total
Criteria For Assessing Goodness Of Fit
Criterion
DF
Value
Value/DF
Deviance
1
75.2594
75.2594
Pearson Chi-Square
1
78.1753
78.1753
Log Likelihood
-561.9568
Criteria For Assessing Goodness Of Fit
Criterion
DF
Value
Deviance
1
0.000
Pearson Chi-Square
1
0.000
Log Likelihood
-524.3271
Value/DF
Goodness of Fit as a Likelihood-Ratio Test
The L-R statistic -2(L0 – L1) test whether certain model parameters are zero by comparing
the log likelihood L1 for the fitted model M1 with L0 for the simpler model M0
(formula p. 187)
For the example, we will use the fact -2(L0 – L1) = G2(M0) - G2(M1) using SAS output.
1st Stage :
G2 = G2(M0) - G2(M1) = 75.2594 – 0.0000 =
-2(L0 – L1) = -2(-561.9568 – (-524.3271)
Df = 1 – 0 = 1, so χ2 p-value < .001 and
effect on pre marital sex suggesting
is a better model.
75.2594
= 75.2594
there is evidence of a gender
having G as an explanatory variable
2nd Stage : predicts Gender and Premarital Sex have direct effects on Extramarital Sex
PMS
GENDER
Female
Male
EMS
Yes
No
Total
Yes
21
79
100
No
40
536
576
Yes
39
102
141
No
21
198
219
data causal3 ;
input gender $ PMS $ EMS TOTALEMS ;
datalines ;
F Y 21 100
F N 40 576
M Y 39 141
M N 21 219
;
Model (Response E, no Actual Explanatory)
PROC GENMOD DATA = CAUSAL3 DESCENDING ; CLASS GENDER PMS ;
MODEL EMS/TOTALEMS = / DIST = BIN LINK = LOGIT TYPE3 RESIDUALS OBSTATS ;
Model (Response E, P Actual Explanatory)
PROC GENMOD DATA = CAUSAL3 ; CLASS GENDER PMS ;
MODEL EMS/TOTALEMS = PMS / DIST = BIN LINK = LOGIT TYPE3 RESIDUALS OBSTATS ;
Model (Response E, G+P Actual Explanatory)
PROC GENMOD DATA = CAUSAL3 DESCENDING ; CLASS GENDER PMS ;
MODEL EMS/TOTALEMS = GENDER PMS / DIST = BIN LINK = LOGIT TYPE3 RESIDUALS OBSTATS
Model (Response E, no Actual Explanatory)
Model (Response E, P Actual Explanatory)
Criteria For Assessing Goodness Of Fit
Criteria For Assessing Goodness Of Fit
Criterion
D
F
Value
Value/DF
Deviance
3
48.9244
16.3081
Pearson Chi-Square
3
56.7739
18.9246
Log Likelihood
Criterion
DF
Value
Value/DF
Deviance
2
2.9080
1.4540
Pearson Chi-Square
2
2.9542
1.4771
Log Likelihood
-350.4605
-373.4687
Model (Response E, G+P Actual Explanatory
Criteria For Assessing Goodness Of Fit
Criterion
DF
Value
Value/DF
Deviance
1
0.0008
0.0008
Pearson Chi-Square
1
0.0008
0.0008
Log Likelihood
;
-349.0069
Model E = 1 vs. E = P
G2(M0) - G2(M1) = 48.9244 – 2.9080 = 46.016
-2(L0 – L1) = -2(-373.4687–(-350.4605) = 46.016
df = 3-2= 1, so χ2 p-value < .001, so there is
evidence of a P effect on E
Model E = P vs. E = G+P
G2 = G2(M0) - G2(M1) = 2.9080 - .0008 = 2.9
df = 2-1 = 1, so χ2 p-value > .10 so only weak
evidence occurs that G had a direct effect as
well as indirect effect on E. So E = P is a
sufficient model.
3rd stage : predicts Extramarital Sex has direct effect on Marriage ; Premarital Sex has direct and
indirect effects on Marriage; Gender has indirect effects through PMS and EMS
PMS
EMS
GENDER
Female
Yes
Yes
No
Yes
No
No
Yes
17
4
21
No
54
25
79
Yes
36
4
40
214
322
536
Yes
28
11
39
No
60
42
102
Yes
17
4
21
No
68
130
198
No
Male
Divorced
Model M = E + P vs. M = E*P
G2 = G2(M0) - G2(M1) = 18.1596 – 5.2455
so χ2 p-value < .10 so the interaction
Model M = E*P vs. M = E*P + G
G2 = G2(M0) - G2(M1) = 5.2455 - .6978 =
so χ2 .025 < p-value < .05 so adding G
data causal ;
input gender $ PMS $ EMS $
DIVORCED TOTAL ;
datalines ;
F Y Y 17 21
F Y N 54 79
F N Y 36 40
F N N 214 536
M Y Y 28 39
M Y N 60 102
M N Y 17 21
M N N 68 198
;
= 12.91, with df = 5-4= 1
EMS*PMS is a better model to predict Divorce
4.5477, with df = 4-3= 1
to interaction EMS*PMS fits slightly better.
Conclusion for Causal Relationships
Good alternative for model building by using common sense to hypothesize relationships
6.1.6 New Model-Building Strategies for Data Mining
o Data mining is the analysis of huge data sets, in order to find
previously unsuspected relationships which are of interest or value
o Model Building is challenging
o There are alternatives to traditional statistical methods, such as
automated algorithms that ignore concepts such as sampling error and
modeling
o Significance tests are usually irrelevant, since nearly any variable has
significant effect if n is sufficiently large
o For large n, inference is less relevant than summary measures of
predictive power