Multiple Regression
Download
Report
Transcript Multiple Regression
Multiple Regression
From last time
• There were questions about the bowed shape of the
confidence limits around the regression line, both for
limits around the mean and the individuals should be
curved because they are estimates.
• In theory if you knew the population values you would not
need to bow CLs around the individuals. You just shift
the density around following the regession line.
CLs
• You do not have the same percision across the
entire range.
• The CLs around the mean are based on:
ˆ x y b( x x )
var[ y b( x x)] var( y) ( x x ) var( b)
2
• The CLs for a new person has the distance from
the mean of x in the formula as well.
Overall
estimated
variance in y
2
1
(
x
x
)
s 2 [1
]
2
n
x
Multiple Regression
• You have seen that you can include polynomials
in a regression model. What about including
entirely different predictors? Say you want to
predict the blood pressure of daft scientists
before they give talks at an international
conference. You could predict with many single
predictors or as a set:
– age
– the size of the audience
– number of “hawt” potential evil lackeys in the front row
Explaining variance
Total variance
in blood pressure
Age
Audience size
Lackey Quality
Explained Multivariate Variance
• The total variance explained depends on how
correlated the predictors are.
• You want to have a global R2 to indicate the
amount of the variance explained by the model
and also measures of the contributions of the
predictors.
Multicolinearity
• Even though audience size and lackey quality
both allow for you to predict the heart rate of the
mad scientists, the amount of variance that is
uniquely associated with the lackey variable is
very little. When you have very correlated
predictors, they can’t uniquely explain variance.
• You can end up with a model that is statistically
significant with a big R2 but none of the
individual predictors is statistically significant.
Stop it before it starts
• Before you do a multiple regression use
subject matter knowledge to remove highly
correlated predictors. Look at the bivaraite
correlation coefficients between the
predictors. If you have highly correlated
variables use subject matter and
pragmatism to decide which varible to put
in the model.
Partitioning Variance
• What is the unique contribution of each variable?
• There are different formulas for adding up the sum of
squares SS (in the variance).
• Sequential (aka type 1 SS) lets the first variable explain
as much variance as it can then add in the second and
see if it can explain any of the remaining variance.
• Simultaneous (aka type 2 SS) put the first all the
variables in at the same time and let them divvy up the
common variance.
• Simultaneous with interactions (aka type 3 SS) have the
variable try to explain all they can and consider they are
used interactions.
• Type 4 SS makes my head hurt.
SAS vs. R
• S-Plus and SAS use the same formulas
for SS but R does not use the same
formula for Type 3 SS (and I have never
tested it on Type 4 SS).
Partial and Semipartial Correlations
• If you want to look at the correlation between a
predictor (a) and an outcome (z) controlling for
the impact of a second predictor (b) you can do
a partial correlation to remove the impact b on
both.
raz.b
raz rab rzb
1 r 2 ab 1 r 2 zb
• You can also remove the correlation between b
and a only you can do a semi-partial correlation.
raz.b
raz rab rzb
1 r 2 zb
Hierarchical Stepwise Regression
• If you take 2nd and 3rd quarter statistics you will
learn the details on how to compare two models.
Hierarchical stepwise regression is the process
of figuring out what variables matter (in
advance) and adding them to a model to see if
you improve the quality of the model as you add
them.
• People frequently look at the R2 for the models
and/or use AIC.
• This is a fine thing to do so long as you keep
track of the comparison you make and report it.
Automatic Stepwise Regression
• These are BAD BAD BAD.
• You feed the software a set of variables and tell
it put them into the model one at a time to find
the predictor that explains the most outcome
variance. Once that is put into the model, add all
the remaining ones one at a time to see if the
residual variance is reduced with the second
variable. Repeat over and over. Some of the
methods subtract variables instead of adding
(others do both).
• The Type 1 error is astronomical.
• These methods have horrible properties. Adding
in completely random variables affects the
model.