#### Transcript Data Analysis - University of Western Ontario

```Soc 3306a Lecture 9:
Multivariate 2
More on Multiple Regression:
Building a Model and Interpreting
Coefficients
Assumptions for Multiple Regression
Random sample
 Distribution of y is relatively normal

 Check

histogram
Standard deviation of y is constant for
each value of x
 Check
scatterplots (Figure 1)
Problems to Watch For…
Violation of assumptions, especially
normality of DV and heteroscedasticity
(Figure 1)
 Multicollinearity

Building a Model in SPSS (Figure 2)





Should be driven by your theory
at each step whether there is significant
improvement in the explanatory power of the
model. Use Method=Enter.
for R2 change.
Click next, and enter additional IV.
Check the Change Statistics in the Model
Summary watch changes in R2 and coefficients
(esp. partial correlations) carefully.
Multiple Correlation R (Figure 1)
Measures correlation of all IV’s with DV
 Is the correlation of y values with the
predicted y values
 Always positive (between 0 and +1)

Coefficient of Determination R2
Measures the proportional reduction in
error (PRE) in predicting y using the
prediction equation (taking x into account)
rather than the mean of y
 R2 = (TSS – SSE)/TSS
 This is the explained variation in y

TSS = Total variability around the mean of y
 SSE = Residual sum of squares or error

 This

is the unexplained variability
 This
is the regression sum of squares
 The explained variability in y
F Statistic and p-value
This is an ANOVA table
 F is the ratio of the regression mean
square (RSS/df) and the residual (error)
mean square (SSE/df)
 The larger the F, the smaller the p-value
 Very small p-value (<.01 or .001) is strong
evidence for the significance of the model

Slope (b), β, t-statistic and p-value






Slope is measured in actual units of variables.
Change in y for 1 unit of x
In multiple regression, each slope is controlled
for all other x variables
β is standardized slope – can compare strength
t = b/se with df= n-(k+1), note: k = # of predictors
Small p-value indicates significant relationship
with y, controlling for other variables in model
Note: in bivariate regression, t2 = F and β = r
Indicates a spurious relationship
 See printouts in Figure 1
 Indicated by change in the sign of partial
correlations
 Can also check the partial regression plots
(ask for all partial plots under Plots)

Multicollinearity (Figure 1 and 2)





Two independent variables in the model, i.e. x1 and
x2, are correlated with y but also highly correlated
(>.700) with each other
Both are explaining the same proportion of variation
in y but adding x2 to the model does not increase
explanatory value (R, R2)
Check correlation between IV’s in correlation matrix.
Ask for and check partial correlations in multiple
regression (Part and Partial under Statistics)
If partial correlation in multiple model much lower
than bivariate correlation, multicollinearity indicated
A Few Tips for SPSS Mini 6





Review powerpoint for Lectures 8 and 9
Read assignment over carefully before starting.