Regression Towards the Mean

Download Report

Transcript Regression Towards the Mean

Bivariate Regression
CJ 526 Statistical Analysis in
Criminal Justice
Regression Towards the Mean
Measure tend to “fall toward” the mean
Tall parents have tall children, but not as
tall as themselves
Sir Francis Galton
Regression
1. Prediction: predicting a variable from
one or more variables
2. Karl Pearson, Pearson r correlation
coefficient, uses one variable to make
predictions about another variable
(bivariate prediction)
Multivariate Prediction
Uses two or more variables (considered
independent variables) to make
predictions about another variable
Y = a +b1x1+b2x2+b3x3+e
Criterion Variable
Criterion variable: The variable whose
value is predicted
A = a constant, x (1, 2, etc) the
independent variables, and b(1,2,) are
the slopes. They are standardized
and referred to as beta weights
Predictor Variables
1. The variable(s) whose values are used
to make predictions
2. Predictions are made based on
independent variables which are
weighted (by the beta weights) that
BEST predict the predictor variable
Regression Line
1.
A straight line that an be used to
predict the value of the criterion
variable from the value of the predictor
variable
Line of Best Fit
2. Graphically, the regression line is the
line that minimizes the size of errors
that are made when using it to make
predictions
Predicted Value (Y’)
1.
2.
3.
4.
Values of Y that are predicted by the
regression line
The regression line is the line of best
fit, that makes the prediction
There will be error
Error, or e = Y –Y’
Least-Squares Criterion
The regression line is determined such
that the sum of the squared prediction
errors for all observations is as small
as possible
Regression Equation
1. The equation of a straight line
(bivariate, one predictor and one
predicted variable)
2. Y’ = 3 X + 2
3. X = 4, Y’ = 3(4) + 2 = 14
4. X = 2, Y’ = 3(2) + 2 = 8
Regression equation
Multiple regression equations an
expansion of the equation example
above to 2 or more predictor variables
to predict a predicted variable
Standard Error of Estimate
Measure of the average amount of
variability of the predictive error
Standard Error of Estimate
SYX  SY 1  r
2
Range of Predictive Error
SYX becomes smaller as r increases
Multiple regression
Multiple regression can tell us how
much variance in a dependent variable
is explained by independent variables
that are combined into a predictor
equation
Collinearity
Very often independent variables are
intercorrelated, related to one another
i.e., lung cancer can be predicted from
smoking, but smoking is intercorrelated
with other factors such as diet, exercise,
social class, medical care, etc.
Multiple Regression
One purpose of multiple regression is to
determine how much prediction in
variability is uniquely due to each IV
Proportion of variance
R squared
The F test can be used to determine the
statistical significance of R squared.
SPSS Procedure Regression
Analyze, Regression, Linear



Move DV into Dependent
Move IV into Independent
Method


Enter
Statistics




Estimate
Model fit
R squared change
Descriptives
SPSS Procedure Regression
Output
Descriptive Statistics




Variables
Mean
Standard Deviation
N
Correlations



Pearson Correlation
Sig (1-tailed)
N
SPSS Procedure Regression
Output -- continued
Variables Entered/Removed
Model Summary
R
 R Square
 Adjusted R Square
 Standard Error of the Estimate

SPSS Procedure Regression
Output -- continued

Change Statistics
R Square Change
 F Change
 Df1
 Df2
 Sig F Change

SPSS Procedure Regression
Output -- continued

ANOVA
Sum of Squares
 Df
 Mean Squares
 F
 Sig

SPSS Procedure Regression
Output -- continued

Coefficients

Model



Unstandardized Coefficients




B
Standard Error of B
Standardized Coefficients


Constant (Y-Intercept)
IV
t
sig
Beta