Regression Towards the Mean

Download Report

Transcript Regression Towards the Mean

Bivariate Regression
CJ 526 Statistical Analysis in
Criminal Justice
Regression Towards the Mean -continued
Tall parents have tall children, but not
as tall as themselves
Sir Francis Galton
3.
Regression
1. Prediction
Prediction Equation
1.
Karl Pearson, Pearson r correlation
coefficient
Bivariate Prediction
1.
Uses one variable to make predictions
about another variable
Multivariate Prediction
1.
Uses two or more variables
(considered independent variables) to
make predictions about another
variable
Y = a +b1x1+b2x2+b3x3+e
Criterion Variable
1. The variable who value is predicted
A = a constant, x (1, 2, etc) the
independent variables, and b(1,2,) are
the slopes. They are standardized
and referred to as beta weights
Predictor Variable
1. The variable whose values are used to
make predictions
Regression Line
1.
A straight line that an be used to
predict the value of the criterion
variable from the value of the predictor
variable
Line of Best Fit
1.
Regression line that minimizes the
size of errors that are made when
using it to make predictions
Predicted Value (Y’)
1.
Value of Y that is predicted by the
regression line
Error of Prediction
1.
e = Y - Y’
Least-Squares Criterion
1.
The regression line is determined such
that the sum of the squared prediction
errors for all observations is as small
as possible
Regression Equation
1. The equation of a straight line
Example
1.
2.
3.
Y’ = 3 X + 2
X = 4, Y’ = 3(4) + 2 = 14
X = 2, Y’ = 3(2) + 2 = 8
Standard Error of Estimate
1.
Measure of the average amount of
variability of the predictive error
Standard Error of Estimate
SYX  SY 1  r
2
Range of Predictive Error
1. SYX becomes smaller as r increases
Multiple regression
Multiple regression can tell us how
much variance in a dependent variable
is explained by independent variables
that are combined into a predictor
equation
Collinearity
Very often independent variables are
intercorrelated, related to one another
i.e., lung cancer can be predicted from
smoking, but smoking is intercorrelated
with other factors such as diet, exercise,
social class, medical care, etc.
Multiple Regression
One purpose of multiple regression is to
determine how much prediction in
variability is uniquely due to each IV
Proportion of variance
R squared
The F test can be used to determine the
statistical significance of R squared.
SPSS Procedure Regression
Analyze, Regression, Linear



Move DV into Dependent
Move IV into Independent
Method


Enter
Statistics




Estimate
Model fit
R squared change
Descriptives
SPSS Procedure Regression
Output
Descriptive Statistics




Variables
Mean
Standard Deviation
N
Correlations



Pearson Correlation
Sig (1-tailed)
N
SPSS Procedure Regression
Output -- continued
Variables Entered/Removed
Model Summary
R
 R Square
 Adjusted R Square
 Standard Error of the Estimate

SPSS Procedure Regression
Output -- continued

Change Statistics
R Square Change
 F Change
 Df1
 Df2
 Sig F Change

SPSS Procedure Regression
Output -- continued

ANOVA
Sum of Squares
 Df
 Mean Squares
 F
 Sig

SPSS Procedure Regression
Output -- continued

Coefficients

Model



Unstandardized Coefficients




B
Standard Error of B
Standardized Coefficients


Constant (Y-Intercept)
IV
t
sig
Beta