Regression - Maarten Buis

Download Report

Transcript Regression - Maarten Buis

Regression
Maarten Buis
12-12-2005
Outline
•
•
•
•
•
Recap
Estimation
Goodness of Fit
Goodness of Fit versus Effect Size
transformation of variables and effect size
Recap
• With regression we looked at the effect of
one variable on another
• an effect is a comparison of groups
• Effect of for instance age consists of a
comparison of too many groups
• so, look at an average effect
• implies a straight line
• average effect is slope
rent
room 1
room 2
room 3
room 4
room 5
room 6
room 7
room 8
room 9
room 10
room 11
room 12
room 13
room 14
room 15
room 16
room 17
room 18
room 19
surface area
175
180
185
190
200
210
210
210
230
240
240
250
250
280
300
300
310
325
620
13
16
16
20
18
19
20
22
20
18
18
24
20
24
23
26
27
28
49
mean and regression
• Mean summarizes observations with one
number that minimizes the sum of squared
deviations from that number
• Regression summarizes observations with
one line that minimizes the sum of squared
deviations from that line.
600
rent of room
500
400
300
mean
200
600
200
300
400
rent of room
500
600
500
rent of room
400
300
200
15
20
25
30
35
40
surface area of room
45
50
15
20
25
30
35
40
surface area of room
45
50
Ordinary Least Squares (OLS)
• yˆ  b0  b1 x
2
• So we want to minimize:   y  yˆ 
• by choosing optimal values of b0 and b1
What you need to know
• How to find the slope and intercept in:
– a graph
– a regression equation
– SPSS output
• How to interpret the slope and intercept
Coefficientsa
Model
1
(Cons tant)
age age at day
of interview
Uns tandardized
Coefficients
B
Std. Error
4845,644
235,959
-33,002
Standardized
Coefficients
Beta
3,317
-,205
t
20,536
Sig.
,000
-9,950
,000
t
20,536
-9,950
Sig.
,000
,000
a. Dependent Variable: incmid hous ehold income in guilders
Coefficientsa
Model
1
(Cons tant)
age10
Uns tandardized
Coefficients
B
Std. Error
4845,644
235,959
-330,023
33,167
Standardized
Coefficients
Beta
-,205
COMPUTE age10 = age/10 .
a. Dependent Variable: incmid household income in guilders
Coefficientsa
Model
1
(Cons tant)
age age at day
of interview
Uns tandardized
Coefficients
B
Std. Error
4,846
,236
-,033
Standardized
Coefficients
Beta
,003
-,205
t
20,536
Sig.
,000
-9,950
,000
COMPUTE incmid1000 = incmid/1000 .
a. Dependent Variable: incmid1000
Coefficientsa
Model
1
(Cons tant)
age55
Uns tandardized
Coefficients
B
Std. Error
3030,520
59,960
-33,002
3,317
Standardized
Coefficients
Beta
-,205
a. Dependent Variable: incmid household income in guilders
t
50,542
-9,950
Sig.
,000
,000
COMPUTE age55 = age-55 .
How well does the regression fit?
• We started with variation in the dependent
variable
• We fitted a regression, which has less
variation around the regression line
• The decrease in variation (Proportion of
variance explained) is a measure of fit.
• R2
Model Summary
Model
1
R
R Square
,205 a
,042
Adjus ted
R Square
,042
Std. Error of
the Es timate
1,45385
a. Predictors : (Constant), age age at day of interview
Standard Error of the Estimate
• Unfortunate choice, should have been
standard deviation of the estimate
• Measures the (unexplained) variation
around the regression line.
 y  y
2
Sy 
i
N 1
  y  yˆ 
2
S y. x 
i
N 2