Transcript Week 4

Week 4
Bivariate Regression,
Least Squares and
Hypothesis Testing
Lecture Outline
•
•
•
•
•
•
•
Method of Least Squares
Assumptions
Normality assumption
Goodness of fit
Confidence Intervals
Tests of Significance
alpha versus p
IS 620
Spring 2006
2
Recall . . .
• Regression curve as “line connecting the
mean values” of y for a given x
– No necessary reason for such a construction
to be a line
– Need more information to define a function
IS 620
Spring 2006
3
Method of Least Squares
• Goal: describe the functional relationship
between y and x
– Assume linearity (in the parameters)
• What is the best line to explain the
relationship?
• Intuition: The line that is “closest” or “fits
best” the data
IS 620
Spring 2006
4
0
5
y
10
15
20
“Best” line, n = 2
0
5
10
15
20
25
30
35
x
IS 620
Spring 2006
5
0
5
y
10
15
20
“Best” line, n = 2
0
5
10
15
20
25
30
35
x
IS 620
Spring 2006
6
25
30
35
40
“Best” line, n > 2
0
5
10
15
y
20
?
0
5
10
15
20
25
30
35
x
IS 620
Spring 2006
7
0
5
10
15
y
20
25
30
35
40
“Best” line, n > 2
0
5
10
15
20
25
30
35
x
IS 620
Spring 2006
8
40
Least squares: intuition
30
35
Goal : min u1  u2  u3 
y
20
25
u2
u3
0
5
10
15
u1
0
5
10
15
20
25
30
35
x
IS 620
Spring 2006
9
10 y15 20 25 30 35 40
Least squares, n > 2
-15 -10
-5
0
5

2
min   uˆi 
 i

0
5
10
15
20
25
30
35
40
45
x
IS 620
Spring 2006
10
Why sum of squares?
• Sum of residuals may be zero
• Emphasize residuals that are far away
from regression line
• Better describes spread of residuals
IS 620
Spring 2006
11
Least-squares estimates
yˆi  ˆ1  ˆ2 xi  uˆi
yˆi  ˆ1  ˆ2 xi  uˆi
Intercept
Residuals
Effect of
x on y
(slope)
IS 620
Spring 2006
12
Gauss-Markov Theorem
• Least-squares method produces best,
linear unbiased estimators (BLUE)
• Also most efficient (minimum variance)
• Provided classic assumptions obtain
IS 620
Spring 2006
13
Classical Assumptions
• Focus on #3, #4, and #5 in Gujarati
– Implications for estimators of violations
• Skim over #1, #2, #6 through #10
IS 620
Spring 2006
14
#3: Zero mean value of ui
• Residuals are randomly distributed around
the regression line
• Expected value is zero for any given
observation of x
• NOTE: Equivalent to assuming the model
is fully specified
IS 620
Spring 2006
15
-20
0
y 20
40
60
#3: Zero mean value of ui
0
10
20
30
40
50
x
IS 620
Spring 2006
16
-20
0
y 20
40
60
#3: Zero mean value of ui
0
10
20
30
40
50
x
IS 620
Spring 2006
17
20
0
-20
y if E(u|X) > 0
40
60
#3: Zero mean value of ui
0
10
20
30
40
50
x
IS 620
Spring 2006
18
20
0
-20
y if E(u|X) > 0
40
60
#3: Zero mean value of ui
0
10
20
30
40
50
x
IS 620
Spring 2006
19
-20
0
y 20
40
60
#3: Zero mean value of ui
0
10
20
30
40
50
x
IS 620
Spring 2006
20
20
0
-20
y if E(u|X) <> 0
40
60
#3: Zero mean value of ui
0
10
20
30
40
50
x
IS 620
Spring 2006
21
20
0
-20
y if E(u|X) <> 0
40
60
#3: Zero mean value of ui
0
10
20
30
40
50
x
IS 620
Spring 2006
22
20
0
-20
y if E(u|X) <> 0
40
60
#3: Zero mean value of ui
0
10
20
30
40
50
x
IS 620
Spring 2006
23
Violation of #3
• Estimated betas will be
– Unbiased but
– Inconsistent
– Inefficient
• May arise from
– Systematic measurement error
– Nonlinear relationships (Phillips curve)
IS 620
Spring 2006
24
#4: Homoscedasticity
• The variance of the residuals is the same
for all observations, irrespective of the
value of x
• “Equal variance”
• NOTE: #3 and #4 imply (see “Normality
Assumption”)
uˆ ~ N 0, 
IS 620
Spring 2006
25
-20
0
y
20
40
60
#4: Homoscedasticity
0
10
20
30
40
50
x
IS 620
Spring 2006
26
-50
0
50
100
#4: Homoscedasticity
0
10
20
30
40
50
x
IS 620
Spring 2006
27
-50
0
50
100
#4: Homoscedasticity
0
10
20
30
40
50
x
IS 620
Spring 2006
28
-50
0
50
100
#4: Homoscedasticity
0
10
20
30
40
50
x
IS 620
Spring 2006
29
-50
0
50
100
#4: Homoscedasticity
0
10
20
30
40
50
x
IS 620
Spring 2006
30
Violation of #4
• Estimated betas will be
– Unbiased
– Consistent but
– Inefficient
• Arise from
– Cross-sectional data
IS 620
Spring 2006
31
#5: No autocorrelation
• The correlation between any two residuals
is zero
• Residual for xi is unrelated to xj
IS 620
Spring 2006
32
-20
0
y
20
40
60
#5: No autocorrelation
0
10
20
30
40
50
x
IS 620
Spring 2006
33
-50
0
50
100
#5: No autocorrelation
0
10
20
30
40
50
x
IS 620
Spring 2006
34
-50
0
50
100
#5: No autocorrelation
0
10
20
30
40
50
x
IS 620
Spring 2006
35
-50
0
50
100
#5: No autocorrelation
0
10
20
30
40
50
x
IS 620
Spring 2006
36
Violations of #5
• Estimated betas will be
– Unbiased
– Consistent
– Inefficient
• Arise from
– Time-series data
– Spatial correlation
IS 620
Spring 2006
37
Other Assumptions (1)
• Assumption 6: zero covariance between xi
and ui
– Violations cause of heteroscedasticity
– Hence violates #4
• Assumption 9: model correctly specified
– Violations may violate #1 (linearity)
– May also violate #3: omitted variables?
IS 620
Spring 2006
38
Other Assumptions (2)
• #7: n must be greater than number of
parameters to be estimated
– Key in multivariate regression
– King, Keohane and Verba’s (1996) critique of
small n designs
IS 620
Spring 2006
39
Normality Assumption
• Distribution of disturbance is unknown
• Necessary for hypothesis testing of I.V.s
– Estimates a function of ui
• Assumption of normality is necessary for
inference
• Equivalent to assuming model is
completely specified
IS 620
Spring 2006
40
Normality Assumption
• Central Limit Theorem: M&Ms
• Linear transformation of a normal variable
itself is normal
• Simple distribution (mu, sigma)
• Small samples
IS 620
Spring 2006
41
Assumptions, Distilled
1. Linearity
2. DV is continuous, interval-level
3. Non-stochastic: No correlation between
independent variables
4. Residuals are independently and
identically distributed (iid)
a) Mean of zero
b) Constant variance
IS 620
Spring 2006
42
If so, . . .
• Least-squares method produces BLUE
estimators
IS 620
Spring 2006
43
Goodness of Fit
• How “well” the least-squares regression
line fits the observed data
• Alternatively: how well the function
describes the effect of x on y
• How much of the observed variation in y
have we explained?
IS 620
Spring 2006
44
Coefficient of determination
• Commonly referred to as “r2”
• Simply, the ratio of explained variation in y
to the total variation in y
IS 620
Spring 2006
45
35
40
Components of variation
explained
15
y
20
25
30
total
0
5
10
residual
0
5
10
IS 620
15
20
x
Spring 2006
25
30
35
40
46
Components of variation
• TSS: total sum of squares
• ESS: explained sum of squares
• RSS: residual sum of squares
ESS
RSS
r 
 1
TSS
TSS
2
IS 620
Spring 2006
47
Hypothesis Testing
•
•
•
•
Confidence Intervals
Tests of significance
ANOVA
Alpha versus p-value
IS 620
Spring 2006
48
Confidence Intervals
• Two components
– Estimate
– Expression of uncertainty
• Interpretation:
– Gujarati, p. 121: “The probability of
constructing an interval that contains Beta is
1-alpha”
– NOT: “The p that Beta is in the interval is 1alpha”
IS 620
Spring 2006
49
C.I.s for regression
• Depend upon our knowledge or
assumption about the sampling distribution
• Width of interval proportional to
standard error of the estimators
• Typically we assume
– The t distribution for Betas
– The chi-square distribution for variances
– Due to unknown true standard error
IS 620
Spring 2006
50
Confidence Intervals in IR
• Examples?
IS 620
Spring 2006
51
The worst weatherman
in the world
• “Three-degree
guarantee”
• If his forecast high is
off by more than three
degrees, someone
wins an umbrella
• Woo hoo
IS 620
Spring 2006
52
How Many Umbrellas?
• Data: mean daily temperature in February
for Washington, DC
– Daily observations from 1995 to 2005 (n =
311)
– Mean: 47.91 degrees F
– Standard deviation: 10.58
• The interval: +/- 3.5 degrees F
– Due to rounding
– Note: spread of seven (eight?) degrees
IS 620
Spring 2006
53
The t value
• We don’t know alpha: level of confidence
• Assume t distribution

Pr x  t 2
   x  x  t    1  

 2
n

n
Pr47.9  3.5  x  47.9  3.5  1  
 10.58 
t
  3.5
 311 
0.60016  t  3.5
t  5.83
IS 620
Spring 2006
54
The answer
• From the t table:
Pr t  5.83  3.746 10 8 for df  311
 0.00000003746
Tom will give away an umbrella on
average about once every 26,695,141 days.
Thanks, Tom.
IS 620
Spring 2006
55
Tests of Significance
• A hypothesis about a point value rather
than an interval
– Does the observed sample value differ
from the hypothesized value?
• Null hypothesis (H0): no difference
• Alternative hypothesis (Ha): significant
difference
IS 620
Spring 2006
56
Regression Interpretation
• Is the hypothesized causal effect (beta)
significantly different than zero?
– Ho: no effect (β = 0)
– Ha: effect (β ≠ 0)
• The “zero” null hypothesis
IS 620
Spring 2006
57
Two-tail v. One-tail tests
Two-tail
• Ha is not concerned
with direction of
difference
– Exploratory
• Theory in
disagreement
• Critical regions on
both ends
IS 620
One tailed
• Ha specifies a
direction of effect
• Theory well
developed
• Critical regions only
on one end
Spring 2006
58
The 2-t rule
• Gujarati, p. 134: zero null hypothesis
can be rejected if t > 2
– D.F. > 20
– Level of significance = 0.05
– Recall Weatherman Tom: t = 5.62!
IS 620
Spring 2006
59
Alpha versus p-values
Alpha
• Conventional
• Findings reported at
0.5, 0.1, 0.01
• Accessible, intuitive
• Arbitrary
• Makes assumptions
about Type I, II errors
IS 620
P-value
• “The lowest
significance at which
a null hypothesis can
be rejected”
• Widely accepted
today
• Know your readers!
Spring 2006
60
ANOVA
• Intuitively similar to r2
– Identical output for bivariate regression
• A good test of the zero null hypothesis
• In multivariate regression, tests the null
hypotheses for all betas
– Check F statistic before checking betas!
IS 620
Spring 2006
61
Limits of ANOVA
• Harder to interpret
• Does not provide information on direction
or magnitude of effect for independent
variables
IS 620
Spring 2006
62
ANOVA output from SPSS
IS 620
Spring 2006
63