Y - staff.city.ac.uk

Download Report

Transcript Y - staff.city.ac.uk

Topic4
Ordinary Least Squares
• Suppose that X is a non-random
variable
• Y is a random variable that is
affected by X in a linear fashion and
by the random variable e with E(e) =
0
That is,
E(Y) = b1 + b2X
Or, Y = b1 + b2X + e
Y
.
.
O
.
.
.
Observed points
X
Y
.
.
O
.
.
.
Actual
Line
Y= b1 + b2x
X
Y
.
.
.
O
Actual
Line
.
. Y= b + b x
1
2
X
Y
.
.
.
O
Actual
Line
.
. Y= b + b x
1
2
X
Y
.
.
.
O
.
.
Actual
Line
Y= b1 + b2x
X
Y
.
.
O
.
.
.
Actual
Line
Y= b1 + b2x
X
Y
.
.
O
.
.
.
Actual
Line
Y= b1 + b2x
X
Y= b1 + b2x
Y
Fitted Line
.
.
..
.
C
B
A
.
O
.
Actual
Line
Y= b1 + b2x
BC is an error of Estimation X
AC is an effect of the random
factor
• The Ordinary Least Squares
(OLS) estimates are obtained by
minimising the sum of the
squares of each of these errors.
• The OLS estimates are obtained
from the values of X and the
actual Y values (YA) as follows:
Error of estimation (e)  |YA –YE |
where YE is the estimated value of Y.
Se2 S [YA –YE ]2
Se2 S [YA –(b1 + b2 X)]2
dSe2/db1 2S[YA –(b1 + b2X)] (-1) =0
dSe2 /db2 2S [YA –(b1 + b2X)] (-X) = 0
S [Y –(b1 + b2X)] (-1) = 0
-NYMEAN + N b1 + b2NXMEAN = 0
b1 = YMEAN – b2XMEAN
….. (1)
dSe2/db2 2S [Y –(b1+ b2X)] (-X) = 0
S [Y –(b1 + b2X)] (-X) = 0
b1SX –b2SX2 = SXY ………..(2)
b1 = YMEAN - b2XMEAN
….. (1)
• These estimates are given below
(with the superscripts for Y dropped).
b^1 = (∑Y)(ΣX2) – (∑X)(∑XY)
2
X
2
(∑X)
N∑
b^2 = N∑YX – (∑X)(∑Y)
N∑ X2 - (∑X)2
• Alternatively,
b^1 = YMEAN - b^2XMEAN
b^2 =
Covariance(X,Y)
Variance(X)
Two Important Results
(a) Sei S(Yi– YiE) = 0 and
(b) SX2iei  SX2i(Yi– YiE) = 0
where YiE is the estimated value of Yi.
X2i is the same as Xi from before
Proof: S(Yi– YiE) = S(Yi– b^1 - b^2 X2i)
= SYi–S b^1 - Sb^2 X2i
= nYMEAN – nb^1 - nb^2 XMEAN
= n(YMEAN – b^1 - b^2 XMEAN)
= 0 [ since b^1 = YMEAN - b^2XMEAN ]
See the lecture notes for a proof of part
(b)
Total sum of squares (TSS)
S(Yi– YMEAN ) 2
Residual sum of squares (RSS)
 S(Yi– YiE ) 2
Explained sum of squares (ESS)
 S(YiE – YMEAN ) 2
To prove that
TSS = RSS + ESS
TSS ≡ S(Yi– YMEAN)2
= S{(Yi– YiE + YiE– YMEAN)}2
= S(Yi– YiE)2 + S(YiE– YMEAN)}2
+2S(Yi– Yi E)(YiE– YMEAN)
= RSS + ESS +2S(Yi– YiE)(YiE– YMEAN)
S(Yi– YiE)(YiE– YMEAN)
S(Yi– YiE)(YiE ) -YMEAN S(Yi– YiE)
 S(Yi– YiE)(YiE ) [by (a) above]
S(Yi– YiE)(YiE ) = S(Yi– YiE)( b^1 + b^2
Xi)
= b^1 S(Yi– YiE) + b^2 SXi(Yi– YiE)
= 0 [by (a) and (b) above]
R2 ≡ ESS/TSS
Since TSS = RSS + ESS, it follows that
0 R2  1
Topic 5
Properties of Estimators
In the discussion that follows, q^ is an
estimator of the parameter of interest, q
Bias of q^ ≡ E(q^) - q
q^ is unbiased if Bias of q^ = 0.
q^ is negatively biased if Bias of q^ < 0.
q^ is positively biased if Bias of q^ > 0.
Mean Squared Errors (MSE) of
estimation for q^ is given as
MSE q^ ≡ E[(q^-q)]2
MSE q^ ≡ E[(q^-q)2]
≡ E[{q^-E(q^) +E(q^)- q}2]
≡ E[{q^-E(q^)}2] + E[{E(q^)- q}2] +
2E[{q^-E(q^)}*{E(q^)- q}]
≡ Var(q^) + {E(q^)- q}2 +
2E[{q^-E(q^)}*{E(q^)- q}]
Now, E[{q^-E(q^)}*{E(q^)- q}]
≡ {E(q^)-E(q^)}*{E(q^)- q}]
≡ 0*{E(q^)- q}]  0
MSE q^ ≡ Var(q^) + {E(q^)- q}2
MSE q^ ≡ Var(q^) + (bias)2 .
If q^ is unbiased, that is, if E(q ^)- q = 0.
then we have,
MSE q^ ≡ Var(q^)
An unbiased estimator q^ of a
parameter q is efficient if and only if it
has the smallest variance of all
unbiased estimators. That is, for any
other unbiased estimator p of q,
Var(q^)≤ Var(p)
An estimator q^ is said to be consistent
if it converges in probability to q. That
is,
Lim n

Prob(|q^-q | > e) = 0
for every e> 0.
When the above condition holds,
q^ is said to be the probability limit
of q, that is,
plim q^  q
Sufficient conditions for
consistency: If the mean of q^
converges
to q and var(q^) converges to zero
(as n approaches ) then q^ is
That is, q^n is consistent if it can
be shown that
Lim n

E(q^n)  q

Var(q^n)  0
And
Lim n
The Regression Model with TWO
Variables
The Model :: Y = b1 + b2X + e
Y is the DEPENDENT variable
X is the INDEPENDENT variable
Yi  b1 X1i + b2X2i + ei
Yi  b1 X1i + b2X2i + ei
Here X1i ≡ 1 for all i and X2 is
nothing but X .
The OLS estimates b^1 and b^2 are sample
statistics used to estimate b1 and b2
respectively
Assumptions about X2:
(1a) X2 is non-random (chosen by the
investigator)
(1b) Random sampling is performed from a
population of fixed values of X2 .
(1c) : Lim
(1/n) S(x22i) = Q > 0
n
[ where x2i  X2i – X2MEAN.]
(1/n)S(X2i) = P > 0
(1c) : Lim
n

Assumptions about the disturbance term e
2a. E(e) = 0
2b. Var(ei) = 2 for all i.
Homoskedasticity
2c. Cov(ei, ej ) = 0 for i  j. (The e values
are uncorrelated across observations).
2d. The ei all have a normal distribution
Result :b^2 is linear in the dependent variable Yi
Proof:
b^2 =
Covariance(X,Y)
Variance(X)
b^2 = S(Yi–YMEAN )(Xi–XMEAN )
S(Xi–XMEAN )2
b^2 = SYi(Xi–XMEAN )
+K
S(Xi–XMEAN )2
 S CiYi + K
where the Ci and K are constants
Therefore,
b^2 is a linear function of Yi
Since,
Yi  b1 X1i + b2X2i + ei
b^2 is a linear function of ei and hence
is normally distributed
Similarly,
b^1 is a linear function of Yi (and
hence ei ) and is normally distributed
Both b^1 and b^2 are unbiased
estimates of b1 and b2 respectively.
That is, E( b^1 ) = b1 and
E( b^2 ) = b2
Each of b^1 and b^2 is an efficient
estimators of b1 and b2 respectively.
Thus, each of b^1 and b^2 is a
Best (efficient)
Linear (in the dependent variable Yi )
Unbiased
Estimator of b1 and b2 respectively.
Also,
Each of b^1 and b^2 is a consistent
estimator of b1 and b2 respectively.
Var(b^1 ) = 2 (1/n +X 2mean2/Sx2i2)
Var(b^2 ) = 2 /Sx2i2)
. Cov(b^1, b^2 ) =
2
- X
2mean/Sx2i
2
LimVar(b^2 )
n 
= Lim 2/Sx2i2
n 
= Lim 2/n/Sx2i2/n
n 
= 0/Q
[using assumption (1c)]
Because b^2 is an unbiased estimator
of b2 and
LimVar(b^2 ) = 0
n 
b^2 is a consistent estimator of b2
The variance of the random term,
is not known
To perform statistical analysis, we
estimate 2 by
^2  RSS/(n-2)
This is because ^2 is an unbiased
estimator of 2
2
,