Transcript Regression
Applied Regression Analysis
BUSI 6220
KNN Ch. 4
Simultaneous Inferences and Other Topics
Joint Estimation of β0 and β1
Confidence intervals are used for a single
parameter, confidence regions for a two or more
parameters
The region for (β0, β1) defines a set of lines
Since β0 and β1 are (jointly) normal, the natural
confidence region is an ellipse
KNN do rectangles (KNN 4.1)
0011
0010 1010 1101 0001 0100 1011
1
2
4
Bonferroni
0011 0010 1010 1101 0001 0100 1011
We
want the probability that both intervals are
correct to be (at least) .95
Basic idea is an error budget (α =.05)
Spend half on β0 (.025) and half on β1 (.025)
We use α =.025 for the β0 CI (97.5% CI)
and α =.025 for the β1 CI (97.5% CI)
1
2
4
Bonferroni (2)
So we use
b1 ± t*s(b1)
b0 ± t*s(b0)
where t* = t(.9875, n-2)
.9875 = 1 – (.05)/(2*2)
0011 0010
1010 1101 0001 0100 1011
1
2
4
Bonferroni (3)
0011 0010 1010 1101 0001 0100 1011
Note
we start with a 5% error budget and we have
two intervals so we give
2.5% to each
Each interval has two ends so we again divide by 2
So, .9875 = 1 – (.05)/(2*2)
1
2
4
Bonferroni Inequality
0011 0010 1010 1101 0001 0100 1011
Let
the two intervals be I1 and I2
We will use cor (=correct) if the interval
contains the true parameter value, inc
(=incorrect) if not
1
2
4
Bonferroni Inequality (2)
0011 0010 1010 1101 0001 0100 1011
P(both
cor)=1-P(at least one inc)
P(at least one inc)
= P(I1 inc)+ P(I2 inc)-P(both inc)
leq P(I1 inc)+ P(I2 inc)
So P(both cor)
geq 1-(P(I1 inc)+ P(I2 inc))
1
2
4
Bonferroni Inequality (3)
0011 0010 1010 1101 0001 0100 1011
P(both
cor) geq 1-(P(I1 inc)+ P(I2 inc))
So if we use .05/2 for each interval,
1-(P(I1 inc)+ P(I2 inc)) = 1 – .05 =.95
So P(both cor) is at least .95
We will use this idea when we do multiple
comparisons in anova
1
2
4
If
the error events overlap, P(both corr) is actually
greater than 0.95
0011
0010
1010do
1101
0001
0100 1011at all, P(both corr) = 0.95
If
they
not
overlap
.025
<.025
.025
1
2
4
.025
Mean Response CIs
0011 0010 1010 1101 0001 0100 1011
Simultaneous estimation for all Xh, use WorkingHotelling (KNN 2.6)
E(Yh)(hat) ± Ws(E(Yh)(hat))
where W2=2F(1-α; 2, n-2)
For simultaneous estimation for a few (g) Xh, use
Bonferroni
E(Yh)(hat) ± Bs(E(Yh)(hat))
where B=t(1-α/(2g), n-2)
1
2
4
Simultaneous PIs
0011 0010 1010 1101 0001 0100 1011
Simultaneous prediction for a few (g) Xh,
use Bonferroni
Yh(hat) ± Bs(Yh(hat))
where B=t(1-α/(2g), n-2)
Or Scheffe
Yh(hat) ± Ss(Yh(hat))
where S2 = gF(1-α; g, n-2)
1
2
4
Regression through the Origin
0011 0010 1010 1101 0001 0100 1011
Yi = β1Xi + ξi
How to set it up in your Stat software:
1
2
Check “Constant is Zero” (Excel)
Uncheck “Fit Intercept” in OPTIONS (MINITAB)
Uncheck “Include Const. in Eq.” in OPTIONS (SPSS)
“NOINT” option in PROC REG (SAS)
Generally not a good idea
Problems with r2 and other statistics
See cautions, KNN p 163
4
Measurement Error
0011 0010 1010 1101 0001 0100 1011
For Y, this is usually not a problem
For X, we can get biased estimators of our
regression parameters
See KNN 4.5, pp 164-166
1
2
4
Inverse Predictions
Sometimes called calibration
Given Yh, predict the corresponding value of X,
Xh(hat)
Solve the fitted equation for Xh
Xh(hat) = (Yh – b0)/b1, b1 neq 0
Approximate CI can be given, see KNN, p 167
0011 0010
1010 1101 0001 0100 1011
1
2
4
Choice of X Values (Levels)
Look at the formulas for the variances of the
estimators of interest
Usually we find Σ(Xi – X(bar))2 in a denominator
So we want to spread out the values of X
0011 0010
1010 1101 0001 0100 1011
1
2
4
Last slide
0011 0010 1010 1101 0001 0100 1011
Read KNN 4.1 to 4.6, read problems on pp 171-175
Next class we will do all of this with vectors and
matrices so that we can generalize to multiple
regression
If you are rusty in Linear Algebra:
1
REVIEW KNN 5.1 to 5.7
2
4