Regression

Transcript Regression

Applied Regression Analysis
BUSI 6220
KNN Ch. 4
Simultaneous Inferences and Other Topics
Joint Estimation of β0 and β1
Confidence intervals are used for a single
parameter, confidence regions for a two or more
parameters
 The region for (β0, β1) defines a set of lines
 Since β0 and β1 are (jointly) normal, the natural
confidence region is an ellipse
 KNN do rectangles (KNN 4.1)
0011
 0010 1010 1101 0001 0100 1011
1
2
4
Bonferroni
0011 0010 1010 1101 0001 0100 1011
 We
want the probability that both intervals are
correct to be (at least) .95
 Basic idea is an error budget (α =.05)
 Spend half on β0 (.025) and half on β1 (.025)
 We use α =.025 for the β0 CI (97.5% CI)
 and α =.025 for the β1 CI (97.5% CI)
1
2
4
Bonferroni (2)
So we use
 b1 ± t*s(b1)
 b0 ± t*s(b0)
 where t* = t(.9875, n-2)
 .9875 = 1 – (.05)/(2*2)
0011 0010
 1010 1101 0001 0100 1011
1
2
4
Bonferroni (3)
0011 0010 1010 1101 0001 0100 1011
 Note
we start with a 5% error budget and we have
two intervals so we give
 2.5% to each
 Each interval has two ends so we again divide by 2
 So, .9875 = 1 – (.05)/(2*2)
1
2
4
Bonferroni Inequality
0011 0010 1010 1101 0001 0100 1011
 Let
the two intervals be I1 and I2
 We will use cor (=correct) if the interval
contains the true parameter value, inc
(=incorrect) if not
1
2
4
Bonferroni Inequality (2)
0011 0010 1010 1101 0001 0100 1011
 P(both
cor)=1-P(at least one inc)
 P(at least one inc)

= P(I1 inc)+ P(I2 inc)-P(both inc)

leq P(I1 inc)+ P(I2 inc)
 So P(both cor)

geq 1-(P(I1 inc)+ P(I2 inc))
1
2
4
Bonferroni Inequality (3)
0011 0010 1010 1101 0001 0100 1011
 P(both
cor) geq 1-(P(I1 inc)+ P(I2 inc))
 So if we use .05/2 for each interval,
 1-(P(I1 inc)+ P(I2 inc)) = 1 – .05 =.95
 So P(both cor) is at least .95
 We will use this idea when we do multiple
comparisons in anova
1
2
4
 If
the error events overlap, P(both corr) is actually
greater than 0.95
0011
0010
1010do
1101
0001
0100 1011at all, P(both corr) = 0.95
 If
they
not
overlap
.025
<.025
.025
1
2
4
.025
Mean Response CIs
0011 0010 1010 1101 0001 0100 1011
Simultaneous estimation for all Xh, use WorkingHotelling (KNN 2.6)
 E(Yh)(hat) ± Ws(E(Yh)(hat))
 where W2=2F(1-α; 2, n-2)
 For simultaneous estimation for a few (g) Xh, use
Bonferroni
 E(Yh)(hat) ± Bs(E(Yh)(hat))
 where B=t(1-α/(2g), n-2)

1
2
4
Simultaneous PIs
0011 0010 1010 1101 0001 0100 1011
Simultaneous prediction for a few (g) Xh,
 use Bonferroni
 Yh(hat) ± Bs(Yh(hat))
 where B=t(1-α/(2g), n-2)
 Or Scheffe
 Yh(hat) ± Ss(Yh(hat))
 where S2 = gF(1-α; g, n-2)

1
2
4
Regression through the Origin
0011 0010 1010 1101 0001 0100 1011
Yi = β1Xi + ξi
 How to set it up in your Stat software:

1
2
 Check “Constant is Zero” (Excel)
 Uncheck “Fit Intercept” in OPTIONS (MINITAB)
 Uncheck “Include Const. in Eq.” in OPTIONS (SPSS)
 “NOINT” option in PROC REG (SAS)
Generally not a good idea
 Problems with r2 and other statistics
 See cautions, KNN p 163

4
Measurement Error
0011 0010 1010 1101 0001 0100 1011
For Y, this is usually not a problem
 For X, we can get biased estimators of our
regression parameters
 See KNN 4.5, pp 164-166

1
2
4
Inverse Predictions
Sometimes called calibration
 Given Yh, predict the corresponding value of X,
Xh(hat)
 Solve the fitted equation for Xh
 Xh(hat) = (Yh – b0)/b1, b1 neq 0
 Approximate CI can be given, see KNN, p 167
0011 0010
 1010 1101 0001 0100 1011
1
2
4
Choice of X Values (Levels)
Look at the formulas for the variances of the
estimators of interest
 Usually we find Σ(Xi – X(bar))2 in a denominator
 So we want to spread out the values of X
0011 0010
 1010 1101 0001 0100 1011
1
2
4
Last slide
0011 0010 1010 1101 0001 0100 1011
Read KNN 4.1 to 4.6, read problems on pp 171-175
 Next class we will do all of this with vectors and
matrices so that we can generalize to multiple
regression
 If you are rusty in Linear Algebra:

1
REVIEW KNN 5.1 to 5.7
2
4

Regression

Transcript Regression

Directory