Transcript 12.1
Chapter 12
More About Regression ο
Letβs look at the Warm-Up first to remind ourselves
what we did with regression!
Remember FODS!
Section 12.1
Inference for Linear Regression
Least-Squares Regression fits a straight
line of the form π¦ = π + ππ₯ to data to
predict a response variable y from the
explanatory variable x.
ο Inference in this setting uses the sample
regression line to estimate or test a
claim about the population (true)
regression line.
ο
ο
Confidence intervals and significance tests
about the slope of the population
regression line are based on the sampling
distribution of b, the slope of the sample
regression line.
Conditions - LINER
ο
ο
ο
ο
ο
Linear β the actual relationship between x and y is
linear. For any fixed value of x, the mean response ππ₯ falls
on the population (true) regression line ππ¦ = πΌ + π½π₯ β
graph scatterplot
Independent β individual observations are independent
Normal β for any fixed value of x, the response y varies
according to a Normal distribution β you will be graphing
the Normal probability plot of the residuals here.
Equal Variance β the standard deviation of y (call it π)
is the same for all values of x β graph residual plot
Random β the data are produced from a well-designed
random sample or a randomized experiment.
How it worksβ¦
The slope b and intercept a of the leastsquares regression line estimate the slope
π½ and the intercept πΌ of the population
(true) regression line.
ο To estimate π, use the standard deviation
of s of the residuals.
ο
Confidence Intervals
Confidence intervals and significance tests
for the slope π½ of the population regression
line are based on a t distribution with n-2
degrees of freedom.
ο n-2 because we have two lists β we need to
allow another degree of freedom for the
extra variableβ¦
ο The t interval for the slope π½ has the
form π ± π‘ β ππΈπ
ο The standard error of the slope is ππΈπ =
π
ο
π π₯ πβ1
Letβs look at SEβ¦
ο
ππΈπ =
π
π π₯ πβ1
πππ πππ’πππ 2
πβ2
ο
π =
ο
On the AP formula sheet:
ο π π
=
(π¦βπ¦)2
πβ2
(π₯βπ₯)2
=
(π¦βπ¦)2
πβ2
Hypothesis Tests
ο
ο
To test the null hypothesis, carry out a t
test for the slope.
Use π =
πβπ·π
πΊπ¬π
(π»0 : π½ = βπ¦πππ‘βππ ππ§ππ π£πππ’π)
ο The most common null hypothesis is
π»0 : π½ = 0, which says that there is no
linear relationship between x and y in the
population.
ο
Letβs do a confidence interval!
We examined data from a study that investigated why some
people donβt gain weight even when they overeat. Researchers
deliberately overfed a random sample of 16 healthy young adults
for 8 weeks. They measured fat gain and change in energy use
from activity other than deliberate exercise (non-exercise
activity, NEA) β fidgeting, daily living, etc β for each subject. Here
are the results:
NEA Change (cal)
-94
-57
-29
135
143
151
245
355
Fat Gain (kg)
4.2
3.0
3.7
2.7
3.2
3.6
2.4
1.3
NEA Change (cal)
392
473
486
535
571
580
620
690
Fat Gain (kg)
3.8
1.7
1.6
2.2
1.0
0.4
2.3
1.1
Construct and interpret a 90% confidence interval
for the slope of the population regression line.
Check conditions first! Type information into calculator!
ο Linear β look at scatterplot and draw it to prove that
you have checked this condition.
ο Independent β
ο
Normal β look at Normal probability plot of residuals
and draw it to prove you checked this condition. (find
the LinReg first and then do NPP with RESID β 2nd list)
Keep checking conditionsβ¦
ο
Equal Variance β we want the standard deviation (the
average distance from the mean β or 0) to be the same
for all points β draw the residual plot to prove that you
have looked at it.
ο
Random β
Do:
A Significance Testβ¦
Infants who cry easily may be more easily stimulated than others. This may
be a sign of higher IQ. Researchers explored the relationship between crying
infants 4 to 10 days old and their later IQ scores. The researchers flicked
the infants with a rubber band and recorded the crying. They measured its
intensity by the number of peaks in the most active 20 seconds. The table
below contains data from a random sample of 38 infants.
a) Here is a scatterplot of the data with the least-squares
regression line added. Describe what this graph tells you
about the relationship between these two variables.
b) Using the min-tab output, what is the equation
of the least-squares regression line?
c) Interpret slope and y-intercept of
the regression line in context
d) Do these data provide convincing evidence that there is
a positive linear relationship between crying counts and IQ
scores in the population of infants?
Homework
ο
Pg 759 (6, 8, 13-15, 18-26)