P. STATISTICS LESSON 14 – 1 ( DAY 1 )

Download Report

Transcript P. STATISTICS LESSON 14 – 1 ( DAY 1 )

AP STATISTICS
LESSON 14 – 1
( DAY 1 )
INFERENCE ABOUT THE
MODEL
ESSENTIAL QUESTION:
What is regression inference and
how is it used?
Objectives:
• To find regression inference.
• To find standard errors for regression
lines.
• To create confidence intervals for
regression slope.
Inference About the Model
When a scatterplot shows a linear
relationship between a quantitative
explanatory variable x and a
quantitative response variable y, we can
use the least-squares line fitted to the
data to predict y for a given value of x.
Example 14.1 Page 781
Crying and IQ
• Plot and interpret.
• Numerical summary
• Mathematical model.
We are interested in
predicting the
response from
information about the
explanatory variable.
So we find the least
^
square regression
line
for predicting IQ from
crying.
^
y = a + bx
The Regression Model
We use the notation ^
y to remind ourselves
that the regression line gives predictions of
IQ.
The slope b and intercept a of the leastsquares line of are statistics. That is we
calculate them from the sample data.
To do formal inference, we think of a and b as
estimates of unknown parameters.
Conditions for Regression
Inference
We have n observations on an explanatory
variable x and a response variable y. Our
goal is to study or predict the behavior of y for
given values of x.
• For any fixed value of x, the response y varies
according to a normal distribution. Repeated
responses y are independent of each other.
Conditions of Regression
(continued…)
• The mean response μy has a straight-line
relationship with x:
μy = α +βx
The slope β and intercept α are unknown
parameters.
• The standard deviation of y (call it σ ) is the same
for all values of x. The value of σ is unknown.
The Heart of the
Regression Model
The heart of this model is that there is an “on
the average” straight-line relationship
between y and x. The true regression line μy
= α +βx says that the mean response μy
moves along a straight line as the
explanatory variable x changes.
The mean of the response y moves along this
line as the explanatory variable x takes
different values
Inference
The first step in inference is to estimate
the unknown parameters α, β, and σ.
The slope b is an unbiased estimator of
the true slope β, and the intercept a of
the least-squares line is an unbiased
estimator of the true intercept α.
Example 14.2 Page 784
Slope and Intercept
A slope is a rate of change.
The true slope β says how much higher
average IQ is for children with one more
peak in their crying measurement.
We need the intercept α to draw the
line, but it has no statistical meaning.
Example 14.2 (continued…)
The standard deviation σ, which describes the
variability of the response y about the true
regression line.
The least-squares line estimates the true
regression line. Recall that the residuals are
the vertical deviations of the data points from
the least-squares line:
Residual = observed y – predicted y = y - ^
y
Standard Error About the
Least-Squares Line
We call this sample standard deviation a standard error
to emphasize that it is estimated from data.
The standard error about the line is
s = √ 1/(n – 2)∑ residual2
s = √ 1/(n – 2)∑ (y – ^
y)2
Use s to estimate the unknown σ in the regression
model.