Transcript Warm-up
Warm-up
O Turn in HW β Ch 8 Worksheet
O Complete the warm-up that you picked up by the
door. (you have 10 minutes)
Objective
O Define and create Residual Plots. (By hand
and in the calculator.
O Use Residual Plots to determine if using a
linear model is appropriate.
O Define and calculate R2(Coefficient of
Determination).
O Use R2 to explain how much of the variation
is accounted for by the model.
Residuals
O The difference between an observed value
of response variable and value predicted by
the regression line..
residual ο½ y ο yΛ
O
e represents residual
O π¦ represents the predicted response value
O y
represents the actual response value
Residuals
o Negative residual
means the model
OVER PREDICTS the
y value.
o Positive residual
means the model
UNDER PREDICTS
the y value.
Residual Example
O Data was collected on 16 F-150 Ford Trucks
in order to create a linear regression model
that would help predict the price of a used F150 based on its mileage.
πππππ = 38,257 β 0.1629(πππππππ)
Calculate the residual for a truck that had
70,583 miles and cost $21,994.
Residual Example
πππππ = 38,257 β 0.1629 πππππππ
π=π¦ βπ¦
πππππ = 38,257 β 0.1629 70,583
πππππ = $26,759
π = 21,994 β 26,759 = β$ππππ
This means that the actual price of the truck is $4765 lower
than expected based on its miles.
Residual Plots
O A scatterplot of the residuals against the
explanatory variable.
O Help us assess how well a regression line fits
the data.
O Should show no obvious pattern.
O Should be relatively small in size
Try on your own
O Answer #11-13 on your Classwork sheet.
Residual Plot Practice
O Do the first page of the Worksheet
Residual Plot (calculator)
O Enter x values in L1 and y values in L2.
O Scroll to put cursor on L3 . Press 2nd ,STAT,
Enter, 1. (RESID) This calculates the
residuals and puts them in L3 .
O Go to STAT PLOT. Turn on Scatterplot. Pick L1
for X list, and L3 (RESID) for Y list. ZOOM 9
Residual Plot Practice
O Go back to your worksheet. Create a
scatterplot for this data. Is the linear model
appropriate?
Standard Deviation of
Residuals
O Give the approximate size of a βtypicalβ or
βaverageβ prediction error.
Oπ =
πππ πππ’πππ 2
πβ2
O This can be found on the calculator by using
STAT-CALC-1-Var Stats for the Residuals and
looking up standard deviation.
Standard Deviation Practice
O Go back to your worksheet. What is the
typical prediction error for the fare when
using the model?
Coefficient of Determination
O R2 is the fraction of the variation in the
values of y that is accounted for by the LSRL
of y on x.
Interpreting r2
O β __(r2)_% of the variation in _(response
variable)_ is accounted for by the linear
model relating _(response variable) to
_(Explanatory variable) .
O Example: From the Roller Coaster warm-up,
where ππ’πππ‘πππ = 91.033 + 0.242(ππππ)
O If we calculate that r2 is .82 we would
interpret that by saying β82% of the variation
in duration of the ride is accounted for by
the linear model relating duration to initial
drop.β
Interpreting r2
O Explain what r2 means in context of the data
for the Distance vs. Fare example (#2).
Which scatterplot has a
r=0.816?
Is the linear model
appropriate?
O Scatter plot must meet the βStraight enough
conditionβ
O Correlation Coefficient- π ππππ π π‘π 1.
O Residual Plot β random & not too far from
the line.
Calculating LSRL from means
y ο½ a ο« bx
O Slope
bο½r
O Y-intercept
sy
a ο½ y ο bx
sx
r= correlation coefficient
sy = std. dev of y-values
sx = std. dev. of x-values
π₯=mean of x-values
π¦=mean of y-values
Example
In a random sample of 15 high school students, the mean height was
Found to be 171.43 cm with a standard deviation of 10.69 cm, and the
Mean foot length was 24.7c cm with a standard deviation of 2.71 cm. The
Correlation coefficient between foot length and height is r= 0.697
Find the LSRL and the coefficient of determination.
bο½r
sy
sx
10.69
π = 0.697
2.71
π = π. ππ
y ο½ a ο« bx
a ο½ y ο bx
π = 171.43 β 2.75(24.76)
r2 = (0.697)2
r2 = 0.486
π = πππ. ππ
ππππππ = πππ. ππ + π. ππ(ππππππ)
Outliers and Influential Points
O Outliers β
Observations that lie
outside the overall
pattern in the x or y
direction.
O Outliers in the ydirection have high
residuals.
O Influential points- a
point, that if
removed, would
markedly change the
result of the
calculation.
O Typically these are
extreme x-outliers
with small yresiduals.