Transcript Warm-up

Warm-up
O Turn in HW – Ch 8 Worksheet
O Complete the warm-up that you picked up by the
door. (you have 10 minutes)
Objective
O Define and create Residual Plots. (By hand
and in the calculator.
O Use Residual Plots to determine if using a
linear model is appropriate.
O Define and calculate R2(Coefficient of
Determination).
O Use R2 to explain how much of the variation
is accounted for by the model.
Residuals
O The difference between an observed value
of response variable and value predicted by
the regression line..
residual  y  yˆ
O
e represents residual
O 𝑦 represents the predicted response value
O y
represents the actual response value
Residuals
o Negative residual
means the model
OVER PREDICTS the
y value.
o Positive residual
means the model
UNDER PREDICTS
the y value.
Residual Example
O Data was collected on 16 F-150 Ford Trucks
in order to create a linear regression model
that would help predict the price of a used F150 based on its mileage.
π‘π‘Ÿπ‘–π‘π‘’ = 38,257 βˆ’ 0.1629(π‘šπ‘–π‘™π‘’π‘Žπ‘”π‘’)
Calculate the residual for a truck that had
70,583 miles and cost $21,994.
Residual Example
π‘π‘Ÿπ‘–π‘π‘’ = 38,257 βˆ’ 0.1629 π‘šπ‘–π‘™π‘’π‘Žπ‘”π‘’
𝑒=𝑦 βˆ’π‘¦
π‘π‘Ÿπ‘–π‘π‘’ = 38,257 βˆ’ 0.1629 70,583
π‘π‘Ÿπ‘–π‘π‘’ = $26,759
𝑒 = 21,994 βˆ’ 26,759 = βˆ’$πŸ’πŸ•πŸ”πŸ“
This means that the actual price of the truck is $4765 lower
than expected based on its miles.
Residual Plots
O A scatterplot of the residuals against the
explanatory variable.
O Help us assess how well a regression line fits
the data.
O Should show no obvious pattern.
O Should be relatively small in size
Try on your own
O Answer #11-13 on your Classwork sheet.
Residual Plot Practice
O Do the first page of the Worksheet
Residual Plot (calculator)
O Enter x values in L1 and y values in L2.
O Scroll to put cursor on L3 . Press 2nd ,STAT,
Enter, 1. (RESID) This calculates the
residuals and puts them in L3 .
O Go to STAT PLOT. Turn on Scatterplot. Pick L1
for X list, and L3 (RESID) for Y list. ZOOM 9
Residual Plot Practice
O Go back to your worksheet. Create a
scatterplot for this data. Is the linear model
appropriate?
Standard Deviation of
Residuals
O Give the approximate size of a β€œtypical” or
β€œaverage” prediction error.
O𝑠 =
π‘Ÿπ‘’π‘ π‘–π‘‘π‘’π‘Žπ‘™π‘  2
π‘›βˆ’2
O This can be found on the calculator by using
STAT-CALC-1-Var Stats for the Residuals and
looking up standard deviation.
Standard Deviation Practice
O Go back to your worksheet. What is the
typical prediction error for the fare when
using the model?
Coefficient of Determination
O R2 is the fraction of the variation in the
values of y that is accounted for by the LSRL
of y on x.
Interpreting r2
O β€œ __(r2)_% of the variation in _(response
variable)_ is accounted for by the linear
model relating _(response variable) to
_(Explanatory variable) .
O Example: From the Roller Coaster warm-up,
where π‘‘π‘’π‘Ÿπ‘Žπ‘‘π‘–π‘œπ‘› = 91.033 + 0.242(π‘‘π‘Ÿπ‘œπ‘)
O If we calculate that r2 is .82 we would
interpret that by saying β€œ82% of the variation
in duration of the ride is accounted for by
the linear model relating duration to initial
drop.”
Interpreting r2
O Explain what r2 means in context of the data
for the Distance vs. Fare example (#2).
Which scatterplot has a
r=0.816?
Is the linear model
appropriate?
O Scatter plot must meet the β€œStraight enough
condition”
O Correlation Coefficient- π‘Ÿ π‘π‘™π‘œπ‘ π‘’ π‘‘π‘œ 1.
O Residual Plot – random & not too far from
the line.
Calculating LSRL from means
y ο€½ a  bx
O Slope
bο€½r
O Y-intercept
sy
a ο€½ y ο€­ bx
sx
r= correlation coefficient
sy = std. dev of y-values
sx = std. dev. of x-values
π‘₯=mean of x-values
𝑦=mean of y-values
Example
In a random sample of 15 high school students, the mean height was
Found to be 171.43 cm with a standard deviation of 10.69 cm, and the
Mean foot length was 24.7c cm with a standard deviation of 2.71 cm. The
Correlation coefficient between foot length and height is r= 0.697
Find the LSRL and the coefficient of determination.
bο€½r
sy
sx
10.69
𝑏 = 0.697
2.71
𝒃 = 𝟐. πŸ•πŸ“
y ο€½ a  bx
a ο€½ y ο€­ bx
π‘Ž = 171.43 βˆ’ 2.75(24.76)
r2 = (0.697)2
r2 = 0.486
𝒂 = πŸπŸŽπŸ‘. πŸ‘πŸ’
π’‰π’†π’Šπ’ˆπ’‰π’• = πŸπŸŽπŸ‘. πŸ‘πŸ’ + 𝟐. πŸ•πŸ“(π’π’†π’π’ˆπ’•π’‰)
Outliers and Influential Points
O Outliers –
Observations that lie
outside the overall
pattern in the x or y
direction.
O Outliers in the ydirection have high
residuals.
O Influential points- a
point, that if
removed, would
markedly change the
result of the
calculation.
O Typically these are
extreme x-outliers
with small yresiduals.