3.2 - Least-Squares Regression

Download Report

Transcript 3.2 - Least-Squares Regression

3.2 - LeastSquares
Regression
Where else have we
seen “residuals?”
•
Sx = data point - mean (observed predicted)
z-scores = observed - expected
* note: this is just the numerator of
these calculations
Remember:
AP
Below is the LSRL for sprint time (seconds) and the long jump distance (inches)
Find and interpret the residual for John who had a time of 8.09 seconds and a
jump of 151 inches.
•
predicted long jump distance = 304.56 - 27.63(sprint time)
residual = observed - predicted
151- 81.03
residual = 69.97 inches
John jumped much farther than what was
predicted by our least squares regression
line. He jumped almost 70 inches farther,
based on his sprint time.
So why least squared
regression line?
http://bcs.whfreeman.com/tps4e/#628644__666392__
Residual Plots
a scatterplot of the residuals
against the explanatory
variable.
Use to help assess the
strength of your regression
line
Residual Plots
with Normal Probability Plots we want
the graphs to be linear to support the
Normality of our data.
with Residual Plots we want the
residuals to be very scattered so our
data is can be model with a linear
regression.
Remember:
Correlation does NOT assess
linearity, just strength and direction!
What’s a Good
Residual Plot?
No obvious pattern - the LSRL would be
in the middle of the data, some data
above and some below
Relatively small residuals - the data
points are close to the LSRL
Do the following residual plots support
or refute a linear model?
http://content.ebscohost.com/pdf23_24/pdf/2009/D8Y/01Sep09/43669525.pdf?T=P&P=AN&K=43669525&S=R&D=aph&EbscoContent=dGJyMNHX8kSeqK84yOvqOLCmr0qep7RSs6%2B4S7aWxWXS&ContentCustomer=
ssk2xqLJNuePfgeyx44Hy
How to Graph?
Take each data point and determine the
residual
Plot the residuals versus the explanatory
variable
•
i.e. (explanatory data, residual)
residual
2
1.5
1
0.5
0
-0.5
-1
-1.5
-2
use the same numbers as your scatterplot
explanatory variable
Calculator
Construction
If you have a lot of data, follow the instructions on page 178
to construct your residual plot
(you will also have to have done the technology corner on p. 170)
What is Standard
Deviation?
the average squared distance
a data point is from the mean
Is there a sx? Is there a sy?
So why not s? (standard
deviation of residuals)
Standard Deviation of
Residuals
gives the approximate size of
an “average” or “typical”
prediction error from our LSRL
formula on page 177
Why divide by n-2?