Get out p. 193 HW and notes

Download Report

Transcript Get out p. 193 HW and notes

Get out p. 193 HW and notes
LEAST-SQUARES
REGRESSION
3.2 Interpreting Computer Regression Output
Interpreting Computer Regression Output
A number of statistical software packages produce similar regression
output. Be sure you can locate
• the slope b
• the y intercept a
• the values of s and r2
Interpreting Computer Regression Output
A number of statistical software packages produce similar regression
output. Be sure you can locate
• the slope b
• the y intercept a
• the values of s and r2
Example, p. 181 & 182
A random sample of 15 high school students was selected from the U.S.
CensusAtSchool database. The foot length (in cm) and height (in cm) of each
student in the sample were recorded.
Example, p. 181 & 182
(a) What is the equation of the least-squares regression line that describes the
relationship between foot length and height? Define any variables that you
use.
𝑦 = 103.41 + 2.7469𝑥 where x = foot length and y = height.
OR ℎ𝑒𝑖𝑔ℎ𝑡 = 103.41 + 2.7469(𝑓𝑜𝑜𝑡 𝑙𝑒𝑛𝑔𝑡ℎ)
Example, p. 181 & 182
(c) Find the correlation.
Take the square root of r2 = .486.
𝑟 2 = .486 ≈ ±0.697
Because the scatterplot showed a positive relationship, r = 0.697.
Regression to the Mean
How to Calculate the Least-Squares Regression Line
We have data on an explanatory variable x and a response
variable y for n individuals. From the data, calculate the means and
the standard deviations of the two variables and their correlation r.
The least-squares regression line is the line ŷ = a + bx with slope
sy
b=r
sx
And y intercept
a = y - bx
Example, p. 183
Using Feet to Predict Height. The mean and standard deviations of the
foot lengths are 𝑥 = 24.76 cm and 𝑠𝑥 = 2.71 cm. The mean and standard
deviation of the heights are 𝑦 = 171.43 cm and 𝑠𝑦 = 10.69 cm. The
correlation between foot length and height is 𝑟 = 0.697.
Problem: Find the equation for the least-squares regression line for
predicting height from foot length. Show your work.
Slope: 𝑏 = 𝑟
𝑠𝑦
𝑠𝑥
= 0.697
10.69
2.71
≈ 2.7494
Y-intercept: 𝑎 = 𝑦 − 𝑏𝑥 = 171.43 − 2.7494(24.76) ≈ 103.3549
LSRL: 𝑦 = 103.3549 + 2.7494𝑥 , where x = foot length and y =
height.
Correlation and Regression Wisdom
1. The distinction between explanatory and response variables is
important in regression.
Correlation and Regression Wisdom
2. Correlation and regression lines describe only linear relationships.
r = 0.816.
r = 0.816.
r = 0.816.
r = 0.816.
Correlation and Regression Wisdom
3. Correlation and least-squares regression lines are not resistant.
Correlation and Regression Wisdom
4. Association does not imply causation.
Outliers and Influential Observations in
Regression
An outlier is an observation that lies outside the overall pattern of the
other observations. Points that are outliers in the y direction but not
the x direction of a scatterplot have large residuals. Other outliers
may not have large residuals.
An observation is influential for a statistical calculation if removing it
would markedly change the result of the calculation. Points that are
outliers in the x direction of a scatterplot are often influential for the
least-squares regression line.
Outliers
Influential
Example p. 190
• The strong influence of Child 18 makes the original
regression of Gesell score on age at first word misleading.
The original data have r2 = 0.41, which means the age a
child begins to talk explains 41% of the variation on a later
test of mental ability. This relationship is strong enough to
be interesting to parents. If we leave out Child 18, r2
drops to only 11%. The apparent strength of the
association was largely due to a single influential
observation.
HW Due: Friday
• P. 196 #59, 61 a, 63, 72, 73