Transcript September 2
2.2 Normal Distributions
9.2.2016
Homework Answers
1.
a)
b)
c)
Her percentile is .25, or the 25th percentile. 25% of girls have
fewer pairs of shoes
His percentile is .85, or the 85th percentile. 85% of boys have
fewer pairs of shoes
The boy is more unusual, because he is further from the 50th
percentile (or the median). He is 35 percentiles away from
the 50th, with only 15% being more extreme than he is, while
the girl is 25 percentiles away from the 50th, with 25% being
more extreme than she is.
The girl is taller than 78% of girls her age, but weighs more than only
48%. She is therefore probably fairly thin.
5.
9.
a)
b)
c)
Since the first quartile is the 25th percentile, you would find 25 on the yaxis, and find that your first quartile is about $19. Then find the 75th
percentile (third quartile) using the same method—it is about $50. So we
know that our IQR is about $31 (50-19)
Approximately the 25th percentile
35
30
25
20
15
10
5
0
0-10
10-20. 20-30. 30-40. 40-50. 50-60. 60-70. 70-80. 80-90. 90-100.
10.
a)
b)
To find the 60th percentile, find 60 on the y-axis, and find the x-value
where y=60. It is approximately 1,000 hours
Approximately the 35th percentile
11. For Eleanor, z=1.8. Gerald’s z-score is 1.5. Therefore, Eleanor has the
higher score.
12. For Cobb, z=4.15. For Williams, z=4.26, and for Brett, z=4.07. All
three hitters were more than 4 standard deviations above the mean,
but Williams had the best season.
13.
a)
b)
Judy’s bone density is about one and one half standard deviations below
the mean for women her age. The negative value indicates that it is below
average, and the magnitude of the value indicates how many standard
deviations above/below the mean it is.
5.52 grams/cm^2
14.
a)
b)
Mary’s score (0.5) indicates that her bone density is about half
of a standard deviation above average for women her age.
Even though the two bone density scores are exactly the
same, Mary is 10 years older, so her bone density is better in
comparison to the women her age.
8 grams/cm^2
15.
a)
b)
Since 22 salaries were less than Lidge’s salary, his salary was at
the 75.86 percentile.
Lidge’s salary was .79 standard deviations above the mean
salary of $3,388,617
19.
a)
b)
The mean and the median both increase by 18, so the mean is 87.188, and
the median is 87.5. The distribution of heights just shifts by 18 inches.
The standard deviation and IQR do not change
20.
a)
b)
The mean and median salaries will each increase by $1000 (the
distribution of salaries just shifts by $1000)
The extremes and quartiles will also each increase by $1000. The standard
deviation will not change.
21.
a)
b)
To give the heights in feet, not inches, we need to divide each observation
by 12 (12 inches=1 foot). Thus, the mean and the median are divided by
12. The new mean is 5.77 feet, and the new median is 5.79 feet.
We simply divide the old standard deviation by 12 to get .27 feet.
Similarly, we divide the old first and third quartiles by 12 to get 5.65 feet
for Q1 and 5.92 feet for Q3, leaving us with an IQR of .27 feet
22.
a)
b)
The mean and the median will each increase by 5%
The 5% raise will increase the distance of the quartiles from
the median. The quartiles and the standard deviation will each
increase by 5% (the original values multiplied by 1.05)
23. Mean in degrees Fahrenheit is 77. Standard deviation in
degrees Fahrenheit is 3.6
31.
a)
b)
Mean is C, median is B
Mean is B, median is B
33. C
34. B
35. C
36. B
37. D
38. E
New Material
Density Curves and Normal Distributions
In Chapter 1, we developed a kit of graphical and numerical
tools for describing distributions. Now, we’ll add one more
step to the strategy.
So far our strategy for exploring data is :
1. Graph the data to get an idea of the overall pattern
2. Calculate an appropriate numerical summary to
describe the center and spread of the distribution.
Sometimes the overall pattern of a large number of
observations is so regular, that we can describe it by a
smooth curve, called a density curve.
+
Curves
Describing Location in a Distribution
Density
Curve
A density curve is a curve that
•is always on or above the horizontal axis, and
•has area exactly 1 underneath it.
A density curve describes the overall pattern of a distribution.
The area under the curve and above any interval of values on
the horizontal axis is the proportion of all observations that fall in
that interval.
The overall pattern of this histogram of
the scores of all 947 seventh-grade
students in Gary, Indiana, on the
vocabulary part of the Iowa Test of
Basic Skills (ITBS) can be described
by a smooth curve drawn through the
tops of the bars.
Describing Location in a Distribution
Definition:
+
Density
Here’s the idea behind density curves:
+
+
No real set of data is described by a density curve. It
is simply an approximation that is easy to use and accurate
enough for practical use.
Describing
Density Curves
The
median of the density curve is the “equal-areas” point.
The
mean of the density curve is the “balance” point.
Density Curves
NOTE:
Section 2.2
Normal Distributions
Learning Objectives
After this section, you should be able to…
DESCRIBE and APPLY the 68-95-99.7 Rule
DESCRIBE the standard Normal Distribution
PERFORM Normal distribution calculations
ASSESS Normality
Normal Distributions
Normal Distributions
One particularly important class of density curves are the Normal
curves, which describe Normal distributions.
All Normal curves are symmetric, single-peaked, and bell-shaped
A Specific Normal curve is described by giving its mean µ and
standard deviation σ.
Two Normal curves, showing the mean µ and standard deviation σ.
Definition:
Normal Distributions
Normal Distributions
A Normal distribution is described by a Normal density curve. Any
particular Normal distribution is completely specified by two numbers: its
mean µ and standard deviation σ.
•The mean of a Normal distribution is the center of the symmetric
Normal curve.
•The standard deviation is the distance from the center to the
change-of-curvature points on either side.
•We abbreviate the Normal distribution with mean µ and standard
deviation σ as N(µ,σ).
Normal distributions are good descriptions for some distributions of real data.
Normal distributions are good approximations of the results of many kinds of
chance outcomes.
Many statistical inference procedures are based on Normal distributions.
The 68-95-99.7 Rule
Definition:
The 68-95-99.7 Rule (“The Empirical Rule”)
In the Normal distribution with mean µ and standard deviation σ:
•Approximately 68% of the observations fall within σ of µ.
•Approximately 95% of the observations fall within 2σ of µ.
•Approximately 99.7% of the observations fall within 3σ of µ.
Normal Distributions
Although there are many Normal curves, they all have
properties in common.
Example, p. 113
Normal Distributions
The distribution of Iowa Test of Basic Skills (ITBS) vocabulary
scores for 7th grade students in Gary, Indiana, is close to
Normal. Suppose the distribution is N(6.84, 1.55).
a) Sketch the Normal density curve for this distribution.
b) What percent of ITBS vocabulary scores are less than 3.74?
c) What percent of the scores are between 5.29 and 9.94?
Normal Distributions
The Standard Normal Distribution
All Normal distributions are the same if we measure in units of size σ
from the mean µ as center.
Definition:
The standard Normal distribution is the Normal distribution
with mean 0 and standard deviation 1.
If a variable x has any Normal distribution N(µ,σ) with mean µ
and standard deviation σ, then the standardized variable
z
x -
has the standard Normal distribution, N(0,1).
Because all Normal distributions are the same when we
standardize, we can find areas under any Normal curve from
a single table.
Definition:
The Standard Normal Table
Table A is a table of areas under the standard Normal curve. The table
entry for each value z is the area under the curve to the left of z.
Suppose we want to find the
proportion of observations from the
standard Normal distribution that are
less than 0.81.
We can use Table A:
Z
.00
.01
.02
0.7
.7580
.7611
.7642
0.8
.7881
.7910
.7939
0.9
.8159
.8186
.8212
Normal Distributions
The Standard Normal Table
P(z < 0.81) = .7910
Example, p. 117
Find the proportion of observations from the standard Normal distribution that
are between -1.25 and 0.81.
Normal Distributions
Finding Areas Under the Standard Normal Curve
Can you find the same proportion using a different approach?
1 - (0.1056+0.2090) = 1 – 0.3146
= 0.6854
State: Express the problem in terms of the observed variable x.
Plan: Draw a picture of the distribution and shade the area of
interest under the curve.
Do: Perform calculations.
•Standardize x to restate the problem in terms of a standard
Normal variable z.
•Use Table A and the fact that the total area under the curve
is 1 to find the required area under the standard Normal curve.
Conclude: Write your conclusion in the context of the problem.
Normal Distributions
Normal Distribution Calculations
When Tiger Woods hits his driver, the distance the ball travels can be
described by N(304, 8). What percent of Tiger’s drives travel between 305
and 325 yards?
When x = 305, z =
305 - 304
0.13
8
When x = 325, z =
325 - 304
2.63
8
Normal Distributions
Normal Distribution Calculations
Using Table A, we can find the area to the left of z=2.63 and the area to the left of z=0.13.
0.9957 – 0.5517 = 0.4440. About 44% of Tiger’s drives travel between 305 and 325 yards.
Using Technology
There are two useful “technology corner” sections of your
textbook that will help you use your calculator to do these
types of calculations
Page 118 and Page 123
Let’s focus on the one on Page 123
You can use your calculator as a replacement for Table A
You will use the normalcdf command on your TI 83/84
Let’s try it, using our height data:
Let’s assume that our mean height was 67 inches, and the standard
deviation of our heights was 5 inches
We know that roughly 68 percent of people should fall between 62
inches and 72 inches. How do we know this?
Let’s confirm it
You will hit ‘2nd’ and then ‘VARS’ and then go to ‘DISTR’
Choose #2: ‘normalcdf’
The syntax of this command is normalcdf(lower bound, upper bound, mean,
standard deviation)
What should we choose as lower and upper bounds?
Depends—if we want to know the percentage BETWEEN, we choose our
two endpoints
So our command would look like this: normalcdf(62,72,67,5)
What is the result?
What if we wanted to know what percent of people fall below 5 feet
Normalcdf(0,60,67,5)
What if we wanted to know the percent above 6 feet?