Chapter 7 Continuous Distributions
Download
Report
Transcript Chapter 7 Continuous Distributions
Chapter 7
Continuous
Distributions
Continuous random variables
• Are numerical variables
whose values fall within a
range or interval
• Are measurements
• Can be described by
density curves
Density curves
• Is always on or above the
horizontal axis
• Has an area exactly equal to one
underneath it
• Often describes an overall
distribution
• Describe what proportions of the
observations fall within each range
of values
Unusual density curves
• Can be any shape
• Are generic continuous
distributions
• Probabilities are calculated
by finding the area under
the curve
.5
How do you find
the area of a
triangle?
.25
1
2
3
4
2.25
.25
P(X < 2) =
2
5
What is the
area of a line
segment?
.5
.25
1
2
P(X = 2) = 0
P(X < 2) = .25
3
4
5
In continuous
distributions,
P(X < 2) & P(X < 2)
areHmmmm…
the same
answer.
Is this different
than discrete
distributions?
Shape is a
trapezoid –
.5
b1How
= .5long are the
bases?
.25
b2 = .375
1
2
4
h = 1
3
b1 b2 h
Area
5
2
P(X > 3) = .5(.375+.5)(1)=.4375
P(1 < X < 3) =.5(.125+.375)(2) =.5
P(X > 1) = .75
0.50
.5(2)(.25) = .25
0.25
(2)(.25) = .5
1
2
3
4
0.50
P(0.5 < X < 1.5) = .28125
.5(.25+.375)(.5) =
.15625
0.25
(.5)(.25) = .125
1
2
3
4
Special Continuous
Distributions
Uniform Distribution
• Is a continuous distribution that is
evenly (or uniformly) distributed
• Has a density curve in the shape of a
rectangle
• Probabilities are calculated by finding
the area under the curve
a b
x
2
x2
b a
12
2
How
do ayou
the
Where:
& bfind
are the
area
endpoints
ofof
thea
rectangle?
uniform
distribution
The Citrus Sugar Company packs sugar in
bags labeled 5 pounds. However, the
packaging isn’t perfect and the actual
What
shape
does a uniform
weights
are
uniformly
distributed
with a
What is the height of this
distribution
have?
mean of 4.98
pounds
and
a
range
of .12
rectangle?
pounds.
How long is this rectangle?
a)Construct the uniform distribution above.
1/.12
4.92
4.98
5.04
•
What is the probability that a
randomly selected bag will weigh
more than 4.97 pounds?
P(X > 4.97) =
.07(1/.12)
= .5833
What is the
length of
the shaded region?
1/.12
4.92
4.98
5.04
• Find the probability that a
randomly selected bag weighs
between 4.93 and 5.03 pounds.
What is the
length of
P(4.93<X<5.03) = .1(1/.12)
= .8333
the shaded region?
1/.12
4.92
4.98
5.04
The time it takes for students to drive
to school is evenly distributed with a
minimum of 5 minutes and a range of 35
minutes.
What is the height of the
rectangle?
a)Draw the distribution
Where should the
rectangle end?
1/35
5
40
b) What is the probability that it takes
less than 20 minutes to drive to
school?
P(X < 20) = (15)(1/35) = .4286
1/35
5
40
c) What is the mean and standard
deviation of this distribution?
= (5 + 40)/2 = 22.5
2 = (40 - 5)2/12 = 102.083
= 10.104
Normal Distributions
•
•
•
•
•
Symmetrical bell-shaped (unimodal) density curve
How is this done
Above the horizontal axis
mathematically?
N(, )
The transition points occur at +
Probability is calculated by finding the area under
the curve
• As increases, the curve flattens &
spreads out
• As decreases, the curve gets
taller and thinner
Normal distributions occur
frequently.
•
•
•
•
•
•
•
Length of newborn child
Height
Weight
ACT or SAT scores
Intelligence
Number of typing errors
Chemical processes
A
6
B
Do these two normal curves have the same mean?
If so, what is it?
YES
Which normal curve has a standard deviation of 3?
B
Which normal curve has a standard deviation of 1?
A
Empirical Rule
• Approximately 68% of the
observations fall within of
• Approximately 95% of the
observations fall within 2 of
• Approximately 99.7% of the
observations fall within 3 of
Suppose that the height of male
students at FHS is normally
distributed with a mean of 71 inches
and standard deviation of 2.5 inches.
What is the probability that the
height of a randomly selected male
student is more than 73.5 inches?
1 - .68 = .32
P(X > 73.5) = 0.16
68%
71
Standard Normal Density
Curves
Always has = 0 & = 1
To standardize:
x
z
Must have
this
memorized!
Strategies for finding
probabilities or proportions in
normal distributions
1. State the probability
statement
2. Draw a picture
3. Calculate the z-score
4. Look up the probability
(proportion) in the table
The lifetime of a certain type of battery
is normally distributed with a mean of
200 hours
and
a standardDraw
deviation
of 15
& shade
Write
the
hours. What
proportion of these
the curve
probability
batteries
can be expected to last less
statement
than 220 hours?
P(X < 220) = .9082
Look up z220
200
score
in
z
1.33
table
15
Calculate z-score
The lifetime of a certain type of battery
is normally distributed with a mean of
200 hours and a standard deviation of 15
hours. What proportion of these
batteries can be expected to last more
than 220 hours?
P(X>220) = 1 - .9082
= .0918
220 200
z
1.33
15
The lifetime of a certain type of battery
is normally distributed with a mean of
200 hours and a standard deviation of 15
Look
up in
table 0.95
hours. How long
must
a battery
last to be
in the top 5%? to find z- score
P(X > ?) = .05
x 200
1.645
15
x 224.675
.95
.05
1.645
The heights of the female students at
FHS are normally distributed with a
What
is the zmean of 65 inches. What
is the
for the
standard deviation of this score
distribution
63?
if 18.5% of the female students are
shorter than 63 inches?
P(X < 63) = .185
63 65
.9
2
2.22
.9
-0.9
63
The heights of female teachers at FHS
are normally distributed with mean of
65.5 inches and standard deviation of
2.25 inches. The heights of male
teachers are normally distributed with
mean of 70 inches and standard
deviation of 2.5 inches.
•Describe the distribution of differences
of heights (male – female) teachers.
Normal distribution with
= 4.5 & = 3.3634
• What is the probability that a
randomly selected male teacher is
shorter than a randomly selected
female teacher?
P(X<0) = .0901
0 4 .5
z
1.34
3.3634
4.5
Will my calculator do any
of this normal stuff?
• Normalpdf – use for graphing ONLY
• Normalcdf – will find probability of area
from lower bound to upper bound
• Invnorm (inverse normal) – will find zscore for probability
Ways to Assess Normality
• Use graphs (dotplots,
boxplots, or histograms)
• Use the Empirical Rule
• Normal probability
(quantile) plot
Normal Probability (Quantile) plots
• The observation (x) is plotted against known
normal z-scores
• If the points on the quantile plot lie close
to a straight line, then the data is normally
distributed
• Deviations on the quantile plot indicate
nonnormal data
• Points far away from the plot indicate
outliers
• Vertical stacks of points (repeated
observations of the same number) is called
granularity
Normal Scores
Suppose
we have
the following
To
construct
a normal
probability plot,
Sketch
a
scatterplot
by
pairing
the
Think
of
selecting
sample
after
sample
of
observations
of
widths
of
contact
you cansmallest
use quantities
called
normal
normal
score with
the
size
10
from
a
standard
normal
windows
in integrated
circuit
chips:
What should
score.
The
values
of
the
normal
scores
smallest
observation
from
the
the
1distribution. Then -1.539 is data
happen
if sample size n. The normal
depend
on
the
set smallest
& so on observation
average of the
our when
data n = 10 are below:
scores
from each sample & so on . . .
is
3.21 set2.49
2.94 4.38 4.02
2
33.34
4 3.81
5
3.62normally
3.301 2.85
distributed?
-1
-1.539 -1.001 -0.656 -0.376 -0.123
Contact 1.001
Windows1.539
0.123 Widths
0.376of 0.656
Are these approximately normally
distributed?
50 48 54 47 51 52 46 53
What
52 51 48 48 54 55
57is this
45
53 50 47 49 50 56 called?
53 52
Both the histogram & boxplot
are approximately
symmetrical, so these data
are approximately normal.
The normal probability
plot is approximately
linear, so these data are
approximately normal.
Normal Approximation to the
Binomial
Before widespread use of
technology, binomial probability
calculations were very tedious.
Let’s see how statisticians
estimated these calculations in
the past!
Premature babies are those born more than
3 weeks early. Newsweek (May 16, 1988)
reported that 10% of the live births in the
U.S. are premature. Suppose that 250 live
births are randomly selected and that the
number X of the “preemies” is determined.
What is the probability that there are
between 15 and 30 preemies, inclusive?
(POD, p. 422)
1) Find this probability using the binomial
distribution.P(15<X<30) = binomialcdf(250,.1,30) –
binomialcdf(250,.1,14) =.866
2) What is the mean and standard deviation
of the above distribution? = 25 & = 4.743
3) If we were to graph a
histogram for the above binomial
distribution,
what
shape do –you
Let’s graph this
distribution
think it will have?
•Put the numbers 1-45 in L1
Since the probability is only 10%,
we
expect
the histogram
be
•Inwould
L2, use
binomialpdf
to to
find
strongly
skewed right.
the probabilities.
4) What do you notice about the
shape?
Overlay a normal curve on your
histogram:
•In Y1 = normalpdf(X,,)
Normal distributions can be used to
estimate probabilities for binomial
distributions when:
1) the probability of success is close to
.5
or
2) n is sufficiently large
Rule: if n is large enough,
then np > 10 & n(1 –p) > 10
Why 10?
Normal distributions extend infinitely in
both directions; however, binomial
distributions are between 0 and n. If
we use a normal distribution to
estimate a binomial distribution, we
must cut off the tails of the normal
distribution. This is OK if the mean of
the normal distribution (which we use
the mean of the binomial) is at least
three standard deviations (3) from 0
and from n. (BVD, p. 334)
We require:
3 0
Or
3
As binomial:
np 3 np 1 p
Square:
n 2 p 2 9np 1 p
Simplify:
np 91 p
Since (1 - p) < 1:
np 9
n 1 p 9
And p < 1:
Therefore, we say the np should be at
least 10 and n (1 – p) should be at least
10.
Normal
can be used
to
Thinkdistributions
about how discrete
histograms
estimate
probabilities
for
binomial
are
made.
Each
bar
is
centered
distributions when:
over
the
discrete
values.
The
bar
1) the probability of success is close to .5
for
“1”
actually
goes
from
0.5
to
or
1.5
&
the
bar
for
“2”
goes
from
1.5
2) n is sufficiently large
to
2.5.
Therefore,
by
adding
or
Rule: if n is large enough,
subtracting
.5
from
the
discrete
then np > 10 & n(1 –p) > 10 Why?
values, you find the actually width
of
the
bars
that
you
need
to
Since a continuous distribution is used to
estimate
with the normal
curve.
estimate
the probabilities
of a discrete
distribution, a continuity correction is used
to make the discrete values similar to
continuous values.(+.5 to discrete values)
(Back to our example) Since P(preemie) = .1 which
is not close to .5, is n large enough?
np = 250(.1) = 25 & n(1-p) = 250(.9) = 225
Yes, Ok to use normal to approximate binomial
5) Use a normal distribution with the binomial mean
and standard deviation above to estimate the
probability that between 15 & 30 preemies,
inclusive, are born in the 250 randomly selected
babies.
Binomial
written as
Normal (w/cont. correction)
P(15 < X < 30)
P(14.5 < X < 30.5) =
Normalcdf(14.5,30.5,25,4.743) = .8635
6) How does the answer in question 6 compare to
the answer in question 1 (Binomial answer =0.866)?