Lecture 11 - The Department of Mathematics & Statistics

Download Report

Transcript Lecture 11 - The Department of Mathematics & Statistics

The Normal Probability Distribution
Points of
Inflection
s
m - 3s m - 2s m - s
m
m +s
m + 2s
m + 3s
Main characteristics of the Normal Distribution
• Bell Shaped, symmetric
• Points of inflection on the bell shaped curve are at
m – s and m + s. That is one standard deviation
from the mean
• Area under the bell shaped curve between m – s and m +
s is approximately 2/3.
• Area under the bell shaped curve between m – 2s and m
+ 2s is approximately 95%.
• Close to 100% of the area under the bell shaped curve
between m – 3s and m + 3s,
There are many Normal distributions
depending on by m and s
Normal m = 100, s =20
0.03
Normal m = 100, s = 40
Normal m = 140, s =20
f(x)
0.02
0.01
0
0
50
100
x
150
200
The Standard Normal Distribution
m = 0, s = 1
0.4
0.3
0.2
0.1
0
-3
-2
-1
0
1
2
3
• There are infinitely many normal probability
distributions (differing in m and s)
• Area under the Normal distribution with mean m and
standard deviation s can be converted to area under the
standard normal distribution
• If X has a Normal distribution with mean m and standard
deviation s than has a standard normal distribution
z
X -m
s
has a standard normal distribution.
• z is called the standard score (z-score) of X.
Converting Area
under the Normal distribution with mean m and
standard deviation s
to
Area under the standard normal distribution
Perform the z-transformation
z
then
X -m
P  a  X  b
s
Area under the Normal
distribution with mean m
and standard deviation s
a - m X - m b - m 
 P


s
s 
 s
b-m
a - m
 P
z

s
s


Area under the
standard normal
distribution
Area under the Normal distribution with
mean m and standard deviation s
P  a  X  b
s
a
m
b
Area under the standard normal distribution
b-m
a - m
P
z

s
s


1
a-m
s
0
b-m
s
Using the tables for the Standard Normal
distribution
Example
Find the area under the standard normal curve between z = -
and z = 1.45
0.9265
0
• A portion of Table 3:
z
0.00
0.01
0.02
0.03
1.45
0.04
z
0.05
..
.
1.4
..
.
P( z  1.45)  0.9265
0.9265
0.06
Example
Find the area to the left of -0.98; P(z < -0.98)
Area asked for
-0.98 0
P ( z < - 0.98)  0.1635
Example
Find the area under the normal curve to the right of z =
1.45; P(z > 1.45)
Area asked for
0.9265
0
1.45
P( z  1.45)  1.0000 - 0.9265  0.0735
z
Example
Find the area to the between z = 0 and of z = 1.45; P(0 < z
< 1.45)
0
1.45
P( z < 1.45)  0.9265 - 0.5000  0.4265
• Area between two points = differences in two
tabled areas
z
Notes
Use the fact that the area above zero and the area
below zero is 0.5000



the area above zero is 0.5000
When finding normal distribution probabilities, a sketch
is always helpful
Example:
Find the area between the mean (z = 0) and z = -1.26
Area asked for
- 1.26
0
z
P( -1.26 < z < 0)  0.5000 - 0.1038  0.3962
Example: Find the area between z = -2.30 and z = 1.80
Required Area
.-2.30
0
.1.80
P( -1.26 < z < 1.80)  0.9641 - 0.0107  0.9534
Example: Find the area between z = -1.40 and z = -0.50
Area asked
for
-1.40
- 0.500
P( -1.40 < z < -0.50)  0.3085 - 0.0808  0.2277
Computing Areas under the general
Normal Distributions
(mean m, standard deviation s)
Approach:
1. Convert the random variable, X, to its z-score.
z
X -m
s
2. Convert the limits on random variable, X, to
their z-scores.
3. Convert area under the distribution of X to area
under the standard normal distribution.
b-m
a - m
Pa  X  b  P 
z

s
s


Example
Example:
A bottling machine is adjusted to fill bottles with a
mean of 32.0 oz of soda and standard deviation of
0.02. Assume the amount of fill is normally distributed
and a bottle is selected at random:
1) Find the probability the bottle contains between 32.00 oz and
32.025 oz
2) Find the probability the bottle contains more than 31.97 oz
Solutions part 1)
When x  32.00 ;
When x  32.025;
32.00 - m 32.00 - 32.0

 0.00
z
s
0.02
z
32.025 - m 32.025 - 32.0

 1.25
s
0.02
Graphical Illustration:
Area asked for
32.0
0
32.025
1.25
x
z
32.0 - 32.0 X - 32.0 32.025 - 32.0 

<
<

P ( 32.0 < X < 32.025)  P 


0.02
0.02
0.02
 P ( 0 < z < 1.25)  0. 3944
Example, Part 2)
31.97
- 150
.
32.0
0
x
z
x - 32.0
3197
. - 32.0 


  P( z  -150)
P( x  3197
. )  P
.
 0.02

0.02
 1.0000 - 0.0668  0.9332
Combining Random Variables
Quite often we have two or more random variables
X, Y, Z etc
We combine these random variables using a
mathematical expression.
Important question
What is the distribution of the new random variable?
An Example
Suppose that a student will take three tests in the next
three days
1. Mathematics (X is the score he will receive on this
test.)
2. English Literature (Y is the score he will receive on
this test.)
3. Social Studies (Z is the score he will receive on this
test.)
Assume that
1. X (Mathematics) has a Normal distribution with
mean m = 90 and standard deviation s = 3.
2. Y (English Literature) has a Normal distribution
with mean m = 60 and standard deviation s = 10.
3. Z (Social Studies) has a Normal distribution with
mean m = 70 and standard deviation s = 7.
Graphs
0.14
X (Mathematics)
m = 90, s = 3.
0.12
0.1
0.08
Z (Social Studies)
m = 70 , s = 7.
0.06
0.04
Y (English Literature)
m = 60, s = 10.
0.02
0
0
20
40
60
80
100
Suppose that after the tests have been written an overall
score, S, will be computed as follows:
S (Overall score) = 0.50 X (Mathematics) + 0.30 Y
(English Literature) + 0.20 Z (Social Studies) +
10 (Bonus marks)
What is the distribution of the overall score, S?
Sums, Differences, Linear Combinations of R.V.’s
A linear combination of random variables, X, Y, . . . is
a combination of the form:
L = aX + bY + …
where a, b, etc. are numbers – positive or negative.
Most common:
Sum = X + Y
Difference = X – Y
Others
Averages = 1/3 X + 1/3 Y + 1/3 Z
Weighted averages = 0.40 X + 0.25 Y + 0.35 Z
Means of Linear Combinations
If
L = aX + bY + …
The mean of L is:
Mean(L) = a Mean(X) + b Mean(Y) + …
mL = a mX + b mY + …
Most common:
Mean( X + Y) = Mean(X) + Mean(Y)
Mean(X – Y) = Mean(X) – Mean(Y)
Variances of Linear Combinations
If X, Y, . . . are independent random variables and
L = aX + bY + … then
Variance(L) = a2 Variance(X) + b2 Variance(Y) + …
s L2  a 2s X2 + b 2s Y2 +
Most common:
Variance( X + Y) = Variance(X) + Variance(Y)
Variance(X – Y) = Variance(X) + Variance(Y)
Combining Independent Normal Random Variables
If X, Y, . . . are independent normal random variables,
then L = aX + bY + … is normally distributed.
In particular:
X + Y is normal with mean m X + mY
standard deviation
s X2 + s Y2
X – Y is normal with mean m X - mY
standard deviation
s X2 + s Y2
Example: Suppose that one performs two
independent tasks (A and B):
X = time to perform task A (normal with mean 25
minutes and standard deviation of 3 minutes.)
Y = time to perform task B (normal with mean 15
minutes and std dev 2 minutes.)
X and Y independent so T = X + Y = total time is normal
mean
m  25 + 15  40
with
standard deviation s  32 + 2 2  3.6
What is the probability that the two tasks take more than 45
minutes to perform?
45 - 40 

PT  45  P Z 
  PZ  1.39  .0823
3.6 

The distribution of averages (the mean)
• Let x1, x2, … , xn denote n independent random
variables each coming from the same Normal
distribution with mean m and standard deviation s.
n
• Let
x
x
i 1
n
i
1
1
   x1 +   x2 +
n
n
What is the distribution of x ?
1
+   xn
n
The distribution of averages (the mean)
Because the mean is a “linear combination”
1
1
m x    m x1 +   m x2 +
n
n
1
1
  m + m +
n
n
1
+   m xn
n
1
1
+   m  n  m  m
n
n
and
2
2
2
1 2 1 2
1 2
s    s x1 +   s x2 + +   s xn
n
n
n
2
2
2
s2 s2
1 2 1 2
1 2
   s +  s + +  s  n 2 
n
n
n
n
n
2
x
Thus if x1, x2, … , xn denote n independent random
variables each coming from the same Normal
distribution with mean m and standard deviation s.
Then
n
x
x
i
i 1
n
1
1
   x1 +   x2 +
n
n
1
+   xn
n
has Normal distribution with
mean m x  m and
variance s x2 
s2
n
standard deviation s x 
s
n
Example
• Suppose we are measuring the cholesterol level of
men age 60-65
• This measurement has a Normal distribution with
mean m = 220 and standard deviation s = 17.
• A sample of n = 10 males age 60-65 are selected and
the cholesterol level is measured for those 10 males.
• x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, are those 10
measurements
Find the probability distribution of x ?
Compute the probability that x is between 215 and 225
Example
• Suppose we are measuring the cholesterol level of
men age 60-65
• This measurement has a Normal distribution with
mean m = 220 and standard deviation s = 17.
• A sample of n = 10 males age 60-65 are selected and
the cholesterol level is measured for those 10 males.
• x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, are those 10
measurements
Find the probability distribution of x ?
Compute the probability that x is between 215 and 225
Solution
Find the probability distribution of x
Normal with m x  m  220
s
17
and s x 

 5.376
n
10
P  215  x  225
 215 - 220 x - 220 225 - 220 
 P



5.376
5.376
5.376


 P  -0.930  z  0.930  0.648
Graphs
0.08
The probability
distribution of
the mean
0.06
0.04
The probability
distribution of
individual
observations
0.02
0
150
170
190
210
230
250
270
290
310
Normal approximation to the Binomial
distribution
Using the Normal distribution to calculate
Binomial probabilities
Binomial distribution n = 20, p = 0.70
0.2500
Approximating
Normal distribution
0.2000
m  np  14
s  npq  2.049
0.1500
Binomial distribution
0.1000
0.0500
-0
-0.5
2
4
6
8
10
12
14
16
18
20
Normal Approximation to the Binomial
distribution
PX  a  Pa - 12  Y  a + 12 
• X has a Binomial distribution with
parameters n and p
• Y has a Normal distribution
m  np
s  npq
1
2
 continuity correction
0.2500
Approximating
Normal distribution
0.2000
P[X = a]
0.1500
Binomial distribution
0.1000
0.0500
-0
-0.5
2
4
6
8
10
a - 12
12
a
14
a+
16
1
2
18
20
0.2500
0.2000
Pa - 12  Y  a + 12 
0.1500
0.1000
0.0500
--
-0.5
a
0.2500
0.2000
P[X = a]
0.1500
0.1000
0.0500
--
-0.5
a
Example
• X has a Binomial distribution with
parameters n = 20 and p = 0.70
We want PX  13
The exact valu e PX  13
 20 
13
7
  0.70 0.30  0.1643
 13 
Using the Normal approximation to the
Binomial distribution
PX  13  P12 12  Y  13 12 
Where Y has a Normal distribution with:
m  np  20(0.70)  14
s  npq  20.70.30  2.049
Hence
P12.5  Y  13.5
12.5 - 14 Y - 14 13.5 - 14 
 P



2
.
049
2
.
049
2
.
049


 P- 0.73  Z  -0.24
= 0.4052 - 0.2327 = 0.1725
Compare with 0.1643
Normal Approximation to the Binomial
distribution
Pa  X  b  p(a) + p(a + 1) + + p(b)
1
1

 P a - 2  Y  b + 2
• X has a Binomial distribution with
parameters n and p
• Y has a Normal distribution
m  np
s  npq
1
2  continuity correction
0.2500
Pa  X  b
0.2000
0.1500
0.1000
0.0500
--
-0.5
a - 12
a
b
b + 12
0.2500
Pa - 12  Y  b + 12 
0.2000
0.1500
0.1000
0.0500
--
-0.5
a - 12
a
b
b + 12
Example
• X has a Binomial distribution with
parameters n = 20 and p = 0.70
We want P11  X  14
The exact valu e P11  X  14
 p(11) + p(12) + p(13) + p(14)
 20 
 20 
11
9
14
6




  0.70 0.30 +  +  0.70 0.30
 11 
 14 
 0.0654 + 0.1144 + 0.1643 + 0.1916  0.5357
Using the Normal approximation to the
Binomial distribution
P11  X  14  P10 12  Y  14 12 
Where Y has a Normal distribution with:
m  np  20(0.70)  14
s  npq  20.70.30  2.049
Hence
P10.5  Y  14.5
10.5 - 14 Y - 14 14.5 - 14 
 P



2
.
049
2
.
049
2
.
049


 P-1.71  Z  0.24
= 0.5948 - 0.0436 = 0.5512
Compare with 0.5357
Comment:
• The accuracy of the normal
appoximation to the binomial
increases with increasing values of n
Example
• The success rate for an Eye operation is 85%
• The operation is performed n = 2000 times
Find
1. The number of successful operations is
between 1650 and 1750.
2. The number of successful operations is at
most 1800.
Solution
• X has a Binomial distribution with
parameters n = 2000 and p = 0.85
We want P1680  X  1720
 P1679.5  Y  1720.5
where Y has a Normal distribution with:
m  np  2000(0.85)  1700
s  npq  200.85.15  15.969
Hence P1680  X  1720
 P1679.5  Y  1720.5
1679.5 - 1700 Y - 1700 1720.5 - 1700 
 P



15
.
969
15
.
969
15
.
969


 P-1.28  Z  1.28
= 0.9004 - 0.0436 = 0.8008
Solution – part 2.
We want PX  1800
 PY  1800.5
 Y - 1700 1800.5 - 1700 
 P


15
.
969
15
.
969


 PZ  6.29
= 1.000