Normal distribution - Erwin Sitompul
Download
Report
Transcript Normal distribution - Erwin Sitompul
Probability and Statistics
Lecture 7
Dr.-Ing. Erwin Sitompul
President University
http://zitompul.wordpress.com
President University
Erwin Sitompul
PBST 7/1
Chapter 6
Some Continuous Probability Distributions
Chapter 6
Some Continuous Probability
Distributions
President University
Erwin Sitompul
PBST 7/2
Chapter 6.1
Continuous Uniform Distribution
Continuous Uniform Distribution
|Uniform Distribution| The density function of the continuous
uniform random variable X on the interval [A, B] is
1
, A x B
f ( x; A, B ) B A
e lse w h e re
0,
The mean and variance of the uniform distribution are
AB
and
2
2
( B A)
2
12
The uniform density
function for a
random variable on
the interval [1, 3]
President University
Erwin Sitompul
PBST 7/3
Chapter 6.1
Continuous Uniform Distribution
Continuous Uniform Distribution
Suppose that a large conference room for a certain company can be
reserved for no more than 4 hours. However, the use of the
conference room is such that both long and short conference occur
quite often. In fact, it can be assumed that length X of a conference
has a uniform distribution on the interval [0,4].
(a) What is the probability density function?
(b) What is the probability that any given conference lasts at least 3
hours?
(a)
(b)
1
, 0 x4
f (x) 4
0, e lse w h e re
P X 3
4
1
1
d
x
4
4
3
President University
Erwin Sitompul
PBST 7/4
Chapter 6.2
Normal Distribution
Normal Distribution
Normal distribution is the most important continuous probability
distribution in the entire field of statistics.
Its graph, called the normal curve, is the bell-shaped curve which
describes approximately many phenomena that occur in nature,
industry, and research.
The normal distribution is often referred to as the Gaussian
distribution, in honor of Karl Friedrich Gauss, who also derived its
equation from a study of errors in repeated measurements of the
same quantity.
The normal curve
President University
Erwin Sitompul
PBST 7/5
Chapter 6.2
Normal Distribution
Normal Distribution
A continuous random variable X having the bell-shaped distribution
as shown on the figure is called a normal random variable.
The density function of the normal random variable X, with mean μ
and variance σ2, is
n ( x; , )
1
2
e
1 x
2
2
,
x
where π = 3.14159... and e = 2.71828...
President University
Erwin Sitompul
PBST 7/6
Chapter 6.2
Normal Distribution
Normal Curve
μ1 < μ2, σ1 = σ2
μ1 = μ2, σ1 < σ2
μ1 < μ2, σ1 < σ2
President University
Erwin Sitompul
PBST 7/7
Chapter 6.2
Normal Distribution
Normal Curve
f(x)
The mode, the point where
the curve is at maximum
Concave downward
Point of inflection
σ
σ
Concave upward
Approaches zero
asymptotically
x
μ
Total area under the curve
and above the horizontal
axis is equal to 1
President University
Symmetry about a vertical
axis through the mean μ
Erwin Sitompul
PBST 7/8
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
The area under the curve bounded by two ordinates x = x1 and
x = x2 equals the probability that the random variable X assumes
a value between x = x1 and x = x2.
x2
P ( x1 X x 2 )
n ( x ; , ) dx
x1
President University
Erwin Sitompul
1
2
x2
e
1 x
2
2
dx
x1
PBST 7/9
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
As seen previously, the normal curve is dependent on the mean μ
and the standard deviation σ of the distribution under
investigation.
The same interval of a random variable can deliver different
probability if μ or σ are different.
Same interval, but different probabilities
for two different normal curves
President University
Erwin Sitompul
PBST 7/10
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
The difficulty encountered in solving integrals of normal density
functions necessitates the tabulation of normal curve area for
quick reference.
Fortunately, we are able to transform all the observations of any
normal random variable X to a new set of observation of a normal
random variable Z with mean 0 and variance 1.
Z
X
x2
1
P ( x1 X x 2 )
2
1
2
e
1 x
2
2
dx
x1
z2
e
z
2
2
dz
z1
z2
n ( z ; 0,1) dz
P ( z1 Z z 2 )
z1
President University
Erwin Sitompul
PBST 7/11
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
The distribution of a normal random variable with mean 0 and
variance 1 is called a standard normal distribution.
President University
Erwin Sitompul
PBST 7/12
Chapter 6.3
Areas Under the Normal Curve
Table A.3 Normal Probability Table
President University
Erwin Sitompul
PBST 7/13
Chapter 6.3
Areas Under the Normal Curve
Interpolation
Interpolation is a method of constructing new data points within
the range of a discrete set of known data points.
Examine the following graph. Two data points are known, which
are (a,f(a)) and (b,f(b)).
If a value of c is given, with a < c < b, then the value of f(c) can be
estimated.
If a value of f(c) is given, with f(a) < f(c) < f(b), then the value of c
can be estimated.
f (c ) f (a )
f (b )
ca
ba
f (b )
f (a )
f (c ) ?
f (a )
ca
a
President University
c?
f (c ) f ( a )
f (b ) f ( a )
b a
b
Erwin Sitompul
PBST 7/14
Chapter 6.3
Areas Under the Normal Curve
Interpolation
P(Z < 1.172)?
P(Z < z) = 0.8700, z = ?
President University
Erwin Sitompul
Answer: 0.8794
1.126
PBST 7/15
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a standard normal distribution, find the area under the curve
that lies (a) to the right of z = 1.84 and (b) between z = –1.97 and
z = 0.86.
(a)
P ( Z 1.84) 1 P ( Z 1.84)
1 0 .9 6 7 1
0.0329
(b)
P ( 1.94 Z 0.86) P ( Z 0.86) P ( Z 1.94)
0 .8 0 5 1 0 .0 2 4 4
0.7807
President University
Erwin Sitompul
PBST 7/16
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a standard normal distribution, find the value of k such that
(a) P ( Z > k ) = 0.3015, and (b) P ( k < Z < –0.18 ) = 0.4197.
(a)
P (Z k ) 1 P (Z k )
P (Z k ) 1 P (Z k )
1 0 .3 0 1 5 0 .6 9 8 5
k 0.52
(b)
P ( k Z 0.18) P ( Z 0.18) P ( Z k )
P ( Z k ) P ( Z 0.18) P ( k Z 0.18)
0 .4 2 8 6 0 .4 1 9 7 0 .0 0 8 9
k 2.37
President University
Erwin Sitompul
PBST 7/17
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a random variable X having a normal distribution with μ = 50
and σ = 10, find the probability that X assumes a value between 45
and 62.
z1
z2
x1
x2
45 50
0.5
10
62 50
1.2
10
P (45 X 62) P ( 0.5 Z 1.2)
P ( Z 1.2) P ( Z 0.5)
0 .8 8 4 9 0 .3 0 8 5
0.5764
President University
Erwin Sitompul
PBST 7/18
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given that X has a normal distribution with μ = 300 and σ = 50, find
the probability that X assumes a value greater than 362.
z
x
362 300
1.24
50
P ( X 362) P ( Z 1.24)
1 P ( Z 1.24)
1 0 .8 9 2 5
0.1075
President University
Erwin Sitompul
PBST 7/19
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a normal distribution with μ = 40 and σ = 6, find the value of x
that has (a) 45% of the area to the left, and (b) 14% of the area to
the right.
(a)
P ( Z z ) 0.45
z 0.13
x z
0.45 0.4483
0.4522 0.4483
0.12 ( 0.13)
0 .1 2 5 6
40 ( 0.1256)(6) 39.2464
2 2 5 4. 0
5 4. 0
3 8 4 4. 0
President University
3 1. 0
?
2 1. 0
Erwin Sitompul
PBST 7/20
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a normal distribution with μ = 40 and σ = 6, find the value of x
that has (a) 45% of the area to the left, and (b) 14% of the area to
the right.
(b)
P ( z Z ) 0.14 1 P ( Z z )
P ( Z z ) 1 0.14 0.86
z 1 .0 8
x z 40 (1.08)(6) 46.48
President University
Erwin Sitompul
PBST 7/21
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
A certain type of storage battery lasts, on average, 3.0 years, with a
standard deviation of 0.5 year. Assuming that the battery lives are
normally distributed, find the probability that a given battery will last
less than 2.3 years.
z
x
2.3 3.0
1.4
0.5
P ( Z 1.4) 0.0808
8.08%
President University
Erwin Sitompul
PBST 7/22
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
In an industrial process the diameter of a ball bearing is an
important component part. The buyer sets specifications on the
diameter to be 3.0 ± 0.01 cm. All parts falling outside these
specifications will be rejected.
It is known that in the process the diameter of a ball bearing has a
normal distribution with mean 3.0 and standard deviation 0.005.
On the average, how many manufactured ball bearings will be
scrapped?
P (2.99 X 3.01) P ( 2 Z 2)
P ( Z 2) P ( Z 2)
0 .9 7 7 2 0 .0 2 2 8
0 .9 5 4 4
z1
z2
x1
x2
2.99 3.0
95.44% accep ted
2
0.005
3.01 3.0
4.56% re je cte d
2
0.005
President University
Erwin Sitompul
PBST 7/23
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
A certain machine makes electrical resistors having a mean
resistance of 40 Ω and a standard deviation of 2 Ω. It is assumed
that the resistance follows a normal distribution.
What percentage of resistors will have a resistance exceeding 43 Ω
if:
(a) the resistance can be measured to any degree of accuracy.
(b) the resistance can be measured to the nearest ohm only.
(a)
(b)
z
43 40
1.5
2
P ( X 43) P ( Z 1.5) 1 P ( Z 1.5) 1 0 .9 3 3 2 0 .0 6 6 8 6.68%
z
43.5 40
1.75
2
P ( X 43.5) P ( Z 1.75) 1 P ( Z 1.75) 1 0 .9 5 9 9 0 .0 4 0 1 4.01%
As many as 6.68%–4.01% = 2.67% of
the resistors will be accepted although
the value is greater than 43 Ω due to
measurement limitation
President University
Erwin Sitompul
PBST 7/24
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
The average grade for an exam is 74, and the standard deviation is
7. If 12% of the class are given A’s, and the grade are curved to
follow a normal distribution, what is the lowest possible A and the
highest possible B?
P ( Z z ) 0.12
P ( Z z ) 1 P ( Z z ) 1 0 .1 2 0 .8 8
z 1 .1 7 5
x z 74 (1.175)(7 ) 82.225
President University
Lowest possible A is 83
Highest possible B is 82
Erwin Sitompul
PBST 7/25
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
The probabilities associated with binomial experiments are readily
obtainable from the formula b(x;n, p) of the binomial distribution
or from the table when n is small.
For large n, making the distribution table is not practical anymore.
Nevertheless, the binomial distribution can be nicely approximated
by the normal distribution under certain circumstances.
President University
Erwin Sitompul
PBST 7/26
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
If X is a binomial random variable with mean μ = np and variance
σ2 = npq, then the limiting form of the distribution of
Z
X np
npq
as n ∞, is the standard normal distribution n(z;0, 1).
Normal approximation of b(x; 15, 0.4)
Each value of b(x; 15, 0.4) is
approximated by P(x–0.5 < X < x+0.5)
President University
Erwin Sitompul
PBST 7/27
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
P ( X 4) b (4;15, 0.4)
4
15
C 4 (0.4) (0.6)
11
0.1268
P ( X 4) P (3.5 X 4.5)
P ( 1.32 Z 0.79)
0.1214
Normal approximation of
9
b (4;15, 0.4) and
b ( x ;15, 0.4)
x7
9
P (7 X 9)
b ( x ;15, 0.4)
x7
0.3564
np (15)(0.4) 6
npq
(15)(0.4)(0.6) 1.897
P (7 X 9) P (6.5 X 9.5)
P (0.26 Z 1.85)
0.3652
President University
Erwin Sitompul
PBST 7/28
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
The degree of accuracy, that is how well the normal curve fits the
binomial histogram, will increase as n increases.
If the value of n is small and p is not very close to 1/2, normal
curve will not fit the histogram well, as shown below.
b ( x ; 6, 0.2)
b ( x ;15, 0.2)
The approximation using normal curve will be excellent when n is
large or n is small with p reasonably close to 1/2.
As rule of thumb, if both np and nq are greater than or equal to 5,
the approximation will be good.
President University
Erwin Sitompul
PBST 7/29
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
Let X be a binomial random variable with parameters n and p. For
large n, X has approximately a normal distribution with μ = np and
σ2 = npq = np(1–p) and
x
P ( X x)
b(k ; n, p )
k 0
a re a u n d e r n o rm a l c u rv e to th e le ft o f x 0 .5
P ( X x 0.5)
( x 0.5)
PZ
and the approximation will be good if np and nq = n(1–p) are
greater than or equal to 5.
President University
Erwin Sitompul
PBST 7/30
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
The probability that a patient recovers from a rare blood disease is
0.4. If 100 people are known to have contracted this disease, what is
the probability that less than 30 survive?
n 100, p 0.4
29
P ( X 30)
b ( x ;100, 0.4)
np (100)(0.4) 40
npq
(100)(0.4)(0.6) 4.899
x0
P ( X 30) P ( X 29.5)
z
29.5 40
2.143
4.899
P ( Z 2.143)
0.01608
After interpolation
1.608%
Can you calculate the
exact solution?
President University
Erwin Sitompul
PBST 7/31
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
A multiple-choice quiz has 200 questions each with 4 possible
answers of which only 1 is the correct answer. What is the probability
that sheer guess-work yields from 25 to 30 correct answers for 80 of
the 200 problems about which the student has no knowledge?
n 80, p
1
np (80)( 14 ) 20
4
z1
30
P (25 X 30)
b ( x ; 80,
1
4
npq
24.5 20
3.873
(80)( 14 )( 43 ) 3.873
1.162, z 2
30.5 20
3.873
)
x 25
P (24.5 X 30.5)
P (1.162 Z 2.711)
P ( Z 2.711) P ( Z 1.162)
0 .9 9 6 6 0 .8 7 7 4
0.1192
President University
Erwin Sitompul
PBST 7/32
2.711
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
PU Physics entrance exam consists of 30 multiple-choice questions
each with 4 possible answers of which only 1 is the correct answer.
What is the probability that a prospective students will obtain
scholarship by correctly answering at least 80% of the questions just
by guessing?
n 30, p
1
np
4
30
P ( X 24)
b ( x ; 30,
1
4
)
x 24
z
(30)( 14 ) 7.5
npq
23.5 7.5
(30)( 14 )( 34 ) 2.372
6.745
2.372
1 P ( X 23.5)
1 P ( Z 6.745)
0
It is practically impossible to
get scholarship just by pure
luck in the entrance exam
President University
Erwin Sitompul
PBST 7/33
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
There are still numerous situations that the normal distribution
cannot cover. For such situations, different types of density
functions are required.
Two such density functions are the gamma and exponential
distributions.
Both distributions find applications in queuing theory and reliability
problems.
The gamma function is defined by
( )
for α > 0.
x
1
x
e dx
0
President University
Erwin Sitompul
PBST 7/34
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
|Gamma Distribution| The continuous random variable X has a
gamma distribution, with parameters α and β, if its density
function is given by
1
1 x
x
e
( )
f (x)
0,
, x0
e lse w h e re
where α > 0 and β > 0.
|Exponential Distribution| The continuous random variable X
has an exponential distribution, with parameter β, if its density
function is given by
1 x
e
f ( x)
0,
, x0
e ls e w h e re
where β > 0.
President University
Erwin Sitompul
PBST 7/35
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
Gamma distributions for certain values of
the parameters α and β
The gamma distribution with α = 1 is called
the exponential distribution
President University
Erwin Sitompul
PBST 7/36
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
The mean and variance of the gamma distribution are
and
2
2
The mean and variance of the exponential distribution are
2
2
and
President University
Erwin Sitompul
PBST 7/37
Chapter 6.7
Applications of the Gamma and Exponential Distributions
Applications
Suppose that a system contains a certain type of component whose
time in years to failure is given by T. The random variable T is
modeled nicely by the exponential distribution with mean time to
failure β = 5.
If 5 of these components are installed in different systems, what is
the probability that at least 2 are still functioning at the end of 8
years?
P (T 8)
1
e
5
5
t 5
dt
P ( X 2)
b ( x ; 5, 0.2)
x2
8
1
e
8 5
1 b ( x ; 5, 0.2)
0.2
x0
1 0 .7 3 7 3
The probability whether
the component is still
functioning at the end of 8
years
0.2627
The probability whether at
least 2 out of 5 such
component are still
functioning at the end of 8
years
President University
Erwin Sitompul
PBST 7/38
Chapter 6.7
Applications of the Gamma and Exponential Distributions
Applications
Suppose that telephone calls arriving at a particular switchboard
follow a Poisson process with an average of 5 calls coming per
minute.
What is the probability that up to a minute will elapse until 2 calls
have come in to the switchboard?
1 5,
β is the mean time of the
event of calling
α is the quantity of the
event of calling
2
x
P ( X x)
1
2
xe
x
dx
0
1
P ( X 1) 25 xe
5 x
dx 1 e
5 (1)
(1 5) 0.96
0
President University
Erwin Sitompul
PBST 7/39
Chapter 6.7
Applications of the Gamma and Exponential Distributions
Applications
Based on extensive testing, it is determined that the average of time
Y before a washing machine requires a major repair is 4 years. This
time is known to be able to be modeled nicely using exponential
function. The machine is considered a bargain if it is unlikely to
require a major repair before the sixth year.
(a) Determine the probability that it can survive without major repair
until more than 6 years.
(b) What is the probability that a major repair occurs in the first
year?
(a)
(b)
P (Y 6)
1
e
4
1
4
4
dt e 6
4
0.223
6
P (Y 1) 1
1
t 4
e
t 4
1
1
e
Only 22.3% survives until
more than 6 years without
major reparation
t 4
dt 1 e
1 4
0.221
22.1% will need major
reparation after used for 1
year
dt
0
President University
Erwin Sitompul
PBST 7/40
Chapter 6.8
Chi-Squared Distribution
Chi-Squared Distribution
Another very important special case of the gamma distribution is
obtained by letting α = v/2 and β = 2, where v is a positive
integer.
The result is called the chi-squared distribution, with a single
parameter v called the degrees of freedom.
The chi-squared distribution plays a vital role in statistical
inference. It has considerable application in both methodology and
theory.
Many chapters ahead of us will contain important applications of
this distribution.
President University
Erwin Sitompul
PBST 7/41
Chapter 6.8
Chi-Squared Distribution
Chi-Squared Distribution
|Chi-Squared Distribution| The continuous random variable X
has a chi-squared distribution, with v degrees of freedom, if its
density function is given by
1
v
x
2 v 2 (v 2)
f (x)
0,
2 1
e
x
, x0
e lse w h e re
where v is a positive integer.
The mean and variance of the chi-squared distribution are
2
and
v
2v
President University
Erwin Sitompul
PBST 7/42
Chapter 6.9
Lognormal Distribution
Lognormal Distribution
The lognormal distribution is used for a wide variety of
applications.
The distribution applies in cases where a natural log
transformation results in a normal distribution.
President University
Erwin Sitompul
PBST 7/43
Chapter 6.9
Lognormal Distribution
Lognormal Distribution
|Lognormal Distribution| The continuous random variable X has
a lognormal distribution if the random variable Y = ln(X) has a
normal distribution with mean μ and standard deviation σ. The
resulting density function of X is
2
1
ln( x )
e
f ( x ) 2 x
0,
( 2 )
2
, x0
x0
The mean and variance of the chi-squared distribution are
E(X ) e
2
2
President University
and
V ar ( X ) e
2
Erwin Sitompul
2
(e
2
1)
PBST 7/44
Chapter 6.9
Lognormal Distribution
Lognormal Distribution
Concentration of pollutants produced by chemical plants historically
are known to exhibit behavior that resembles a log normal
distribution. This is important when one considers issues regarding
compliance to government regulations.
Suppose it is assumed that the concentration of a certain pollutant,
in parts per million, has a lognormal distribution with parameters μ =
3.2 and σ = 1.
What is the probability that the concentration exceeds 8 parts per
million?
P ( X 8) 1 P ( X 8)
ln(8) 3.2
P ( X 8) F
F ( 1.12) 0.1314
1
F denotes the cumulative distribution
function of the standard normal distribution
a. k. a. the area under the normal curve
President University
Erwin Sitompul
PBST 7/45
Probability and Statistics
Homework 7
1. Suppose the current measurements in a strip of wire are assumed to
follow a normal distribution with a mean of 10 milliamperes and a
variance of 4 milliamperes2. (a) What is the probability that a
measurement will exceed 13 milliamperes? (b) Determine the value for
which the probability that a current measurement is below this value is
98%.
(Mo.E4.13-14 p.113)
2. A lawyer commutes daily from his suburban home to midtown office. The
average time for a one-way trip is 24 minutes, with a standard deviation
of 3.8 minutes. Assume the distribution of trip times to be normally
distributed. (a) If the office opens at 9:00 A.M. and the lawyer leaves his
house at 8:45 A.M. daily, what percentage of the time is he late for work?
(b) Find the probability that 2 of the next 3 trips will take at least 1/2
hour.
(Wa.6.15 s.186)
3. (a) Suppose that a sample of 1600 tires of the same type are obtained at
random from an ongoing production process in which 8% of all such tires
produced are defective. What is the probability that in such sample 150
or fewer tires will be defective?
(Sou18. CD6-13)
(b) If 10% of men are bald, what is the probability that more than 100 in
a random sample of 818 men are bald?
President University
Erwin Sitompul
PBST 7/46