Normal distribution - Erwin Sitompul

Download Report

Transcript Normal distribution - Erwin Sitompul

Probability and Statistics
Lecture 7
Dr.-Ing. Erwin Sitompul
President University
http://zitompul.wordpress.com
President University
Erwin Sitompul
PBST 7/1
Chapter 6
Some Continuous Probability Distributions
Chapter 6
Some Continuous Probability
Distributions
President University
Erwin Sitompul
PBST 7/2
Chapter 6.1
Continuous Uniform Distribution
Continuous Uniform Distribution
 |Uniform Distribution| The density function of the continuous
uniform random variable X on the interval [A, B] is
 1
, A x B

f ( x; A, B )   B  A

e lse w h e re
 0,
 The mean and variance of the uniform distribution are
 
AB
and
2

2

( B  A)
2
12
 The uniform density
function for a
random variable on
the interval [1, 3]
President University
Erwin Sitompul
PBST 7/3
Chapter 6.1
Continuous Uniform Distribution
Continuous Uniform Distribution
Suppose that a large conference room for a certain company can be
reserved for no more than 4 hours. However, the use of the
conference room is such that both long and short conference occur
quite often. In fact, it can be assumed that length X of a conference
has a uniform distribution on the interval [0,4].
(a) What is the probability density function?
(b) What is the probability that any given conference lasts at least 3
hours?
(a)
(b)
1
 , 0 x4
f (x)   4

 0, e lse w h e re
P  X  3 
4
1
1
d
x

  4 
4
3
President University
Erwin Sitompul
PBST 7/4
Chapter 6.2
Normal Distribution
Normal Distribution
 Normal distribution is the most important continuous probability
distribution in the entire field of statistics.
 Its graph, called the normal curve, is the bell-shaped curve which
describes approximately many phenomena that occur in nature,
industry, and research.
 The normal distribution is often referred to as the Gaussian
distribution, in honor of Karl Friedrich Gauss, who also derived its
equation from a study of errors in repeated measurements of the
same quantity.
 The normal curve
President University
Erwin Sitompul
PBST 7/5
Chapter 6.2
Normal Distribution
Normal Distribution
 A continuous random variable X having the bell-shaped distribution
as shown on the figure is called a normal random variable.
 The density function of the normal random variable X, with mean μ
and variance σ2, is
n ( x;  ,  ) 
1
2 
e
1  x 
 

2  
2
,
  x
where π = 3.14159... and e = 2.71828...
President University
Erwin Sitompul
PBST 7/6
Chapter 6.2
Normal Distribution
Normal Curve
 μ1 < μ2, σ1 = σ2
 μ1 = μ2, σ1 < σ2
 μ1 < μ2, σ1 < σ2
President University
Erwin Sitompul
PBST 7/7
Chapter 6.2
Normal Distribution
Normal Curve
f(x)
The mode, the point where
the curve is at maximum
Concave downward
Point of inflection
σ
σ
Concave upward
Approaches zero
asymptotically
x
μ
Total area under the curve
and above the horizontal
axis is equal to 1
President University
Symmetry about a vertical
axis through the mean μ
Erwin Sitompul
PBST 7/8
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
 The area under the curve bounded by two ordinates x = x1 and
x = x2 equals the probability that the random variable X assumes
a value between x = x1 and x = x2.
x2
P ( x1  X  x 2 ) 
 n ( x ;  ,  ) dx 
x1
President University
Erwin Sitompul
1
2 
x2
e
1  x 
 

2  
2
dx
x1
PBST 7/9
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
 As seen previously, the normal curve is dependent on the mean μ
and the standard deviation σ of the distribution under
investigation.
 The same interval of a random variable can deliver different
probability if μ or σ are different.
 Same interval, but different probabilities
for two different normal curves
President University
Erwin Sitompul
PBST 7/10
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
 The difficulty encountered in solving integrals of normal density
functions necessitates the tabulation of normal curve area for
quick reference.
 Fortunately, we are able to transform all the observations of any
normal random variable X to a new set of observation of a normal
random variable Z with mean 0 and variance 1.
Z 
X 

x2
1
P ( x1  X  x 2 ) 
2 
1

2
e
1  x 
 

2  
2
dx
x1
z2
e

z
2
2
dz
z1
z2

 n ( z ; 0,1) dz
 P ( z1  Z  z 2 )
z1
President University
Erwin Sitompul
PBST 7/11
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
 The distribution of a normal random variable with mean 0 and
variance 1 is called a standard normal distribution.
President University
Erwin Sitompul
PBST 7/12
Chapter 6.3
Areas Under the Normal Curve
Table A.3 Normal Probability Table
President University
Erwin Sitompul
PBST 7/13
Chapter 6.3
Areas Under the Normal Curve
Interpolation
 Interpolation is a method of constructing new data points within
the range of a discrete set of known data points.
 Examine the following graph. Two data points are known, which
are (a,f(a)) and (b,f(b)).
 If a value of c is given, with a < c < b, then the value of f(c) can be
estimated.
 If a value of f(c) is given, with f(a) < f(c) < f(b), then the value of c
can be estimated.
f (c )  f (a ) 
f (b )
ca
ba
 f (b ) 
f (a ) 
f (c ) ?
f (a )
ca
a
President University
c?
f (c )  f ( a )
f (b )  f ( a )
b  a 
b
Erwin Sitompul
PBST 7/14
Chapter 6.3
Areas Under the Normal Curve
Interpolation
 P(Z < 1.172)?
 P(Z < z) = 0.8700, z = ?
President University
Erwin Sitompul
Answer: 0.8794
1.126
PBST 7/15
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a standard normal distribution, find the area under the curve
that lies (a) to the right of z = 1.84 and (b) between z = –1.97 and
z = 0.86.
(a)
P ( Z  1.84)  1  P ( Z  1.84)
 1  0 .9 6 7 1
 0.0329
(b)
P (  1.94  Z  0.86)  P ( Z  0.86)  P ( Z   1.94)
 0 .8 0 5 1  0 .0 2 4 4
 0.7807
President University
Erwin Sitompul
PBST 7/16
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a standard normal distribution, find the value of k such that
(a) P ( Z > k ) = 0.3015, and (b) P ( k < Z < –0.18 ) = 0.4197.
(a)
P (Z  k )  1  P (Z  k )
P (Z  k )  1  P (Z  k )
 1  0 .3 0 1 5  0 .6 9 8 5
k  0.52
(b)
P ( k  Z   0.18)  P ( Z   0.18)  P ( Z  k )
P ( Z  k )  P ( Z   0.18)  P ( k  Z   0.18)
 0 .4 2 8 6  0 .4 1 9 7  0 .0 0 8 9
k   2.37
President University
Erwin Sitompul
PBST 7/17
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a random variable X having a normal distribution with μ = 50
and σ = 10, find the probability that X assumes a value between 45
and 62.
z1 
z2 
x1  

x2  


45  50
  0.5
10

62  50
 1.2
10
P (45  X  62)  P (  0.5  Z  1.2)
 P ( Z  1.2)  P ( Z   0.5)
 0 .8 8 4 9  0 .3 0 8 5
 0.5764
President University
Erwin Sitompul
PBST 7/18
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given that X has a normal distribution with μ = 300 and σ = 50, find
the probability that X assumes a value greater than 362.
z
x


362  300
 1.24
50
P ( X  362)  P ( Z  1.24)
 1  P ( Z  1.24)
 1  0 .8 9 2 5
 0.1075
President University
Erwin Sitompul
PBST 7/19
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a normal distribution with μ = 40 and σ = 6, find the value of x
that has (a) 45% of the area to the left, and (b) 14% of the area to
the right.
(a)
P ( Z  z )  0.45
z   0.13 
x    z
0.45  0.4483
0.4522  0.4483
  0.12  (  0.13) 
  0 .1 2 5 6
 40  (  0.1256)(6)  39.2464
2 2 5 4. 0
5 4. 0
3 8 4 4. 0
President University
3 1. 0 
?
2 1. 0 
Erwin Sitompul
PBST 7/20
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a normal distribution with μ = 40 and σ = 6, find the value of x
that has (a) 45% of the area to the left, and (b) 14% of the area to
the right.
(b)
P ( z  Z )  0.14  1  P ( Z  z )
P ( Z  z )  1  0.14  0.86
 z  1 .0 8
x    z  40  (1.08)(6)  46.48
President University
Erwin Sitompul
PBST 7/21
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
A certain type of storage battery lasts, on average, 3.0 years, with a
standard deviation of 0.5 year. Assuming that the battery lives are
normally distributed, find the probability that a given battery will last
less than 2.3 years.
z
x


2.3  3.0
  1.4
0.5
P ( Z   1.4)  0.0808
 8.08%
President University
Erwin Sitompul
PBST 7/22
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
In an industrial process the diameter of a ball bearing is an
important component part. The buyer sets specifications on the
diameter to be 3.0 ± 0.01 cm. All parts falling outside these
specifications will be rejected.
It is known that in the process the diameter of a ball bearing has a
normal distribution with mean 3.0 and standard deviation 0.005.
On the average, how many manufactured ball bearings will be
scrapped?
P (2.99  X  3.01)  P (  2  Z  2)
 P ( Z  2)  P ( Z   2)
 0 .9 7 7 2  0 .0 2 2 8
 0 .9 5 4 4
z1 
z2 
x1  

x2  


2.99  3.0
 95.44% accep ted
 2
0.005

3.01  3.0
 4.56% re je cte d
 2
0.005
President University
Erwin Sitompul
PBST 7/23
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
A certain machine makes electrical resistors having a mean
resistance of 40 Ω and a standard deviation of 2 Ω. It is assumed
that the resistance follows a normal distribution.
What percentage of resistors will have a resistance exceeding 43 Ω
if:
(a) the resistance can be measured to any degree of accuracy.
(b) the resistance can be measured to the nearest ohm only.
(a)
(b)
z
43  40
 1.5
2
P ( X  43)  P ( Z  1.5)  1  P ( Z  1.5)  1  0 .9 3 3 2  0 .0 6 6 8  6.68%
z
43.5  40
 1.75
2
P ( X  43.5)  P ( Z  1.75)  1  P ( Z  1.75)  1  0 .9 5 9 9  0 .0 4 0 1  4.01%
 As many as 6.68%–4.01% = 2.67% of
the resistors will be accepted although
the value is greater than 43 Ω due to
measurement limitation
President University
Erwin Sitompul
PBST 7/24
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
The average grade for an exam is 74, and the standard deviation is
7. If 12% of the class are given A’s, and the grade are curved to
follow a normal distribution, what is the lowest possible A and the
highest possible B?
P ( Z  z )  0.12
P ( Z  z )  1  P ( Z  z )  1  0 .1 2  0 .8 8
 z  1 .1 7 5
x    z  74  (1.175)(7 )  82.225
President University
 Lowest possible A is 83
 Highest possible B is 82
Erwin Sitompul
PBST 7/25
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
 The probabilities associated with binomial experiments are readily
obtainable from the formula b(x;n, p) of the binomial distribution
or from the table when n is small.
 For large n, making the distribution table is not practical anymore.
 Nevertheless, the binomial distribution can be nicely approximated
by the normal distribution under certain circumstances.
President University
Erwin Sitompul
PBST 7/26
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
 If X is a binomial random variable with mean μ = np and variance
σ2 = npq, then the limiting form of the distribution of
Z 
X  np
npq
as n  ∞, is the standard normal distribution n(z;0, 1).
 Normal approximation of b(x; 15, 0.4)
 Each value of b(x; 15, 0.4) is
approximated by P(x–0.5 < X < x+0.5)
President University
Erwin Sitompul
PBST 7/27
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
P ( X  4)  b (4;15, 0.4)

4
15
C 4 (0.4) (0.6)
11
 0.1268
P ( X  4)  P (3.5  X  4.5)
 P (  1.32  Z   0.79)
 0.1214
 Normal approximation of
9
b (4;15, 0.4) and
 b ( x ;15, 0.4)
x7
9
P (7  X  9) 
 b ( x ;15, 0.4)
x7
 0.3564
  np  (15)(0.4)  6
 
npq 
(15)(0.4)(0.6)  1.897
P (7  X  9)  P (6.5  X  9.5)
 P (0.26  Z  1.85)
 0.3652
President University
Erwin Sitompul
PBST 7/28
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
 The degree of accuracy, that is how well the normal curve fits the
binomial histogram, will increase as n increases.
 If the value of n is small and p is not very close to 1/2, normal
curve will not fit the histogram well, as shown below.
b ( x ; 6, 0.2)
b ( x ;15, 0.2)
 The approximation using normal curve will be excellent when n is
large or n is small with p reasonably close to 1/2.
 As rule of thumb, if both np and nq are greater than or equal to 5,
the approximation will be good.
President University
Erwin Sitompul
PBST 7/29
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
 Let X be a binomial random variable with parameters n and p. For
large n, X has approximately a normal distribution with μ = np and
σ2 = npq = np(1–p) and
x
P ( X  x) 
 b(k ; n, p )
k 0
 a re a u n d e r n o rm a l c u rv e to th e le ft o f x  0 .5
 P ( X  x  0.5)
( x  0.5)   

 PZ 




and the approximation will be good if np and nq = n(1–p) are
greater than or equal to 5.
President University
Erwin Sitompul
PBST 7/30
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
The probability that a patient recovers from a rare blood disease is
0.4. If 100 people are known to have contracted this disease, what is
the probability that less than 30 survive?
n  100, p  0.4
29
P ( X  30) 
 b ( x ;100, 0.4)
  np  (100)(0.4)  40
 
npq 
(100)(0.4)(0.6)  4.899
x0
P ( X  30)  P ( X  29.5)
z
29.5  40
  2.143
4.899
 P ( Z   2.143)
 0.01608
 After interpolation
 1.608%
 Can you calculate the
exact solution?
President University
Erwin Sitompul
PBST 7/31
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
A multiple-choice quiz has 200 questions each with 4 possible
answers of which only 1 is the correct answer. What is the probability
that sheer guess-work yields from 25 to 30 correct answers for 80 of
the 200 problems about which the student has no knowledge?
n  80, p 
1
  np  (80)( 14 )  20
4
 
z1 
30
P (25  X  30) 
 b ( x ; 80,
1
4
npq 
24.5  20
3.873
(80)( 14 )( 43 )  3.873
 1.162, z 2 
30.5  20
3.873
)
x  25
 P (24.5  X  30.5)
 P (1.162  Z  2.711)
 P ( Z  2.711)  P ( Z  1.162)
 0 .9 9 6 6  0 .8 7 7 4
 0.1192
President University
Erwin Sitompul
PBST 7/32
 2.711
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
PU Physics entrance exam consists of 30 multiple-choice questions
each with 4 possible answers of which only 1 is the correct answer.
What is the probability that a prospective students will obtain
scholarship by correctly answering at least 80% of the questions just
by guessing?
n  30, p 
1
  np
4
 
30
P ( X  24) 
 b ( x ; 30,
1
4
)
x  24
z
 (30)( 14 )  7.5
npq

23.5  7.5
(30)( 14 )( 34 )  2.372
 6.745
2.372
 1  P ( X  23.5)
 1  P ( Z  6.745)
 0
 It is practically impossible to
get scholarship just by pure
luck in the entrance exam
President University
Erwin Sitompul
PBST 7/33
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
 There are still numerous situations that the normal distribution
cannot cover. For such situations, different types of density
functions are required.
 Two such density functions are the gamma and exponential
distributions.
 Both distributions find applications in queuing theory and reliability
problems.
 The gamma function is defined by

 ( ) 
for α > 0.

x
 1
x
e dx
0
President University
Erwin Sitompul
PBST 7/34
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
 |Gamma Distribution| The continuous random variable X has a
gamma distribution, with parameters α and β, if its density
function is given by
1

 1  x
x
e
    ( )
f (x)  

 0,

, x0
e lse w h e re
where α > 0 and β > 0.
 |Exponential Distribution| The continuous random variable X
has an exponential distribution, with parameter β, if its density
function is given by
 1 x
 e
f ( x)  

 0,

, x0
e ls e w h e re
where β > 0.
President University
Erwin Sitompul
PBST 7/35
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
 Gamma distributions for certain values of
the parameters α and β
 The gamma distribution with α = 1 is called
the exponential distribution
President University
Erwin Sitompul
PBST 7/36
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
 The mean and variance of the gamma distribution are
  
and
  
2
2
 The mean and variance of the exponential distribution are
2
2
 
 
and
President University
Erwin Sitompul
PBST 7/37
Chapter 6.7
Applications of the Gamma and Exponential Distributions
Applications
Suppose that a system contains a certain type of component whose
time in years to failure is given by T. The random variable T is
modeled nicely by the exponential distribution with mean time to
failure β = 5.
If 5 of these components are installed in different systems, what is
the probability that at least 2 are still functioning at the end of 8
years?
P (T  8) 
1

e

5
5
t 5
dt
P ( X  2) 
 b ( x ; 5, 0.2)
x2
8
1
e
8 5
 1   b ( x ; 5, 0.2)
 0.2
x0
 1  0 .7 3 7 3
 The probability whether
the component is still
functioning at the end of 8
years
 0.2627
 The probability whether at
least 2 out of 5 such
component are still
functioning at the end of 8
years
President University
Erwin Sitompul
PBST 7/38
Chapter 6.7
Applications of the Gamma and Exponential Distributions
Applications
Suppose that telephone calls arriving at a particular switchboard
follow a Poisson process with an average of 5 calls coming per
minute.
What is the probability that up to a minute will elapse until 2 calls
have come in to the switchboard?
  1 5,
 β is the mean time of the
event of calling
 α is the quantity of the
event of calling
 2
x
P ( X  x) 
1

2
xe
x 
dx
0
1
P ( X  1)  25  xe
5 x
dx  1  e
 5 (1)
(1  5)  0.96
0
President University
Erwin Sitompul
PBST 7/39
Chapter 6.7
Applications of the Gamma and Exponential Distributions
Applications
Based on extensive testing, it is determined that the average of time
Y before a washing machine requires a major repair is 4 years. This
time is known to be able to be modeled nicely using exponential
function. The machine is considered a bargain if it is unlikely to
require a major repair before the sixth year.
(a) Determine the probability that it can survive without major repair
until more than 6 years.
(b) What is the probability that a major repair occurs in the first
year?
(a)
(b)
P (Y  6) 

1
e
4
1
4

4
dt  e  6
4
 0.223
6
P (Y  1)  1 
1
t 4

e
t 4
1
1
e
 Only 22.3% survives until
more than 6 years without
major reparation
t 4
dt  1  e
1 4
 0.221
 22.1% will need major
reparation after used for 1
year
dt
0
President University
Erwin Sitompul
PBST 7/40
Chapter 6.8
Chi-Squared Distribution
Chi-Squared Distribution
 Another very important special case of the gamma distribution is
obtained by letting α = v/2 and β = 2, where v is a positive
integer.
 The result is called the chi-squared distribution, with a single
parameter v called the degrees of freedom.
 The chi-squared distribution plays a vital role in statistical
inference. It has considerable application in both methodology and
theory.
 Many chapters ahead of us will contain important applications of
this distribution.
President University
Erwin Sitompul
PBST 7/41
Chapter 6.8
Chi-Squared Distribution
Chi-Squared Distribution
 |Chi-Squared Distribution| The continuous random variable X
has a chi-squared distribution, with v degrees of freedom, if its
density function is given by
1

v
x
 2 v 2  (v 2)
f (x)  

 0,
2 1
e
x 
, x0
e lse w h e re
where v is a positive integer.
 The mean and variance of the chi-squared distribution are
2
and
 v
  2v
President University
Erwin Sitompul
PBST 7/42
Chapter 6.9
Lognormal Distribution
Lognormal Distribution
 The lognormal distribution is used for a wide variety of
applications.
 The distribution applies in cases where a natural log
transformation results in a normal distribution.
President University
Erwin Sitompul
PBST 7/43
Chapter 6.9
Lognormal Distribution
Lognormal Distribution
 |Lognormal Distribution| The continuous random variable X has
a lognormal distribution if the random variable Y = ln(X) has a
normal distribution with mean μ and standard deviation σ. The
resulting density function of X is
2
1

  ln( x )   
e

f ( x )   2  x

 0,
( 2 )
2
, x0
x0
 The mean and variance of the chi-squared distribution are
E(X )  e
 
2
2
President University
and
V ar ( X )  e
2  
Erwin Sitompul
2
(e

2
 1)
PBST 7/44
Chapter 6.9
Lognormal Distribution
Lognormal Distribution
Concentration of pollutants produced by chemical plants historically
are known to exhibit behavior that resembles a log normal
distribution. This is important when one considers issues regarding
compliance to government regulations.
Suppose it is assumed that the concentration of a certain pollutant,
in parts per million, has a lognormal distribution with parameters μ =
3.2 and σ = 1.
What is the probability that the concentration exceeds 8 parts per
million?
P ( X  8)  1  P ( X  8)
 ln(8)  3.2 
P ( X  8)  F 
 F (  1.12)  0.1314

1


 F denotes the cumulative distribution
function of the standard normal distribution
 a. k. a. the area under the normal curve
President University
Erwin Sitompul
PBST 7/45
Probability and Statistics
Homework 7
1. Suppose the current measurements in a strip of wire are assumed to
follow a normal distribution with a mean of 10 milliamperes and a
variance of 4 milliamperes2. (a) What is the probability that a
measurement will exceed 13 milliamperes? (b) Determine the value for
which the probability that a current measurement is below this value is
98%.
(Mo.E4.13-14 p.113)
2. A lawyer commutes daily from his suburban home to midtown office. The
average time for a one-way trip is 24 minutes, with a standard deviation
of 3.8 minutes. Assume the distribution of trip times to be normally
distributed. (a) If the office opens at 9:00 A.M. and the lawyer leaves his
house at 8:45 A.M. daily, what percentage of the time is he late for work?
(b) Find the probability that 2 of the next 3 trips will take at least 1/2
hour.
(Wa.6.15 s.186)
3. (a) Suppose that a sample of 1600 tires of the same type are obtained at
random from an ongoing production process in which 8% of all such tires
produced are defective. What is the probability that in such sample 150
or fewer tires will be defective?
(Sou18. CD6-13)
(b) If 10% of men are bald, what is the probability that more than 100 in
a random sample of 818 men are bald?
President University
Erwin Sitompul
PBST 7/46