슬라이드 1 - KAIST 수리과학과

Download Report

Transcript 슬라이드 1 - KAIST 수리과학과

Chapter 3.
Discrete Probability Distributions
3.1 The Binomial Distribution
3.2 The Geometric and Negative Binomial Distributions
3.3 The Hypergeometric Distribution
3.4 The Poisson Distribution
3.5 The Multinomial Distribution
NIPRL
3.1 The Binomial Distribution
3.1.1 Bernoulli Random Variables(1/2)
• To model
- the outcome of a coin toss,
- whether a valve is open or shut,
- whether an item is defective or not,
- any other process that has only two possible outcomes.
• The outcomes are labeled 0 and 1
• The random variable is defined by the parameter p, 0  p  1,
which is the probability that the outcome is 1.
NIPRL
3.1.1 Bernoulli Random Variables(2/2)
• Expectations
E ( X )  (0  P( X  0))  (1 P( X  1))
 (0  (1  p))  (1 p)
p
E ( X 2 )  (02  P( X  0))  (12  P( X  1))
p
Var ( X )  E ( X 2 )  ( E ( X ))2
 p  p 2  p(1  p)
NIPRL
3.1.2 Definition of the Binomial Distribution(1/5)
• Consider an experiment consisting of
– n Bernoulli trials (X1, ……, Xn)
– that are independent and
– that each have a constant probability p of success.
• Then the total number of successes X, that is X=X1+…+Xn, is a
random variable that has a binomial distribution with parameters
n and p, which is written
X~B(n,p)
NIPRL
3.1.2 Definition of the Binomial Distribution(2/5)
• The probability mass function of a B(n,p) random variable is
n x
P( X  x)    p (1  p) n  x
 x
for x  0,1,..., n, with
E ( X )  E ( X1 ) 
 E( X n )  p 
Var ( X )  Var ( X 1 ) 
 Var ( X n )
 p (1  p ) 
NIPRL
 p  np
 p (1  p )  np (1  p )
3.1.2 Definition of the Binomial Distribution(3/5)
• Ex) X~B(8,0.5)
8
8!
P( X  3)     0.53  (1  0.5)5 
 0.58  0.219
3!5!
 3
P  X  1  P ( X  0)  P ( X  1)
8 
8
0
8
    0.5  (1  0.5)     0.51  (1  0.5) 7
0
1 
8!
8!

 0.58 
 0.58  0.004  0.031  0.035
0!8!
1!7!
NIPRL
3.1.2 Definition of the Binomial Distribution(4/5)
•
Ex) X~B(8,0.5)
0.273
0.219
Probability
0.219
0.109
0.109
0.031
0.031
0.004
0.004
0
x
P  X  x
NIPRL
1
2
0
1
0.004
0.035
3
2
0.144
4
3
0.363
5
4
0.636
6
5
0.855
7
6
0.965
8
7
0.996
x
8
0.1000
3.1.1 Definition of the Binomial Distribution(5/5)
• Symmetric Binomial Distributions :
A B(n,0.5) distribution is a symmetric probability distribution for
any value of the parameter n. The distribution is symmetric
about the expected value n/2.
NIPRL
Example 24 : Air Force Scrambles(1/3)
• 16 planes.
• A probability of 0.25 that the engines of a particular plane does
not start at a given attempt.
• Then, the number of planes successfully launched has a
binomial distribution with parameters n = 16 and p = 0.75
• The expected number of plane launched is
E  X   np  16  0.75  12
NIPRL
Example 24 : Air Force Scrambles(2/3)
• The variance is
Var  X   np(1 p)  16  0.75 0.25  3
• The probability that exactly 12 planes scramble successfully is
16 
16!
P  X  12     0.7512  0.254 
 0.7512  0.254  0.225
12!4!
12 
• The probability that at least 14 planes scramble successfully is
P  X  14   P  X  14   P  X  15   P( X  16)
16 
16 
16 
14
2
15
1
    0.75  0.25     0.75  0.25     0.7516  0.250
14 
15 
16 
 
 
 
 0.134  0.054  0.010  0.198
NIPRL
Example 24 : Air Force Scrambles(3/3)
0.225
0.208
0.180
0.134
0.110
Probability
0.054
0.052
0.000
0.000
0.0010.0060.020
0.000
0.000
0.000
x
P  X  x
NIPRL
0.010
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0.000
0.000
0.000
0.000
0.000
0.007
0.001
0.079
0.027
0.369
0.189
0.802
0.594
0.990
0.936
0.100
x
Proportion of successes in Bernoulli Trials
• Let
X
• Then, if
B(n, p)
Y  X /n
E (Y )  p
p (1  p )
V (Y ) 
n
NIPRL
3.2 The Geometric and Negative Binomial
Distributions
3.2.1 Definition of the Geometric Distribution(1/2)
• The number of trials up to and including the first success in a
sequence of independent Bernoulli trials with a constant
success probability p has a geometric distribution with
parameter p.
• The probability mass function is
P  X  x   (1  p) x1 p
for x  1, 2, 3, 4
NIPRL
.
3.2.1 Definition of the Geometric Distribution(2/2)
• The cumulative distribution function is
P  X  x   1  (1  p) x
• The expectations
1
EX  
p
Var  X  
NIPRL
1 p
p2
- Expectation of X:

E[ X ]   x(1  p)
x 1
x 1

p  p  x(1  p)
x 1
x 1
- Variance of X:
Var ( X )  E[ X 2 ]  E 2 [ X ]


x 1
x 1
E[ X 2 ]   x 2 (1  p ) x 1 p  p  x 2 (1  p ) x 1

 x (1  p)
2
x 1
x 1
1  1 2(1  p)  2  p
  
 3
2
p p
p
p

2  p 1 1 p
 Var ( X )  2  2  2
p
p
p
NIPRL
1
1
 p 2 
p
p
Example 24 : Air Force Scrambles(1/2)
• If the mechanics are unsuccessful in starting the engines, then
they must wait 5 minutes before trying again.
• The distribution of the number of attempts needed to start a
plane’s engine  geometric distribution with p = 0.75.
• The probability that the engines start on the third attempt is
P  X  3  0.252  0.75  0.047
NIPRL
Example 24 : Air Force Scrambles(2/2)
• The probability that the plane is launched within 10 minutes of
the first attempt to start the engines is
• The expected number of attempts required to start the engines
is
P  X  3  1 0.253  0.984
EX  
NIPRL
1
1

 1.33
p 0.75
3.2.2 Definition of the Negative Binomial Distribution(1/2)
• The number of trials up to and including the r th success in a
sequence of independent Bernoulli trials with a constant success
probability p has a negative binomial distribution with parameter p
and r
• The probability mass function is
 x  1
x r r
P  X  x  
(1

p
)
p

 r 1 
for x  r , r  1, r  2,
NIPRL
.
3.2.2 Definition of the Negative Binomial Distribution(2/2)
• The expectations
r
EX  
p
Var  X  
NIPRL
r (1  p)
p2
Example 12 : Personnel Recruitment(1/2)
• Suppose that a company wishes to hire three new workers and
each applicant interviewed has a probability of 0.6 of being
found acceptable.
• The distribution of the total number of applicants that the
company needs to interview  Negative Binomial distribution
with parameter p = 0.6 and r = 3.
• The probability that exactly six applicants need to be interviewed
is
5
P  X  6      0.43  0.63  0.138
 2
NIPRL
Example 12 : Personnel Recruitment(2/2)
• If the company has a budget that allows up to six applicants to be
interviewed, then the probability that the budget is sufficient is
P  X  6   P  X  3  P  X  4   P  X  5   P  X  6 
 0.216  0.259  0.207  0.318  0.820
• The expected number of interviews required is
EX  
NIPRL
r
3

5
p 0.6
3.3 The Hypergeometric Distribution
3.3.1 Definition of the Hypergeometric Distribution(1/3)
• Consider a collection of N items of which r are of a certain kind.
• If one of the items is chosen at random, the probability that it is
of the special kind is clearly
p
r
N
• Consequently, if n items are chosen at random with replacement,
then clearly the distribution of X, the number of defective items
chosen, is
X ~ B  n, r / N 
NIPRL
3.3.1 Definition of the Hypergeometric Distribution(2/3)
• However, if n items are chosen at random without replacement,
then the distribution of X is the hypergeometric distribution.
• The hypergeometric distribution has a probability mass function
given by
r N r
 

x  n  x 

P  X  x 
N
 
n
for max{0, n  r  N}  x  min{n, r}.
NIPRL
3.3.1 Definition of the Hypergeometric Distribution(3/3)
• The expectations
EX   n
r
N
r 
r 
 N n
Var  X   

n


1




N

1
N
N




• It represents the distribution of the number of items of a certain
kind in a random sample of size n drawn without replacement
from a population of size N that contains r items of this kind.
NIPRL
•
Let X and Y be independent binomial random variables such that
X
B(r , p), Y
B( N  r , p )
Let us consider the conditional probability mass function of X given that
X+Y=n.
P{ X  x, X  Y  n} P{ X  x, Y  n  x}
P{ X  x | X  Y  n} 

P{ X  Y  n}
P{ X  Y  n}
 P{ X  x | X  Y  n} 
P{ X  x}P{Y  n  x}
P{ X  Y  n}
n x
n x  N  r  n x
N r n x
p
(1

p
)
p
(1

p
)
 


x
nx 



N n
N n
p
(1

p
)
 
n
 n  N  r 
 

x
n

x

  
N
 
n
NIPRL
That is, the conditional distribution of X
given the value of X+Y is hypergeometric!
• Expectation of X:
Let Xi be the following random variable:
Xi=1 if the ith selection is acceptable; 0 otherwise.
Then,
r
P{ X i  1} 
N
Let X be the hypergeometric random variable with parameters
(r,N.n). Then, n
This gives
X   Xi
i 1
n
n
i 1
i 1
E[ X ]   E[ X i ]   P{ X i  1} 
nr
N
• Variance of X:
n
n
i 1
i 1
n
n
Var ( X )  Var ( X i )  Var ( X i )  2 Cov( X i , X j )
NIPRL
i 1 j i
Since Xi is a Bernoulli random variable,
r N r
Var ( X i )  P{ X i  1}(1  P{ X i  1}) 
N N
Also for i<j,
Cov( X i , X j )  E[ X i X j ]  E[ X i ]E[ X j ]
E[ X i X j ]  P{X i X j  1}  P{X i  1, X j  1}
r r 1
P{ X i  1, X j  1}  P{ X i  1}P{ X j  1| X i  1} 
N N 1
2
r r 1  r 
r ( N  r )
Cov( X i , X j ) 
   2
N N 1  N 
N ( N  1)
Hence,
r ( N  r ) n(n  1) N ( N  r )
r 
r  N n
Var ( X ) 

 n 1  
2
2
N
N ( N  1)
N  N  N 1
Cf. Let p  r / N Then,
N n
Var
(
X
)

np
(1

p
)
and
E[ X ]  np
N 1
As N goes to infinity,
Var ( X )  np(1  p)
NIPRL
Example 17 : Milk Container Contents(1/3)
• Suppose that milk is shipped to retail outlets in boxes that hold
16 milk containers. One particular box, which happens to
contain six under weight containers, is opened for inspection,
and five containers are chosen at random.
• The distribution of the number of underweight milk containers in
the sample chosen by inspector  Hypergeometric distribution
with N=16, r=6, and n=5.
NIPRL
Example 17 : Milk Container Contents(2/3)
• The probability that the inspector chooses exactly two
underweight containers is
 6  10   6!   10! 

   

2   3   2!4!   3!7! 

P  X  2 

 0.412
16 
 16! 


 
5!11!


5
• The expected number of underweight containers chosen by the
inspector is
EX  
NIPRL
nr 5  6

 1.875
N
16
Example 17 : Milk Container Contents(3/3)
0.412
0.288
0.206
Probability
0.058
0.034
0.002
0
NIPRL
1
2
3
4
5
Number of underweight milk containers
found by inspector
3.4 The Poisson Distribution
3.4.1 Definition of the Poisson Distribution(1/3)
• The distribution of
– The number of defects in an item
– The number of radioactive particles emitted by a substance
– The number of telephone calls received by an operator with
a certain time limit
• That is, the number of “events” that occur within certain
specified boundaries.
NIPRL
3.4.1 Definition of the Poisson Distribution(2/3)
• A random variable X distributed as a Poisson random variable
with parameter λ, which is written
X ~ P  
has a probability mass function
e  x
P  X  x 
x!
for x  0, 1, 2, 3,
NIPRL
.
3.4.1 Definition of the Poisson Distribution(3/3)
• The Poisson distribution is often useful to model the number of
times that a certain event occurs per unit of time, distance, or
volume, and it has a mean and variance both equal to the
parameter value λ.
• The expectations
E  X   Var  X   
NIPRL
• The Poisson random variable can be used as an approximation for
a binomial random variables with parameters (n,p) when n is large
and p is small:
Let X be a binomial random variable with parameters (n,p) and
  np Then,
n!
n!   
i
n i
P{X  i} 
p (1  p) 
 
(n  i)!i !
(n  i)!i !  n 
 
1  
 n
i
That is,
P{ X  i} 
n(n  1)
(n  i  1)  i (1   / n)n
ni
i ! (1   / n)i
For large n and small p,

n


1


e
,


 n
Hence,
P{ X  i}  e
NIPRL

n(n 1)
i
i!
(n  i  1)
ni

i

 1, 1    1
 n
n i
Example 3 : Software Errors(1/3)
• Suppose that the number of errors in a piece of software has a
Poisson distribution with parameter λ=3.
•
[The expected number of errors]
= [The variance in the number of errors] = 3.
• The probability that a piece of software has no error is
e3  30
P  X  0 
 e3  0.050
0!
NIPRL
Example 3 : Software Errors(2/3)
• The probability that there are three or more errors in a piece of
software is
P  X  3  1  P  X  0   P  X  1  P  X  2 
e 3  30 e 3  31 e 3  32
 1


0!
1!
2!
1 3 9 
 1  e 3    
1 1 2 
 1  0.423  0.577
NIPRL
Example 3 : Software Errors(3/3)
0.224 0.224
0.168
0.149
Probability
0.101
0.050
0
0.050
0.022
0.0080.003 0.001
1
2
3
4
5
6
7
8
9
10
11
12
13
Number of software errors
x
0
P  X  x
0.050
NIPRL
1
2
3
0.224
0.149
4
5
0.168
0.224
6
7
0.050
0.101
8
9
0.008
0.022
10
11
0.001
0.003
12
13
0.000
0.000
0.000
3.5 The Multinomial Distribution
3.5.1 Definition of the Multinomial Distribution(1/2)
• Consider a sequence of n independent trials where each
individual trial can have k outcomes that occur with constant
probability value p1,…, pk with p1+···+pk = 1. The random
variable X1,…, Xk that count the number of occurrences of each
outcome are said to have a multinomial distribution.
• The joint probability mass function is
P  X1  x1 ,..., X k  xk  
n!
x1 !
xk !
 p1x1 
 pkxk
for nonnegative integer values of the xi satisfying x1 
NIPRL
 xk  n.
3.5.1 Definition of the Multinomial Distribution(2/2)
• The random variables X1,…, Xk have expectation and variances
given by
E  X i   npi
Var  X i   npi (1 pi )
but they are not independent. (why?)
NIPRL
Example 1 : Machine Breakdowns(1/4)
• Suppose that the machine breakdowns are attributable to
electrical faults, mechanical faults, and operator misuse, and
these causes occur with probabilities of 0.2, 0.5, and 0.3,
respectively.
• The engineer is interested in predicting the causes of the next
ten breakdowns.
• X1: the number of breakdowns due to electrical reasons.
• X2: the number of breakdowns due to mechanical reasons.
• X3: the number of breakdowns due to operator misuse.
NIPRL
Example 1 : Machine Breakdowns(2/4)
• X1+X2+X3=10
• If the breakdown causes can be assumed to be independent of
one another, then the probability mass function is
P  X1  x1 , X 2  x2 , X 3  x3  
10!
 0.2 x1  0.5x2  0.3x3
x1 ! x2 ! x3 !
• The probability that there will be three electrical breakdowns, five
mechanical breakdowns, and two misuse breakdowns is
P  X 1  3, X 2  5, X 3  2  
NIPRL
10!
 0.23  0.55  0.32  0.057
3!5!2!
Example 1 : Machine Breakdowns(3/4)
• The expected number of electrical breakdowns is
E  X1   np1  10  0.2  2
• The expected number of mechanical breakdowns is
E  X 2   np2  10  0.5  5
• The expected number of misuse breakdowns is
E  X 3   np3  10  0.3  3
NIPRL
Example 1 : Machine Breakdowns(4/4)
• If the engineer is interested in the probability of there being no
more two electrical breakdowns, then this can be calculated by
nothing that X1~B(10, 0.2), so that
P  X 1  2   P  X 1  0   P  X 1  1  P  X 1  2 
10 
0
10  10 
1
9  10 
    0.2  0.8     0.2  0.8     0.22  0.88
0
1
2
 0.107  0.268  0.302  0.677
NIPRL