Engineering Statistics

Download Report

Transcript Engineering Statistics

Engineering Statistics
Chapter 2
Special Variables
2D Approximation of Variables
Poisson Distribution as Approximation
to Binomial Distribution
• As n increases, the probability table for Bin(n, p)
becomes longer. When n>30, it is not practical to
tabulate the probabilities for the distribution. The
UTM tables stops at 30. Some other statistical tables
may go up to n=40, but all such tables have to stop
at some point.
• One reason why we do not create tables for
binomial distribution for n>30 is that, for small
values of p, we can use a Poisson distribution
approximately equivalent to the binomial
distribution.
Comparing means and variance
• We recall that the mean for X~Bin(n, p) is np, and
its variance is npq.
• In the case of P(), the mean and variance are both
.
• Note that when p is nearly 0, then q (=1–p) will be
close to 1, which means np  npq, which indicates
that the mean and variance are nearly equal. This
is similar in property to the Poisson distribution.
• In such a case, the Poisson distribution P(np) is
approximately equivalent to Bin(n, p).
Example 1
• Given X~Bin(40, 0.05), calculate P(X6). Compare it to
P(Y6) for Y~P(2).
Solution:
For the binomial distribution Bin(40, 0.05)
P(X6) = P(X=0)+… +P(X=6)
= 0.128512+0.270552+0.277672+0.185114
+0.090122+0.034151+0.010485=0.9966
For the Poisson distribution Y~P(2)
P(Y6) = 0.995. (Value from the table)
The probability using Poisson distribution is very near the
original binomial distribution. It is of very much easier than
the multiple calculations using binomial distributions.
Points to note
• The approximation of Poisson for binomial is
generally good when n and p satisfy the stated
conditions.
• For X~Bin(n, p) and Y~P(np), individual events of
the type P(X = r) and P(Y = r) may differ
significantly, even for n>30 and p nearly 0.
• However, for compound events of the type P(Xr)
and P(Yr), their values are usually very close.
Example 2
• The probability a man in a village contracts TB is 0.005. The
authority is checking 500 people. What is the probability up
to 4 people have TB?
Solution: In this case, direct calculation using the binomial
distribution may be done using the calculator, giving
0.891681. However, it is easier using Poisson approximation.
• Let X represent the number of TB patients. X~Bin(500,
0.005). Since n>30 and p<0.1, we approximate X by the
Poisson distribution P(2.5). From the table, we have P(X4)
= 0.8912. Again, we note that the answers are not much
different from the direct calculations using binomial
distribution.
Example 3
• 2% of motorcyclists do not have valid license. In
an operation, the JPJ stops 60 motorcyclists to
check their license. What is the probability 3 to 6
of them have invalid licenses?
• Solution: Let I represents the number of invalid
licenses. I~Bin(60, 0.02). As n>30 and p<0.1, use
the Poisson distribution P(1.2) to represent I.
• P(3I6) = P(I6) –P(I2) = 0.9997–0.8795 =
0.1202.
When p>0.9
• If X~Bin(n, p) such that p>0.9, then we
represent X’~Bin(n, q), where q=1-p. We
then interpret the events in X in terms of X’,
as we have discussed earlier in the chapter
on binomial distribution.
• Hence if n>30 and p>0.9, events of X may
also be approximated using Poisson
distribution through X’.
Example 4
• 97% of octogenerians suffer from cataracts. In a
health screening, 50 octogenerians have their eyes
checked. What is the probability less then 47 have
cataracts?
Solution: Let C represent people with cataracts. Then
C~Bin(50, 0.97). Using C’ to represent those with
no cataracts, we have C’~Bin(50, 0.03). We note
the C’ can be approximated using Poisson
distribution P(1.5).
Now P(C<47)=P(C’4)=1–P(C’3)=1–0.9344 =
0.0656.
Example 5
• At least 95% of visitors to an exhibition ends up
buying goods on exhibitions. During a period
under observation, 220 to 240 visitors enter the
exhibition hall. What is the probability at least
95% of them buy some exhibits?
Solution: Let B represent number who make
purchases. Taking the lower value 220 first,
B~Bin(220, 0.95). Using B’ to represent number
who only visit, B’~Bin(200, 0.05).
Example 5 (contd)
• As n>30, we approximate B’ using Poisson
distribution. I.e. B’~P(11). Now 95% of 220
is 209. P(B209) = P(B’11) = 0.5793.
• Next if n=240, then B~Bin(240, 0.95) 
B’~Bin(240, 0.05). The Poisson
approximation in this case is P(12). This time,
95% of visitors equals 228. The probability is
P(B228) = P(B’12) = 0.5760.
Approximation
• Note that X~Bin(n, p) is approximated by P(np)
only when both the conditions n>30 and p<0.1 are
true.
• In general, when n is large, say >50, and p is
small, <0.05, we have good approximations for
probabilities of events. Indeed, the Poisson
distribution is the limiting distribution of the
binomial distribution as N and p0.
• When n is rather small, <40 and p is very near 0.1,
the approximations are not very good.
When p>0.1 and p<0.9
• Unfortunately, when p>0.1 and p<0.9, the Poisson
distribution P(np) may not be close to the binomial
distribution Bin(n, p) anymore.
• For example, if X~(50, 0.4), then the mean of X is
20.
• Let us examine the probabilities of the event X=10
for Bin(50, 0.4) and P(20):
Bin(50,0.4): P(X=10)=0.0014398
P(20):
P(X=10)=0.0058163
• As we can see, the values are very different.
Normal distribution as Approximation to
Binomial Distribution
• For n>30, and when p lies between 0.1 and 0.9,
we can approximate the binomial distribution
Bin(n, p) as N(np, npq).
• This approximation will create probabilities which
are good for sufficiently large n and when p is
close to 0.5.
• Unfortunately, because Binomial distribution is
discrete, while the normal distribution is
continuous, we need to make some adjustments to
the original events.
Continuity adjustment
• For a discrete event X>5, we do not want to
include 5. But if we start from 6, there is big big
gap left between 5 and 6.
• The proposal is to treat X>5 in discrete variable as
X>5.5 as continuous.
• For discrete X5, we treat it as X4.5 and so on.
• The table below list the correction needed when
we approximate a discrete event using continuous
variable.
Corrections
Discrete event  Continuous event
X<a
X< a – 0.5
X>a
X > a + 0.5
Xa
X  a + 0.5
Xa
X  a – 0.5
Example 6
• 24% of all smokers will die of lung cancer in 2 years after
detection. A record is kept on 100 such cases. What is the
probability at least 30 will die within 2 years?
Solution: L = number of lung cancer patients who will die
within 2 years. N = 100, p =0.24. L~Bin(100, 0.24).
Since n> 30 and 0.1<p<0.9 , we can use normal distribution
as approximation for L.  = np = 1000.24 = 24, variance
= 2 = npq = 240.76 = 18.24.
Hence L~N(24, 18.24).
The event we want is L30; this we convert to L29.5 as
continuous correction.
P(L29.5) = P(z [29.5 – 24]/18.24) = P(z1.29
= 0.5 – 0.4015 = 0.0985.
Example 7
• In a new agricultural project, 200 cactus plants of
dragon fruits are planted. Based on experience,
only 77% of the plants will bear fruits. What is the
probability 160 plants or less will be successful?
Solution: F = number of cactus bearing fruits. n =
200, p = 0.77. F~Bin(200, 0.77). Mean = 200
0.77 = 154; variance = 200 0.77 0.23 = 35.42.
Hence we approximate F as N(154, 35.42).
P(F160) = P(z[160-154]/35.42)
= P(z1.01) = 0.5 + 0.3438 = 0.8438.
Example 8
• The probability a new car has more than 2 defects
within 2 years is 0.25 for car A and 0.32 for car B.
A company sells 50 A and 40 B. What is the
probability more A than B will have defects within
the two year guarantee period?
Solution:
A~Bin(50, 0.25);
B~Bin(40, 0.32)
We note that, as binomial distribution, we cannot
find the probability for the event A>B. This is
when normal distribution as approximation is
needed.
Normal distribution for A-B
Using normal distribution as approximation:
A~N(12.5, 9.375);
B~N(12.8, 8.704)
A-B~N(12.5-12.8, 9.375+8.704)
P(A>B) = P(A-B>0)
 P(A-B>0.5) (continuity adjustment)
= P(z>[0.5 – (-0.3)]/18.079)
= P(z>0.04) = 0.5 – 0.016 = 0.484.
Example 9
• 15% of men and 8% of women are colour-blind. A
check is done on 80 men and 80 women. What is
the probability more than 20 are colour-blind?
Solution:
M~Bin(80, 0.15)  M~N(12, 10.2)
W~Bin(80, 0.08) W~N(6.4, 5.888)
M+W~N(12+6.4, 10.2+5.888)
~N(18.4, 16.088)
We want the event M+W>20, which on adjustment,
becomes M+W>20.5
Now P(M+W>20.5)
= P(z>[20.5-18.4]/16.088)
= P(z>0.52) = 0.5 – 0.1985) = 0.3015.
Technically speaking, since p<0.1 in the case of W,
we cannot use normal distribution as
approximation for W. However, there is no direct
way of combining two binomial distributions.
Using the normal distribution, we can obtain the
answer, even though it may not be very accurate.
Using normal distribution to
approximate Poisson distribution
• Even the Poisson distribution table is
limited. In the UTM table,  stops at 40. So
we still need to find ways to overcome the
problem when the  exceeds 40.
• In general, we find that, when >40, the
normal distribution N(, ) when both the
mean an variance are  give a good
approximation.
Example 10
• On the average, 80 accidents occur in a day during
the festive season. What is the probability more
than 90 accidents occur on such a day?
Solution: A~P(80). Since >40, we use the normal
distribution N(80, 80) as the approximation.
As Poisson distribution is discrete, we also need to
make continuity adjustment. Thus P(A>90) is
adjusted to P(A>90.5) = P(z>[90.5-80]/80)
= P(z>1.17) = 0.5 – 0.379 = 0.121.
Example 11
• Hospital A receives 30 male and 25 female
patients each day. What is the probability
the total number of patients on a certain day
is between 50 and 65?
Solution: M~P(30), W~P(25).
M+W~P(55)
Using the normal distribution as
approximation, we have M+W~N(55, 55)
• Unfortunately, the word between here has two
interpretations: either it means more than 50 and less than
60 (40<X<60), or it means from 40 to 60 (40X60).
Since the question is not clear on this, we have to decide
on our own.
Case I: P(40<M+W<60)  P(40.5<M+W<59.5)
= P([40.5-55]/55)<z<[59.5-55]/55)
= P(-1.96<z<0.61) = 0.4750+0.2291 = 0.7041.
Case II: P(40M+W  60)  P(39.5  M+W  60.5)
= P([39.5-55]/55)  z [60.5-55]/55)
= P(-2.09  z  0.74) = 0.4817+0.2704 = 0.7521.
We usually avoid using the word between without
qualification because of the possible misunderstanding.
Using Poisson distributions to
combine Binomial Distributions
• As had been shown, there is no simple way to
solve problems of the type P(X1+X2=k) if
X1~Bin(n1, p1) and X2~Bin(n2, p2) unless p1=p2.
• If, however, the n’s are sufficiently big and the p’s
are relatively small, we can achieve good
approximations for P(X1+X2=k) by using the
corresponding Poisson distributions as
approximations for the binomial distributions.
• Let’s examine an example
Combining Binomials
Example 12. 12% of city dwellers and 8% of
kampung folks are known to suffer from breathing
ailments. A check is made on 25 people living in
cities and 40 people from kampung. What is the
probability up to 6 of them show signs of
breathing sickness?
Solution: We use C and K to represent the numbers
of city and kampung dwellers who are suffering
from breathing sicknesses. Then C~Bin(25, 0.12)
and K~Bin(40, 0.08).
Example 12 (contd)
Obviously, K is a good candidate for using Poisson
distribution as an approximation. Unfortunately, for
C, the n is a little small and the p is a bit too big.
However, in this case, we shall resort to the Poisson
distribution for approximation for K as well. Hence
C~P(3) and K~P(3.2)  C+K~P(6.2). From the
table, P(C+K6) = 0.5742.
The approximation of C appears to violate the
conditions needed. However, the amount of time and
work saved compensates for the loss in accuracy.
Using normal distribution instead
Another way out is to use normal distributions as
approximation for both C and K, as we did in Ex
9. As K is rather skewed, it may not be so
desirable. Also, this will involve more calculations
as we need to carry out continuous adjustment. We
show the work briefly here.
C~N(3, 2.25), K~N(3.2, 2.944)  C+K~N(6.2,
5.194).
P(C+K6)  P(C+K6.5) = P(Z0.13) = 0.5517.
This is a little different from 0.5742 which we obtain
using Poisson distribution.