discrete probability distribution
Download
Report
Transcript discrete probability distribution
CHAPTER 2
BCT2053
COMMONLY USED
PROBABILITY
DISTRIBUTION
Introduction
Probability – chance of an event occurring
Distribution – a function which assigns to each
possible value of the random variable
Probability Distribution – the values/function that
a random variable can assume and the
corresponding probabilities of the values
Types of probability distribution:
1. Discrete – describe discrete random variable
2. Continuous – describe continuous random
variable
CONTENT
2.1 Review on Binomial and Poisson
Distributions
2.2 Poisson Approximation for Binomial
Distribution
2.3 Review on Normal Distribution
2.4 Central Limit Theorem
2.5 Normal Approximation to the Binomial
Distribution
2.6 Normal Approximation to the Poisson
Distribution
2.7 Normal Probability Plots
2.1 Binomial Distribution
OBJECTIVE
At the end of this chapter, you should be able to:
1. Explain what a Binomial Distribution, identify Binomial
experiments and compute Binomial probabilities
2. Find the expected value (mean), variance, and standard
deviation of a Binomial experiment.
Binomial Distribution
A Binomial distribution results from a procedure
that meets all the following requirements
The procedure has a fixed number of trials ( the
same trial is repeated)
The trials must be independent
Each trial must have outcomes classified into 2
relevant categories only (success & failure)
The probability of success remains the same in all
trials
• Example: toss a coin, Baby is born, True/false question, product, etc ...
Notation for the Binomial Distribution
Then, X has the Binomial distribution with parameters n and p denoted by
X ~ Bin (n, p) which read as
‘‘X is Binomial distributed with number of trials n and probability of success p’’
Binomial Experiment or not ?
1.
An advertisement for Vantin claims a 77% end of
treatment clinical success rate for flu sufferers. Vantin
is given to 15 flu patients who are later checked to see
if the treatment was a success.
2.
A study showed that 83% of the patients receiving
liver transplants survived at least 3 years. The files of
6 liver recipients were selected at random to see if
each patients was still alive.
3.
In a study of frequent fliers (those who made at least
3 domestic trips or one foreign trip per year), it was
found that 67% had an annual income over RM35000.
12 frequent fliers are selected at random and their
income level is determined.
For each problem, state what are X, n, p, and q.
Binomial Probability Formula
EXERCISE 2.1
1.
A fair coin is tossed 10 times. Let X be
the number of heads that appear. What
is the distribution of X?
2.
A lot contains several thousand
components. 10 % of the components
are defective. 7 components are sampled
from the lot.
Let X represents the number of defective
components in the sample. What is the
distribution of X ?
Solves problems involving linear inequalities
At least, minimum of, no less than
At most, maximum of, no more than
Is greater than, more than
Is less than, smaller than, fewer than
EXERCISE 2.1
3.
Find the probability distribution of the random
variable X if X ~ Bin (10, 0.4).
Find also P(X = 5) and P(X < 2).
Then find the mean and variance for X.
4.
A fair die is rolled 8 times. Find the probability
that no more than 2 sixes comes up. Then find
the mean and variance for X.
EXERCISE 2.1
5.
A survey found that, one out of five Malaysians
say he or she has visited a doctor in any given
month. If 10 people are selected at random, find
the probability that exactly 3 will have visited a
doctor last month.
6.
A survey found that 30% of teenage consumers
receive their spending money from part time
jobs. If 5 teenagers are selected at random, find
the probability that at least 3 of them will have
part time jobs.
Solve Binomial problems by statistics table
Use Cumulative Binomials Probabilities Table
n number of trials
p probability of success
k number of successes in n trials – X
It give P (X ≤ k) for various values of n and p
Example: n = 2 , p = 0.3
Then P (X ≤ 1) = 0.9100
Then P (X = 1) = P (X ≤ 1) - P (X ≤ 0) = 0.9100 – 0.4900 = 0.4200
Then P (X ≥ 1) = 1 - P (X <1) = 1 - P (X ≤ 0) = 1 – 0.4900 = 0.5100
Then P (X < 1) = P (X ≤ 0) = 0.4900
Then P (X > 1) = 1 - P (X ≤ 1) = 1- 0.9100 = 0.0900
Using symmetry properties to read
Binomial tables
In general,
P (X = k | X ~ Bin (n, p)) = P (X = n - k | X ~ Bin (n,1 - p))
P (X ≤ k | X ~ Bin (n, p)) = P (X ≥ n - k | X ~ Bin (n,1 - p))
P (X ≥ k | X ~ Bin (n, p)) = P (X ≤ n - k | X ~ Bin (n,1 - p))
Example: n = 8 , p = 0.6
Then P (X ≤ 1) = P (X ≥ 7 | p = 0.4) = P ( 1 - X ≤ 6 | p = 0.4)
= 1 – 0.9915 = 0.0085
Then P (X = 1) = P (X = 7 | p = 0.4)
= P (X ≤ 7 | p = 0.4) - P (X ≤ 6 | p = 0.4)
= 0.9935 – 0.9915 = 0.0020
Then P (X ≥ 1) = P (X ≤ 7 | p = 0.4) = 0.9935
Then P (X < 1) = P (X > 7 | p = 0.4) = P ( 1 - X ≤ 7 | p = 0.4)
= 1 – 0.9935 = 0.0065
Then P (X > 1) = P (X < 7 | p = 0.4) = P (X ≤ 6 | p = 0.4) = 0.9915
EXERCISE 2.1
7.
Given that n
P (X ≤
P (X =
P (X ≥
P (X <
P (X >
= 12 , p = 0.25. Then find
3)
7)
5)
2)
10)
8.
Given that n
P (X ≤
P (X =
P (X ≥
P (X <
P (X >
= 9 , p = 0.7. Then find
4)
8)
3)
5)
6)
EXERCISE 2.1
9.
A large industrial firm allows a discount on any
invoice that is paid within 30 days. Of all
invoices, 10% receive the discount. In a
company audit, 12 invoices are sampled at
random.
a)
What is probability that fewer than 4 of 12
sampled invoices receive the discount?
b)
Then, what is probability that more than 1 of the
12 sampled invoices received a discount.
EXERCISE 2.1
10.
A report shows that 5% of Americans are afraid being
alone in a house at night. If a random sample of 20
Americans is selected, find the probability that
a)
There are exactly 5 people in the sample who are
afraid of being alone at night
b)
There are at most 3 people in the sample who are
afraid of being alone at night
c)
There are at least 4 people in the sample who are
afraid of being alone at night
2.1 Poisson Distribution
OBJECTIVE
At the end of this chapter, you should be able to:
1. Explain what a Poisson Distribution, identify Poisson
experiments and compute Poisson probabilities.
2. Find the expected value (mean), variance, and standard
deviation of a Poisson experiment.
Poisson Distribution
The Poisson distribution is a discrete probability
distribution that applies to occurrences of some event over
a specified interval ( time, volume, area etc..)
The random variable X is the number of occurrences of an
event over some interval
The occurrences must be random
The occurrences must be independent of each other
The occurrences must be uniformly distributed over the
interval being used
Example of Poisson distribution
1. The number of emergency call received by an ambulance control in an hour.
2. The number of vehicle approaching a bus stop in a 5 minutes interval.
3. The number of flaws in a meter length of material
Poisson Probability Formula
λ, mean number of occurrences in the given interval is known
and finite
Then the variable X is said to be
‘Poisson distributed with mean λ’
X ~ Po (λ)
EXERCISE 2.2
1.
A student finds that the average number of amoebas
in 10 ml of ponds water from a particular pond is 4.
Assuming that the number of amoebas follows a
Poisson distribution, find the probability that in a 10
ml sample,
a)
there are exactly 5 amoebas
b)
there are no amoebas
c)
there are fewer than three amoebas
EXERCISE 2.2
2.
On average, the school photocopier breaks down 8
times during the school week (Monday - Friday).
Assume that the number of breakdowns can be
modeled by a Poisson distribution.
Find the probability that it breakdowns,
a)
5 times in a given week
b)
Once on Monday
c)
8 times in a fortnight (2 week)
EXERCISE 2.2
Solve Poisson problems by statistics table
3.
Given that X ~ Po (1.6). Use cumulative Poisson
probabilities table to find
a)
b)
c)
d)
e)
P
P
P
P
P
(X
(X
(X
(X
(X
≤
=
≥
<
>
6)
5)
3)
1)
10)
Find also the smallest integer n such that
P ( X > n) < 0.01
EXERCISE 2.2
4.
A sales firm receives, on the average, three calls per
hour on its toll-free number. For any given hour, find
the probability that it will receive the following:
a)
At most three calls
b)
At least three calls
c)
5 or more calls
EXERCISE 2.2
5.
The number of accidents occurring in a weak in a
certain factory follows a Poisson distribution with
variance 3.2.
Find the probability that in a given fortnight,
a)
b)
exactly seven accidents happen.
More than 5 accidents happen.
2.2 Using the Poisson distribution as
an approximation to the Binomial
distribution
When n is large (n > 50) and p is small (p < 0.1),
the Binomial distribution X ~ Bin (n, p) can be
approximated using a Poisson distribution with
X ~ Po (λ) where mean, λ = np < 5.
The larger the value of n and the smaller the value
of p, the better the approximation.
EXERCISE 2.2
6.
Eggs are packed into boxes of 500. On average 0.7
% of the eggs are found to be broken when the eggs
are unpacked.
Find the probability that in a box of 500 eggs,
a)
b)
Exactly three are broken
At least two are broken
EXERCISE 2.2
7.
If 2% of the people in a room of 200 people are lefthanded, find the probability that
a)
b)
c)
exactly five people are left-handed.
At least two people are left-handed.
At most seven people are left-handed.
2.3 Normal Distribution
OBJECTIVE
At the end of this chapter, you should be able to:
1. Identify the properties of the normal distribution and find the area under
the standard normal distribution, given various Z values.
3. Find probabilities for a normally distributed variable by transforming it
into a standard normal variable.
4. Find specific data values for given percentages, using the standard normal
distribution.
Continuous Distribution
A discrete variable cannot assume all values between
any two given values of the variables.
A continuous variable can assume all values between
any two given values of the variables.
Examples of continuous variables are the heights of
adult men, body temperatures of rats, and cholesterol
levels of adults.
Many continuous variables, such as the examples just
mentioned, have distributions that are bell-shaped,
and these are called approximately normally
distributed variables.
Properties of Normal Distribution
Also known as the bell curve or the Gaussian
distribution, named for the German
mathematician Carl Friedrich Gauss (1777–
1855), who derived its equation.
X is continuous where
and
X ~ N ,
2
1
x
f x
e
2
2 2
2
, x
Example: Histograms for the Distribution
of Heights of Adult Women
Observation
The larger the data size, then the distribution of
the data will approximately bell shape (normal).
No variable fits normal distribution perfectly,
since a normal distribution is a theoretical
distribution.
However, a normal distribution can be used to
describe many variables, because the deviations from
normal distribution are very small.
The Normal Probability Curve
The Curve is bell-shaped
The mean, median, and mode
are equal and located at the
center of the distribution.
The curve is unimodal (i.e., it has only one mode).
The curve is symmetric about the mean, (its shape is the
same on both sides of a vertical line passing through the
center.
The curve is continuous, (there are no gaps or holes)
For each value of X, there is a corresponding value of Y.
The Normal Probability Curve
The curve never touches the x axis.
Theoretically, no matter how far in either
direction the curve extends, it never meets the x
axis—but it gets increasingly closer.
The total area under the normal distribution
curve is equal to 1.00, or 100%.
A Normal Distribution is a continuous,
symmetric, bell shaped distribution of a variable.
Area Under a Normal Distribution Curve
The area under the part of the normal curve that lies
within 1
0.68, or
within 2
within 3
standard deviation of the mean is approximately
68%;
standard deviations, about 0.95, or 95%
standard deviations, about 0.997, or 99.7%.
Other Characteristics
Finding the probability
Area under curve
P a x b
P x 0.68
P 2 x 2 0.95
P 3 x 3 0.99
Example
Given
X ~ N 110,144 , Find the value of a and b if
P a x b 0.68
Shapes of Normal Distributions
The Standard Normal Distribution
The standard normal distribution is a normal distribution
with a mean of 0 and a standard deviation of 1.
The standard normal variable Z is given by Z ~ N 0,1
where Z
X
and X ~ N
,
2
use the statistical table
to obtain probability for X ,
P 0 X x P 0 Z z z
T
I
P
S
P 0 Z 0.12 0.0478
0.0517 0.0478
P 0 Z 0.123 0.0478
3
10
Different between 2 curves
Area Under
the Normal
Distribution
Curve
Area Under
the
Standard
Normal
Distribution
Curve
Finding Area under the Standard
Normal Distribution
GENERAL PROCEDURE
STEP 1 Draw a picture.
STEP 2 Shade the area
desired.
STEP 3 Find the correct
figure in the following
Procedure Table (the figure
that is similar to the one
you’ve drawn).
STEP 4 Follow the
directions given in the
appropriate block of the
Procedure Table to get the
desired area.
EXAMPLE 1
P
P
P
P
(0 < Z < 2.34) = 0.4904
(-2.34 < Z < 0) = 0.4904
(0 < Z < 0.156) = 0.062
(-1.738 < Z < 0) = 0.4589
Finding Area under the Standard
Normal Distribution
EXAMPLE 2
P
P
P
P
(Z
(Z
(Z
(Z
>1.25) = 0.1056
<-2.13) = 0.0166
>2.099) = 0.0179
<-0.087) = 0.4653
EXAMPLE 3
P
P
P
P
(0.21 < Z
(-2.134 <
(0.67 < Z
(-1.738 <
< 2.34) = 0.4072
Z < -0.21) = 0.4004
< 1.156) = 0.1276
Z < -0.79) = 0.1737
Finding Area under the Standard
Normal Distribution
EXAMPLE 4
P
(-0.21 < Z < 2.34) = 0.5736
P (-2.134 < Z < 0.21) = 0.5688
P (-0.67 < Z < 1.156) = 0.6248
P (Z < |0.79|) = 0.5704
EXAMPLE 5
P (Z < 1.21) = 0.8869
P (Z < 2.099) = 0.9821
P (Z < 0.512) = 0.6957
Finding Area under the Standard
Normal Distribution
EXAMPLE 6
P (Z >-1.25) = 0.8944
P (Z >-2.13) = 0.9834
P (Z >-0.087) = 0.5347
EXAMPLE 7
P (Z >|2.34|) = 0.0192
P (Z >|0.147|) = 0.8832
EXERCISE 2.3
Given X ~ N(110,144), find
1.
(a)
(b)
(c)
T
I
P
S
P (110 < X < 128)
P (X < 150)
P (X > 130)
(d)
(e)
(f)
P (X > 170)
P (98 < X < 128)
P (X < 60)
Transform the original variable X where
X ~ N
,
to a standard normal distribution variable Z where
Z ~ N 0,1
Z
X
2
EXERCISE 2.3
T
I
P
S
2.
P 0 Z a 0.0478
a 0.12
P 0 Z a 0.0490
0.0490 0.0478
a 0.12
100
0.0517
0.0478
If Z ~ N(0,1), find the value of a if
a)
b)
c)
d)
P(Z < a) = 0.9693
P(Z < a) = 0.3802
P(Z < a) = 0.7367
P(Z < a) = 0.0793
3.
If X ~ N(μ,36) and P ( X > 82) = 0.0478, find μ.
4.
If X ~ N(100, σ ²) and P ( X < 82) = 0.0478, find σ.
EXERCISE 2.3
Applications of the Normal Distribution
5. The mean number of hours an American worker spends
on the computer is 3.1 hours per workday. Assume the
standard deviation is 0.5 hour. Find the percentage of
workers who spend less than 3.5 hours on the computer.
Assume the variable is normally distributed.
6. Length of metal strips produced by a machine are
normally distributed with mean length of 150 cm and a
standard deviation of 10cm.
Find the probability that the length of a randomly
selected is
a) Shorter than 165 cm
b) within 5cm of the mean
EXERCISE 2.3
Applications of the Normal Distribution
7. Time taken by the Milkman to deliver to the Jalan Indah
is normally distributed with mean of 12 minutes and
standard deviation of 2 minutes. He delivers milk
everyday. Estimate the numbers of days during the year
when he takes
a) longer than 17 minutes
b) less than ten minutes
c) between 9 and 13 minutes
8. To qualify for a police academy, candidates must score
in the top 10% on a general abilities test. The test has a
mean of 200 and a standard deviation of 20.
Find the lowest possible score to qualify. Assume the
test scores are normally distributed.
EXERCISE 2.3
Applications of the Normal Distribution
9. The heights of female student at a particular college are
normally distributed with a mean of 169cm and a standard
deviation of 9 cm.
a) Given that 80% of these female students have a
height less than h cm. Find the value of h.
b) Given that 60% of these female students have a
height greater than y cm. Find the value of y.
10. For a medical study, a researcher wishes to select people
in the middle 60% of the population based on blood
pressure. If the mean systolic blood pressure is 120 and
the standard deviation is 8, find the upper and lower
readings that would qualify people to participate in the
study.
2.4 Central Limit Theorem
OBJECTIVE
At the end of this chapter, you should be able to:
1. Use the central limit theorem to solve problems involving
sample means for large samples (probability of mean)
The Central Limit Theorem
As the sample size n increases without limit, the
shape of the distribution of the sample means
taken with replacement from a population with
mean µ and standard deviation σ will approach a
normal distribution.
This distribution (for sample mean) will have a
mean µ and a standard deviation σ/√n.
The Central Limit Theorem
MATHEMATICAL EXPLAINATION
The Central Limit Theorem
T
I
P
S
If
X ~ N
then
,
2
and n sample is selected,
2
X ~ N ,
n
Use a standard normal distribution variable Z where
Z ~ N 0,1
Z
X
n
EXTRA: If the distribution of X is not normal, so a sample size of 30
or more is needed to use the central limit theorem
EXERCISE 2.4
1.
A. C. Neilsen reported that children between the ages of 2
and 5 watch an average of 25 hours of television per
week. Assume the variable is normally distributed and the
standard deviation is 3 hours.
If 20 children between the ages of 2 and 5 are randomly
selected, find the probability that the mean of the number
of hours they watch television will be greater than 26.3
hours.
2. The average age of a vehicle registered in the United
States is 8 years, or 96 months. Assume the standard
deviation is 16 months.
If a random sample of 36 vehicles is selected, find the
probability that the mean of their age is between 90 and
100 months.
EXERCISE 2.4
3. The average number of pounds of meat that a person
consumes a year is 218.4 pounds. Assume that the
standard deviation is 25 pounds and the distribution is
approximately normal.
a. Find the probability that a person selected at random
consumes less than 224 pounds per year.
b. If a sample of 40 individuals is selected, find the
probability that the mean of the sample will be less than
224 pounds per year.
2.5 Normal approximation to
the Binomial Distribution
OBJECTIVE
At the end of this chapter, you should be able to:
1. Use the normal approximation to compute probabilities for a
Binomial variable.
Procedure
1.
Check to see whether the normal approximation
can be used
2.
Find the mean and standard deviation
3.
Write the problem in probability notation using X
4.
Rewrite the problem by using the continuity
correction factor, and show the corresponding
area under the normal distribution
5.
Find the corresponding Z values
6.
Find the solution
Normal Approximation to the
Binomial Distribution
If X ~ Bin (n, p) and n and p are such that np ≥ 5 and
nq ≥ 5 where q = 1 – p then X ~ N (np, npq) approximately.
The continuity correction is needed when using a continuous
distribution (normal) as an approximation for a discrete
distribution (binomial), i.e
TIPS: class boundary
EXERCISE 2.5
1. In a sack of mixed grass seeds, the probability that a
seed is ryegrass is 0.35. Find the probability that in a
random sample of 400 seeds from the sack,
less than 120 are ryegrass seeds
between 120 and 150 (inclusive) are ryegrass
more than 160 are ryegrass seeds
2. Find the probability obtaining 4, 5, 6 or 7 heads when a
fair coin is tossed 12 time using a normal approximation
to the binomial distribution
2.6 Normal approximation to
the Poisson Distribution
OBJECTIVE
At the end of this chapter, you should be able to:
1. Use the normal approximation to compute probabilities for a
Poisson variable.
Normal approximation to the
Poisson Distribution
If X ~ Po (λ) and λ > 15, then X can be approximated by
Normal distribution with X ~ N (λ, λ)
The continuity correction is also needed.
EXERCISE 2.6
1.
If X ~ Po (35), use the normal approximation to find
a)
b)
c)
d)
P ( X ≤ 33)
P ( X > 37)
P (33 < X < 37)
P ( X = 37)
EXERCISE 2.6
2. A radioactive disintegration gives counts that follow a
Poisson distribution with mean count of 25 per second.
Find the probability that in one-second interval the count is
between 23 and 27 inclusive.
3. The number of hits on a website follows a Poisson
distribution with mean 27 hits per hour.
Find the probability that there will be 90 or more hits in
three hours.
2.7 Normal Probability Plots
OBJECTIVE
At the end of this chapter, you should be able to:
1. Plot and interpret a Normal Probability Plot
Normal Probability Plots
To determine whether the
sample might have come
from a normal population
or not.
The most plausible
normal distribution is
the one whose mean
and standard deviation
are the same as the
sample mean and
standard deviati.on
How to plot?
Arrange the data sample in ascending (increasing) order
Assign the value (i -0.5) / n to xi
to reflect the position of xi in the ordered sample.
There are i - 1 values less than xi , and i values less
than or equal to xi . The quantity (i -0.5) / n is a
compromise between the proportions (i - 1) / n and i / n
Plot xi versus (i -0.5) / n
If the sample points lie approximately on a straight line,
so it is plausible that they came from a normal
population.
Normal Probability Plots
Other than plot manually, we can obtain it from software
such as SPSS, Minitab, Excel, and etc. The normality of
the data can be test by using Kolmogorov Smirnov and
Anderson Darling for parametric test.
EXERCISE 2.7
1.
A sample of size 5 is drawn. The sample, arranged in
increasing order, is
3.01 3.35 4.79 5.96 7.89
Do these data appear to come from an approximately
normal distribution?
The data shown represent the number of movies in US for
14-year period.
2084
848
1497
837
1014
826
910
815
899
750
870
737
859
637
Do these data appear to come from an approximately
normal distribution?
Conclusion
Statistical Inference involves drawing a sample from a
population and analyzing the sample data to learn about
the population.
In many situations, one has an approximate knowledge of
the probability mass function (discrete) or probability
density function (continuous) of the population.
In these cases, the probability mass
or density function can often be well
approximated by one of several standard
families of curves or function discussed
in this chapter.
Thank You
NEXT: CHAPTER 3 Sampling Distribution and Confidence Interval