Transcript notes

Statistics for Business
Chapter 5
Discrete Random Variables
1
Discrete Random Variables
5.1
5.2
5.3
5.4
Two Types of Random Variables
Discrete Probability Distributions
The Binomial Distribution
The Poisson Distribution
Two Types of Random Variables
• Random variable: a variable (qualitative or quantitative
attribute that characterizes a population/sample) that assumes
numerical values which are determined by the
outcome of an experiment
– Discrete
– Continuous
• Discrete random variable: Possible values can be
counted or listed, e.g.
– The number obtained from the throw of a die
– The number of children in a family
– The number of students in a class
Random Variables
Continued
• Continuous random variable: May assume
any numerical value in one or more intervals,
e.g.
– The waiting time for the next bus at a bus stop
– The interest rate charged on a business loan
– Distance in meters a student has to walk per day
Discrete Probability Distributions
• The probability distribution of a discrete
random variable is a table, graph or formula
that gives the probability associated with each
possible value that the variable can assume
• Notation: Denote the values of the random
variable by x and the value’s associated probability
by p(x)
Discrete Probability Distribution Properties
1. For any value x of the random variable, p(x)
0
2. The probabilities of all the events in the
sample space must sum to 1, that is…
 px   1
all x
Discrete random variable: an example
You are going to flip a coin three times. Let X be the
number of heads appear in these 3 times. Let H represent
the outcome of a head and T the outcome of a tail.
The possible outcomes for such an
experiment:
Thus the possible values of X
(number of heads) are 0,1,2,3.
X is a discrete random variable.
TTT, TTH, THT, THH,
HTT, HTH, HHT, HHH
The probability distribution of X
For a fair coin:
x
0
1
2
3
Probability P(X=x)
1/8
3/8
3/8
1/8
This is an example of Binomial Distribution as explained later.
What is the chance of having 3 heads? 1/8
What is the chance of having 1 heads? 3/8
Example: Number of TV Sold in a Day
• Let x be the random variable of the number of TV
sold per day in an appliance store
– x has values x = 0, 1, 2, 3, 4, 5, 6
• Given: Frequency distribution of sales history over
past 100 days
– Let f be the number of days (of the past 100) during
which x number of TV were sold
# TV sold, x
0
1
2
3
4
5
Frequency
f(0) = 3
f(1) = 20
f(2) = 50
f(3) = 20
f(4) = 5
f(5) = 2
100
Relative Frequency
3/100 = 0.03
20/100 = 0.20
0.50
0.20
0.05
0.02
1.00
Example
Continued
• Interpret the relative frequencies as probabilities
– So for any value x, f(x)/n = p(x)
– Assuming that sales remain stable over time
Number of TV Sold in a store per day
TV sold, x
0
1
2
3
4
5
Probability, p(x)
p(0) = 0.03
p(1) = 0.20
p(2) = 0.50
p(3) = 0.20
p(4) = 0.05
p(5) = 0.02
1.00
Example
Continued
• What is the chance that fewer than 2 TV will
Using the addition rule
be sold per day?
for the mutually
– p(x < 2)= p(x = 0 or x = 1)
= p(x = 0) + p(x = 1)
= 0.03 + 0.20 = 0.23
exclusive values of
the random variable.
• What is the chance that three or more TV will
be sold per day?
– p(x ≥ 3)= p(x = 3, 4, or 5)
= p(x = 3) + p(x = 4) + p(x = 5)
= 0.20 + 0.05 + 0.02 = 0.27
Expected Value of a Discrete Random Variable
The mean or expected value of a discrete
random variable X is:
m X   x p x 
All x
m is the value expected (or expectation value)
to occur in the long run and on average
Example: Number of TV Sold per day
• How many TV should be expected to be sold in a day?
– Calculate the expected value of the number of TV sold, µX
# of TV, x
0
1
2
3
4
5
Probability, p(x)
p(0) = 0.03
p(1) = 0.20
p(2) = 0.50
p(3) = 0.20
p(4) = 0.05
p(5) = 0.02
1.00
• On average, expect to sell 2.1 TVs per day
x p(x)
0  0.03 = 0.00
1  0.20 = 0.20
2  0.50 = 1.00
3  0.20 = 0.60
4  0.05 = 0.20
5  0.02 = 0.10
2.10
Variance
• The variance is the average of the squared
deviations of the different values of the
random variable from the expected value
• The variance of a discrete random variable is:
2
X
   x  m X  p x 
2
All x
Standard Deviation
• The standard deviation is the square root of
the variance
X 
2
X
• The variance and standard deviation measure
the spread of the values of the random
variable from their expected value
Example: Number of TV Sold per day
Radios, x
0
1
2
3
4
5
Probability, p(x)
p(0) = 0.03
p(1) = 0.20
p(2) = 0.50
p(3) = 0.20
p(4) = 0.05
p(5) = 0.02
1.00
(x - mX)2 p(x)
(0 – 2.1)2 (0.03) = 0.1323
(1 – 2.1)2 (0.20) = 0.2420
(2 – 2.1)2 (0.50) = 0.0050
(3 – 2.1)2 (0.20) = 0.1620
(4 – 2.1)2 (0.05) = 0.1805
(5 – 2.1)2 (0.02) = 0.1682
0.8900
Example: Number of TV Sold per day
• Variance equals 0.8900
• Standard deviation is the square root of the
variance
• Standard deviation equals 0.9434
Flipping a coin three times
Number of heads appeared
x
0
1
2
3
Probability P(X=x)
1/8
3/8
3/8
1/8
E(X) = 0 x 1/8 + 1 x 3/8 + 2 x 3/8 + 3 x 1/8
= 12/8 = 1.5
What is the variance & standard deviation of X?
Permutation
By the PERMUTATIONS of the letters abc we mean all
of their possible arrangements:
abc, acb, bac, bca, cab, cba;
where ab means that a was chosen first and b second; ba means that b was
chosen first and a second; and so on.
There are 6, or 3! = 3*2*1, possible arrangements.
In general, nPk means "the number of permutations
of n different things taken k at a time."
nPk = n(n − 1)(n − 2)· · · to k factors =
n!
(n − k)!
Combination
In permutations, the order is important -- we
count abc as different from bca. But in
combinations we are concerned only that a, b,
and c have been selected. abc and bca are
the same combination.
nCk = nPk / k!
Example: How many combinations are there of 5
things taken 4 at a time?
Solution. 5C4 = 5· 4· 3· 2 /(1· 2· 3· 4) = 5
The Binomial Distribution
•
A binomial experiment is
1. Experiment consists of n identical trials
2. Each trial results in 2 possible outcomes: “success” or
“failure”
3. Probability of success, p, is constant from trial to trial
–
The probability of failure, q, is 1 – p
4. Trials are independent
•
•
If X is the total number of successes in n trials of a
binomial experiment, then X is a binomial random
variable.
Notice that X can have any value from 0 to n.
Binomial Probability Distribution
P( X  x) n C x (1   )
x
n x
n is the number of trials (experiments)
x is a variable (may be 1, 2, …n)
 is the probability of success on each trial
n!
nCx 
x!(n-x)!
is the (# of) combinations of arranging x successes
in n trials.
Binomial Distribution
Continued
• For a binomial random variable x, the probability of x
successes in n trials is given by the binomial
distribution:
n!
px  =
p x q n- x
x!n - x !
– n! is read as “n factorial” and n! = n × (n-1) × (n-2) × ...
×1
– 0! =1
– Not defined for negative numbers or fractions
– And p + q = 1
Examples
• Roll a standard die ten times and count the
number of sixes. The distribution of this
random number is a binomial distribution
with n = 10 and p = 1/6.
• Flip a coin three times and count the number
of heads. The distribution of this random
number is a binomial distribution with n = 3
and p = 1/2.
Flipping a coin three times
The outcome of head/tail is a binomial random variable. For a fair coin, p = q
= 0.5. The probility of having x heads in 3 trials is:
P(x) = 3Cx / 8
x
0
1
2
3
Probability P(X=x)
1/8
3/8
3/8
1/8
26
Example:
The Department of Labor reports that 20% of the workforce
in a city is unemployed.
In an interviewed with 14 workers in that city:
What is the probability that exactly
three are unemployed?
P( X  3)14 C 3(.20) (1  .20)
3
11
 (364)(.0080)(.0859)
 .2501
and, at least three are unemployed?
P( X  3)14 C3 (.20)3 (.80)11  ...14 C14 (.20)14 (.80) 0
 .250  .172  ...  .000  .551
Example of Binomial Probability Distribution
cont.
The probability at least one is unemployed?
P( X  1)  1  P( X  0)
 114 C0 (.20) (1  .20)
0
 1  .044  .956
14
Example: Binomial Distribution
n = 4, p = 0.1
Binomial Probability Table
Table 5.7(a) for n = 4, with x = 2 and p = 0.1
p = 0.1
values of p (.05 to .50)
x
0
1
2
3
4
0.05
0.8145
0.1715
0.0135
0.0005
0.0000
0.95
0.1
0.6561
0.2916
0.0486
0.0036
0.0001
0.9
0.15
0.5220
0.3685
0.0975
0.0115
0.0005
0.85
…
…
…
…
…
…
…
0.50
0.0625
0.2500
0.3750
0.2500
0.0625
0.50
values of p (.05 to .50)
P(x = 2) = 0.0486
4
3
2
1
0
x
Example 5.10: Incidence of Nausea
after Treatment
• Suppose 10% of the patients will experience nausea
following treatment with a cancer drug.
• Take a sample of 4 patients, all treated with the
same cancer drug. Let X be the number of patients,
who experience nausea after the treatment.
• Find the probability that 0 of the 4 patients
treated will experience nausea
• Given: n = 4, p = 0.1, with X = 0
Then: q = 1 – p = 1 – 0.1 = 0.9
Example 5.10
4!
0
4
0.1 0.9
px  0 
0!4  0 !
Continued
 10.1 0.9   0.6561
0
4
Example 5.11: Incidence of Nausea
after Treatment
x = number of patients who will experience nausea
following treatment with the drug out of the 4
patients tested
Find the probability that at least 3 of the 4 patients treated
will experience nausea
Set x = 3, n = 4, p = 0.1, so q = 1 – p = 1 – 0.1 = 0.9
Then:
p x  3  px  3 or 4 
 px  3  p x  4 
 0.0036  .0001  0.0037
with a binomial table
Using the addition rule for the
mutually exclusive values of
the binomial random variable
Several Binomial Distributions
Mean and Variance of a Binomial Random
Variable
• If x is a binomial random variable with
parameters n and p (so q = 1 – p), then
– Mean m = n•p
– Variance 2x = n•p•q
– Standard deviation x = square root n•p•q
 X  npq
Back to Example 5.11
• Of 4 randomly selected patients, how many
should be expected to experience nausea
after treatment?
– Given: n = 4, p = 0.1
– Then µX = np = 4  0.1 = 0.4
– So expect 0.4 of the 4 patients to experience
nausea
• If at least three of four patients experienced nausea,
this would be many more than the 0.4 that are
expected
What happen if…
• We don’t know the probability for certain
outcome to occur
• But we know the average number for such
outcome to occur in
– a fixed period of time, or
– an interval of any kind
• Example: an earthquake larger than scale 8 occurs in
Japan on average once in 100 years.
Remember m = n•p; m and p are related.
37
Poisson experiment is …
* The experiment results in outcomes that can be
classified as successes or failures.
* The average number of successes (μ) that occurs in a
specified region (in time or space) is known.
* The probability that a success will occur is
proportional to the size of the region.
* The probability that a success will occur in an
extremely small region is virtually zero.
* The probability of an event in one interval is
independent of the probability of an event in any other
non-overlapping interval.
The Poisson Distribution
•
Consider the number of times an event occurs over
an interval of time or space, and assume that
1. The probability of occurrence is the same for any
intervals of equal length
2. The occurrence in any interval is independent of an
occurrence in any non-overlapping interval
•
If X = the number of occurrences in a specified
interval, then X is a Poisson random variable
The Poisson Distribution
Continued
• Suppose μ is the mean or expected number of
occurrences during a specified interval
• The probability of x occurrences in the interval when
μ are expected is described by the Poisson
distribution
e m m x
px  
x!
– where x can take any of the values x = 0,1,2,3, …
– and e = 2.71828 (e is the base of the natural log)
Example
• The average number of cars sold by the Acme Car
Company is 2 cars per day. What is the probability
that exactly 3 cars will be sold tomorrow?
• Solution: This is a Poisson experiment in which we know the
following:
• μ = 2; since 2 cars are sold per day, on average.
• x = 3; since we want to find the likelihood that 3 cars will be
sold tomorrow.
• We plug these values into the Poisson formula as follows:
= 0.180
Example
Suppose the average number of tigers seen on a day trip to
a national forest is 5. What is the probability that tourists
will see fewer than four tigers on the next 1-day trip?
To solve this problem, we need to find the probability that tourists will
see 0, 1, 2, or 3 tigers. Thus, we need to calculate the sum of four
probabilities: P(0; 5) + P(1; 5) + P(2; 5) + P(3; 5).
To compute this sum, we use the Poisson formula:
P(x < 3, 5) = P(0; 5) + P(1; 5) + P(2; 5) + P(3; 5)
= [ (e-5)(50) / 0! ] + [ (e-5)(51) / 1! ] + [ (e-5)(52) / 2! ] + [ (e-5)(53)
/ 3! ]
= [ 0.0067 ] + [ 0.03369 ] + [ 0.084224 ] + [ 0.140375 ]
= 0.2650
More examples of Poisson Distribution
• The number of people killed by accidents each year
in a country.
• The number of earthquakes recorded per year in the
world.
• The number of people who travel between Zhuhai
and Hong Kong each week.
• The number of accidents involved by a driver per
100,000 km travelled.
• The number of trees per square mile in a forest.
• The number of mutations in a given stretch of DNA
after a fixed dosage of radiation.
Example 5.13: ATC Center Errors
• An air traffic control (ATC) center has been averaging
20.8 errors per year and lately has been making 3
errors per week
• Let x be the number of errors made by the ATC
center during one week
• Given: µ = 20.8 errors per year
• Then: µ = 0.4 errors per week
– There are 52 weeks per year so µ
for a week is:
– µ = (20.8 errors/year)/(52 weeks/year)
= 0.4 errors/week
Example 5.13: ATC Center Errors
Continued
• Find the probability that 3 errors (x =3) will occur in a
week
– Want p(x = 3) when µ = 0.4
px  3 
e
0.4
0.4
3
3!
 0.0072
• Find the probability that no errors (x = 0) will occur in
a week
– Want p(x = 0) when µ = 0.4
e 0.4 0.40
px  0 
 0.6703
0!
Poisson Probability Table
Table 5.9
x
0
1
2
3
4
5
m, Mean number of Occurrences
0.1
0.9048
0.0905
0.0045
0.0002
0.0000
0.0000
0.2
0.8187
0.1637
0.0164
0.0011
0.0001
0.0000
…
…
…
…
…
…
…
0.4
0.6703
0.2681
0.0536
0.0072
0.0007
0.0001
e 0.4 0.43
px  3 
 0.0072
3!
…
…
…
…
…
…
…
m=0.4
1.00
0.3679
0.3679
0.1839
0.0613
0.0153
0.0031
Poisson Probability Calculations
Example: Poisson Distribution
m = 0.4
Mean and Variance of a Poisson Random
Variable
• If x is a Poisson random variable with
parameter m, then
– Mean mx = m
– Variance 2x = m
– Standard deviation x is square root of variance
2x
Several Poisson Distributions
Back to Example 5.13
• In the ATC center situation, 20.8 errors
occurred on average per year
• Assume that the number x of errors during
any span of time follows a Poisson distribution
for that time span
• Per week, the parameters of the Poisson
distribution are:
– mean µ = 0.4 errors/week
– standard deviation σ = 0.6325 errors/week
From binomial to Poisson Distribution
We can deduce the Poisson distribution from the binomial
distribution by using
where is the mean & p is
the the probability of
success
as n becomes very very large
1
Sum of Poisson Distribution
The sum of the probabilities P(X = r) or simply P(r) for r = 0,
1, 2, … is 1.
Chapter Five-Discrete Random Variables
GOALS
After you completed this chapter, you will be able to:
ONE
Define the terms random variable and probability distribution.
TWO
Distinguish between a discrete and continuous probability distributions.
THREE
Calculate the mean, variance, and standard deviation from a discrete probability
distribution.
Chapter Five-Discrete Random Variables
FOUR
Compute probabilities, mean and variance of binomial probability distributed random
variable.
FIVE
Compute probabilities, mean and variance of Poisson probability distributed random
variable.