Transcript notes

2013
Statistics for Business
Chapter 4
Probability
1
Probability
4.1
4.2
4.3
4.4
The Concept of Probability
Sample Spaces and Events
Some Elementary Probability Rules
Conditional Probability and Independence
Probability and statistics
3
The use of Probability in our life
• Life is rife with uncertainty.
– Will it rain tomorrow?
– How much oil can be found by drilling here?
– Will the economy be better 6 months from now?
• Sometimes we can predict it, many times we
cannot.
• Probability quantify the best we can say
about the situation.
“The laws of probability, so true in
general, so fallacious in particular.”
Edward Gibbon
(English Historian, 1737-1794)
5
Richard P. Feynman
QED: The Strange Theory of Light and Matter
Philosophers have said that if the same circumstances don't
always produce the same results, predictions are impossible and
science will collapse. Here is a circumstance—identical photons
are always coming down in the same direction to the piece of
glass—that produces different results. We cannot predict
whether a given photon will arrive at A or B. All we can predict is
that out of 100 photons that come down, an average of 4 will be
reflected by the front surface. Does this mean that physics, a
science of great exactitude, has been reduced to calculating only
the probability of an event, and not predicting exactly what will
happen? Yes. That's a retreat, but that's the way it is: Nature
permits us to calculate only probabilities. Yet science has not
collapsed.
6
The Concept of Probability
• An experiment is any process of observation with an
uncertain (random) outcome.
• A sample space is a collection of all possible
outcomes for an experiment .
• A random variable is a function defined on a sample
space that characterized an outcome.
• Probability is a measure of the chance that an
experimental outcome will occur when an
experiment is carried out.
Probability: basic properties
If E is an experimental outcome, then P(E)
denotes the probability that E will occur
with the following basic properties:
1. 0  P(E)  1 such that:
•
•
If E can never occur, then P(E) = 0
If E is certain to occur, then P(E) = 1
2. The probabilities of all the experimental
outcomes must sum to 1
Classical experiment : A fair die
Possible outcomes: The
numbers 1, 2, 3, 4, 5, 6
There are six possible outcomes and the sample space
consists of six elements: {1, 2, 3, 4, 5, 6}.
One possible event:
The occurrence of an
even number. That is,
we collect the
outcomes 2, 4, and 6.
An Event is the collection
of one or more outcomes of
an experiment.
Events
• An event is a set (or collection) of
experimental outcomes
• The probability of an event is the sum of the
probabilities of the experimental outcomes
that correspond to the event
Fair die experiment
the number of outcomesin which E happened
P( E ) 
the number of totaloutcomes
P(even number) = (the # of outcomes
in which an even number appears)/
(the # of total outcomes)
= 3/6
Discrete Sample Space
• Consider the experiment of flipping two coins.
• It is possible to get 0 heads, 1 head, or 2 heads.
Thus, the sample space could be {0, 1, 2}.
• Another way to look at it is flip { HH, HT, TH, TT }.
• The second way is sometimes more convenient
because each outcome is as equally likely to
occur as any other.
12
• If the two indistinguishable coins are tossed
simultaneously, there are just three possible
outcomes, {H, H}, {H, T}, and {T, T}.
• If the coins are different, or if they are thrown
one after the other, there are four distinct
outcomes: (H, H), (H, T), (T, H), (T, T), which
are often presented in a more concise form:
HH, HT, TH, TT.
• Thus, depending on the nature of the
experiment, there are 3 or 4 outcomes, with
the sample spaces.
13
Continuous Sample Space
Arrival time. The experimental setting is a metro
(underground) station where trains pass (ideally)
with equal intervals. A person enters the station.
The experiment is to note the time of arrival
past the departure time of the last train. If T is
the interval between two consecutive trains,
then the sample space for the experiment is the
interval [0, T], or
[0, T] = {t: 0 ≤ y ≤ T}.
14
Human height. The experiment is to randomly
select a Chinese and measure his or her height.
The sample space contains the 1.3 billion humans
inhabiting in China. In this case, the height of the
selected person becomes a random variable.
It is also possible to consider the sample space
consisting of all possible values of height
measurements of the Chinese people. While the
Chinese population is discrete, we may assume
that in some height range near the average, all
possible heights are realized making up a
continuous Sample Space.
15
Assigning Probabilities to Experimental
Outcomes
• Classical Method (theoretical)
– For equally likely outcomes
• Long-run relative frequency (empirical)
– Long-run experiment (e.g. throwing a die many times)
– The number of times the event happened over
the total number of past data
• Subjective
– Assessment based on experience, common
sense, intuition or expertise.
Classical Method
• Frequently used when the experimental
outcomes are equally likely to occur
• Example 1: tossing a “fair” coin
– Two outcomes: head (H) and tail (T)
– If the coin is fair, then H and T are equally likely to
occur any time the coin is tossed
– So P(H) = 0.5, P(T) = 0.5
• 0 < P(H) < 1, 0 < P(T) < 1
• P(H) + P(T) = 1
Classical Method: Fair die experiment
If the die is fair, each number are equally likely
to occur any time the die is tossed.
Let X be the number we get in tossing the die
once.
P(X=2) = 1/6
Long-Run Relative Frequency Method
• Let E be an outcome of an experiment
• If it is performed many times, P(E) is the
relative frequency of E
– P(E) is the percentage of times E occurs in many
repetitions of the experiment
• Use sampled or historical data
• Example: Of 1,000 randomly selected consumers, 140
preferred brand X
• The probability of randomly picking a person who
prefers brand X is 140/1000 = 0.14 or 14%
Example2: Long-Run Relative Frequency
Long-Run Relative Frequency
Method Method: Example
1. An accounts receivable manager knows from
past data that about 70 of 1000 accounts became
uncollectible.
The manager would estimate the probability of bad
debts as 70/1000 = .07 or 7%.
2. Tossing a fair coin 3000 times, we can see that
although the proportion of heads was far from 0.5 in
the first 100 tosses, it seemed to stabilize and
approach 0.5 as the number of tosses increased.
Long-Run Relative Frequency Method:
application
• Often we determine the probability from a random sample
(Long-Run Relative Frequency Method) and apply it to the
population.
• Of 5528 Zhuhai residents randomly sampled,
445 prefer to watch CCTV-1
• Estimated Share P(CCTV-1) = 445 / 5528 = 0.0805
• So the probability that any Zhuhai resident chosen at random
prefers CCTV-1 is 0.0805
• Assuming total population in Zhuhai is 1,000,000 :
• Size of audience in the city = Population x Share
so 1,000,000 x 0.0805 = 80,500
Subjective Probability
• Using experience, intuitive judgment, or personal
expertise to assess/derive a probability
• May or may not have relative frequency
interpretation (Some events cannot be repeated many times)
• Contains a high degree of personal bias.
• What is the probability of your favorite basketball
or football team win the next game? (e.g. sports
betting)
Subjective probability & betting
The odds in betting reflect the subjective
probability guessed by the mass.
Who much are you willing to pay for a ticket
which worth $10 if there was life on Mars and
nothing if there was not?
Subjective probability usually reflects the
mind/opinion more than the reality. It is an
area of research in psychologies.
Probabilities: Equally Likely Outcomes
• If the sample space outcomes (or
experimental outcomes) are all equally likely,
then the probability that an event will occur is
equal to the ratio:
– The number of ways the event can occur
– Over the total number of outcomes
Number of sample space outcomes that correspond to the event
Total number of sample space outcomes
Watch out!
• Consider, for example, the question of
whether or not there is life on Mars. There are
only two possible outcomes in the sample
space.
1. There is life on Mars.
2. There is no life on Mars.
• However, you cannot conclude that the
probability of life on Mars is p= 1/2.
25
Some Elementary Probability Rules
1.
2.
3.
4.
5.
6.
Complement
Union
Intersection
Addition
Conditional probability
Multiplication
Complement
• The complement (Ā) of an event A is the set
of all sample space outcomes not in A
• P(Ā) = 1 – P(A)
“Venn diagram”
Event
Sample space
Union and Intersection
• The union of A and B are elementary events
that belong to either A or B or both
– Written as A  B
• The intersection of A and B are elementary
events that belong to both A and B
– Written as A ∩ B
Other rules
• Complement of complement
–(E’)’ = E
• Complement of intersection/union
–(A∩B)’ = A’  B’
–(A  B)’ = A’∩B’
29
Some Elementary Probability Rules
Entire event space
Mutually Exclusive
• A and B are mutually exclusive if they have no
sample space outcomes in common
• In other words:
P(A∩B) = 0
The Addition Rule (special & general)
• If A and B are mutually exclusive, then the
probability that A or B (the union of A and B)
will occur is
P(AB) = P(A) + P(B) (special rule)
• If A and B are not mutually exclusive:
P(AB) = P(A) + P(B) – P(A∩B) (general rule)
where P(A∩B) is the joint probability of A and
B both occurring together
Example: Newspaper Subscribers #1
• Define events:
– A = event that a randomly selected household subscribes
to the Atlantic Journal
– B = event that a randomly selected household subscribes
to the Beacon News
• Given:
–
–
–
–
total number in city, N = 1,000,000
number subscribing to A, N(A) = 650,000
number subscribing to B, N(B) = 500,000
number subscribing to both, N(A∩B) = 250,000
Example: Newspaper Subscribers #2
• Use the relative frequency method to assign
probabilities
650,000
P  A 
 0.65
1,000,000
500,000
P B  
 0.50
1,000,000
250,000
P A  B  
 0.25
1,000,000
Contingency
table inSubscribers
Table 4.3
Example:
Newspaper
(contingency table)
A∩B
A  B 
A contingency table is a tabular representation of categorical data .
Example: Newspaper Subscribers #3
• Refer to the contingency table
• The chance that a household does not
subscribe to either newspaper
100,000
PA  B  
 0.10
1,000,000
Example: Newspaper Subscribers #4
• The chance that a household subscribes to either
newspaper:
P(A  B)=P(A)+P(B)  P(A  B)
 0.65  0.50  0.25
 0.90
Other method to find more complex
probabilities
Since:
(A' B ' )'  ( A' )'( B ' )'  A  B
Therefore:
P(A' B ' )=1  P((A' B ' )' )
 1  P( A  B)
 1  0.90  0.1
38
Conditional Probability and
Independence
• Conditional probability is used to determine
how two events are related.
• The probability of an event A, given that the
event B has occurred, is called the conditional
probability of A given B
– Denoted as P(A|B)
• Further, P(A|B) = P(A∩B) / P(B)
– assume P(B) ≠ 0
Interpretation
• Restrict sample space to just event B
• The conditional probability P(A|B) is the
chance of event A occurring in this new
sample space
• In other words, if B occurred, then what is the
chance of A occurring
Example: Newspaper Subscribers
• Of the households that subscribe to the
Atlantic Journal, what is the chance that they
also subscribe to the Beacon News?
PA  B
• Want P(B|A), where
PB | A  
PA 
0.25

0.65
 0.3846
Example: mutual fund performance
• Why are some mutual fund managers more successful than
others? One possible factor is where the manager earned his
or her MBA. The following table compares mutual fund
performance against the ranking of the school where the fund
manager earned their MBA:
Mutual fund outperforms
the market
B1
Mutual fund doesn’t
outperform the market
B2
A1 - Top 20 MBA program
.11
.29
A2 - Not top 20 MBA
program
.06
.54
E.g. This is the probability that a mutual fund
outperforms AND the manager was in a top20 MBA program; it’s a joint probability.
Conditional Probability…
• We want to calculate
P(B1 |B2A1)
B1
P(Ai)
A1
A2
.11
.29
.40
.06
.54
.60
P(Bj)
.17
.83
1.00
Thus, there is a 27.5% chance that that a fund will outperform the market given that the
manager graduated from a top-20 MBA program.
6.43
Example: New test for early detection of cancer
•
•
•
•
Let
C = event that patient has cancer
C’ = event that patient does not have cancer
+ = event that the test indicates a patient has cancer
- = event that the test indicates that patient does not have cancer
• Clinical trials indicate that the test is accurate 95% of the time in detecting
cancer for those patients who actually have cancer: P(+/C) = .95
• but unfortunately will give a “+” 8% of the time for those patients who are
known not to have cancer: P(+/ Cc ) = .08
• It has also been estimated that approximately 10% of the population have
cancer and don’t know it yet: P(C) = .10
• You take the test and receive a “+” test results.
Should you be worried? P(C/+) = ?????
6.44
P(+/C) = .95
P(+/ C’ ) = .08
P(C) = .10
Test Results
True State of Nature
Have Cancer: C
Do Not Have Cancer: CC
+
-
6.45
46
False negative
False positive
47
48
Assignment Problem
•
•
•
•
•
The Rapid Test is used to determine whether someone has
HIV. The false positive and false negative rates are 0.05 and
0.09 respectively.
The doctor just received a positive test results on one of their
patients [assumed to be in a low risk group for HIV].
The low risk group is known to have a 6% probability of having
HIV.
What is the probability that this patient actually has HIV [after
the positive test].
Use a table to work this problem.
6.49
Independence of Events
• Two events A and B are said to be
independent if and only if:
P(A|B) = P(A)
• This is equivalently to
P(B|A) = P(B)
Example: Newspaper Subscribers
• Of the Atlantic Journal subscribers, what is the
chance that they also subscribe to the Beacon News?
– If independent, the P(B|A) = P(B)
• Is P(B|A) = P(B)?
– Know that P(B) = 0.5
– Just calculated that P(B|A) = 0.3846
– 0.65 ≠ 0.3846, so P(B|A) ≠ P(B)
• A is not independent of B
– A and B are said to be dependent
The Multiplication Rule
• The joint probability that A and B (the
intersection of A and B) will occur is
General Rule of Multiplication
P(A∩B) = P(A) • P(B|A) = P(B) • P(A|B)
• If A and B are independent, then the
probability that A and B will occur is:
Special Rule of Multiplication
P(A∩B) = P(A) • P(B) = P(B) • P(A)
Example: Genders of Two Children
• Let: B be the outcome that child is boy
G be the outcome that child is girl
• Sample space S = {BB, BG, GB, GG}
• If B and G are equally likely , then
P(B) = P(G) = ½ and
• P(BB) = P(BG) = P(GB) = P(GG) = ¼
A Tree Diagram: the Genders of Two
Children
Example: Gender of Two Children
• Experimental Outcomes:
BB, BG, GB, GG
• All outcomes equally likely:
P(BB) = … = P(GG) = ¼
• P(one boy and one girl) =
P(BG) + P(GB) = ¼ + ¼ = ½
• P(at least one girl) =
P(BG) + P(GB) + P(GG) = ¼+¼+¼ = ¾
Example: Genders of Two Children
Continued
• Of two children, what is the probability of having a
girl first and then a boy second?
• Want P(G first and B second)?
– Want P(G∩B)
• P(G∩B) = P(G)  P(B|G)
• But gender of siblings is independent
– So P(B|G) = P(B)
– Then P(G∩B) = P(G)  P(B) = ½  ½ = ¼
• Consistent with the tree diagram
Using Tree Diagrams
A Tree Diagram is
used when you have
a list of choices. It
clearly shows
conditional and joint
probabilities.
Example:
In a bag containing
7 red balls and 5 blue balls, 2
balls are selected at random
without replacement.
Construct a tree diagram
showing this information.
Calculate the probability of
getting 1 red and 1 blue ball.
Using Tree Diagrams
P(R1R2) = (7/12)*(6/11)
P(B1R2)= (5/12)*(7/11)
Probability of getting 1 red and 1blue = P(R1B2) + P(B1R2) =
(5/12)*(7/11) + (7/12)*(5/11)
Explaining the solution by conditional probability
P(R1)=7/12
P(B2|R1)=5/11
P(R1&B2)=P(B2|R1)*P(R1)=(5/11)*(7/12)
P(B1)=5/12
P(R2|B1)=7/11
P(B1&R2)=P(R2|B1)*P(B1)=(5/12)*(7/11)
P(B1R2 or R1B2)=(2*5*7)/(12*11)
59
Another explanation:
All permutations to choose any 2 from 12 balls
= 12P2=12*11 (all possible outcomes: sample space)
All permutations to choose 1 R & 1 B
= 2* 7C1 * 5C1 = 2*7*5 (outcomes in the event of interest)
P(B1R2 or R1B2)=(2*5*7)/(12*11)
60
PROBABILITY AND FREQUENCY DISTRIBUTIONS
Since frequency distribution of a random variable
X is constructed from data obtained from a
statistical experiment, all possible outcomes of X
comprise the sample space of the experiment.
In an experiment that accumulated a large number
of data, the relative frequency distribution of X
naturally reflects the probability distribution of X.
61
Example
You can determine a probability from a frequency distribution
table by computing the proportion for the X value in question.
Consider the following distribution of scores, which
has been summarized in a frequency distribution
table.
For this distribution of scores, what is the
probability of selecting a score of X = 8?
p(X = 8) = ƒx=8 / N = 3/10 = 0.30
62
A frequency distribution histogram for a population that
consists of N= 10 scores.
The shaded part of the figure indicates the portion of the
whole population that corresponds to scores greater than
X= 4.
The shaded portion is two-tenths of the whole distribution.
So the probability of X>4 is
P(X>4) = 2/10=0.2
63
The normal distribution following a z-score
transformation.
64
Example 1:
Adult heights form a normal distribution with a mean of 68 inches
and a standard deviation of 6 inches. Given this information about
the population we can determine the probability associated with
specific samples from the normal distribution curve.
For example, what is the probability of randomly selecting an
individual from this population who is taller than 6 feet 8 inches (X
= 80 inches)?
Restating this question in probability notation:
p(X > 80) = ?
Solution: Z= (80 – 68)/6 = 12/6 = 2
Thus p(X > 80) = p(z>2) = 2.28%
65
66
The Standard Normal Table
• The standard normal table is a table that lists
the area under the standard normal curve to
the right of the mean (z=0) up to the z value of
interest
– See Table 6.1, Table A.3 in Appendix A, and the
table on the back of the front cover
• This table is so important that it is repeated 3 times in
the textbook!
• Always look at the accompanying figure for guidance on
how to use the table
The Standard Normal Table
Continued
• The values of z (accurate to the nearest tenth)
in the table range from 0.00 to 3.09 in
increments of 0.01
– z accurate to tenths are listed in the far left
column
– The hundredths digit of z is listed across the top of
the table
• The areas under the normal curve to the right
of the mean up to any value of z are given in
the body of the table
A Standard Normal Table
Example 2:
Scores on the Scholastic Achievement Test (SAT) form a normal
distribution with mean = 500 and sd = 100. What is the
probability of selecting an individual from the population who
scores above 650?
[p(X > 650) = ?]
Solution:
z=(650-500)/100=1.5
P(X>650) = p(z>1.5) = 0.5 - 0.4332 = 0.0668 = 6.68%
70
Exercise:
The height of the young Chinese male population with age
between 20 and 35 is normally distributed with mean = 175 cm
and sd = 17 cm. What is the probability for finding someone with
a height of taller than 199 cm?
What is the probability for finding someone taller than Yao Ming
in this population ? (His height is 2.29 m)
71
Example 3: PERCENTILE RANKS
The IQ score of the whole population is normally distributed
with mean = 100 and sd = 10. What is the percentile
rank for someone with IQ = 114?
Solution:
z=(114-100)/10=1.4
P(X>114) = p(z>1.4) = 0.5 - 0.4192 = 0.0808 = 8.08%
Percentile rank is 100% - 8.08% =91.92%
72
PERCENTILE RANKS: exercise
The height of the young Chinese male population with age
between 20 and 35 is normally distributed with mean = 175 cm
and sd = 17 cm. What is the percentile rank for someone with a
height of 165 cm?
73
Summary
ONE
Define probability in three different way: classical,
empirical, and subjective approaches
TWO
Understand the terms: event, outcome
THREE
Special addition rule and General addition rule.
Summary
FOUR
Define the terms: conditional probability and
joint probability.
FIVE
The special multiplication rule and the General
multiplication rule.
SIX
PROBABILITY AND FREQUENCY
DISTRIBUTIONS