Transcript May 25

Probability
Some of the Problems about Chance having a great appearance of
Simplicity, the Mind is easily drawn into a belief, that their Solution may
be attained by the meer Strength of natural good Sense; which generally
proving otherwise and the Mistakes occasioned thereby being not
unfrequent, ‘tis presumed that a Book of this Kind, which teaches to
distinguish Truth from what seems so nearly to resemble it, will be looked
upon as a help to good Reasoning
- Abraham de Moivre (1667-1754)
Probability Overview
• Random Generating Processes
• Probability Properties
• Probability Rules
• Example: Binomial Random Processes
Types of Explanations
Data could be generated by:

A purely systematic process

A purely random process

A combination of systematic and random processes
12
12
12
10
10
10
8
8
8
Y6
Y6
Y6
4
4
4
2
2
2
0
X
0
X
0
X
Types of Explanations
Data could be generated by:

A purely systematic process

A purely random process

A combination of systematic and random processes
25
25
25
20
20
20
15
15
15
10
10
10
5
5
5
0
G1
G2
0
G1
G2
0
G1
G2
Hypothesis testing

We would like to know which of the three explanations is most likely
correct

The “purely systematic” explanation is easy to confirm or reject based
on a quick look at the data. (rarely fits social science data)

So we’re left trying to assess the question “could a purely random
process fully account for this data?”

If not, we’ll accept the more complex (systematic + random) model.
Random Generating Processes

To answer that question, we need to understand random generating
processes. (The domain of probability mathematics).

Note: most people intuitively over-estimate the role of systematic
factors. One reason is that people often have a poor grasp of how
random generating processes actually work.
Random Generating Processes

Random is not the same as haphazard or helter-skelter or higgledypiggledy.

Random generating processes yield “characteristic properties of
uncertainty”.
Random Generating Processes
Example: the Binomial random process

We have two possible outcomes (e.g. heads or tails) associated with a
specific probability (e.g. 0.50)

We can’t predict with certainty the particular outcome for any trial, but
we can describe the per-trial likelihood.

We can’t say too much about the relative frequency of outcomes in the
short-run, but we can say a lot about the relative frequencies in the
long-run.
Random Generating Processes

When we say something can be described by a random generating
process, we do not necessarily mean that it is caused by a mystical
thing called “chance”

There may be many independent (but unmeasured) systematic factors
that combine together to create the observed random probability
distribution. E.g. coin tosses

When we say “random” we just mean that we can’t do any better than
some basic (but characteristic) probability statements about how the
outcomes will vary
PROBABILITY

Probabilities are numbers which describe the likelihoods of random
events.

The probability of an event corresponds to the per-trial likelihood of
that event, as well as the long-run relative frequency of that event.

P(A) means “the probability of event A.”

If A is certain, then P(A) = 1

If A is impossible, then P(A) = 0
CHANCES and ODDS

Chances are probabilities expressed as percents. Chances
range from 0% to 100%.
–

For example, a probability of .75 is the same as a 75% chance.
The odds for an event is the probability that the event
happens, divided by the probability that the event doesn’t
happen. Odds can be any positive number.
–
For example, a probability of .75 is the same as 3-to-1 odds.
Sample Space

A sample space is a list of all possible outcomes of a random
process.
–
–

When I roll a die, the sample space is {1, 2, 3, 4, 5, 6}.
When I toss a coin, the sample space is {head, tail}.
An event is one or more members of the sample space.
– For example, “head” is a possible event when I toss a coin. Or “number less
than four” is a possible event when I roll a die.
– Events are associated with probabilities
Probability Properties

All probabilities are between zero and one:
• 0 < P(A) < 1

Something has to happen:
• P(Sample space) = 1

The probability that something happens is one minus
the probability that it doesn’t:
• P(A) = 1 - P(not A)
“complement”
Analytic Approach: Theoretical probabilities
If equally likely outcomes
P (event A) =
# outcomes favorable to event A
# outcomes total
What is the probability of getting exactly two heads in three coin tosses?
Total outcomes:
HHH
HHT
HTH
HTT
THH
THT
TTH
TTT
3
8
# outcomes with exactly tw o heads
# total possible outcomes
A box contains red and blue marbles. One marble is drawn at random
from the box. If it is red, you win $1. If it is blue you lose $1. You can
choose between two boxes.
-Box A contains 3 red marbles and 2 blue ones
-Box B contains 51 red marbles and 34 blue ones
Some Typical Probability Problems
•Anja has to pick a four digit pin number. Each digit will be between 0
and 9. What is the probability that she picks a pin number that has exactly
one 3 in it?
•A certain senior class has 6 students. Two will receive $500 scholarships.
What is the probability that Kim and AJ are the winning pair?
P (event A) =
# outcomes favorable to event A
# outcomes total
Relative Frequency Approach: Observed %s
If large sample
P (event A) = long term relative frequency =
f ( A)
n
What is the probability that a Columbia MBA student is a narcissist?
•From a random sample of n = 250, 70 students were classified as
narcissists.
Relative frequency =
f ( N ) 70

 .28
n
250
* Justification: The law of large numbers
USA Today survey of 966 inventors who
hold U.S. patents.
6 a.m. – 12 noon
290
P = .14%
12 noon – 6 p.m.
135
6 p.m. – 12 midnight
319
12 midnight – 6 a.m.
222
More Probability Properties
Unconditional Probability

The general probability (relative frequency) of an
event, in the absence of any other information
Conditional Probability

The conditional probability of B, given A, is written as P(B|A). It is
the probability of event B, given that A has occurred.

For example, P(short-sleeved shirt| shorts) is the probability that I
will put on a short-sleeved shirt, given that I have already decided
to wear shorts.

Note that P(B|A) is not the same as P(A|B).

It is very likely that I will wear a short sleeved shirt if I’m going
with shorts. It is not necessarily likely that I will wear shorts just
because I’m wearing a short sleeved shirt.
Sales Approach Survey
Aggressive
Passive
Sale
No Sale
270
310
580
416
164
580
474
1160
686
What is the unconditional probability of making a sale? .59
What is the probability of making a sale, given an aggressive approach? .47
What is the probability of making a sale, given a passive approach? .72
Practical Application of Conditional Probability
Sensitivity: probability a test is positive, given disease is present
False Positive rate: probability a test is positive, given disease is absent
False Negative rate: probability a test is negative, given disease is present
Medical Test Survey
Disease
Present
Test Result +
Test Result -
Disease
Absent
110
20
130
20
50
70
70
200
130
What is the sensitivity of the test? P(+, given condition present)
.85
What is the false negative rate? P(-, given condition present)
.15
What is the false positive rate? P(+, given condition absent)
.28
Independence
 Events A and B are independent if the probability of event B is the
same whether or not A has occurred.
 If (and only if) A and B are independent, then
P(B | A) = P(B | not A) = P(B)
• For example, if I am tossing two coins, the probability that the
second coin lands heads is always .50, whether or not the first coin
lands heads.
Superstition Survey
Happy
Ending
No Happy
Ending
144
456
600
192
618
800
1074
1400
Throw Rice
Not Throw Rice
336
Is rice-throwing statistically independent from happy endings?
P(happy│throw rice) =? P(happy│no throw) =? P(happy)
.24
.24
.24
Conditional Probability
• The probability of A, given B
• May be larger, smaller, or equal to the unconditional P(A)
Joint Probability
• The probability that A and B both occur
• Use the multiplication rule
•Will always be ≤ to the unconditional P(A)
“Linda is thirty-one years old, single, outspoken, and very
bright. She majored in philosophy. As a student, she was
deeply concerned with issues of discrimination and social
justice, and also participated in antinuclear demonstrations.”
Which is more likely?
•Linda is a bank teller
•Linda is a bank teller and is active in the feminist movement
Probability Rules

Probability of A or B: Addition Rule
P(A or B) = P(A) + P(B), when A and B are mutually exclusive

Probability of A and B: Multiplication Rule
P(A and B) = P(A) x P(B), when A and B are independent
The Addition Rule
A
B
“mutually exclusive” = A and B
cannot both happen
P (A or B) = P(A) + P(B)
Patricia is getting paired up with a big sister from the neighboring high school.
If there are 30 student volunteers (9 seniors, 6 juniors, 7 sophomores, and 8 freshmen),
what is the probability her big sister is an upperclassman?
P(senior or junior) = P(senior) + P(junior) = .30 + .20 = .50
The Multiplication Rule
A
B
“independent” = A does not
effect the likelihood of B and
vice versa
P (A and B) = P(A) X P(B)
The probability that Am Ex will offer Frank a job is 50%. The probability Citibank will offer him a
job is 30%. Am Ex and Citibank are not in contact.
What is the likelihood he gets offered both jobs? What is the likelihood he is offered neither
job?
P(AmEx and Citibank) = P(AmEx) x P(Citibank) = .50 x .30 = .15
P(Not AmEx and Not Citibank) = P(Not AmEx) x P(Not Citibank) = .50 x .70 = .35
General Addition Rule
For all cases
A
B
When A and B are mutually exclusive,
this is zero
P (A or B) = P(A) + P(B) – P(A and B)
There are 20 people sitting in a café. 10 like tea, 10 like coffee, and 2 people like both tea and
coffee. What is the probability that a random person in the café will like tea or coffee?
P(tea or coffee) = P(tea)+P(coffee)-P(tea and coffee) = .50+.50-.10 = .90
General Multiplication Rule
For all cases
A
B
When A and B are independent, this is
same as P(B)
P (A and B) = P(A) X P(B│A)
There are 10 green and 10 blue marbles in a jar. What is the probability that Sue draws two
blue marbles in a row?
P(blue1 and blue2)= P(blue1) x P(blue2│blue1) =
10 9
  .24
20 19
Summary

Addition Rule
P(A or B) = P(A) + P(B), when A and B are mutually
exclusive
P(A or B) = P(A) + P(B) – P(A and B), generally

Multiplication Rule
P(A and B) = P(A) x P(B), when A and B are independent
P(A and B) = P(A) x P(B│A), generally
Example: Binomial Random Processes
 Two possible outcomes
- Heads or tails
- Make basket or miss basket
- Fatality, no fatality
 With probability p (or 1-p)
 Events are independent
 Per trial probability is p (or 1-p)
 Long run relative frequency is p (or 1-p)
Example: Binomial Random Processes
 Short run relative frequency is NOT necessarily p
HTHHTHTTHTHTTTHTHH
HTHHTHTTTTTHTHTHH
 Chance is LUMPY
Example: Binomial Random Processes
 People are bad random number generators, we put in
too few “lumps” for our samples
25
25
25
20
20
20
15
15
15
10
10
10
5
5
5
0
G1
G2
0
G1
G2
0
G1
G2
 Conversely, people are too quick to draw conclusions of
systematicity from observed “lumps” in a sequence
- A string of wins “must” mean a hot table
Example: Binomial Random Processes
 “Representativeness Error”
- People expect a small sample to be too representative of
of the population or the long run frequency
 “Law of Small Numbers” Error
- People are overly confident of observed data patterns
based on small samples
Example: Binomial Random Processes
 More representativeness errors: “The Gambler’s Fallacy”
HHHHHHHH?
 People expect tails to be “temporarily advantaged” after
a run of heads
 But events are independent
equally likely
HHHHHHHHH
HHHHHHHHT
Example: Binomial Random Processes
 Predicting a specific versus a general pattern
Which lotto ticket would you buy?
equally likely
(or unlikely)
to win
26 45 8 72 91
26 26 26 26 26
Less likely to be
bought
• Each specific ticket is equally (un)likely to win
• A ticket that “looks like” ticket A (with alternating values) is more
likely than one that “looks like” ticket B (with identical values).
•But buyer beware! You are betting on a specific ticket, not a
general class of tickets
Example: Binomial Random Processes
• Probabilities for specific patterns get smaller as you run
more trials
What is the probability of getting heads on the second trial
and the tails on all other trials?
P(T,H) = 0.25
P(T, H, T) = 0.125
P(T, H, T, T) = 0.0625
Example: Binomial Random Processes
• Probabilities for general patterns get larger as you run more
trials
What is the probability of getting at least one heads when
you toss a coin multiple times?
Two tosses:
P(HT or TH or HH) = 0.75
Three tosses: P(HTT or THT or TTH or THH or HHT or HHH) = 0.875
Four tosses: 0.9375
Example: Binomial Random Processes
• Probabilities for general patterns get larger as you run more
trials
Compare:
P(at least one accident) when you ride in a car 2x a week
P(at least one accident) when you ride in a car 7x a week
They say P (fatality in airplane crash) < P(fatality in car crash)
But people spend more time in cars
P(airplane fatality in one minute) = P(car fatality in one minute)
The “Hot Hand”
• The “hot hand” is a belief about conditional probability.
People believe shots are not independent.
• Gilovich argues that the pattern of data, however, can be
well described by a binomial random process
-Independent shots
-Two outcomes: basket or missed basket
-Player has general probability p of getting a basket
• His Evidence:
 P(basket |miss) = P(basket|basket) = P(basket)
 frequence of 4, 5, 6 basket “streaks” no more likely
than a binomial process would predict
The “Hot Hand”
• Are people just deluded?
• There are biases in information processing which contribute
to the misperception
• But also:
 P(streak) is greater when p is greater.
 Thus, by a binomial process, good players will have more streaks
 P(streak) is greater when more shots are taken
players are not more likely to make the next shot if they made the
previous shot, but…
turns out players are more likely to take the next shot if they made
the previous shot.