GG 313 Lecture 4 Probability Basics

Download Report

Transcript GG 313 Lecture 4 Probability Basics

GG 313 Lecture 4
Probability Basics
9/1/05
Why Study Probability?
We need to determine what is probable - not just
possible.
Expect that the most probable explanation is the
right one. My advisor called this “The principle of
minimum astonishment”. This is not always the
case, however, rare events do occur, such as huge
meteor impacts. But if you think the occurrence of
an exciting rare event explains your data, be sure to
eliminate the more likely possibilities first - like
errors in your data.
One aspect of a viable theory is its ability to
predict undetermined values and future events.
The theory of plate tectonics predicts speed of the
plates, ages of the ocean floor, and locations of
earthquakes. The values of these parameters can
be predicted to a high probability based on the
principles of plate tectonics. Other theories are
discarded because their predictions are less
reliable.
The work done by Chris Conger will increase the
probability of finding sand deposits in the shallow
ocean. Not too exciting? Most science progresses
in small steps capitalizing on the work of others.
Another advantage of study of probability is its use in everyday
life. What actions will improve your probability of success.
Buying lottery tickets? Investing in junk bonds? Buying gold?
Investing in real estate? Smoking? Going to grad school?
What events are most probable in my lifetime? What kind of
event is likely to kill me? Meteor impact? Earthquake?
Tsunami? Terrorism? War? Accident? Heart disease? Cancer?
Should you buy insurance? How much should you pay for
insurance?
None of these probabilities are fixed. As knowledge increases
and parameters change, so do the probabilities.
Some basics:
Flip a coin 3 times, how many possible outcomes are there?
With each flip there are two possible outcomes, and we do
this 3 times, so all the possible results are:
Flip 1
flip 2
flip 3
H
H
H
T
H
H
H
T
H
T
T
H
H
H
T
T
H
T
H
T
T
T
T
T
There are 3 events each with two
possible outcomes, so there are a total
of 2*2*2 results = 8.
The formulation is the number of
possible results with k trails with ni
possible outcomes in the I’th trial is
N   ni
How many values can a 3-digit binary number have?
Another example: How many possible license plates are
there using three letters and three numbers?
N=26*26*26*10*10*10= 1,757,600
Permutations: The permutations of r objects taken from a set
of n distinct objects is the number of ways n things taken r at a
time can be arranged.
Example: We have 20 rock samples, how many ways can you
select 3 samples from the 20?
The first rock can be any one of the 20; the 2nd can be any of
19, and the 3rd can be any of 18. So the answer is 20*19*18.
The formulation is:
n
Pr  n(n 1)(n  2)...(n  r 1)
Factorial: The factorial operation is defined as:
n
n!  i  n  (n 1)  (n  2) 1
i1
By definition, 0! Is set equal to 1.
We can re-write the permutation equation as:

n(n 1)(n 1) (n  r  1)(n  r) 1
n!

n Pr 
(n  r)(n  r 1) 1
(n  r)!
Example: How many different hands are there in straight
poker (no draw)?
52!
52!

 52  51 50  49  48  318,372,600
52 P5 
52  5! 47!
The poker example isn’t quite correct, because it assumes
that the order that you received the cards in is important,
which it isn’t. We need another parameter where order isn’t
important.
COMBINATIONS: When we don’t care about the order of
the outcomes (ABC=ACB), then we talk about the number of
COMBINATIONS of n objects taken r at a time. This turns
out to be the number of permutations divided by r!.
n 
Pr
n!

  
n Cr 
r! r!(n  r)! r 
n
n 
  are called the binomial coefficients
r 
The reason they are called binomial coefficients is
because they are the coefficients of the
exdpansion:
x  y 
n
n  k nk
  x y
k
k 0  
n
For n=2, (x+y)2=1x2+2xy+1y2


So how many different poker hands are there really?
C5 318,372,600

 2,653,105
52 C5 
5!
5  4  3 2 1
52
How many ways can you pick three marbles from 9
marbles?
9! 9  8  7 3 4  7


 84
9 C3 
3!6! 3 2 1
1
After picking r objects, there are n-r left, and there are as
many ways of picking n-r objects from n objects as there
 are ways to pick r objects from n objects:
n   n 
  

r
n

r
  

Probability
Now that we know how to tell what’s possible, how
do we tell what’s probable?
The basic concept is: If there are s possible favorable
outcomes of an event and there are n outcomes
possible, then the probability of success is s/n.
p=s/n
However, this is only true if all outcomes are equally
likely.
Example: What is the probability of drawing an ace from
a deck of cards?
Since there are 52 cards, there are 52 possible
outcomes, and, since there are 4 aces, four of those
outcomes are favorable, thus:
P=4/52=1/13=7.7%
Example: A cancer surgery patient gets biopsies on 6
lymph nodes. If any one is found to contain cancer, then
the cancer will be known to have spread and the patient
will receive chemotherapy. If only 1 in 10 lymph nodes
are actually cancerous, what are the odds of all six
sampled nodes coming out negative?
Our possible outcomes are 10 nodes taken 6 at a time, or
10C6=10!/(6!(10-6)!)=10*9*8*7/(4*3*2*1)=10*3*7=210.
Favorable outcomes are picking the 1 cancerous node
out of 10 in 6 tries, which is the same as picking only the
9 clear nodes in 6 tries:
9C6=9!/(6!(9-6)!)=9*8*7/(3*2*1)=84.
So the probability is
84/210=40%. Lesson to surgeons: sample LOTS of
nodes!
When the probabilities of some outcomes are greater
than those of others, the above calculations don’t work.
A better definition is:
The probability of an outcome is the fraction of
trials where that outcome is observed with a large
number of trials.
Example: “The probability of sunshine for more than 2
hours per day in June in Honolulu is 97%.” This
statistic, valuable to the Tourist Bureau, is based on a
large number of samples of sunshine in Honolulu in
June.
The Law of Large Numbers:
If an experiment is repeated a large
number of times, the fraction of times a
particular outcome is observed will
approach the probability of that outcome.
Rules and definitions
S: Sample Space: All possible outcomes of an experiment
A: Event: a subset of S. An event may contain more than
one outcome
Mutually exclusive: Two events that have no common
outcomes.
The probability of an event must be greater than or equal to
0 and less than or equal to 1.
0 P(A) 1
Also, P(S)=1.
If P(A)=1, A is a certainty.
If two events are mutually exclusive, then the
probability that one or the other will occur is the sum
of their probabilities.
 : the “Union” symbol. It means “or”
 : the “Intersection” symbol. It means “and”
If A and B are mutually exclusive:
P(A B)= P(A)+P(B)
P(A  B)=0

: the “compliment” symbol. It means “not”
P(A)+P(A)=1
ODDS
Gamblers use odds rather than probabilities. It is an
error to use these two terms interchangeably. If the
probability of an event is p, the odds of it occurring
are:
a:b=p/(1-p) , or p=a/(a+b)
Odds are used because they tell you directly what your
likely winning are. 1:1 (say 1-to-1) odds mean even
money, or a probability of 50%.
Example: There are 3 blue marbles and one red marble.
You reach into a hat and draw out 1 marble. Your
probability of picking a red marble is 0.25, but the odds
of picking red are 1:3. If you make a bet to pick a red
marble you should get 3 dollars for every dollar bet if
you’re going to break even in the long run.
Additional Probability Addition Rules
Venn Diagram
0.18 0.12
0.24
Venn Diagrams illustrate the the probabilities of nonexclusive events. The circles represent two different
events embedded in the sample space. This could be
the probabilities of hitting economical oil (Orange) or
gas (pink).
P(oil)=0.18+0.12=0.30
P(gas)=0.24+0.12=0.36
P(gas  oil)=0.18+0.12+0.24=0.54
Note: this is the “inclusive OR” in that both events can
occur and still be counted.
If we had used our previous addition rule,
AB=P(A)+P(B)=P(oil)+P(gas)=0.30+0.36=0.66,
We overestimate the probability of finding gas and oil.
We fix that by writing:
P(oil  gas)=P(oil)+P(gas)-P(oil  gas)=0.3+0.36-0.12=0.54
If the events are mutually exclusive, then P(A)  P(B)=0,
And the original rule is recovered.
Conditional Probability
“What if” probabilities are very common - probabilities
where an outcome depends on the occurrence of a
previous outcome.
• If a strength 5 hurricane hits New Orleans, what is
the probability that a dike will fail?
• If an earthquake occurs of the west coast, what is
the probability that a major tsunami will be
generated.
• If a disaster occurs, what is the probability that our
insurance company will not have sufficient funds
• If oil supply drops below demand, what is the
probability that we can make due with alternative
energy?
Conditional probability is the probability that an event
will occur, given that another event has already
occurred.
P(A|B)=
P(A  B)
P(B)
The probability of A given that B has occurred is equal to
the probability of A and B divided by the probability of B.
In the oil and gas example, what is the probability of
finding oil given that gas was found?
P(oil | gas)= 0.12/0.36= 1/3= 33%
Bayes Basic Theorem
Re-writing the above equation, we get:
P(A  B) = P(B) P(A | B)
and
P(A  B) = P(A) P(B | A).
If A and B are independent, then if B has already
occurred or not does not affect the probability of A:
P(A|B)=P(A).
Substituting into Bayes Theorem:
P(A  B)= P(A) P(B), if A and B are independent.
For n independent events:
n
P   pi
i1
Example:
What is the probability of death by meteoroid impact?
The probability of a planet killer meteoroid impact in a
given year are about 10-8. The average person lives
about 60 years, and there are about 5x109 people.
Every 108 years, 5x109 people will be killed by an
impact, but every 60 years 5x109 people will be killed
by other causes. So, in 108 years, 108/60 * 5x109 die
of other causes, and 5x109 people will be killed by an
impact.
Divide the total deaths by impact by total deaths by
other to get the probability of death by impact:
P(death by impact)~ 1 in 17 million
this is about the same as death by lightning
Peak Oil Example
The probability that A) demand for oil will outstrip supply
within the next 5 years is ~70%.
The probability that B) we will be able to satisfy demand
with other energy sources to take up the demand: ~20%
The probability of C) global economic chaos if A  B’ is
~60%.
The probability of global economic chaos beginning
within the next 5 years:
P(A) P(B’|A)=0.7*(1-0.2)=0.56
P( C ) = 0.56*0.6 ~ 34 %
This is the argument that is getting considerable attention
now: Google “peak oil”
If there is more than 1 event Bi (all mutually exclusive) that
are conditionally related to event A, then P(A) is the sum of
the conditional probabilities of the Bi.
n
P(A)   P(A | Bi ) P(Bi )
i1
This yields:
 P(Bi | A) 
P(Bi )  P(A | Bi )
n
 P(A | B ) P(B )
i
i
i1
Which is the general Bayes Theorem.

Like much of statistics, the formulas are incomprehensible
without examples. Consider:
An unknown marine fossil fragment was found at the fossil
site in a stream bed. You want a better fossil, but there are
two possible sources up stream. Drainaage basin B1 covers
180 km2 and B2 covers 100 km2 .
Based on the area alone, the probability that the fossil comes from
one or the other basins is:
P(B1)=180/280=0.64
P(B2)=100/280=.36
However, a geological map shows that 35% of the outcrops in B1
are marine, while 80% of the outcrops in B2 are marine. The
conditional probabilities are:
P(A|B1)=0.35 probability of fossil given B1
P(A|B2)=0.80 probability of fossil given B2
We can now use Bayes theorum to find the probability that the fossil
came from B1, given that the fossil is marine:
P(B1 | A) 
P(A | B1)P(B1)
0.35 * 0.64

 0.44
P(A | B1)P(B1)  P(A | B2)P(B2) 0.35 * 0.64  0.80 * 0.36