Probability, Probability Distributions, Binomial Distribution

Download Report

Transcript Probability, Probability Distributions, Binomial Distribution

Class 02
Probability, Probability Distributions,
Binomial Distribution
What we learned last class…
• We are not good at recognizing/dealing with randomness
– Our “random” coin flip results weren’t streaky enough.
• If B/G results behave like independent coin flips, we know
how many families to EXPECT with 0,1,2,3,4 girls.
– We expect 6/16 4-child families to have 2 each.
– This is PROBABILITY
• We will compare the actual counts to the expected counts
to judge whether the coin flip assumption is a good one.
– To do this comparison, we will have to recognize that there will
be differences between actual and expected counts even if the
coin flip assumption is a good one.
• That is STATISITCS!
Probability is useful
• To make better (thoughtful) decisions.
– Lend or reject.
– Operate or wait and see.
– Bunt or hit away.
• To help make sense of data
– By comparing what happened to what can happen
by chance.
The First Probability Problem
Two men play chess. The first to win
three games will receive two ducats.
Play is interrupted with player A
ahead 2 games to 1. How should
the prize be divided between the
two men? (circa 1400)
Probability Examples
Flip a Fair Coin
Draw a Card from a well
shuffled Deck
Observe the weather
tomorrow
P(Head)=0.5
P(Ace)=4/52
P(R)= ?
Probability Fact: The Pr A will not happen is 1
minus the Pr it will happen (and vice versa).
Flip a Fair Coin
Draw a Card from a well
shuffled Deck
Observe the weather
tomorrow
P(Head)=0.5
P(Ace)=4/52
P(R)= ?
P(Tail)=1-0.5
P(not an Ace) = 1-4/52
P(Rc)= 1-?
Not A is denoted Ac.
So if it is difficult to find P(A), try finding P(Ac) instead.
P(3 or fewer girls in 4) = 1 – P(4 boys)
P(some students here have the same birthday) = 1 – P(all have different birthdays)
(4.5)
Consider Two Trials
Flip a Fair Coin
Draw a Card from a well
shuffled Deck
Observe the weather
tomorrow
P(H)=0.5
P(Ace)=4/52
P(R)= ?
P (H,H)=(0.5)(0.5)
P(Ace,Ace) = (4/52)(3/51)
P(R1,R2)=P(R1)*P(R2│R1)
P(AandB) is written as P(A∩B) or P(A,B)
Prob of B given A
P(A∩B) = P(A) * P(B│A) always. THE MULTIPICATION LAW (4.12)
B and A are INDEPENDENT if Pr(B│A) = P(B) and vice versa. (4.9)
So Pr(A∩B) = P(A) * P(B) if A and B are independent. (4.13)
Conditional Probability
People who switched to
ALLSTATE saved on
average $348 per year.
P(Amount of Saving│You swithed) does not equal P(Amount of Savings)
“Amount of Saving” and “Switching” are NOT independent.
http://www.couponsnapshot.com/merchantAllstate-coupons-deals-5106.html
Consider Two Trials
Flip a Fair Coin
Draw a Card from a well
shuffled Deck
Observe the weather
tomorrow
Pr(H)=0.5
Pr(Ace)=4/52
Pr(R)= ?
Pr(H,H)=(0.5)(0.5)
Pr(Ace,Ace) = (4/52)(3/51)
Pr(R1,R2)=Pr(R1)*Pr(R2│R1)
Pr(AandB) is written as Pr(A∩B)
Pr(A∩B) = P(A) * P(B│A) always.
Coin Flips are
independent
B and A are INDEPENDENT if Pr(B│A) = P(B) and vice versa.
Pr(A∩B) = P(A) * P(B) if A and B are independent.
Card draws are
not. (Unless we
replace the first
card or the deck is
HUGE)
Independence is often THE question
• Are boy/girl outcomes independent?
– Does P(fourth child is a boy) change based on first
three outcomes?
• Do players get “hot” or “in the zone”?
• Does past fund performance predict future
performance?
The Monty Hall Problem
• Three doors. Prize behind one, goats behind
the other two.
• I pick a door.
• Monty (who knows where the prize is) reveals
a goat. (Assume he ALWAYS reveals a goat).
• What is the probability the prize is behind my
door?
INDEPENDENCE solves the Monty Hall
Problem
• P(Monty reveals a goat) = 1
• P(Monty reveals a goat │ my door has prize) = 1
• Events “Monty reveals a goat” “my door has prize”
are INDEPENDENT.
• P(my door has prize) = 1/3
• P(my door has prize │Monty reveals a goat) = 1/3
• So….if I switch to the other unopened door…I win the
prize with probability 2/3.
Consider Two Traits and a randomly
selected 2010 ND undergrad
Ac
A
total
Female
3479
382
3861
Male
3935
555
4490
total
7414
937
8351
Pr(A) = 937/8351
Pr(A│F) = 382/3861
Pr(F) = 3861/8351
Pr(F│A) = 382/937
Pr(A∩F) = 382/8351
Pr(AUF) = (3479+382+555)/8351
Any four
numbes or
%s allows
you to fill in
everything.
Consider Two Traits
and a randomly selected ND undergrad
Ac
A
total
Female
3479
382
3861
Male
3935
555
4490
total
7414
937
8351
Pr(A) = 937/8351
Pr(A│F) = 382/3861
Pr(F) = 3861/8351
Pr(F│A) = 382/937
Pr(A∩F) = 382/8351
Pr(AUF) = (3479+382+555)/8351
Events A,F
are NOT
independent
Also
P(A)*P(F│A)
Convert Probs to Table of Counts to
make things easy to understand
I have the D
with Prob 1%
Pr(Pos│D)=90%
DC
D
total
Pos
1980
90
2070
Neg
7920
10
7930
total
9900
100
10,000
I tested positive. Do I have the disease?
Pr(D│Pos) = 90/2010
Pr(Pos│DC)=20%
Convert Probs to Table of Counts to
make things easy to understand
I have the D
with Prob 1%
Pr(Pos│D)=90%
DC
D
total
Pos
1980
90
2070
Neg
7920
10
7930
total
9900
100
10,000
Pr(Pos│DC)=20%
Pr(D│Pos) = 90/2070 = 4.3%
We just used BAYES THEOREM!!
See (4.17) or (4.18) or (4.19) to see what the formula looks like.
Consider 3 independent coin flips.
Pr(H,H,H) = 1/8
Pr(H,H,T) = 1/8
Pr(H,T,H) = 1/8
Pr(T,H,H) = 1/8
Pr(H,T,T) = 1/8
Pr(T,H,T) = 1/8
Pr(T,T,H) = 1/8
This is a probability
Distribution
Pr(3H) = 1/8
Addition law
Pr(2H) = 3/8
Pr(1H) = 3/8
It is a schedule that
assigns the unit of
probability to the set of
possible numeric
outcome.
Pr(0H) = 1/8
Random Variable X is
the number of heads in
3 flips.
Pr(T,T,T) = 1/8
X is discrete (takes on
only a few values), and
this is a probability
MASS function.
The Addition Law
I never use this.
P(AUB) = P(A) + P(B) – P(A∩B) (4.6)
= P(A) + P(B) if A,B are MUTUALLY EXCLUSIVE
I use this instead... I figure out ALL the possible
mutually exclusive outcomes and ADD the
probabilities of those that apply.
A and B are mutually exclusive if P(A∩B)=0
So P(1H in 3 tosses) = P(H,T,T) + P(T,H,T) + P(T,T,H)
because there are three mutually exclusive ways
to throw 1 H in three flips.
Don’t Make this mistake
• P(H1UH2) = P(H1) + P(H2) = ½ + ½ = 1
– Because H1 H2 are not mutually excusive (both can
happen….neither can happen)
Two correct ways
• P(H1UH2) = P(H1)+P(H2)-P(H1∩H2) = ½ + ½ - ¼.
• P(H1UH2) = P(H1,T2) + P(H1,H2) + P(T1,H2)
•
=¼+¼+¼
Five Probability Mass Functions
P(x) is never
negative.
Number of Flips
No.
Heads
0
1
2
3
4
5
1
0.5
0.5
2
0.25
0.5
0.25
3
0.125
0.375
0.375
0.125
4
0.0625
0.25
0.375
0.25
0.0625
5
0.03125
0.15625
0.3125
0.3125
0.15625
0.03125
Sum of P(x) over
all possible x
values is = to 1.
The Binomial (family) of pmf’s.
• Assumptions
– Random variable X is the number of successes in n
independent trials with p(success) = p on each
trial.
Important
p can be any number
• Parameters
word
between 0 ad 1
– The binomial has two parameters: n and p
• Calculation of the probabilities
Pr(x successes) = BINOMDIST(x,n,p,false)
Pr(x or fewer successes) = BINOMDIST(x,n,p,true)
EMBS: 5.4
Characteristics of any pmf
• MODE (most likely). The x value with the highest probability.
– For the binomial, table the pmf to find the mode.
• MEAN (or expected value). The probability-weighted average X
– Sum over all possible x values of x*P(x)
– For the binomial, the mean will be n*p
• VARIANCE. The probability-weighted average squared distance from
the mean.
– Sum of (x-mean)^2 * p(x)
– For the binomial, VAR(X) = n*p*(1-p)
• STANDARD DEVIATION. The square root of the variance.
– Since VARIANCE is average squared distance, STANDARD DEVIATION
will be an “average distance”.
It is okay if, at this point, you do not appreciate
VARIANCE and STANDARD DEVIATION
EMBS: 5.2, 5.3
Five binomial pmf’s
and their mode,mean,var,stddev
P(x) is never
negative.
Number of Flips
No.
Heads
0
1
2
3
4
5
Mode
Mean
Var
Std dev
1
0.5
0.5
2
0.25
0.5
0.25
3
0.125
0.375
0.375
0.125
4
0.0625
0.25
0.375
0.25
0.0625
0,1
0.5
0.25
0.5
1
1
0.5
0.707
1,2
1.5
0.75
0.867
2
2
1
1
5
0.03125
0.15625
0.3125
0.3125
0.15625
0.03125
2,3
2.5
1.25
1.118
Sum of P(x) over
all possible x
values is = to 1.
Probability Notation
Pr(Ac) = Prob A does not happen = 1 – Pr(A)
Pr(A│B) = Prob A given B = Pr(A∩B)/Pr(B)
Pr(A∩B) = Prob A and B = Pr(A) *Pr(B│A) = Pr(B)*Pr(A│B)
Pr(AUB) = Prob A or B = Pr(A) + Pr(B) – Pr(A∩B)
Just create a table of
counts and go from
there…..or maybe
draw a probability
tree to enumerate all
possible outcomes
A Probability Distribution
A schedule that assigns the unit of probability to the possible values
taken on by a random variable (number)
A Probability Mass Function
When the random variable is discrete, it’s probability distribution is a
probability MASS function because probability MASSES on each
possible discrete outcome value.
Characteristics of any probability
distribution
Mode (most likely), Mean (expected value), variance, standard
deviation.
EMBS: 5.1, 5.2, 5.3
The Binomial Pmf
• Applies to the number of success in n
independent trials.
• Parameters are n and p.
• Mean (expected value) is n*p
• Variance is n*p*(1-p)
• Standard deviation is sqrt(n*p*(1-p))
• =binomdist(X,n,p,false) to find a probability the
binomial random variable =‘s X.
• = binomdist(X,n,p,true) to find the probabilit the
binomial random variable is <= X.
EMBS: 5.4
Assignment Due Next Class
TA Office Hours
Tuesday night
7 to 8:30
classroom 266
My “office” hours
Every class day 3 to 430
In the classroom L051
Tabular Approach to MONTY HALL
not
My Door
Prize
MRG
200
100
300
Not
0
0
0
200
100
300
Pr(Prize│MRG) = 100/100 = 1/3