Transcript Probability

Stat 155, Section 2, Last Time
• Producing Data: How to Sample?
– Placebos
– Double Blind Experiment
– Random Sampling
• Statistical Inference
– Population “parameters”  ,  ,
x, ,
– Sample “statistics”
(keep these separate)
p
s p̂
• Probability Theory
Reading In Textbook
Approximate Reading for Today’s Material:
Pages 231-240, 256-257
Approximate Reading for Next Class:
Pages 259-271, 277-286
Chapter 4: Probability
Goal: quantify (get numerical) uncertainty
•
Key to answering questions above
(e.g. what is “natural variation”
in a random sample?)
(e.g. which effects are “significant”)
Idea: Represent “how likely” something is
by a number
Probability
Recall Basics:
Assign numbers (representing “how likely”),
to outcomes
E.g. Die Rolling:
P{comes up 4} = 1/6
•
Outcome is “4”
•
Probability is 1/6
Simple Probability
Quantify “how likely” outcomes are by
assigning “probabilities”
I.e. a number between 0 and 1, to each
outcome, reflecting “how likely”:
Intuition:
• 0 means “can’t happen”
• ½ means “happens half the time”
• 1 means “must happen”
Simple Probability
Main Rule:
Sum of all probabilities (i.e. over all
outcomes) is 1:
P1  1 6
E.g. for die rolling:
P2  1 6
P3  1 6
P4  1 6
P5  1 6
P6  1 6
 1
Simple Probability
HW:
4.13a
4.15
Probability
General Rules for assigning probabilities:
i. Frequentist View
(what happens in many repititions?)
ii. Equally Likely: for n outcomes
P{one outcome} = 1/n (e.g. die rolling)
iii. Based on Observed Frequencies
e.g. life tables summarize when people die
Gives “prob of dying” at a given age
“life expectancy”
Probability
General Rules for assigning probabilities:
iv. Personal Choice:
–
–
–
HW:
4.19
Reflecting “your assessment”
E.g. Oddsmakers
Careful: requires some care
(key is prob’s need to sum to 1)
Probability - Events
More Terminology (to carry this further):
• An event is a set of outcomes
Die Rolling: “an even #”, is the event {2, 4, 6}
Notes:
–
–
–
–
If betting on even don’t care about #, only
even or odd
Thus events are our foundation
Each outcome is an event: the set containing
just that outcome
So event is the more general concept
Probability on Events
Sample Space is the set of all outcomes =
= “event with everything that can happen”
Extend Probability to Events by:
P{event} = sum of probs of outcomes in event

 PO
outcomes O
Probability
Technical Summary:
•
A probability model is a sample space
•
I.e. set of outcomes, plus a probability, P
•
P assigns numbers to events,
•
Events are sets of outcomes
Probability Function
The probability, P, is a “function”,
defined on a set of events
Recall function in math:
plug-in
Probability:
f ( x )  3x  2
2
get out
P{event} = “how likely”
Probability Function
E.g. Die Rolling
•
Sample Space = {1, 2, 3, 4, 5, 6}
•
“an even #” is the event {2, 4, 6} (a “set”)
P{“even”} = P{2, 4, 6} =  Po
•
o
= P{2} + P{4} + P{6} =
= 1/6 + 1/6 + 1/6 = 3/6 = ½
•
Fits, since expect “even # half the time”
Probability HW
HW:
4.11
4.13b
4.17
And now for something
completely different
•
Did you here about the constipated
mathematician?
And now for something
completely different
•
Did you here about the constipated
mathematician?
•
He worked it out with a pencil!
And now for something
completely different
•
Did you here about the constipated
mathematician?
•
He worked it out with a pencil!
•
Apologies for the juvenile nature…
And now for something
completely different
•
Did you here about the constipated
mathematician?
•
He worked it out with a pencil!
•
Apologies for the juvenile nature…
•
But there is an important point:
And now for something
completely different
•
Did you here about the constipated
mathematician?
•
He worked it out with a pencil!
•
Apologies for the juvenile nature…
•
But there is an important point:
The pencil is a powerful
mathematical tool
And now for something
completely different
The pencil is a powerful
mathematical tool
•
An old student:
–
I was once “good in math”
–
But suddenly lost that
–
Reason: tried to do too much in head
–
Reason: never learned power of the pencil
And now for something
completely different
The pencil is a powerful
mathematical tool
•
For us: now is time to start using pencil
•
I do PowerPoint in class
•
You use pencil on HW (and exams)
•
Change in mindset, from Excel…
Probability
Now stretch ideas with more interesting e.g.
E.g. Political Polls, Simple Random Sampling
2 views:
1. Each individual equally likely to be in sample
2. Each possible sample is equally likely
Allows for simple Probability Modelling
Simple Random Sampling
•
Sample Space is set of all possible
samples
•
An Event is a set of some samples
E.g. For population A, B, C, D
–
Each is a voter
–
Only 4, so easy to work out
S. R. S. Example
For population A, B, C, D,
Draw a S. R. S. of size 2
Sample Space =
{(A,B), (A,C), (A,D), (B,C), (B,D), (C,D)}
outcomes, i.e. possible samples of size 2
S. R. S. Example
Now assign P, using “equally likely” rule:
P{A,B} = P{A,C} = … = P{C,D} =
= 1/(#samples) = 1/6
An interesting event is:
“C in sample” = {(A,C),(B,C),(D,C)}
(set of samples with C in them)
S. R. S. Example
1
P{C in sample} =  P{sample}  
samples
samples 6
with C
with C
1
1 1
# samples with C    3  
6
6 2
i.e. happens “half the time”.
S. R. S. Probability HW
HW C10:
Abby, Bob, Mei-Ling, Sally and Roberto work for a
firm. Two will be chosen at random to attend
an overseas meeting. The choice will be made
by drawing names from a hat (this is an S. R.
S. of 2).
a. Write down all possible choices of 2 of the 5
names. This is the sample space.
b. Random choice makes all choices equally
likely. What is the probability of each choice?
(1/10)
c. What is the prob. that Sally is chosen? (4/10)
d. What is the prob. that neither Bob, nor Roberto
is chosen? (3/10)
Political Polls Example
What is your chance of being in a poll of
1000, from S.R.S. out of 200,000,000?
(crude estimate of # of U. S. voters)
Recall each sample is equally likely so:
# samples with you
Psample with you 
total # samples
Problem:
this is really big  2.66  10
5733
(5,733 digits, too big for easy handling….)
Political Polls Example
More careful calculation:
199,999,999 
1 

999
# samples with you


Psample with you 

total # samples
 200,000,000 


 1,000

199,999,999!
1000
1
199,999,000!999!



200,000,000!
200,000,000 200,000
199,999,000!1000!
Makes sense, since you are “equally likely to
be in samples”
And now for something
completely different
.
An interesting phone conversation….
Sound File
Probability
•
Now have prob. models
•
But still hard to work with
•
E.g. prob’s we care about, such as
“accuracy estimators”, need better tools
•
Need to look more deeply
3 Big Rules of Probability
•
Main idea: calculate “complicated prob’s”
•
By decomposing events in terms of
simple events
•
Then calculating probs of these
•
And then using simple rules of probabilty
to combine
3 Big Rules of Probability
Rule I:
the not rule:
P{not A} = 1 – P{A}
Why?
E.g. equally likely sample points:
# not in A total #  # in A

total #
total #
And more generally:

el ' s not in A
probs  1 
 probs
el ' s in A
The “Not” Rule of Probability
Text Book Terminology (sec. 4.2):
not A =
C
A
for “complement”
(set theoretic term)
(I prefer “not”, since more intuitive)
The “Not” Rule of Probability
HW:
4.17b
Rework, using the “not” rule:
3 Big Rules of Probability
Rule II: the or rule:
P{A or B} = P{A} + P{B} – P{A and B}
Why?
E.g. equally likely sample points:
# in A or B # in A  # in B  # in A & B

total #
total #
Helpful Pic:
Big Rules of Probability
E.g. Roll a die,
Let A = “4 or less” = {1, 2, 3, 4}
Let B = “Odd” = {1, 3, 5}
Check how rules work by calculating 2 ways:
Direct:
P{not A} = P{5, 6} = 2/6 = 1/3
By Rule I: P{not A} = 1 – P{A} = 1 – 4/6 = 1/3
The “Or” Rule of Probability
A = “4 or less” = {1, 2, 3, 4}
B = “Odd” = {1, 3, 5}
Check how rule works by calculating 2 ways:
Direct:
P{A or B} = P{1, 2, 3, 4, 5} = 5/6
By Rule II: P{A or B} =
= P{A} + P{B} – P{A and B} =
= 4/6 + 3/6 – 2/6 = 5/6
(check!)
The “Or” Rule of Probability
•
Seems too easy?
•
Don’t really need rules for these simple
things
•
But they are the key to bigger problems
•
Such as Simple Random Sampling
HW: 4.86 (0.317)
The “Or” Rule of Probability
E.g: A college has 60% Women and 40%
smokers, and 50% women who don’t
smoke.
What is the chance that a randomly selected
student is either a women or a nonsmoker?
(seems “>60%”, but twice? Must be < 100%,
i.e. must be some overlap…)
College Women – Smokers E.g.
P{W or NS} = P{W} + P{NS} = P{W & NS}
(choice of letters make easy to work with)
= 0.6 + (1 – 0.4) – 0.5 = 0.7
(answer is 70% Women or Non-Smokers)
Note: rules are powerful when used together
More “Or” Rule HW
HW: C11
A building company bids on two large
projects. The president believes the
chance of winning the 1st is 0.6, the
chance of winning the 2nd is 0.5, and the
chance of winning both is 0.3. What is
the chance of winning at least one of the
jobs?
(0.8)
The “Or” Rule of Probability
E.g. Events A & B are “mutually exclusive”,
i.e. “disjoint”, when P{A & B} = 0
(i.e. no chance of seeing both at same time)
Useful Pic:
Then:
P{A or B} = P{A} + P{B}
Text suggests “new rule”, I say “special case”
The “Exclusive Or” Rule
HW: C12
Choose an acre of land in Canada at
random. The probability is 0.35 that it is
forest, and 0.03 that it is pasture.
a. What is the probability that the acre
chosen is not forested? (0.65)
b. What is the probability that it is either
forest or pasture? (0.38)
c. What is the probability that a randomly
chosen acre in Canada is neither forest
nor pasture? (0.62)