Transcript A or B

Statistics and Quantitative
Analysis U4320
Segment 3: Probability
Prof. Sharyn O’Halloran
Review Descriptive
Statistics
A. SAMPLE DATA SET
Code book for Measures
Religion
1. Catholic
2. Protestant
3. Other
9. Don't Know, No Answer
Income
Measured in Thousands of $
-99. DK, NA
Employed
0. Unemployed
1. Employed
9. DK, NA
Class
1. Lower
2. Middle
3. Upper
9. DK, NA
Lower
0. Other
1. Lower
9. DK, NA
Upper
0. Other
1. Upper
9. DK, NA
Survey Data -- Matrix of Cases and Measured Variables:
Case
1
2
3
4
5
6
7
8
9
10
Mode
Median
Mean
Variance
Standard
Deviation
Religion
1
3
2
1
1
2
3
2
2
9
2
N/A
N/A
N/A
N/A
Employed
0
0
0
1
1
1
0
0
9
0
0
0
.33
.223
.471
Class
1
3
2
2
3
1
2
2
1
9
2
2
N/A
N/A
N/A
Lower
1
0
0
0
0
1
0
0
1
9
0
0
.33
.223
.471
Upper
0
1
0
0
1
0
0
0
0
9
0
0
.22
.173
.416
Income
8
35
20
12
37
14
20
18
-99
11
20
18
19.44
93.36
9.66
Statistics
1. Mean
N
X
X
i
i 1
N
N
or X 
x f
i i
i 1
N
2 Variance
 X
N
 
2
i 1
 X
N
2
i
or  
2
N

i 1
f xi  X 
2
N
Example: Employment

Calculate Mean
N
X
X
i 1
i
= 3/9 = .33
N
 X
N

Calculate Variance
2 
i 1
 X
2
i
N
(0-.33)2+(0-.33)2+(0-.33)2+(0-.33)2+(0-.33)2+(0-.33)2+(1-.33)2
+(1-.33)2+(1-.33)2
--------------------------------------------------------------------------------------9
= .223.

And the standard deviation is

  .223 .471
How raw data maps into a
frequency table
FREQUENCY TABLES
RELIGION
RELIGION
CATHOLIC
PROTESTANT
OTHER
DK, NA
MODE
CODED VALUE
1
2
3
PROTESTANT (2)
FREQUENCY
3
4
2
1
.
CLASS
CLASS
LOWER
MIDDLE
UPPER
DK, NA
MODE
MEDIAN
CODED VALUE
1
2
3
FREQUENCY
3
4
2
1
MIDDLE (2)
MIDDLE (2)
EMPLOYMENT
EMPLOYED
UNEMPLOYED
EMPLOYED
DK, NA
MODE
MEDIAN
MEAN
CODED VALUE
0
1
UNEMPLOYED (0)
UNEMPLOYED (0)
1/3
FREQUENCY
6
3
1
Example: Employment

Mean: (6*0 +3*1) / 9 = 1/3

Variance: 2
  = 6*(0-.33)2 + 3* (1-.33)2 / 9 = .223.
Standard Deviation:
   .223 .471
Probability Theory

Simple Probability


The relative frequency with which events occur
when repeated many times.
Property: As N gets large
f x 
 P(x)
N
Example: Toss a coin 3
times

Outcome Space
{HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}.

Simple probabilityThe frequency of A
divided by the total number of outcomes.
f  A
P ( A) 
N
Compound Events



A and B
Example:
What is the probability of
exactly two heads AND having the first toss
be heads?
Definition: The relative frequency of the
events that meet both criteria
f ( A and B)
P ( A and B ) 
N
A or B


Example: What is the probability of getting
either exactly 2 heads or no heads at all?
Definition: The probability of A or B is the
relative frequency of the events that meet
either criteria.
f ( A or B)
P ( A or B ) 
N
Conditional Probability (A
given B)

Example: What is the probability that the
first toss is a head given that there are
exactly two heads?

Denote: P(1st H/2H)

Definition:
P( A and B)
P( A / B) 
P( B)
Probability Tables
DIE
1/6
1/2
1/2
COIN
H
T
TOTAL

1
1/12
1/12
1/6
1/6
2
1/12
1/12
1/6
1/6
3
1/12
1/12
1/6
1/6
1/6
6
TOTAL
1/12
1/2
1/12
1/2
1/6
1
5
1/12
1/12
1/6
What is the joint probability of tossing a Head and Rolling a 6?
P( H and 6) 

4
1/12
1/12
1/6
1/6
f ( H and 6)
= 1/12.
N
What is the probability of T given that you rolled a 6?
P(T / 6) =
P(T AND 6) 1 / 12
=
P(6)
1/ 6
= 1 / 2.
Independence

Definition: The occurrence of one event does
not affect the probability of the other.



P(A|B) = P(A)
Example:


20% of the students play football, 50% play
basketball, and 15% play both. How can we put this
into a table?
BASKETBALL
YES
FOOTBALL
NO
TOTAL
YES
NO
TOTAL

100%
What's the probability that a student selected at
random will:

a) Play neither sport?

b) Play football or basketball?

c) Are playing basketball and football independent?