Sampling Distributions

Download Report

Transcript Sampling Distributions

The Binomial Distribution
A motivating example…
• 35% of Canadian university students work
more than 20 hours/week in jobs not related
to their studies. This can have a serious
impact on their grades. What is the
probability that I have at least 5 such
students in this class?
Answer: There is better than
a 99% chance!
What is a Binomial Distribution?
•Any random statistic that can be cast in a
“yes/no” format where:
•N successive choices are independent
•“yes” has probability p and “no” has probability 1-p
fits a binomial distribution.
Suggest 3 other examples of data sets that
can be modeled as binomial distributions
Looking a bit deeper…
• Suppose someone offered you the following
“game”:
Toss a coin 5 times. If you get 3 heads
I pay you a dollar, otherwise you pay me 50 cents.
• Should you accept the bet?
• What is your expected return on this bet?
• How can we calculate the odds?
Pascal to the rescue!
There are exactly 10
ways to get 3 heads
What is the probability
of flipping 6 tails in 8 trials?
How to generate Pascal’s Triangle
•Pascal’s triangle “unlocks” the mystery of binomial
distributions
•The cells in the triangle represent binomial coefficients
which also represent all possible “yes/no” combinations
•In “math-speak” we use the following notation to
calculate the number of ways “k” events can occur in “n”
choices:
n
n!
 k   k !(n  k )!
 
Factorial notation
5! = 5x4x3x2x1 = 120
How many ways can 3 people be selected from a class of 39?
Math detail (FYI)
• The general binomial probability is:
n k
P(k )    p (1  p)nk
k 
Example: B(9,0.4),what is P(5)?
• The Binomial Table is built from these terms
How to use the binomial distribution
• Assign “yes” and “no” and
their respective
probabilities to the
instances in your problem
•Assign “n” and “k” and
either use the formula, look
up in a table or use a stats
package (Excel works well)
•Example: 5.5
Look up in table
3 ways:
Use formula
Use Excel
15 
P(3)    (0.3)3 (0.7)12  0.1700
3
From Binomial to Normal Distributions
• Binomial is a discrete probability
distribution
• Normal is a continuous distribution
• When n becomes very large we can often
approximate by using a N(m,s) dist.
m X  np
s X  np(1  p)
• How large is “large”?
Rule of Thumb: when np >= 10 and n(1-p) >= 10 we can use the
Normal Distribution approximation
Sample Proportions…
• We often are interested in knowing the
proportion of a population that exhibits a
specific property (statistic). We denote this
the following way:
count of successes X
pˆ 

size of sample
n
• p is a proportion (often interpreted as a
probability) and is therefore a number
between 0 and 1
Mean and Standard Deviation of a Sample
Proportion
• If p is the proportion of “successes” in a
large SRS of n samples, then:
m pˆ  p
s pˆ 
p (1  p)
n
Look at Example 5.7
Working through some examples…
• 5.19: ESP
• A) ¼ = 0.25
• B) p(10)+p(11)+…+p(20) or… 1- [p(0)+…p(9)], this
can be read from Table C or done in EXCEL
• C) use m X  np
s X  np(1  p)
• You would expect 5 correct choices with a standard
deviation of 1.936
• D) Since the subject knows that all 5 of the shapes are on the card
the choices are no longer random and hence a binomial model is
not appropriate – this was not the case in parts a-c
• 5.21
• A) just use m X  np
• B) now use:
m pˆ  p
s pˆ 
• C)
z ( pˆ  0.24) 
s X  np(1  p)
p (1  p)
n
0.24  0.2
 3.16
0.01265
• D) p = 0.01  z = 2.33, use z 
X m
s
; X  m s z
Odds on the Oil!
• In order to make the play-offs, the Oilers must win
12 of their remaining 17 games. What is the
probability that they will be successful? They
currently have won 33 of the past 63 games.
• Step 1: re-word as a binomial distribution question,
identify “n” and “k”
• Decide on what probabilities you will need to calculate
• Use either tables, Minitab or EXCEL
Odds on the Oil! – normal
approximation
• Let’s look at using the normal approximation to
solve this:
• In order to make the playoffs the oilers must have a
better winning average than 33/60!
• However, at their current rate, how many of the 17
games do you expect them to win? What’s the standard
deviation of this?
• Determine a z-score from this and comment on the
likelihood of the Oiler’s success.
• Look at the sub-section “continuity correction” on pg
379 to help answer this.
• Should we expect this to give a reasonable answer?
• 5.24
• Identify relevant statistics: n = 1500, p = 0.7
•
•
•
•
A) X = np = (1500)(0.70) = 1050
B) z = (1000-1050)/17.748,  better than 99% chance
C) z = (1200-1050)/17.748,  NO CHANCE!!!!!
D) X = np = 1190 and s = 18.89, chance that more than
1200 accept is now pretty good (p = 0.2892)
In conlusion…
• Be sure that you understand what a binomial
distribution is and when it can be applied
• Be able to use the probability equation on page
382
• Know how to read and apply a binomial probability
table (Appendix C)
• Know what Pascal’s triangle is and how it relates
to binomial distributions
• Be able to relate the binomial distribtion to the
normal distribution and when you can
approximate with a normal distribution z-score
analysis