Chapter 7 Introduction to Sampling Distributions

Download Report

Transcript Chapter 7 Introduction to Sampling Distributions

Hmm…déjà vu?
0 Statistic is a numerical descriptive measure of a
sample
0 Parameter is a numerical descriptive measure of a
population
So…let’s do this
0 What are the symbols for Statistic mean, variance,
standard deviation?
Something new
0 Proportion
0 𝑝 (𝑝 β„Žπ‘Žπ‘‘)
Note:
0 We are going from raw data distribution to a sampling
distribution
Note #2:
0 We often do not have access to all the measurements
of an entire population because of constraints on time,
money, or effort. So we must use measurements from
a sample.
Type of inferences
0 1) Estimation: In this type of inference, we estimate
the value of a population parameter
0 2) Testing: In this type of inference, we formulate a
decision about the value of a population parameter
0 3) Regression: In this type of inference, we make
predictions or forecasts about the value of a statistical
variable
Sampling distribution
0 It is a probability distribution of a sample statistic
based on all possible simple random samples of the
same size from the same population
Group Work:
0 1) What is population parameter? Give an example
0 2) What is sample statistic? Give an example
0 3) What is a sampling distribution?
Answer
0 1) Population parameter is a numerical descriptive
measure of a population
0 2) A sample statistic or statistic is a numerical
descriptive measure of a sample
0 3) A sampling distribution is a probability distribution
for the sample statistic we are using
Read page 295-298
Homework Practice
0 Pg 298-299 #1-9
Central Limit Theorem
Central Limit Theorem
0 For a Normal Probability Distribution:
0 Let x be a random variable with a normal distribution
whose mean is πœ‡ and whose standard deviation is 𝜎.
Let π‘₯ be the sample mean corresponding to random
samples of size n taken from the x distribution. Then
the following are true:
0 A) The π‘₯ distribution is a normal distribution
0 B) The mean of the π‘₯ distribution is πœ‡
0 C) The standard deviation of the π‘₯ distribution is 𝜎/ 𝑛
Note:
0 We conclude from the previous theorem that when x has a
normal distribution, the π‘₯ distribution will be normal for any
sample size n. Furthermore, we can convert the π‘₯ distribution to
the standard normal z distribution by using these formulas:
0 πœ‡π‘₯ = πœ‡
0 𝜎π‘₯ =
0 𝑧=
𝜎
𝑛
π‘₯βˆ’πœ‡π‘₯
𝜎π‘₯
=
π‘₯βˆ’πœ‡
𝜎/ 𝑛
0 Where n is the sample size
0 πœ‡ is the mean of the π‘₯ distribution, and
0 𝜎 is the standard deviation of the x distribution
Example
0 Suppose a team of biologist has been studying the height in
human. Let x be the height of a single person. The group
has determined that x has a normal distribution with mean
πœ‡ = 5.6 𝑓𝑒𝑒𝑑 and standard deviation 𝜎 = 1.5 feet .
0 A) What is the probability that a single person taken at
random will be in between 4.7 and 6.5 feet tall?
0 B) What is the probability that the mean length π‘₯ of 5
people taken at random is between 4.7 and 6.5 feet tall?
Answer
0 A) 𝑧 =
π‘₯βˆ’πœ‡
𝜎
=
π‘₯βˆ’5.6
1.5
4.7βˆ’5.6
1.5
0 𝑃 4.7 < π‘₯ < 6.5 = 𝑃(
0 B) 𝑧 =
π‘₯βˆ’πœ‡π‘₯
𝜎π‘₯
4.7βˆ’5.6
0 𝑃(
1.5
5
=
π‘₯βˆ’πœ‡ π‘₯βˆ’5.6
=
𝜎/ 𝑛 1.5/ 5
<π‘₯<
6.5βˆ’5.6
1.5
5
)
<x<
6.5βˆ’5.6
)
1.5
Group Work
0 Suppose a team of feet analysts has been studying the size
of man’s foot for particular area. Let x represent the size of
the foot. They have determined that the size of the foot has
a normal distribution with πœ‡ = 8.4 π‘–π‘›π‘β„Žπ‘’π‘  and standard
deviation 𝜎 = 3.1 inches.
0 A) What’s the probability of the foot of a single person
taken at random will be in between 6 and 8 inches?
0 B) What’s the probability that the mean size of 8 people
taken at random will be in between 7 and 9 inches?
Standard error
0 Standard error is the standard deviation of a sampling
distribution. For the π‘₯ sampling distribution,
0 Standard error = 𝜎π‘₯ = 𝜎/ 𝑛
Using Central limit theorem to
convert the π‘₯ distribution to
the standard normal
0πœ‡ =πœ‡
distribution
π‘₯
0 𝜎π‘₯ =
0𝑧=
𝜎
𝑛
π‘₯βˆ’πœ‡π‘₯
𝜎π‘₯
=
π‘₯βˆ’πœ‡
𝜎/ 𝑛
0 Where n is the sample size (𝑛 β‰₯ 30),
0 πœ‡ is the mean of the x distribution, and
0 𝜎 is the standard deviation of the x distribution
Group Work: Central Limit
Theorem
0 A) Suppose x has a normal distribution with population mean 18 and
standard deviation 3. If you draw random samples of size 5 from the x
distribution and x bar represents the sample mean, what can you say about
the x bar distribution? How could you standardize the x bar distribution?
0 B) Suppose you know that the x distribution has population mean 75 and
standard deviation 12 but you have no info as to whether or not the x
distribution is normal. If you draw samples of size 30 from the x distribution
and x bar represents sample mean, what can you say about the x bar
distribution? How could you standardize the x bar distribution?
0 C) Suppose you didn’t know that x had a normal distribution. Would you be
justified in saying that the x bar distribution is approximately normal if the
sample size were n=8?
Answer
0 A) Since you are given it to be normal, the x bar
distribution also will be normal even though sample
π‘₯βˆ’18
size is much less than 30. 𝑧 =
3/ 5
0 B) Since sample size is large enough, the x bar
distribution will be an approximately normal
π‘₯βˆ’75
distribution. 𝑧 =
12/ 30
0 C) No, sample size is too small. Need to be 30 or more
Note:
0 A sample statistic is unbiased if the mean of its
sampling distribution equals the values of the
parameter being estimated
0 The spread of the sampling distribution indicates the
variability of the statistic. The spread is affected by
the sampling method and the sample size. Statistics
from larger random samples have spread that are
smaller.
Read 304 and 305
Homework Practice
0 Pg 306-308 #1-18 eoe
Sampling Distributions for
Proportions
Think Back to Section 6.4
0 We dealt with normal approximation to the binomial.
0 How is this related to sampling distribution for
proportions?
0 Well in many important situations, we prefer to work
with the proportion of successes r/n rather than the
actual number of successes r in binomial experiments.
Sampling distribution for the
π‘Ÿ
proportion 𝑝 =
0
0
0
0
0
Given
n= number of binomial trials (fixed constant)
r= number of successes
p=probability of success on each trial
q=1-p= probability of failure on each trial
0 If np>5 and nq>5, then the random variable 𝑝 =
𝑛
π‘Ÿ
𝑛
can be
approximated by a normal random variable (x) with mean and
standard deviation
0 πœ‡π‘ = 𝑝
π‘Žπ‘›π‘‘
πœŽπ‘ =
π‘π‘ž
𝑛
Standard Error
0 The standard error for the 𝑝 distribution is the
standard deviation πœŽπ‘ =
π‘π‘ž
𝑛
Where do these formula
comes from?
0 Remember πœ‡ is as known as expected value or
average.
0πœ‡=
π‘₯ 𝑃(π‘₯)
0 So πœ‡π‘ =
0 πœŽπ‘ =
πœŽπ‘Ÿ
𝑛
πœ‡π‘Ÿ
𝑛
=
𝑛𝑝
𝑛
=
π‘›π‘π‘ž
𝑛
= 𝑝 (r is number of success)
=
π‘π‘ž
𝑛
How to Make continuity
corrections to 𝑝 intervals
0 If r/n is the right endpoint of a 𝑝 interval, we add
0.5/n to get the corresponding right endpoint of the x
interval
0 If r/n is the left endpoint of a 𝑝 interval, we subtract
0.5/n to get the corresponding left endpoint of the x
interval
Example:
0 Suppose n=30 and we have a 𝑝 interval from
15/30=0.5 and 25/30=.83 Use the continuity
correction to convert this interval to an x interval
Answer
0 0.5/30 = 0.02 (approx)
0 So x interval is .5-.02 and .83+.02 which is .48 to .85
0 𝑝 interval: .5 to .83
0 x interval: .48 to .85
Group Work
0 Suppose n=50 and 𝑝 interval is .64 and 1.58. Find the
x interval
Word Problem
0 Annual cancer rate in L.A. is 209 per 1000 people. Suppose we
take 40 random people.
0 A) What is the probability p that someone will get cancer and
what’s the probability q that they won’t get cancer?
0 B) Do you think we can approximate 𝑝 with a normal
distribution? Explain
0 C) What are the mean and standard deviation for 𝑝?
0 D) What is the probability that between 10% and 20% of the
people will be cancer victim? Interpret the result
Answer
0 A) 𝑝 =
209
1000
= .209 π‘Žπ‘›π‘‘ π‘ž = .791
0 B) 𝑛𝑝 = 40 .209 = 8.36 π‘Žπ‘›π‘‘ π‘›π‘ž = 40 .791 = 31.64
0 Since both are greater than 5, we can approximate 𝑝 with a normal distribution
0 C) πœ‡π‘ = 𝑝 = .209
0 πœŽπ‘ =
π‘π‘ž
𝑛
=
.209 .791
40
= 0.064
0 D) Since probability is .10 ≀ 𝑝 ≀ .20 we need to convert into x distribution
0 Continuity correction=0.5/n=0.5/40=0.0125, so we subtract .01 and add .01 from the interval
0 So .09 ≀ π‘₯ ≀ .21, then convert this into z value.
0 𝑃
0.09βˆ’.209
.064
≀𝑧≀
.21βˆ’.209
.064
, π‘‘β„Žπ‘’π‘› 𝑒𝑠𝑒 π‘‘π‘Žπ‘π‘™π‘’ π‘‘π‘œ 𝑓𝑖𝑛𝑑 π‘‘β„Žπ‘’ π‘π‘Ÿπ‘œπ‘π‘Žπ‘π‘–π‘™π‘–π‘‘π‘¦
Group work
0 In the OC, the general ethnic profile is about 47%
minority and 53% Caucasian. Suppose a company
recently hired 78 people. However, if 25% of the new
employees are minorities then there is a problem.
What is the probability that at most 25% of the new
fires will be minorities if the selection process is
unbiased and reflect the ethnic profile? (Follow the
last example’s footstep)
Answer
0
𝑛 = 78, 𝑝 = .47, π‘ž = .53
0
𝑝 = .25
0
0
𝑛𝑝 = 78 .47 = 36.66 π‘Žπ‘›π‘‘ π‘›π‘ž = 78 .53 = 41.34
Since both are greater than 5, so normal approximation is appropriate
0
πœ‡π‘ = 𝑝 = .47
0
πœŽπ‘ =
0
Continuity correction 0.5/n = 0.5/78 = .006
0
Since it is to the right endpoint, you add .006
0
P(𝑝 ≀ .25) = 𝑃(π‘₯ ≀ .256)
0
𝑃 𝑧≀
π‘π‘ž
𝑛
=
.256βˆ’.47
.06
.47 .53
78
=
β‰ˆ .06
Remember Control Chart?
0 Sketch how a control chart looks like
P-Chart
0 It is just like the control chart
How to make a P-Chart?
0 1. Estimate 𝑝, π‘‘β„Žπ‘’ π‘œπ‘£π‘’π‘Ÿπ‘Žπ‘™π‘™ π‘π‘Ÿπ‘œπ‘π‘π‘œπ‘Ÿπ‘‘π‘–π‘œπ‘› π‘œπ‘“ 𝑠𝑒𝑐𝑐𝑒𝑠𝑠𝑒𝑠
0 𝑝=
π‘‘π‘œπ‘‘π‘Žπ‘™ π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ π‘œπ‘π‘ π‘’π‘Ÿπ‘£π‘’π‘‘ 𝑠𝑒𝑐𝑐𝑒𝑠𝑠𝑒𝑠 𝑖𝑛 π‘Žπ‘™π‘™ π‘ π‘Žπ‘šπ‘π‘™π‘’π‘ 
π‘‡π‘œπ‘‘π‘Žπ‘™ π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ π‘‘π‘Ÿπ‘–π‘Žπ‘™π‘  𝑖𝑛 π‘Žπ‘™π‘™ π‘ π‘Žπ‘šπ‘π‘™π‘’π‘ 
0 2. The center like is assigned to be πœ‡π‘ = 𝑝
0 3. Control limits are located at 𝑝 ± 2 π‘π‘ž/𝑛 (2 π‘†π‘‘π‘Žπ‘›π‘‘π‘Žπ‘Ÿπ‘‘ π·π‘’π‘£π‘–π‘Žπ‘‘π‘–π‘œπ‘›π‘ ) and
𝑝 ± 3 π‘π‘ž/𝑛(3 π‘†π‘‘π‘Žπ‘›π‘‘π‘Žπ‘Ÿπ‘‘ π·π‘’π‘£π‘–π‘Žπ‘‘π‘–π‘œπ‘›π‘ )
0 4. Interpretation: out-of-control signals
0 A) any point beyond a 3 standard deviation level
0 B) 9 consecutive points on one side of the center line
0 C) at least 2 out of 3 consecutive points beyond a 2 standard deviation level
0 Everything is in control if no out-of-control signals occurs
Example situation:
0 Civics and Economics is taught in each semester. The
course is required to graduate from high school, so it
always fills up to its maximum of 60 students. The
principal asked the class to provide the control chart
for the proportion of A’s given in the course each
semester for the past 14 semesters. Make the chart
and interpret the result.
Example:
Semester
1
2
3
4
5
6
7
r=#of A’s
9
12
8
15
6
7
13
𝑝=r/60
.15
.20
.13
.25
.1
.12
.22
Semester
8
9
10
11
12
13
14
r=#of A’s
7
11
9
8
21
11
10
𝑝=r/60
.12
.18
.15
.13
.35
.18
.17
Answer
0 𝑝=
π‘‘π‘œπ‘‘π‘Žπ‘™ π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ π‘œπ‘π‘ π‘’π‘Ÿπ‘£π‘’π‘‘ 𝑠𝑒𝑐𝑐𝑒𝑠𝑠𝑒𝑠 𝑖𝑛 π‘Žπ‘™π‘™ π‘ π‘Žπ‘šπ‘π‘™π‘’π‘ 
π‘‡π‘œπ‘‘π‘Žπ‘™ π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ π‘‘π‘Ÿπ‘–π‘Žπ‘™π‘  𝑖𝑛 π‘Žπ‘™π‘™ π‘ π‘Žπ‘šπ‘π‘™π‘’π‘ 
(this is pooled
proportion of success)
0 n=60
0 𝑝=
147
840
0 πœŽπ‘ =
= .175, π‘ž = .825
π‘π‘ž
𝑛
=
.175βˆ—.825
60
= .049
0 Control Limits are: .077 and .273 (2 standard deviation)
0 .028 and .322 (3 standard deviation)
0 Then graph it! Y-axis is proportion, and x-axis is sample number
Group Work
0 Mr. Liu went on a streak of asking 30 women out a
month for 11 months. Complete the table and make a
control chart for Mr. Liu and interpret his β€œgame”.
month
1
2
3
4
5
6
7
1
3
15
12
20
5
8
9
10
11
r=says yes 2
8
7
23
r=says yes 10
𝑝=r/30
month
𝑝=r/30
Homework Practice
0 P317-320 #1-13 eoo