Sampling Distributions
Download
Report
Transcript Sampling Distributions
1
Outline
1.
2.
3.
4.
5.
6.
7.
8.
Review of last week
Sampling distributions
The sampling distribution of the mean
The Central Limit Theorem
Confidence intervals
Normal distribution example
Sampling distribution example
Confidence interval example
2
Review of last week
Last week, we learned how to use the Standard
Normal Distribution to work out the probability of
finding individual scores in some interval – e.g.,
what is the probability that the next Canadian
woman we meet is taller than 175 cm?
Today, we’re going to do the same sort of thing
with sample means rather than individual scores.
3
Population
(µ)
Sample
X
The sampling distribution of a sample statistic (such as
X) is the probability distribution of that statistic.
4
Sample 4
X4
Population
(µ)
Sample 3
X3
Sample 2
X2
Sample 1
X1
The sampling distribution of the mean consists of all
possible sample means – for all possible samples of size
n – that you could take from the population
5
Distribution of sample means
for samples of size n
X
µX
When we draw a sample from a population, we are at
the same time drawing a sample mean from the
distribution of sample means for samples of size n
6
Sampling distributions
The sampling distribution of
a sample statistic is the
probability distribution of
that statistic.
We can have sampling
distributions of any sample
statistic
Mean
Median
Variance
Std devn
X
M
s2
s
7
The sampling distribution of the mean
The sampling distribution of the sample mean X.
E(X) = μX = μ
Variability of this distribution is given by the
standard error of the mean:
σX = σ ≅ s
n
n
8
The Central Limit Theorem
Consider a random sample
of n observations from a
population with mean µ and
standard deviation .
When n is sufficiently large,
the sampling distribution of
XX will be approximately
normal with mean µX = µ
and X = / n .
Note: this is true regardless
of the shape of the
underlying distribution of
raw scores
9
The Central Limit Theorem
The larger the sample size,
the better the approximation
to the normal distribution.
For most populations,
n ≥ 30 will be “sufficiently
large.”
10
The Central Limit Theorem
When we draw a sample and measure its mean,
by the CLT, we may assume the sampling
distribution of the sample mean is normal.
That means we can use the standard normal
distribution (SND) to work out the probability of
finding a sample mean in a given range relative to
the population mean.
11
X
μX
The sampling distribution of the sample mean
12
The sampling distribution of the mean
We use the sampling distribution of the mean the
way we used the SND last week. We obtain
probabilities of finding sample means in a given
range relative to the population mean, for samples
of size n.
Don’t forget to use the standard error, σX, rather
than the standard deviation, σ!
13
Confidence Intervals
There are two ways to estimate population
parameters such as the mean:
1. Point estimates, such as X
2. Interval estimates, which tell us a range of
values that will contain the parameter with known
probability.
14
.45
Z = -1.645
.45
µX
Z = 1.645
90% of the time, X will fall within the range
Z = -1.645 to Z = +1.645
15
Confidence Intervals
If 90% of the time X falls in the range Z = -1.645 to
Z = +1.645 around the mean µ, then…
90% of the time, µ must fall within a range of the
same width centered on X.
16
Confidence Intervals
For given , the 100 (1-)% Confidence Interval
for µX is:
C.I. = X ± Z/2 X
C.I. = X ± Z/2 /√n
17
Confidence Intervals
When is not known and n is large (≥ 30), use s:
C.I. = X ± Z/2 sX
C.I. = X ± Z/2 s/√n
18
Normal Distribution Example
The amount of time that students wait to be served
when buying coffee from the “Campus Perks”
coffee outlet is normally distributed with a mean of
62.0 seconds and a 98.5 percentile of 79.36
seconds. In a random sample of 30 students
buying coffee at Campus Perks, approximately
how many will wait between 40 and 58 seconds to
be served?
NOTE: This is not a question about a sample mean!
19
Normal Distribution Example
.50
.4850
40
58
62
P98.5
Z for .4850 = 2.17
20
Normal Distribution Example
= 79.36 – 62
2.17
=8
Z1 = 40 – 62
8
= -2.75
(p = .4970 from table)
Z2 = 58 – 62
8
= -0.50
(p = .1915 from table)
21
Normal Distribution Example
P(40 ≤ X ≤ 58) = .4970 - .1915 = .3055
The probability of any one student waiting between
40 and 58 seconds is .3055.
Therefore, in a random sample of 30, we expect
approximately .3055 (30) = 9.165 ≈ 9 students to
wait between 40 and 58 seconds.
22
Sampling Distribution Example
People’s reaction times (RTs) to a simple visual stimulus
are normally distributed with a mean of 500 milliseconds
and a standard deviation of 150 milliseconds. You believe
that people who go on a low-carb diet, however, will have
slower (longer) RTs than this, on average, though their
standard deviation will remain at 150. To test your belief,
you take a random sample of 40 people who self-report
having being on a low-carb diet for at least 6 months and
measure their RTs. You decide that your belief will be
supported if the mean RT of the low-carb group is 565
milliseconds or slower. What is the probability that you will
conclude that your belief has been supported even if a
low-carb diet actually has no effect on RTs whatsoever?
23
We want this
probability
500
565
You decide that your belief will be supported if the
mean RT of the low-carb group is 565 milliseconds or
slower. What is the probability that you will conclude
that your belief has been supported even if a low-carb
diet actually has no effect on RTs whatsoever?
24
Example 2
What is P(X ≥ 565 │µ = 500)?
Z = 565 – 500
150/√40
25
Example 2
What is P(X ≥ 565 │µ = 500)?
Z = 565 – 500
150/√40
=
65
23.72
= 2.74
P for Z = 2.74 (from table) is .4969.
Therefore, desired probability is .5 - .4969 = .0031.
26
Example 3
Two variables important to a professional football player
are speed and strength. Each year, camps are held to
determine potential players’ speed and strength, both of
which are continuous, normally-distributed, and
independent of each other. The middle 95% of strength
scores is bounded by 600 and 900 (on a composite
strength index). The average time to run 40 yards is 4.6
seconds, and 40 yard time exceeds 6 seconds only 5% of
the time.
a. In order to be considered by a team, a potential player
must not exceed the 75th percentile for time to run 40
yards. What is the slowest a player can run 40 yards and
still be considered?
27
.25
.45
4.6
X
6
seconds
Probability distribution for time to run 40 yards (seconds)
28
Example 3
Z(.45) = 1.645 =
6 – 4.6
σ
σ = 6 – 4.6 = .851
1.645
29
Example 3
Now we can find X (the 75th percentile):
Z(.25) = 0.675 = X – 4.6
.851
X = 0.675 * (.851) + 4.6 = 5.15 (seconds)
30
4.6
5.15
6
seconds
The 75th percentile for 40 yard times is 5.15 seconds.
31
Example 3
Two variables important to a professional football player
are speed and strength. Each year, camps are held to
determine potential players’ speed and strength, both of
which are continuous, normally-distributed, and
independent of each other. The middle 95% of strength
scores is bounded by 600 and 900 (on a composite
strength index). The average time to run 40 yards is 4.6
seconds, and 40 yard time exceeds 6 seconds only 5% of
the time.
b. You take a random sample of 200 potential players.
What is the probability that the average strength score of
the sample is less than or equal to 740?
32
.45
600
.45
µ
750
900
Probability distribution for strength scores
33
Example 3
Z = 1.645 = 900 – 750
σ
σ = 900 – 750 = 91.19
1.645
Z=
740 – 750 = -1.55
91.19/√200
34
Example 3
P (Z < 1.55) = .4394 (From table)
Tail probability will be .5 – .4394 = .0606
35
This is the sampling
distribution of mean
strength scores for
samples with n = 200
.0606
740
750
What is the probability that the mean for a sample of
200 players is less than this value?
36
Confidence Interval Example
A researcher samples 36 undergraduates from a
local university and finds it took them 36.4 days,
on average, to find a job, with a standard
deviation of 8 days. Use these data to form a 96%
confidence interval for the true mean time it takes
for graduates to find a job.
NOTE: We are not given the population standard deviation
37
Confidence Interval Example
Recall:
C.I. = X ± Z/2 sX = X ± Z/2 s/√n
X = 36.4
S=8
n = 36
S/√n = 8/6 = 1.33
38
Confidence Interval Example
(1-)% = 96%, so /2 = .02 – this is the tail
probability.
We get /2 = .02 when we look up Z.48 = 2.05
C.I. = 36.4 ± 2.05 (1.33)
(33.67 ≤ µ ≤ 39.13)