Chapter 7: The Distribution of Sample Means

Download Report

Transcript Chapter 7: The Distribution of Sample Means

COURSE: JUST 3900
INTRODUCTORY STATISTICS
FOR CRIMINAL JUSTICE
Chapter 7:
The Distribution of Sample Means
Instructor:
Dr. John J. Kerbs, Associate Professor
Joint Ph.D. in Social Work and Sociology
Samples and Populations



Samples provide an incomplete picture of the
population.
There are aspects of the population that may not be
included within a sample.
The sampling error is the natural discrepancy (i.e.,
the difference), or amount of error, between a sample
statistic and its corresponding population parameter.
 The sampling error is the measure of the
discrepancy (i.e., difference) between the sample
and the population.
A Sampling Distribution

A Sampling Distribution is a distribution of
statistics obtained by selecting all of the
possible samples of a specific size (n) from
a population.
The Distribution of Sample Means

The Distribution of Sample Means is defined
as the set of sample means for all of the
possible random samples of a particular size
(n) that can be selected from a specific
population.
Often called the Sampling Distribution of M
 This distribution has well-defined (and predictable)
characteristics that are specified in the Central
Limit Theorem

The Distribution of Sample Means

The three characteristics of the Distribution of Sample
Means
 1. Sample means should pile up around the population
mean
 2. The pile of sample means should tend to form a normalshaped distribution. They should pile up in the center of
the distribution (around μ) and the frequencies should
taper off as the distance between M and μ increases.
 3. In general, the larger the sample, the closer the sample
means should be to the population mean (μ).



Larger samples are more representative of the population than
smaller samples
Sample means obtained with large samples (i.e., a large n) should
cluster relatively close to the population parameter
Means obtained by small samples should be more widely scattered
The Central Limit Theorem
 The
Central Limit Theorem is defined as
follows:
 For any population with a mean (μ) and
standard deviation (σ), the distribution of
sample means for sample size n will have a
mean of μ and a standard deviation of σ
and will approach a normal distribution as
n approaches infinity (∞).
The Central Limit Theorem
1. The Expected Value of M is the mean of the distribution
of sample means and the Expected Value of M is always
equal to the mean of the population of scores (μ).
2. The shape of the distribution of sample means tends to
be normal. It is guaranteed to be normal if either a) the
population from which the samples are obtained is
normal, or b) the sample size n ≥ 30.
3. The standard deviation of the distribution of sample
means is called the Standard Error of M (σM) and is
computed by the following:
The Expected Value of M
If two (or more) samples are selected from the
same population, the two samples probably
will have different means.
 Although the samples will have different
means, you should expect the sample mans to
be close to the population mean.
 The mean of the distribution of the sample of
means is equal to the mean of the population
of scores (μ): that is the expected value of M.

Standard Error of M



The standard error (also known as the standard deviation of
the distribution of sample means, σM) provides a measure of
the average distance between M (sample mean) and μ
(population mean).
σM describes the distribution of sample means (variability)
 σM shows how much distance is expected between M and μ
Law of large numbers: The larger the sample size (n), the
more probable or likely it is that M is close to μ.
 Inverse relationship: the larger the sample size, the smaller
the stander error.
 Small standard errors indicate that sample means are close
together (large standard errors indicate that means are
scattered over a large range with larger difference from one
sample to another)
The Standard Error of M
The standard error of M is defined as the
standard deviation of the distribution of sample
means and measures the standard distance
between a sample mean and the population
mean.
 Thus, the Standard Error of M provides a
measure of how accurately, on average, a
sample mean represents its corresponding
population mean.

The Standard Error of M
Consider the changes in Standard
Error of M as n increases from 1 to 4
and then to 100 for a normal
population with a mean of 80 (μ=80)
and a standard deviation of 20 (σ=20)
Do NOT confuse
“standard deviations”
with
“standard errors”
Difference Between Standard Deviations
and Standard Errors



Standard Deviation measures the distance between a score
and the population mean
 X-μ
The Standard Error measures the distance between a sample
mean and the population mean
 M–μ
The Standard Error (σM) is the same as the Standard
Deviation for n = 1
 Note: there is only one population mean
Probability and Sample Means
Because the distribution of sample means
tends to be normal, the z-score value obtained
for a sample mean can be used with the unit
normal table to obtain probabilities.
 The procedures for computing z-scores and
finding probabilities for sample means are
essentially the same as we used for individual
scores

Probability and Sample Means
(cont'd.)
However, when you are using sample means,
you must remember to consider the sample
size (n) and compute the standard error (σM)
before you start any other computations.
 Also, you must be sure that the distribution of
sample means satisfies at least one of the
criteria for normal shape before you can use
the unit normal table:

1. the population from which the samples are
obtained is normal, or
 2. the sample size (n) is 30 or more.

z-Scores and Location within the
Distribution of Sample Means

Within the distribution of sample means, the
location of each sample mean can be specified
by a z-score:
(M – μ)
z = ───── or
σM
z =
(M – μ)
─────
(σ/√n)
z-Scores and Location within the
Distribution of Sample Means
(Continued)
As always, a positive z-score indicates a
sample mean that is greater than μ and a
negative z-score corresponds to a sample
mean that is smaller than μ.
 The numerical value of the z-score indicates
the distance between M and μ measured
in terms of the standard error.

Distribution for Sample Means
(n = 25, μ = 500, σ = 100)
A score of 540 is two
standard errors above
the mean (z=+2.00),
which is very unlikely
(see Unit Normal Table
for z = +2.00,
p = 0.0228)
2.28%
More Thoughts on
Standard Error


Standard errors are nothing more than measures of reliability.
Vogt (2005, p. 274) defines reliability as follows:
 Freedom from measurement (random) error. In practice,
this boils down to consistency or stability of a measure or
test or observation from one use to the next. When
repeated measures of the same thing give highly similar
results, the measurement instrument is said to be reliable.
 Small standard errors indicate that sample means are
close together and so researchers can be fairly
confident that an individual sample mean can act as a
reliable measure of the population mean
 Large standard errors indicate problems with reliability