Sample Design

Download Report

Transcript Sample Design

 No matter how sophisticated the
statistical techniques may be,
reasonably accurate inferences
about a population’s characteristics
can’t be made without a sample that
is:
 representative of the population
 and is large enough.
 What is inferential statistics?
Producing Data

Anecdotal evidence: based on haphazardly selected
cases which often come to our attention because they
are striking in some way.

Available data: data that were produced in the past
for some other purpose but that may help answer a
present question.
 Statistical vs. non-statistical designs for producing
data: e.g., statistical vs. ethnographies or other case
studies.
 Statistical designs for producing data: these rely
on either experiments or observation based on
sampling.
 Sampling: selects part of a population (a
sample) to gain information about the whole
population.
 A sample survey is a kind of observational
study because it does not attempt to impose
treatments that influence responses.
 Census: attempts to collect data on every
individual in a population.
 What are the advantages of a sample survey
versus a census?
 Experiments: deliberately impose some
treatment on individuals to observe their
responses.
 Experimental design, when carried out
correctly, is the most effective of all
research designs in controlling the effects of
lurking variables. Why?
 Because experimental design is
intrinsically comparative.
But experimental results typically
cannot be generalized. Why not?

The Principles of
Experimental Design
 The main principles of experimental
design are control, randomization, &
replication.
 A fundamental research challenge is to
minimize bias.
 Bias: systematic over or under
estimation of values—the systematic
favoring of higher or lower values.
 Bias usually can’t be detected by
inspecting a given data set by itself.
Why not?
 Properly done, experimental design is
the most effective way of minimizing
bias.
 Control: comparative design makes
it more likely—when carried out
correctly—that influences other than
the experimental variable operate
equally on all experimental units (or
subjects).
 That is, it makes more it likely that
lurking variables operate equally on all
experimental units.
 Control, then, reduces bias.
Randomization (i.e. randomized
assignment): enables impersonal
chance to allocate experimental units
(or subjects) to the treatment or
control groups.

 If done correctly, it makes more likely
that influences other than the
experimental variable operate equally
on all experimental units (or subjects).
 Randomization, then, reduces bias,
as does comparative design.
Replication: make sure that the
number of experimental observations is
large enough to permit a result that is
statistically significant (i.e. an
observed effect so large that it would
rarely occur by chance).

 Why, though, can’t experimental results
typically be generalized?
 Because a properly conducted
experiment randomizes assignment to
the treatment and control groups, but
it does not randomly sample from a
wider population.
Summary: Experimental Design
 What are the basic features of
experimental design?
 How is experimental design intrinsically
comparative?
 In what ways is it the most effective
research design?
 What is statistical significance (versus
practical or theoretical significance)?
Potential Limitations of
Experimental Designs
 What are the potential limitations of
experimental designs?
 What critical questions need to be asked
when assessing an experimental study?
The Principles of Sample
Design
Sample Design
 Population: the entire group of individuals (i.e.
phenomena or entities) that we are trying to
understand.
 Sample: the part of the population that we actually
examine in order to obtain information.
 Sample design: the method used to choose the
sample from the population. Inadequate sample
design & execution lead to misleading conclusions by
creating bias in the selection of observations.

There are two kinds of error:
(1) Bias: non-random (i.e. systematic)
error—favors either higher or lower values.
(2) Variability: random error—just as likely
to involve higher as lower values & thus
has neutral affect on central tendency.

We want to minimize variability, but bias
is a greater concern: Why?
The Two Basic Types of Samples
 Voluntary (i.e. non-probability) sample: consists of
people who choose themselves by responding to a
general appeal. Not a probability sample & thus
highly biased.
 Probability sample: chosen by impersonal chance.
We must know what samples are possible & what
chance, or probability, each possible sample has. If
conducted correctly, bias is minimized.
Basic Types of Probability Sampling
Design
 Sampling relies on a sample frame: a list of
elements from which a probability sample is
drawn.
 Simple random sample: consists of n
individuals from the population chosen so that
every set of n individuals (i.e. units) has an equal
chance to be the sample actually selected.
 Systematic random sample: randomize the list
of elements to choose from; select a random
number between 1 and 10; and then choose every
kth element for inclusion in the sample. The result
is virtually identical to random sampling.

Stratified random sample:
(1) Divides the population into groups of
similar individuals (i.e. phenomena or
entities), called strata.
(2) Chooses a separate simple random sample
within each stratum & combines them to
form the full sample.
 Stratified random sampling is based on
an exhaustive list of the target
population.
 The sample is proportional if the
proportions of the sample chosen in the
various strata are the same as those existing
in the population.
 Required statistical correction: see Stata
manual.
 According to Freedman et al., Statistics, we
shouldn’t exaggerate the benefits of
stratification in reducing a sample’s variance.

Multistage cluster sample:
(1) Selects successively smaller groups (such as
geographic units) within the population in stages,
resulting in a sample consisting of clusters (i.e.
groups) of individuals (i.e. phenomena or entities).
(2) Samples from the clusters (i.e. not all clusters
end up providing samples observations for the
study).
(3) Samples observations only from within the
sample-selected clusters.
 Each stage in a multi-stage cluster sample
may employ a simple random sample, a
stratified random sample, or some other
type of sample.
 Sample observations are drawn from
within the sample-selected clusters
only, not from within every cluster.
 Cluster sampling is used when it’s
impossible or impractical to compile or
observe an exhaustive list from the target
population.
Stratified vs. Cluster Sampling
 In stratified sampling, a sample is drawn
within every stratum, and the strata are
the groups compared.
 In cluster sampling, clusters are ways of
identifying groups of observations, but a
sample is not drawn from within every
cluster.
 See, e.g., Agresti & Finlay, Statistical
Methods for the Social Sciences, pages 2629.
 Multistage cluster sample example – U.S.
Census:
 Divide U.S. into geographic areas within
states: primary sampling units (PSUs).
 Divide each PSU into smaller geographic units-census blocks—then stratify the blocks by
ethnic and other data.
 Take a stratified sample of census blocks.
 Sort each census block’s housing units into
clusters of four nearby units.
 Interview the households in a probability
sample of these clusters.
 While a multistage cluster sample is the most
commonly used sophisticated sample, it involves a
key statistical problem: the observations within
each sampled cluster tend to be more alike than
are the observations between the sampled
clusters.
 This is because such sampling violates an
assumption of inferential statistics: that the
individuals (i.e. units or observations) are sampled
not only randomly but also independently from each
other.
 This reduces the amount of statistical information
about the sample’s variability.
 Required statistical correction: In Stata, type
‘help svy’.
Sources of Non-Probability Bias in
Probability Samples
Even well-designed & well-executed
probability samples can suffer from bias
due to non-probability problems of:
 Undercoverage: the sample frame (i.e.
list of elements from which the probability
sample is drawn) does not adequately cover
all relevant categories of elements.
 Nonresponse: especially if non-randomly
distributed.
 Response bias: due to traits/behavior of
the interviewer/researcher or
respondent/subject
 Poorly worded questions or problems
due to order of questions
Descriptive vs. Inferential Statistics
 Descriptive statistics: summarizes the
data.
 Inferential statistics: makes inferences
from a sample to a population.
Toward Statistical Inference
Statistical inference: based on impersonal chance,
we use data on sampled individuals (i.e. phenomena
or entities) to infer conclusions about the wider
population.
 Parameter: a number that describes a
population. It is a fixed number, but in practice we
usually don’t know its value.
 Statistic: a number that describes a sample. A
statistic’s value is known when we’ve taken a
sample, but it can vary from sample to sample. We
often use a statistic to estimate an unknown
parameter. This is called inferential statistics.
 A parameter is what we want to know about
in a population: e.g., we want to know about
the cholesterol levels of adults between the
ages of 25 & 64 in South Florida.
 A statistic is what we’ll learn from a random
sample: e.g., the cholesterol levels of
randomly sampled adults between the ages
of 25 & 64 in South Florida.
 Sampling variability: the value of a statistic
varies with repeated random sampling of the
same size from the same population.
 All of statistical inference is based on one
idea: to see how trustworthy a procedure
is, ask what would happen if it were
repeated independently many times.
See Freedman et al., Statistics.
That is, ask what would happen if we took
many independent random samples of the
same size from the same population.
 Take a large number of random samples of the same
size from the same population.
 Compute the mean for each sample.
 Make a histogram of the values of the sample
means.
 Examine the distribution’s shape, center, & spread.
 E.g., a medical researcher wants to
estimate the cholesterol levels of South
Florida adults ages 25-64.
 Let’s say that, as a start, the researcher
measures the cholesterol values of 500
randomly sampled adults ages 25-64 (based
on their places of residence & conducting a
door-to-door random-sample survey).
 This, however, is only one sample.
 What would happen if we repeated
the random sample independently
over & over again with the same
size & in the same population?
 The sampling distribution of a statistic is
the distribution of values taken by the statistic
in all possible random samples of the same
size from the same population.
 It’s a way of conceptualizing what
distribution would emerge if we could see all
possible samples of the same size-n from the
same population.
 Regarding the cholesterol study, what if
the researcher could draw all possible
random samples of n=100 from South
Florida’s population of adults ages 25-64?
 What would be the shape, center, &
spread (including outliers) of the resulting
distribution of cholesterol means from each
of the random samples, lined up together?
 Thinking about sampling distribution
helps us to understand what we’re trying
to accomplish in drawing one or more
actual random samples of size-n from the
same population: big picture versus little
picture.
 Later in the course we’ll see its crucial
importance for tests of statistical
significance.
 How do we begin to connect the little
picture of the actual random samples of
size-n to the big picture of the conceptual
ideal of the sampling distribution?
 We do so by using a histogram to
describe the means of each actual
random sample of size-n, lined up
together.
 E.g., regarding the cholesterol levels of the
South Florida population of adults ages 2565, the researcher obtains a huge amount of
funding to take 1000 random samples of size
100.
 The variable, of course, is cholesterol
value: in each random sample of size 500,
what’s the mean cholesterol level?
 Each time we take a random sample of
100, we compute the sample’s mean &
standard deviation for cholesterol level.
 And as we accumulate samples, we
examine them together in one histogram
to find out how much the sampled means
of cholesterol either converge toward the
center or spread out.
 At last, we have our total of 1000 random
samples of size 100 from the South Florida
population of adults ages 25-64.
 According to the histogram, what’s the
shape, center, & spread (including outliers)
of the mean number of high-cholesterol
persons for each of the 1000 random
samples of size 100, lined up together?
 In short, we draw a random
sample to obtain a statistic that
we’ll use to estimate the unknown
parameter (i.e. the unknown
population value).
 Our objective is not to do
descriptive statistics, but rather to
do inferential statistics.
 Before we move on, what
could we do to make the
distribution of sample means
less variable (i.e. reduce the
distribution’s standard
deviation)?
 Make sample size-n notably
larger.
 Let’s say that he researcher
manages to do a survey of 100, but
then gets funding to re-do the survey
increasing the sample size-n to 2500.
 What’s the difference in sample
mean variability between the size 100
samples & the size 2500 samples?
Two Fundamental Problems
 We have to contend with two
fundamental problems in drawing
samples & making inferences:
 Bias
 Variability
Bias
 Bias means that a measurement
systematically underestimates or overestimates
a parameter.
 A statistic used to estimate a parameter is
unbiased if the mean of its sampling
distribution is equal to the true value of the
parameter being estimated.
 Chance errors change from measurement to
measurement, sometimes up & sometimes down.
Chance errors are random errors—and thus don’t
affect a distribution’s central tendency.
 Bias, however, affects all the measurements in
the same direction, either up or down. Bias, then,
is systematic error, which pushes a distribution’s
central tendency either up or down.
What would be evidence of chance
(i.e. random) errors in the cholesterol
study?

 What would be evidence of bias in
the cholesterol study? What might be
sources of such bias?
Chance (i.e. random) error is
also called sampling (i.e. nonsystematic) error.

 Bias is also called nonsampling (i.e. systematic)
error.
 See King et al., Designing
Social Research, on bias.
Variability
 The variability of a statistic is described
by the spread of its sampling distribution.
This spread is determined by the sampling
design & the sample size n. Statistics from
larger samples have smaller spreads.
 Variability refers to whether the estimated
values fall within a relatively narrow range or
are more widely scattered.
 Bias & variability
See King et al. on bias versus
variability (or ‘efficiency’).

Managing Bias & Variability

To reduce bias use random sampling.
 To reduce the variability of a
statistic in simple random sampling, use
a (much) larger sample.
 A large random sample almost always gives
an estimate that is close to the parameter: the
larger, the better.
 What matters is the sample size, not the
population size: The variability of a statistic from
a random sample does not notably depend on
the size of the population.
 According to Moore/McCabe/Craig, this is true, in
the strictest sense, as long as the population is at
least 100 times larger than the sample, but for
basic purposes it is true in general.
 The larger the random sample, the less
variable—i.e. the more precise—the sample
statistic will be.

Freedman et al., Statistics, say: “When estimating
percentages, it is the absolute size of the sample which
determines accuracy, not the size relative to the population.
This is true if the sample is only a small part of the
population, which is the usual case” (p. 367).
 There is a very tiny difference, which the finite
population correction factor (fpc) can compensate for:
perhaps use when the sample is a large share (say, 3040+%) of the population—but using it can cause extra
uncertainty for inferring the sample’s results to a
population (Stats http://www.childrensmercy.org/stats/size/population.asp; and Carolina Population
Center
http://www.cpc.unc.edu/services/computer/presentations/statatutor
ial/example29.htm).
 So use fpc only when descriptive precision, rather
than inference, is the priority.
 See Freedman et al., pp. 367-370; and UCLA-ATS
http://www.ats.ucla.edu/STAT/stata/seminars/svystataintro/d
fpc = square root of (N – n/N – 1)
N=population size
n=sample size
 “If fpc is close to 1, then there is almost
no effect. When fpc is much smaller than 1,
then sampling a large fraction of the
population is indeed having an effect on
precision” (Stats http://www.childrensmercy.org/stats/size/population.asp).
 In Stata: ‘help svy’; & see Stata manual
for svy commands
 The fpc for different situations:
Table’s examples: “When the sample size is 50, it
does not matter much whether the population is 10
thousand or 10 million. When the sample size,
however, is four thousand, then we have about
23% more precision with a population of ten
thousand than we would for a population of ten
million” (Stats http://www.childrens-
mercy.org/stats/size/population.asp).

To repeat, possibly use fpc when sample is a large share of
population (see Carolina Population Center,
(http://www.cpc.unc.edu/services/computer/presentations/statatutorial/exa
mple29.htm).
 But, recall, using it may cause extra uncertainty if you seek
to infer the sample’s results to a population.
 And keep in mind that the statistical difference is slight, so
it’s generally not a big deal: use fpc only when the priority
is descriptive precision rather than inference.
 Returning to the general issue: Freedman et al.
acknowledge that the relationship of sample size and
accuracy to population is counterintuitive.
 Helpful analogy: “Every cook knows that it only takes a
single sip from a well-stirred soup to determine the taste"
(Stats http://www.childrensmercy.org/stats/size/population.asp).
 Remember: the larger the
random sample, the less
variable—i.e. the more
precise—the statistic.
 Required sample size has
virtually nothing to do with
population size.
 The basic rule, then: obtain
the largest random sample
possible.
Why Use Randomized Assignment or a
Random Sample?
 Randomized assignment or a random sample
guarantee that the results of analyzing our data are
subject to the laws of impersonal probability.
 Nonetheless, keep in mind that proper
statistical design is not the only aspect of a
good sample or experiment. The sampling
distribution says nothing about possible bias
due to non-sampling problems such as
undercoverage, nonreponse, or response bias.
 Consequently, the true distance of
a statistic from the parameter it is
estimating can be much larger than
the sampling distribution suggests.
 Moreover there is no way to say
how large the added error is.
These are non-probability sources
of the fact that conclusions are
inherently uncertain.
See King et al. on the
importance of reporting
statistical uncertainty.

Review
 What’s randomized assignment and random
sampling? What are their purpose?
 What are the principal kinds of research design?
 What are the procedures in each kind?
 What are the advantages & disadvantages of each
kind?
 What’s a population, a sample & a sample design?
 What’s measurement error? What’s sampling
variability?
 What’s the basic approach of statistical inference?
 What’s statistical significance?
 What’s a population distribution?
 What’s bias? What’s variability?
 How can we reduce bias? How can we reduce
variability, & what’s the relation of such action to
population size?
 What are the basic kinds of sample? What are the
advantages & disadvantages of each kind?
 Besides lack of randomness, what other problems
can bias a statistic?
Example: Sampling
 FIU’s administration hires you to find out
the proportion & characteristics of FIU
students who are principal caregivers for
children or elders.
Will you do a census or survey, &
why?

 Assuming that you do a survey,
what kind of sample design do you
choose, & why?
What sampling-based problems may
cause bias or unacceptable variability?
What will you do about them (or would
you like to do about them)?

 In Stata, type ‘help svy’ to inspect a
suite of survey-statistic adjustments.
 See Stata manual.
How to draw a random sample with Stata
 Draw a 50% random sample
. use hsb2, clear
. set seed 123 [to make the sample replicable]
. summarize
. sample 50
. summarize, detail
. Use, e.g., histogram & boxplot to graph the
sampled observations.
 Draw a random sample of 50 observations
. use hsb2, clear
. summarize
. set seed 123 [to make the sample replicable]
. sample 50, count
. summarize, detail
. Use, e.g., histogram or boxplot to graph the
sampled observations.