Transcript RA_sampling

Definitions
• Population: the entire group to which we wish to
project our findings
• Sample: the subgroup that is actually measured
• Unit of analysis: that which “contains” the
variables under study
• Case: single occurrence of a unit of analysis
• Coding: assigning a measurement to a variable
Sampling
• Population: Every member, “case” or element of
the group to which your findings are intended to
apply
– Sampling frame: A list that contains each
element
• Sample: A subset of “cases” or elements selected
from a population
Sampling error
• Differences between the characteristics of the
sample and the population
• Decreases as size of sample increases
• Rule of thumb: number of cases in the smallest
group or subgroup to be separately measured,
tested or compared should be at least 30
Representative sampling
• Selecting members so that characteristics of
the sample accurately reflect the
characteristics of the population
– Purpose: To be able to generalize from the
sample to the population
– Limitation: Can only generalize to the
population from which the sample was drawn
Probability sampling
• Each element or “case” in the population has an equal
chance to be selected and become a part of the sample.
– If the sampling frame is 5 and we draw two from a hat, each
element’s probability of being selected is 2/5 (.20) on the first
draw.
• Sampling with replacement: During selection, drawn
elements are returned to the population. This keeps the
probability of any element being drawn the same but
makes duplicate draws possible.
– On the second draw, each remaining element’s probability of
being drawn is 1/5 (.20).
• Sampling without replacement: During selection,
drawn elements are not returned to the population
– On the second draw, each remaining element’s
probability of being drawn is ¼ (.25).
• Sampling without replacement is most
common since most sampling frames are
sufficiently large so that as elements are
drawn, changes in probability are small
Probability sampling techniques
• Simple random sampling: Each element and
combination of elements has the same
chance of being selected
Bring out the chips!
Distribution of sentences for known
population of inmates
N = 200 inmates, mean sentence 2.94 yrs.
90
82
80
70
60
50
38
40
30
38
24
18
20
10
0
1
2
3
Sentence in years
4
5
Distribution of sentences - Property crimes
(known population parameters)
80
n = 150 inmates, mean sentence 2.88
years
N = 200 inmates, mean sentence 2.94
70
70
60
50
40
28
30
22
20
18
12
10
0
1
2
3
Sentence in years
4
5
Distribution of sentences - Violent crimes
(known population parameters)
n = 50 inmates, mean sentence 3.12 yrs.
N = 200 inmates, mean sentence 2.94 yrs.
18
16
16
14
12
12
10
10
8
6
6
6
4
2
0
1
2
3
4
Sentence in years
5
• Stratified random sampling: Divide
population into categories (strata) and
randomly sample within each
– Proportionate: Number of elements in each
category is proportionate to that category’s
representation in the population.
Stratified proportionate random sampling
Does the cynicism of female and male patrol officers differ?
Sin City
200 patrol officers
150 male (75 %)
50 female (25 %)
randomly select 30 officers
expect 22.5 males
expect 7.5 females
Compare average cynicism score
within each strata
Is there a problem?
• Disproportionate sampling: Randomly
drawing a disproportionate number of
elements from a specific strata whose
overall representation is low.
Stratified disproportionate random sampling
Does the cynicism of female and male patrol officers differ?
Sin City
200 patrol officers
150 male (75 %)
50 female (25 %)
randomly select 30 cases
from each category
30 males
30 females
Compare average cynicism score for each strata
(Note: cannot combine results to get department score)
Sampling exercise, Sin City
Research question: Is there more likely to be a personal relationship
between suspect and victim in violent crimes or in crimes against
property?
You have full access to crime data for “Sin City” in 2004.
These statistics show there were 200 crimes, of which 75
percent were property crimes and 25 percent were violent
crimes. For each crime, you know whether the victim and
the suspect were acquainted (yes/no).
1.
2.
3.
4.
Identify the population.
How would you sample?
Would you stratify? How?
Do it two ways – using proportionate and disproportionate techniques.
Which is better? Why?
Stratified proportionate random sampling
Is there more likely to be a personal relationship between suspect and
victim in violent crimes or crimes against property?
Sin City
200 crimes in 2004
50 violent (25 %)
150 property (75 %)
randomly select 30 cases
(15% of the population)
(expect 7.5 violent – 25%)
(expect 22.5 property – 75%)
Compare proportions of these cases
where suspects knew the victim
Stratified disproportionate random sampling
Is there more likely to be a personal relationship between suspect and
victim in violent crimes or crimes against property?
Sin City
200 crimes in 2003
50 violent (25 %)
150 property (75 %)
randomly select 30 cases
from each category
30 violent
30 property
Compare proportions within each where suspect and
victim were acquainted
(Note: cannot combine results)
Jay’s cynicism reduction program
Hypothesis: Training reduces cynicism
The Anywhere Police Department has 200 patrol
officers, of which 150 are males and 50 are
females. Jay wants to conduct an experiment
using control groups to test his program.
1.
2.
3.
4.
Identify the population.
How would you sample?
Would you stratify? How?
Is it better to use proportionate or disproportionate
techniques. Why?
Stratified disproportionate random assignment
Does Jay’s Cynicism reduction program work?
Hypothesis: Training reduces cynicism
population:
200 patrol officers
150 males (75%)
50 females (25%)
CONTROL
GROUP
EXPERIMENTAL
GROUP
EXPERIMENTAL
GROUP
CONTROL
GROUP
Randomly Assign
25 Officers
Randomly Assign
25 Officers
Randomly Assign
25 Officers
Randomly Assign
25 Officers
For each group, pre-measure dependent variable officer cynicism
Apply the intervention (adjust the value of independent variable – Jay’s program.)
NO
YES
YES
NO
For each group, post-measure dependent variable officer cynicism
Compare within-group changes – do they support the hypothesis?
Quasi-probability sampling
• Systematic sampling: Randomly select first element, then
choose every 5th, 10th, etc. depending on the size of the
frame.
– Problem: Sampling list that is ordered in a particular way could
result in a non-representative sample
• Cluster sampling: Divide population into equal-sized
groups (clusters) chosen on the basis of a neutral
characteristic, then draw a random sample of clusters. The
study sample contains every element of the chosen clusters.
– Often done to study public opinion (city divided into blocks)
– Rule of equally-sized clusters usually violated
– The “neutral” characteristic may wind up being an extraneous
variable and affect outcomes!
– Since not everyone in the population has an equal chance of being
selected, there will be sampling error
Nonprobability sampling
• Accidental sample: Subjects who happen to be encountered
by researchers
– Example – observer ridealongs in police cars
• Quota sample: Elements are included in proportion to their
known representation in the population
• Purposive/“convenience” sample: Researcher uses best
judgment to select elements that typify the population
– Example: Interview all burglars arrested during the
past month