week 2 powerpoint lecture

Download Report

Transcript week 2 powerpoint lecture

HDFS 361—Research Methods
Week 2:
Levels of Measurement
and Sampling
Types of Studies
Descriptive studies
• These studies describe the results for the
participants in the study.
Inferential studies
• These studies seek to generalize beyond the
participants to a specified, larger population.
Sample and Population
Population
• A population includes the universe of people or
groups about whom we are interested.
Sample
• A sample is a subset of a population.
• If a sample is representative of the population
from which it was drawn, we can make an
inference from the sample to the population.
Criteria for Levels of Measurement
• Mutually exclusive—each observation is assigned a
single value or label.
• Exhaustive—every observation is classified (measured),
even if assigned to a category called “other.”
• Ordered—observations are ranked or ordered on how
much of the characteristic they have.
• Equal appearing intervals—an equal difference
between values corresponds to an equal difference on the
characteristic being measured.
• Meaningful zero point—a value of 0 corresponds to the
absolute absence of the characteristic being measured.
Level of Measurement and Measurement
Criteria: The Traditional Approach
Level of
Measurement
Measurement Criteria
Mutually
Exclusive
Exhaus- Ordered
tive
Equal
Intervals
Meaningful
zero
Yes
Ratio
Yes
Yes
Yes
Yes
Interval
Yes
Yes
Yes
Yes
Ordinal
Yes
Yes
Yes
Nominal
Yes
Yes
Examples
Age
Scales
Religiosity
Marital
Status
Nominal Variables—Frequency
Distributions
Marital status|
Freq.
Percent
Cum.
--------------+----------------------------------married |
1,269
45.90
45.90
widowed |
247
8.93
54.83
divorced |
445
16.09
70.92
separated |
96
3.47
74.39
never married |
708
25.61
100.00
---------+-----------------------------Total |
2,765
100.00
Ordinal Ranks--Median
• The median is the value of the case in the middle.
• Rank observations. If we had two children who were tied
at the 3rd rank, we would give both of them a rank of 3.5.
This is because the pair of cases occupies both the 3rd
and the 4th ranks. The average of 3 and 4 is, . The next
person higher on the scale would have the rank of 5,
resulting in rankings of 1, 2, 3.5, 3.5, 5, 6, 7, and 8.
• If we had 3 people tied for most aggressive (in addition to
the 2 tied for third), our rankings would be (1, 2, 3.5, 3.5,
5, 7, 7, 7).
• The three highest-ranking children occupy the 6th, 7th, and 8th
ranks.
• In sporting events they try to be nice and give tied contestants
the highest rank they can.
Ordinal categories—Frequency
Distribution
Health
|
Freq.
Percent
Cum.
------------+----------------------------------excellent |
568
30.75
30.75
good |
854
46.24
76.99
fair |
322
17.43
94.42
poor |
103
5.58
100.00
------------+----------------------------------Total |
1,847
100.00
Ordinal Categories—Bar Charts
Distribution of Health Status for U.S. Adults
50
General Social Survey 2002
30
20
30.75
10
17.43
5.577
condition of health
Po
or
Fa
ir
oo
d
G
Ex
ce
l
le
nt
0
Percent
40
46.24
Nominal Level—Bar Charts
Distribution of U.S. Adults on Marital Status
30
20
d
ar
rie
er
m
ne
v
se
pa
ra
te
d
di
v
or
c
ed
ed
w
id
ow
m
ar
ri
0
ed
0
10
Percentage
40
50
General Social Survey 2002
Marital Status
Interval/Ratio Level—Underlying
Continuum
Jim
Sue Joe
Underlying Continuum
Chandra
Interval/Ratio Level
•
•
•
•
We can use most statistics and graphs
Means, standard deviations
Histograms and other charts
We will cover these later in the course
Data Collection—Random Sample
• Simple Random sample means everybody has the
same chance of selection.
• Assumes sampling with replacement, but this is
rarely used in practice.
• Need a list of the entire population to do a random
sample and this is often hard to obtain.
Using Stata to Select Random Sample
of 1000 People from a Population of
+-------+
15,000
set obs
15000
gen id = _n
sample
1000, count
list id in
1/10
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
|
id |
|-------|
| 5546 |
| 4530 |
| 6419 |
| 5622 |
| 8877 |
|-------|
| 3867 |
| 10748 |
| 6179 |
| 11602 |
|
361 |
|-------|
Sample size and Sampling Error
Sample N
20
50
100
200
500
1,000
1,600
10,000
Sampling Error
21.91%
13.86%
9.80%
6.93%
4.38%
3.10%
2.45%
0.98%
Graphic of 15 Confidence Intervals, n= 500,
True proportion in Population = .48
This is the
only one that
misses
True proportion
is .48
.33
.35
.37
.39
.41
.43
.45
.47
.49
.51
.53
.55
.57
.59
Estimating Confidence Interval for
Proportion
pq
p  1.96
n
or
F
I
G
J
H K
pq N  n
p  1.96
n
N
Stratified Sample
• By dividing the population into two or more
strata, each of which is homogeneous, we can
conduct a random sample of each stratum and
then pool the results.
• This is more powerful than a simple random
sample to the extent the strata are homogeneous.
• Rather than taking a random sample of the entire
population, a stratified sample could be used to
take a random sample of each stratum.
Stratified Samples
Two Normal Distributions
MEN
-2
WOMEN
4
7
x
13
Cluster Sample
• Cluster sampling is sometimes confused with
stratified sampling, but it has a different purpose. If
our population is geographically dispersed, we can
often save a great deal of time and money by
dividing the population into geographical clusters,
randomly sampling the clusters
• Census data can be used on any city in the U.S. to
list every city block (usually commercial blocks are
excluded). We could then take a sample of blocks
(sampling units) and interview all or some of the
households in each block we included in our sample
of blocks.
Cluster Sample
• A person interested in morale of elementary school teachers in
a large school district could obtain a list of elementary schools
(sampling units) and sample 10 percent of the schools.
• If your clusters are blocks, you can send an interviewer to a
selected block. Once there the interviewer can go to the first
house. If nobody is home, the interviewer can go to the next
selected house, and so on.
• Sampling HDFS students by randomly sampling 20 sections
from the class schedule, then giving the instrument to
everybody in the selected sections.
Nonprobability Samples—Quota
Sample
• Quota Sampling tries to be representative by
sampling a reasonable number of certain groups.
• We might sample 100 women and 100 men for a
200 person sample. This would make the sample
representative on gender.
• This approach is better than nothing, but should
not be confused with a probability sample. We
may represent the gender and racial distribution of
our population, but without probability sampling,
we should be hesitant to generalize to the
population.
Nonprobability Samples—Snowball
• Snowball Sampling is an approach used for
rare populations.
• What if you wanted to interview lesbian
couples? It is practically impossible to get a
sampling list of lesbian couples.
• You could go to a gay and lesbian group and
interview people, but you would then be
limiting yourself to lesbians who are activists.
Nonprobability Samples—Snowball
• When you interview a lesbian who is in the group
you ask her to share with you the name of other
lesbians who are not in the group. When you
interview them, you ask them to give you the name
of still other lesbians.
• Several “points of entry” are important
• PFLAG would give you gays/lesbians whose parents
were supportive
• Gay and Lesbian groups would give you gays/lesbians
whether their parents were supportive or not.
• Snowballing would give you gays/lesbians who were
“out.” IRB issues might be a problem.
Nonprobability Samples for
Qualitative Studies
• Purposive or elite sampling has decided
advantages over probability sampling.
• The researcher wants to tap the range of people
and because the interviews are so labor intensive
the sample must be small, at least in most
qualitative studies.
• If you are limited to interviewing 20 participants
in your study, you want to select them
purposively.
Nonprobability Samples for
Qualitative Studies
• Suppose you were studying the effects of a change in the
welfare system on parents.
• You will want the perspective of both mothers and father,
unemployed and underemployed parents, single parents,
cohabiting partners, married parents, and parents with different
racial or ethnic backgrounds.
• You may also need the perspective of social service providers in
the welfare system.
• If you randomly sampled 20 participants, you would not
get this diversity. You need to purposively select each
participant based on the information value they have.