Sampling and Measurement

Download Report

Transcript Sampling and Measurement

Sampling, Measurement, Validity
and Reliability
Back to Class 10
Sampling
• Many full scientific texts have been written about
this subject, but it is also a general phenomena –
we all come to conclusions based on samples of
experience that we have had.
• Why sample –
• More economical and efficient
• May be more accurate
• More able to control for biases due to over- or underrepresentation of some population segment
Sampling Terms to Know
• 1. Sampling – the process of selecting a part of the
population to represent the entire population
• 2. Population – an entire aggregation of cases
which meet a designated set of criteria – all
nurses, all BSN nurses, all nurses in Hamilton
County
• Accessible population – all cases which conform to the criteria
and which are accessible for the study
• Target population – the entire field of cases which conform to
the criteria
Sampling Terms to Know
• 3. Sampling Unit – elements or a set of
elements used for sampling – if you want an
element of BSN students, send
questionnaires to BSN schools – the school
is the sampling unit and each student is an
element (the most basic unit about which
information is collected).
Sample Size and Sample Error
• Sample size – always use the largest sample
possible. In general, a sample size should
be at least ten for every subdivision of the
data. 20-30 is preferable. The absolute size
is more important than the relative size.
• Sampling error – the difference between
values obtained from the sample and the
values of the whole population.
Steps in Sampling
• Identify the target population
• Identify the part of the population that is
accessible to you
• Ask the sample subjects for cooperation
• Collect data
• Interpret the results based on the sample –
be realistic and conservative
Types of Samples
• Representative sample – a sample in which the
key characteristics of the elements closely
approximate those of the population
• Probability sample – a sample that uses some form
of random selection in choosing the elements – the
researcher can specify the probability that each
element of the population would be included
• Non-probability sample – a sample in which the
elements are not chosen by random selection
Types of Samples
• Non-probability sample – the elements are
selected by non-random methods. This type of
sampling is more convenient and economical.
– Convenience sample – this is where the researcher uses
the most readily available persons – also called
accidental samples – such as the first persons who come
into a supermarket or a clinic. This is the weakest
method of sampling
– Snowball sample – persons known to the researcher are
asked to participate then the elements are asked to give
names of others they know with the same characteristic.
Types of Samples
– Quota sampling – the researcher identifies different
strata of the population and determines the proportions
of elements needed from those various segments of the
population (establishes a quota and fills the quota as the
elements present themselves)
– Purposive sampling (judgmental sampling) the
researcher’s knowledge about the population is used to
handpick the elements to be included so that the sample
meets “the widest type variety” or the”typical” element.
It is good for testing instruments or validating tests, but
it does risk bias.
– Sequential sampling – Sample one person at a time
until you prove or disprove a statement (“Seven out of
10 times ASA works better.”)
Types of Sampling
• Probability Sampling
– Simple random sampling – establish a list from which
the sample will be chosen (a sample frame) and number
all elements consecutively. Use a table of random
numbers or a computer to draw numbers. This
guarantees that the differences in attributes of the
sample and of the population are purely a function of
chance and the the probability of selecting a deviant
sample is low. As the size of the sample increases, the
probability of its deviance from the attributes of the
population decreases.
Types of Sampling
– Stratified random sample – mutually exclusive
segments of the population are established by one or
more specifications (male/female; below 30yrs/3045yrs/46yrs and over; diploma/ADN, BSN) and
elements are picked randomly from each stratification
of the population. (Decisions about which strata the
elements belong to are made before the selection as
opposed to quota sampling where the person is
questioned and then put into a stratum.) This method
increases representativeness
– Proportional – elements in proportion to population
– Disproportional – to compare greatly unequal proportions
Types of Sampling
– Cluster sampling – this is a process in which a
successive random sampling of units is drawn (states,
then cities, then districts, then blocks, then households)
moving from the largest unit down to the basic element.
It is also called multi-staged sampling. The sampling
error may be larger with it.
– Systematic sampling – the researcher selects every k th
person from a list or a group. It is not random if you
select every 10th person walking by nor is it random
unless you draw the first number to start the list.
Measurement
• Measurement is assigning numbers to objects to
represent quantities of attributes or concepts.
• Measurement procedures are operational
definitions of concepts or attributes – the concept
or attribute should really exist although it may be
an abstraction
• Measurement always deals with abstraction – you
don’t measure a person, but a characteristic of that
person
Measurement
• Numbers are assigned to quantify an attribute –
“whatever exists, exists in some amount and can
be measured” The variability of an attribute is
capable of numerical expression which signifies
how much of the attribute is present in the
element.
• Rules for measuring may have to be invented.
The researcher must specify under what conditions
and according to what criteria, and in what
increments, numerical values are to be assigned.
• Measurement should have a rational
correspondence to reality
Advantages of Measurement
• What would you work with if you did not have
measurement of height, weight, temperature –
intuition, guesses, personal judgment
• Objectivity – scoring minimizes subjectivity.
Analytical procedures are not subjective
• Communication – numbers constitute a nonambiguous language
Levels of Measurement
• Nominal scale – measurement at its weakest –
numbers or other symbols are used to classify an
element – such as a psychiatric diagnostic number
- 295. You can partition a given class of elements
into a set of mutually exclusive subclasses –
295.30, 295.20. The only relationship involved is
equivalence (=) The kinds of statistics that can be
used with this type of measurement are modes and
frequency counts. You can test hypotheses
regarding distribution of cases among categories
(X2).
Levels of Measurement
• Ordinal scale – this measurement shows
relationships among classes such as higher
than , more difficult than, etc. It involves
two relations: equivalence (=) and greater
than (>) The researcher can test hypotheses
using non-parametric statistics of order and
ranking such as the Spearman Rank Order
Correlation or the Mann Whitney U.
Levels of Measurement
• Interval scale – this is similar to the ordinal
scale, but the distance between any two
numbers is of a known size. All parametric
tests are able to be used – mean, standard
deviation, Pearson correlation, T-test, F-test,
etc. It involves three relations: equivalence
(=), greater than (>), and a known ratio of
any two intervals.
Levels of Measurement
• Ratio scale – it is like the interval scale, but
it has a true zero point as its origin. You can
use arithmetic with it and all parametric
tests as well as those involving geometric
means. It involves four relationships:
equivalence(=), greater than(>), the known
ratio of any two intervals, and the known
ratio of any two scale values.
Reliability and Validity - Criteria
for Assessing Measuring Tools
• Every score is part true and part error
• Sources of errors in scores
•
•
•
•
•
•
Situational contaminants
Response set bias
Transitory personal factors
Administration variations
Instrument clarity
Response sampling (a person scores 95% and 90% on two tests
which claim to test the same thing)
• Instrument format
Reliability
• This is the major criterion for assessing a
measuring instrument’s quality and adequacy. It is
the consistency with which the instrument
measures the attribute it is supposed to be
measuring.
– The reliability of an instrument is not a property of the
instrument, but rather of the instrument when
administered under certain conditions to a certain
sample. (A death anxiety instrument would not measure
the same when given to teenagers as it measures for
geriatric patients.)
Ways to Check Reliability
• Stability (test-retest reliability) – the same test is
given to a sample of individuals on two occasions,
then the scores are compared by computing a
reliability coefficient. (A reliability coefficient is a
correlation coefficient between the two scores)
• Internal consistency (homogeneity) – all of the
subparts of the instrument must measure the same
characteristic. Use the split-half technique –split
the test items in half, score each half, then
compare the scores using a correlation coefficient;
or compare each item (by correlation) with the
total score.
Ways to Check Reliability
• Equivalence – can be tested in two ways
– 1. Inter-rater reliability
• Carefully train observers, develop clearly defined,nonoverlapping categories, and use behaviors that are molecular
rather than molar
• Two or more observers watch the same event simultaneously
and independently record variables according to a plan or code
• Reliability is computed:
– Reliability = number of agreements
number of agreements + number of disagreements
Ways to Check Reliability
• Interpretation of reliability coefficients
– If you are interested only in group-level comparisons, a
reliability coefficient of .70 or even .60 is sufficient
(male/female, Dr./nurse, smoker/non-smoker)
– If you are interested in decisions about individuals,
such as who gets into school, then a coefficient of .90
or higher is needed
– If the coefficient were .80, then 80% of the scores’
variability would be true variability and 20% would be
extraneous
Ways to Improve Reliability
• Add more items
• Have a more varied group of subjects – the
more homogeneous the group the lower the
reliability coefficient
Validity
• The degree to which an instrument
measures what it is supposed to be
measuring. Validity is difficult to establish.
An instrument that is not reliable cannot be
valid, but, an instrument can be reliable and
still not be valid
Types of Instrument Validity
• Content validity – looks at the sampling adequacy
of the content area – used especially for tests that
measure knowledge of a specific content area. It
is evaluated by examining the extent to which the
content of the test represents the total domain of
behaviors encompassing the ability being
measured. It is usually measured by expert
opinion. It is based on judgment. The more
experts who agree on the content to be included,
the better – a blueprint could be developed.
Types of Instrument Validity
• Criterion-related validity – this establishes a
relationship between the instrument and some
other criterion that measures the same attribute.
The scores on both should correlate highly
indicating directly how valid the instrument is.
– Concurrent validity – the criterion measure is obtained
at the same time the test is given
– Predictive validity – the criterion measure is obtained
some time after the test is given and the test is used to
predict future performance on the criterion measure
Types of Instrument Validity
• Construct validity – asks the question: Is the
abstract concept/construct under investigation
being adequately measured with this instrument –
is there a fit between the conceptual definition and
the operational definition of a variable. One way
to test it is through the known groups technique –
groups expected to differ on the critical attribute
are tested and scores should be different. If the
test is a sample of behaviors characteristic of the
concept, the researcher has to prove that its items
are representative of the content of the concept.
Types of Instrument Validity
• Statistical Conclusion Validity – determines
whether the conclusions drawn about the
relationships are an accurate reflection of
the real world and/or whether the
differences drawn from statistical analyses
are an accurate reflection of the real world.
Interpretation of Validity
• Validity cannot be proved but it can be
supported. The researcher does not validate
the instrument itself, but actually some
application of the instrument
Other Criteria for an Instrument
• Efficiency – the number of items, the time it takes
•
•
•
•
•
•
to complete
Sensitivity – how small a variation in the attribute
can be detected and measured – use item analysis
of tests
Objectivity – two researchers should agree about
its measurement
Comprehensibility – subjects can understand what
to do with it
Balance – to minimize response sets
Time allowance – adequate time is available for
completion
Simplicity
Back to Class 10