experiment - People Server at UNCW
Download
Report
Transcript experiment - People Server at UNCW
Simpson’s paradox
An association or comparison that holds for all of several groups can
reverse direction when the data are combined (aggregated) to form a
single group. This reversal is called Simpson’s paradox.
Example:
Hospital death
rates
Hospital A Hospital B
Died
63
16
Survived
2037
784
Total
2100
800
% surv.
97.0%
98.0%
Patients in good condition
But once patient
Hospital A Hospital B
condition is taken
Died
6
8
into account, we
Survived
594
592
see that hospital A Total
600
600
has in fact a better % surv.
99.0%
98.7%
record for both patient conditions (good and poor).
On the surface,
Hospital B would
seem to have a
better record.
Patients in poor condition
Hospital A Hospital B
Died
57
8
Survived
1443
192
Total
1500
200
% surv.
96.2%
96.0%
Patient condition is the lurking variable. Now see Ex. 2.40-2.41, p.144…
Does Association Imply Causation?
• Sometimes, but not always! Look at example 2.42 on page 149
(section 2.6, Explaining Causation) for several x,y variables where
association was found - some are causal, others are not.
• The figure below (Fig. 2.29) gives three possible scenarios
explaining a found association between a response variable y and
an explanatory variable x:
• Association between x and y can certainly be because
changes in x cause y to change - but even when
causation is present, there are still other variables
possibly involved in the relationship. (See #1 in Ex. 2.42)
• Be careful of applying a causal relationship between x
and y in one setting to a different setting: (#2 shows a
causal relationship in rats - does it extend to humans?)
• Common response is an example of how a "lurking
variable" can influence both x and y, creating the
association between them (See #3)
• Confounding between two variables arises when their
effects on the response cannot be distinguished from
each other - the confounding variables can either be
explanatory or lurking… (See #5)
Establishing causation
It appears that lung cancer is associated with smoking.
How do we know that both of these variables are not being affected by an
unobserved third (lurking) variable?
For instance, what if there is a genetic predisposition that causes people to
both get lung cancer and become addicted to smoking, but the smoking itself
doesn’t CAUSE lung cancer?
We can evaluate the association using the
following criteria:
1) The association is strong.
2) The association is consistent.
3) Higher doses are associated with stronger
responses.
4) Alleged cause precedes the effect.
5) The alleged cause is plausible.
HW: read 2.6, go over all the examples in the section
(esp. 2.43, 2.44) and then look at # 2.133-2.145
Obtaining data
•Available data are data that were produced in the past for some other
purpose but that may help answer a present question inexpensively.
The library and the Internet are sources of available data.
– Government statistical offices are the primary source for
demographic, economic, and social data (visit the Fed-Stats site
at www.fedstats.gov).
•Beware of drawing conclusions from our own experience or hearsay.
Anecdotal evidence is based on haphazardly selected individual
cases, which we tend to remember because they are unusual in some
way. They also may not be representative of any larger group of cases.
•Some questions require data produced specifically to answer them.
This leads to designing observational or experimental studies.
Observational study: Record data on individuals without attempting
to influence the responses. We typically cannot prove cause & effect
this way.
Example: Based on observations you make in nature,
you suspect that female crickets choose their
mates on the basis of their health. Observe
health of male crickets that mated.
Experimental study: Deliberately impose a treatment on individuals
and record their responses. Lurking variables can be controlled.
Example: Deliberately infect some males
with intestinal parasites and see whether
females tend to choose healthy rather
than ill males.
– a sample is a collection of data drawn from a
population, intended to represent the
population from which it was drawn – a
census is an attempt to sample every
individual in the population.
– an experiment imposes a so-called treatment
on individuals in order to observe their
responses. This is in opposition to an
observational study which simply observes
individuals and measures variables of interest
without intervention
– go over Examples 3.4-3.5 in Ch. 3, Sample
Surveys & Experiments
Terminology of experiments
• The individuals in an experiment are the experimental
units. If they are human, we call them subjects.
• In an experiment, we do something to the subject and
measure the response. The “something” we do
(explanatory variable) is a called a treatment, or factor.
The values of the factor are called its levels. Sometimes
a treatment is a combination of levels of more than one
factor.
– The factor may be the administration of a drug – the different
dosages are its levels.
– One group of people may be placed on a diet/exercise program
for six months (treatment), and their blood pressure (response
variable) would be compared with that of people who did not diet
or exercise. Two levels here: on diet, not on diet
• Go over example 3.8 (Section 3.1) and below –
an example of a designed experiment with two
factors and six treatments. Also see Ex. 3.9
(Section 3.1) for an example of an experiment
not designed well... The lack of a control
group causes the problem...
• If the experiment involves giving two different doses of a
drug, we say that we are testing two levels of the factor.
• A response to a treatment is statistically significant if it
is larger than you would expect by chance (due to
random variation among the subjects). We will learn how
to determine this later.
In a study of sickle cell anemia, 150 patients were given the drug
hydroxyurea, and 150 were given a placebo (dummy pill). The researchers
counted the episodes of pain in each subject. Identify:
• The subjects
• (patients, all 300)
• The factors / treatments
• 1 factor, 2 levels (hydroxyurea and placebo)
• And the response variable • (episodes of pain)
• In principle, experiments can give good evidence for causation
through what we call randomized controlled comparative
experiments.
• The need for comparative experiments is shown in Example 3.9 – a
control group is needed so the experimenter can control the effects
of outside (lurking) variables
• The use of randomization is illustrated in Example 3.10 – a chance
mechanism is used to divide the experimental units into groups to
prevent bias.
QuickTime™ and a
decompressor
are needed to see this picture.
• The logic behind randomized comparative
experiments is given on p. 175:
– Randomization produces groups of subjects that
should be similar in all respects before the treatments
are applied
– Comparative design ensures that influences other
than the treatment operate equally on all groups
– Therefore, differences in the response must be due
either to the treatment or to chance in the random
assignment of subjects to the groups.
• This lead to three basic principles of
experimental design in the box on the same
page…
• Control the effects of lurking variables on the
response, usually by comparing two or more
treatments
• Randomize – use a chance mechanism to
assign experimental units to treatments. See
the Table B of random digits discussed on the
later slides…
• Repeat each treatment on many units to reduce
chance variation in the results
• Then if you see differences in the response they
are called statistically significant if they would
rarely occur by chance
Caution about
experimentation
The design of a study is
biased if it systematically
favors certain
outcomes.
The best way to exclude biases in an experiment is to
randomize the design. Both the individuals and
treatments are assigned randomly.
Other ways to remove bias:
A double-blind experiment is one in which neither the
subjects nor the experimenter know which individuals got
which treatment until the experiment is completed. The goal
is to avoid forms of placebo effects and biases in
interpretation.
The best way to make sure your conclusions are robust is to
replicate your experiment—do it over. Replication ensures
that particular results are not due to uncontrolled factors or
errors of manipulation.
Designing “controlled” experiments
Sir Ronald Fisher—The “father of statistics”
He was sent to Rothamsted Agricultural Station
in the United Kingdom to evaluate the success of
various fertilizer treatments.
•Fisher found the data from experiments going on for decades to be
basically worthless because of poor experimental design.
– Fertilizer had been applied to a field one year and not in another in
order to compare the yield of grain produced in the two years. BUT
• It may have rained more, or been sunnier, in different years.
• The seeds used may have differed between years as well.
– Or fertilizer was applied to one field and not to a nearby field in the
same year. BUT
• The fields might have different soil, water, drainage, and history
of previous use.
• Too many factors affecting the results were “uncontrolled.”
Fisher’s solution:
“Randomized comparative experiments”
• In the same field and same
year, apply fertilizer to
randomly spaced plots within
the field. Analyze plants from
similarly treated plots
together.
F F F
F
F F FF F F F F
F F
FF F
FF F
F FFFF
F
• This minimizes the effect of
variation within the field in
drainage and soil
composition on yield, as well
as controlling for weather.
FF
F FF F
FF
F
F
A Table of Random Digits can be used to
Randomize an Experiment
• any digit in any position in the table is as equally
likely to be 0 as 1 as 2 as … as 9
• the digits in different positions are independent
in the sense that the value of one has no
influence on the value of any other
• any pair of random digits has the same chance
of being picked as any other (00, 01, 02, … 99)
• any triple of random digits has the same chance
of being picked as any other (000, 001, … 999)
• and so on…
• Now use Table B to randomly divide the 40
students in Ex. 3.10 into the two groups (phone
1 and phone 2 groups)
– Step 1: Label the experimental units with as few
digits as possible
– Step 2: Decide on a protocol for how you will place
the chosen units into the groups
– Step 3: Start anywhere in the Table and begin reading
random digits. Matching them with labeled
experimental units and following the protocol creates
the groups.
EX.3.10: We need to randomly divide the 40 students into two groups of 20 those using the first cell phone and those using the second cell phone.
1. List and number (label) all available subjects (the group of 40).
2. Decide that the first 20 students chosen go to the phone 1 group; the
remainder to the phone 2 group (this is the protocol)
3. Scan Table B in groups of numbers that are two digits long. Match the
digits with the labels and follow the protocol to form the groups.
45 46 71 17 09 77 55 80 00 95 32 86 32 94 85 82 22 69 00 56
• There are many types of experimental designs
in use today in the sciences…read about these
at the end of section 3.1:
– Completely randomized: all experimental units are
allocated at random among all treatments (Ex. 3.10)
– Block designs: A block is a group of experimental
units or subjects known in advance to be similar in
some way that is expected to affect the response to
the treatments. Knowing this, the experimenter can
create a block design, in which the random
assignment of units is carried out separately within
each block. See examples 3.17-3.19 for some
examples
– Matched pairs: This is a common design in which a
block design is used to compare just two treatments.
Sometimes each subject receives both treatments
(acts as its own control), or there is a “before-after”
design - see Example 3.16
Completely randomized designs
Completely randomized experimental designs:
Individuals are randomly assigned to groups, then
the groups are randomly assigned to treatments.
Block designs
In a block, or stratified, design, subjects are divided into groups,
or blocks, prior to the experiment to test hypotheses about
differences between the groups.
The blocking, or stratification, here is by gender.
Matched pairs designs
Matched pairs: Choose pairs of subjects that are closely matched—
e.g., same sex, height, weight, age, and race. Within each pair,
randomly assign who will receive which treatment.
It is also possible to just use a single person, and give the two
treatments to this person over time in random order (“before”/”after”). In
this case, the “matched pair” is just the same person at different points
in time. Pre/post testing of a new teaching method is another example...
The most closely
matched pair
studies use
identical twins.
• Read the Introduction & Section 3.1 - pay particular
attention to all the Examples. Make sure you understand
the terminology and the sketches of the types of
designs... Also, make sure you can use Table B to
perform a completely randomized design. Also, try to do
each of the exercises that occur within the text of that
section… then try # 3.17, 3.18, 3.23, 3.27, 3.30, 3.40,
3.44-3.46