m4_p2_study_design
Download
Report
Transcript m4_p2_study_design
Properties of
Well-Designed Studies
Learning Objectives
By the end of this lecture, you should be able to:
– Define ‘control group’
– Contrast with ‘experimental group’
– Give examples of different methods for creating a control group
– Define blinding, group stratification, group randomization
– Speculate on possible stratifications for optimal study design
As with the previous lecture, this too is not a “numbers” oriented
lecture. It’s not difficult, but there is a little bit of terminology
involved. You have only ‘gotten it’ when you can describe these
terms in your own words with examples. It may take a couple of
views/reviews.
Avoiding bias when conducting your experiment
How can we reduce bias?
A major objective in research.
• Control group: Any good experiment should include a control group.
• Blinding: When the subjects and ideally, the researchers as well, do NOT know
which individuals received the ‘treatment’ and which individuals were in the
control group until the experiment is completed.
• Randomize the groups: We will see that it is frequently necessary to stratify your
subjects into groups (beyond just the experimental and control groups). This
stratification should be done randomly.
The ideal experiment:
A “randomized, double-blind, controlled” trial.
Assume that all data is biased – it’s just a
matter of degree…
A reputable journal will only publish studies that demonstrate a significant
effort to minimize bias.
Comparative experiments
Experiments are comparative in nature: We compare the response to a treatment to:
–
–
–
–
–
Another treatment,
No treatment (a control)
Older / original treatment (another form of control)
A placebo (another form of control)
Any combination of the above
A control is a group to which an experimental treatment is NOT administered. It serves as a
reference mark for comparison (e.g., a group of subjects that do not receive the “new” drug, or a
group of subjects that is given a placebo).
A placebo is a fake treatment, such as a sugar pill. This is to test the hypothesis that the response is
due to the actual treatment and not to the subject’s belief that they were treated. In many studies,
the control group is given a placebo.
Without a control group, you should be very, very skeptical about any conclusions drawn as a
result of the experiment!
Control Group
• Any proper study will always discuss the “controls”. The “control group”
refers to the group that was used as a comparison group with the
“treatment” group.
• As was said previously: Without a control group, you should be very, very
skeptical about any conclusions that come out of the experiment!
Example of experimental and control groups:
Suppose you are a pharmaceutical company that has come up with what you
believe is a breakthrough drug for diabetes.
• Experimental Group: In your study, you will give one group your new
wonder-drug. This group is called the experimental group.
• Control Group: For comparison, any decent study will include a control
group. Examples of control groups:
– Give this group a placebo (perhaps the most common ‘control’)
– Give this group the “older” version of the drug
– Give this group NO drug (however you then sacrifice ‘blinding’)
Control Group - Examples
- The Control group
- 3 Experimental groups
Example of how
control and treatment
groups are often
graphed together to
highlight differences
or lack of differences.
Placebos
Can exist in many forms:
– In a drug trial, the placebo might be a completely inert drug that looks exactly
like the experimental drug and is administered in the same way.
– In studies evaluating accupuncture, a great choice for placebo was a needle that
felt exactly like an accupuncture needle, but did not actually penetrate the skin.
– In a study involving prayer, the experimental group was prayed for, while the
placebo group was only told that they were being prayed for.
Blinding
Blinded: If the patient doesn’t know if they are in the experimental
group or in the control group, the study is said to be ‘blinded’.
Double-Blinded: When both the subject AND the people involved in
carrying out the experiment (e.g. researcher, nurses, etc) don’t know
who is in the control group and who is in the experimental group.
Double-blinded studies are much more ideal than single-blinded
studies.
Example: In clinical drug trials, a patient is sometimes given a barcode which they wear on a wristband. The medications also are not
labeled, and also have a bar-code. The researcher/nurse giving the
medication will scan the wristband and match it with an appropriate
medication bar-code. So neither the patient nor the researcher knows
if they are getting the treatment or the placebo/control. Only at the
end of the study will they patients and researchers find out who was
in the “experimental group” and who was in the “control group”.
Designing “controlled” experiments
Sir Ronald Fisher—The “father of statistics”—was sent to
Rothamsted Agricultural Station in the United Kingdom to
evaluate the success of various fertilizer treatments.
Fisher found that the data from experiments that had been going on for decades was
basically worthless because of poor experimental design.
– Fertilizer had been applied to a field one year and not the following year, in order to compare
the yield of grain produced with v.s. without the fertilizer.
– What are the flaws in this research methodology?
• It may have rained more or been sunnier during different years.
• The seeds used may have differed between years as well.
– In one case, fertilizer was applied to one field and not applied to a nearby field in the same year.
– BUT:
• The two fields might have had different soil, sun exposure, water, drainage, and farming
history (that is, the two fields may have been farmed differently in previous years).
• In other words, many factors affecting the results were “uncontrolled.”
Any suggestions for a valid control group?
Setting up ‘controls’
• In this particular experiment, you’d like to “control for” the various
confounding variables that exist in this experiment:
–
–
–
–
–
Different soil
Different sun exposure
Different water drainage
Different farming patterns
etc (it would be possible to come up with several others)
• Fisher came up with a very clever experiment design that did a terrific
job of “controlling for” the confounding variables.
Fisher’s (elegant!) solution:
•
In the same field and same year, apply fertilizer
to randomly spaced plots within the field.
F
F
F
F F
F
F F F
F
F F
F
Analyze plants from similarly treated plots
F F F F
together.
•
F
F
F F
F
This was a great solution! Both the
experimental group (the fertilized areas) and
the control group (the non-fertilized areas)
F
F
F
F F F F
F F F
were exposed to the same sunlight, weather,
drainage, farming patterns, etc.
F F
F
Note how in this experiment there is:
• A control group: The areas that were not fertilized
• Randomization: The plots were randomized to either the fertilizer group or
the control group.
F
Randomization
Recall how with samples, we randomize so that no one group is over-represented.
Similarly, when we place subjects into an experimental or control group, we are careful
to do so randomly. (We don’t put our buddy in the control group to make sure “he gets
the good stuff.”)!
Key Point: All decent studies will randomize which subjects are in the control group vs
which are in the experimental group.
For example, if you are comparing a new cancer treatment vs the ‘older’ treatment,
which patients get the new treatment and which get the older treatment must be
decided at random.
Completely randomized designs
Completely randomized experimental designs:
Individuals are randomly assigned to groups, then
the groups are randomly assigned to treatments.
Which of the two groups is the control group?
Group 1 is the “experimental group”
Group 2 is the “control group”
Some key principles of experimental design
• Control the effects of lurking variables on the response, by comparing the
treatment you are interested in with a second group who either receives a
placebo, or a different treatment.
• Randomize – use some kind of randomization technique to assign subjects
to treatments – in other words, the researcher does not pick who goes in
the treatment group and who goes in the control group.
• Blind: This is another major factor – particularly in medical trials. Neither
the experimenter nor the subjects should be aware which subjects are
receiving the experimental treatment and which subjects are receiving the
control treatment.
Stratification
Individuals (or observations) in a study must be properly stratified (grouped) to try and
ensure that no one batch of people/observations is over-represented in the control group
or in any of the experimental group(s).
Example: Testing a new cancer treatment v.s. the old treatment:
– Both treatments must be given to patients with similar severity of disease. So you might
stratify based on the stage of the disease.
– You might suspect that people of different ethnic groups (specifically Northern European
ancestry) will respond differently to your medication. So you might stratify based on those
from Northern European ancestry and those that are not.
– etc
Example: Suppose you suspect that men and women would respond differently to the
treatment. What is one change you should make to your study?
– Answer: try to ensure that you place about equals numbers of men in each group (control
group and each experimental group). Do the same with the women.
This process of organizing your subjects into various blocks according to certain
categories (age, race, severity of illnes, etc, etc) is called stratification.
Block aka “stratified” designs
In a block, or stratified, design, subjects are divided into groups, or blocks, prior to
experiments, to test hypotheses (i.e. theories) about differences between the
groups.
You can stratify based on the treatment, but you can also stratify based on the
subjects (e.g. different ages, different races, different stages of disease, etc).
For example, suppose you are evaluating three different acne treatments on a
group of teenagers between 14 and 16 years old. You would want to randomize
into a minimum of four groups (one group for each treatment, and the control
group)
Can you spot a potentially major flaw in this study?
Gender! At this age, there are all kinds of hormonal changes affecting teenagers, and
they affect acne production differently in males vs females differently. So you would want
to stratify based on gender as well.
As a result, in order to do this study properly, we would need eight groups!
Boys: 3 treatments + control. Girls: 3 treatments + control.
Stratifying into two blocks of three groups
We divide the subjects are into groups, or blocks, prior to the experiments.
This allows us to test hypotheses about differences between the groups.
(Note: There also must be a fourth group for each block, the control.
However, it is not shown in this diagram).
To stratify, or not to stratify…
A researcher wishes examine the relationship of resting pulse
rates and age. A sample of 52 people had their pulse rate
measured at rest in the lab. Would you stratify?
Answer: Yes. Fitness Level: Pepole who do lots of
endurance sports typically have lower resting rates.
Similarly gender: Men and women typically have
different resting pulse rates, so this experiment should
also be stratified by gender.
A researcher wants to determine if BST, a hormone intended to spur
greater milk production works as advertised. A farming research facility
makes available 60 cattle. Can you think of possible stratifications you
might need?
Answer: Different breeds of cattle may respond differently to
this hormone. As a result, you should consider stratifying by
breed.
Weaknesses in experimental design
• There is no such thing as the perfect experiment. Your goal is to decide
whether any of the limitations in the design are significant enough to limit
the validity of the conclusions.
• Unfortunately, outside of reputable journals, badly designed experiments
are extremely common .
– Which is not to say that “reputable” journals do not also allow shoddy
research to slip through at times – it most certainly does happen!
Example of a randomized, double-blind controlled trial
A major cancer center is excited to hear about a promising new treatment for pancreatic cancer. So:
• They contact all of the patients in their files with this condition.
• They find 408 patients who agree to be in their trial.
• They exclude from their trial 11 patients who say they moving out of state since that group cannot be
monitored by the center.
• They exclude 43 others from the trial because they have other significant medical ilnesses which would
be confounding
•
Stratification: Now they have 354 patients remaining. They suspect that men and women will respond
differently to the drug. They also suspect that people will respond differently based on their age. So
they stratify based on both of these variables.
–
–
Gender: 190 are female and 164 are male.
Age: They use the age groups: 20-40 / 40-60 / 60-80
•
Randomization and Control: Among each of these 6 groups (the 3 age groups, each of which is also
stratified by gender), the patients are randomly assigned to receive either the usual treatment (the
control group) vs the new treatment (the experimental group). We now have 12 different groups! But
that’s okay, provided that each group is of a reasonable size.
•
Blinding: The researchers set up the study to be double-blinded. That is, neither the patients nor the
physicans know which patient is receiving which treatment. They will not find out until the study has
been completed.
•
Very good! Yet, there are still some flaws in the design of this study…
Limitations/Flaws in the pancreatic cancer study?
• Stage of cancer – Drugs will affect the cancer differently depending on
how advanced the disease is when the treatment begins.
• Choice of age groups – The choice seems a bit arbitrary.
• Lack of placebo control – It’s always great to have a placebo group as one
of your controls, but often, you can not. In this case, there are ethical
constraints.
Ethics: Why couldn’t we use a placebo as the control?
It would not be ethical to take patients with cancer and randomly give one block
of them no treatment at all just for the purpose of improving the validity of your
experiment.
• Thoughts?
– Survey: Obtained 36,000 physician office fax numbers, delivered ~16,000 faxes and
received ~700 replies. Their respondents were mostly private practice physicians, and
mostly mid-career. .” (Source: http://www.dpmafoundation.org/physician-attitudes-onmedicine.html).
– The Doctor Patient Medical Association (DPMA) and the Patient Power Alliance (PPA)
work to repeal health care reform and call themselves a "a nonpartisan association of
doctors and patients dedicated to preserving free choice in medicine." The organization
is a member of the National Tea Party Federation and the "American Grassroots
Coalition
– Note which magazine published this article - hardly a fly-by-night magazine!
• I.e. Even legitimate magazines and news sources are frequently guilty of pubishing “studies”
and other polls that are so riddled with flaws as to be completely meaningless.
Example – Claudication Study (on web page)
•
•
•
•
•
•
•
•
•
•
Methods: first thing they mention is IRB approval; Randomized; Design: 3 groups; Location (Northwestern)
Inclusion & Exclusion Criteria: defining the population
Measurement: How they measured the results – sometimes straight-forward, sometimes can be a huge and
contentious issue. How do you measure pain symptoms? How do you measure improvement?
Blinding: Obviously could not be double-blinded since patients knew their ‘treatment’. However, researchers
were blinded. They just saw the data results. They did not know which patients were in which group as the
experiment was going on.
Details: Many other issues and techniques employed by the study are explained in careful detail.
Stratifications (Blocks): Claudication vs No Claudication.
Control group: Nutritional consulting, regular meetings with data-gathering team, etc, but NO exercise.
Outcomes: In particular note the very frequent mention of p-values, and confidence intervals. Very important
and we will be learning about them.
Charts and graphs:
– p159: Breakdown of stratifications. Also note the ‘exclusion’ disclaimer at the bottom of the graph. If
you’re gonna leave people out of your analysis, you’d better explain why. In this case, 4 were left out in
the end because they did not respond to following up.
– Table 1, p.170: A careful breakdown and description of the people in each strata (block)
Conclusion: A study should at some point summarize the researchers’ recommendations on what the study
can tell us. In this study it is in the very last paragraph: “Physicians should recommend supervised treadmill
exercise programs for PAD patients regardless of whether they have classic symptoms of intermittent
claudication”.