Transcript designx
Design
Experimental Planning and
the Role of Randomization
Yi Cheng
Hao Ren
Ancient Example of Design
Avicenna (Ibn Sina)
Canon of Medicine
1000 CE
Seven Rules for Medical Experiment
1.The drug must be free from any extraneous, accidental quality.
2.The experimentation must be done with a simple and not a composite disease.
3.The drug must be tested with two contrary types of disease, because sometimes
a drug cured one disease by its essential qualities and another by its accidental
ones.
4.The quality of the drug must correspond to the strength of the disease.
5.The time of action must be observed, so that essence and accident are not
confused.
6.The effect of the drug must be seen to occur constantly or in many cases, for if
this did not happen it was an accidental effect.
7.The experimentation must be done with the human body.
Recall
Key principles of Experimental Design:
Randomization
Random assignment of units to treatments
Replication
Use lots of units within a specific study (multiple units per treatment)
Control
Keep other variables constant to investigate the effect(s) of the one(s) of interest
(e.g., blocking, blinding, comparing treatments)
Seven Rules for Medical Experiment
Control
Replication
The danger of confounding effects
The wisdom of observing the effects for many differing factors level
Causal reasoning
“Nothing has changed since Avicenna but the mouse has replaced
the lion as the laboratory animal of choice.”
Seven Rules for Medical Experiment
Look again at Avicenna’s second rule:
“The experimentation must be done with a simple and not a composite disease.”
he says, to experiment with but one factor at a time.
“One of the most requisite precautions in
experimentation is to vary only one circumstance at a
time, and to maintain all other circumstances rigidly
unchanged.”
-William Stanley Jevons
Now Read Fisher’s Writing in 1926
1890-1962
pioneered the application of statistical
procedures to the design of scientific
experiments
“No aphorism is more frequently repeated in
connection with field trials, than that we must
ask Nature few questions, or, ideally, one
question, at a time.”
Sir Ronald Fisher
But…
“The writer [Fisher] is convinced that this view is wholly mistaken. Nature, he
suggests, will best respond to a logical and carefully thought out questionnaire;
indeed, if we ask her a single question, she will often refuse to answer until
some other topic has been discussed.”
Fisher’s Multifactor designs
A factorial design allows that effect of several factors and even interactions between them to
be determined with the same number of trials as are necessary to determine any one of the
effects by itself with the same degree of accuracy
Rothamsted Experimental Station
Rothamsted Experimental Station was
founded in 1843 by John Bennet Lawes.
It is the longest running agricultural
research station in the world.
John Bennet Lawes
Broadbalk Wheat Experiment
Each long strip has a different
combination of applied
fertilizers.
Wheat has been grown every
year since 1843.
Some horizontal strips have
been treated differently, such
as being cleared (fallow) to
control weed growth.
The plots with no nitrogen are
palest in color.
Fisher’s Multifactor Design
Fisher revolutionized existing
statistical techniques, he developed
the method of fitting orthogonal
polynomials, and multiple
regression to determine the
relationships between the yield data
and its influential factors for each
year.
“The bible of applied statistics”
Additive Models
Yij = μ + αi + βj + εij
where:
Yij: the crop yield for plot (i,j)
μ: the mean yield for the whole field
αi : the effect due to seed variety i
βj :the effect due to fertilizer level j
εij: the random variation for plot (i,j)
Prussian Example
Bortkiewicz’s data were gathered from the large published
Prussian state statistics (three huge volumes each year for
this period). He included 14 corps (G being the Guard
Corps) over 20 years. (Bortkiewicz 1898)
Fisher’s other work
Fisher’s experience with the Rothamsted experiments led him to
propose the main principles of Experimental Design:
Randomization
Replication
Block structure
Factorial combinations
Estimation of error variance
Role of Randomization
Eliminating biases
Basis for estimating standard errors
Foundation for significance tests
David Cox
“The very fact that a sample was random
was what made inference possible”
-Charles S. Peirce
Distributional Assumptions
Systematic field trials
are better
Randomized field trials
are better
Validate inferences
require normality
Randomization could
also validate inferences
William Gosset
Ronald Fisher
Fisher:
Inference requires only spherical symmetry under the
null hypothesis to work.
Randomization itself could induce a discretized spherical
symmetry, so it works!
Lifted Weights Experiment
Lift two small
containers successively
Guess: which is B and
which is B+D?
Gustav Fechner, around 1860
Repeated with:
Varying weights B & D
Weight: B
or
Weight: B+D
Different hands
Different orders of lifting
Lifted Weights Experiment
P: Probability of a right guess
D: The amount of differential weight
JND
Just Noticeable Difference(JND)
Refined Lifted Weights Experiment
Q: How to measure extremely subtle
sensations?
Solution: Follow rigorous randomization.
Charles Peirce, around 1884
French Lottery Example
French people bought
lotteries quite early…
It was popular from
1757 to 1836 across
the country (large
population basis).
French Lottery Example
The randomness of the
winning numbers pass all
feasible tests.
Gives a brief idea where
in France the lottery
interest was greatest and
how the passion changed
overtime.
Interestingly: This case was executing what is
likely the earliest scientifically randomized
social survey!
Type of Randomization
Fisher, around 1934
Invasive Randomization
Noninvasive Randomization
Random assignment of
treatments is feasible
Randomization for only the
selection of a sample
Field trials;
Lifted weights example;
Modern medical/clinical
trials...
Experiments in social
sciences…
Random sampling
took flight
Selection Bias Example
1937 US President Election Candidates:
Franklin Delano Roosevelt
•
VS
Alfred Landon
The Magazine Literary Digest projected that
Landon would win.
• The magazine polled 10 million people and
received 2.4 million responses.
Problem: heavily skewed poll result
(sample contains more rich people that poor people)
How to Ensure Randomness?
In the past:
Flipping a coin
Using a shuffled deck of cards
Throwing a dice
Using random number table
in a statistics book
Applied partially
Nowadays:
Using computer-generated
random numbers
http://www.randomization.com/
Golden rule for clinical trials
Any Questions?