Clinical Trials Methods
Download
Report
Transcript Clinical Trials Methods
Ana Jerončić
P value is a short form for probability value
P=0.07=7%
There is 7% probability that we will incounter such or
more extreme differences by chance.
OR
In case when no real effect exsists if we repeat
experiment a 100 times, such difference (or more
extreme) would be found in 7 experiments.
P value is a short form for probability value
P=0.99=99%
There is 99% probability that we will incounter such
or even more extreme differences by chance.
OR
In case when no real effect exsists if we repeat
experiment a 100 times, such difference (or more
extreme) would be found in 99 experiments.
P>=0.05
No difference between the treatments
(observed difference having happened
by chance)
Null hypothesis is accepted
P<0.05
5%
Significant difference between the
treatments
Null hypothesis is rejected,
alternative is accepted
The threshold of P-value that determines when to
reject a null hypothesis
It refers to the chance that you are willing to take
in being wrong ie. in concluding that there is a
substantial difference when there is none.
The most common significance level: α=0.05=5%
We want to risk that only 5% of our predictions
are wrong.
= Alpha=0.05
Out of 40 decisions => we could expect that 2 are wrong
α is also called Type I error
The probability of erroneously rejecting the
null hypothesis
Consequence of type I error
Put an useless medicine into the market!
p
The sample size calculation was based on the
primary outcome, BMI or BMI z-score, which
was assumed to have a SD of 1.5, or 1.0
respectively. To have 80% power to detect a
difference in mean BMI of 0.38, or mean BMI
z-score of 0.25 units between the groups at
age 2 at the two sided 5% significance level,
we needed a sample size of 252 per group
The sample size calculation was based on the
primary outcome, BMI or BMI z-score, which
was assumed to have a SD of 1.5, or 1.0
respectively. To have 80% power to detect a
difference in mean BMI of 0.38, or mean BMI
z-score of 0.25 units between the groups at
age 2 at the two sided 5% significance level,
we needed a sample size of 252 per group
…. The higher-degree RR was deemed
significantly better if the P-value for the
higher-degree model was 0.01.
…..
…. The higher-degree RR was deemed
significantly better if the P-value for the
higher-degree model was 0.01.
…..
Hippocampal gray matter volume change was
assessed statistically using a two-tailed t
contrast with a significance level set to 0.05
(corrected for multiple comparisons within
the ROI). Uncorrected exploratory full-brain
statistics were also performed with two-tailed
t contrasts at a significance level set to 0.001.
Hippocampal gray matter volume change was
assessed statistically using a two-tailed t
contrast with a significance level set to 0.05
(corrected for multiple comparisons within
the ROI). Uncorrected exploratory full-brain
statistics were also performed with two-tailed
t contrasts at a significance level set to 0.001.
The probability of erroneously failing to reject
the null hypothesis.
The most common β = 0.2
Consequence of type I error
Keep a good medicine away from patients!
Power quantifies the ability of the study to
find true differences.
Power = 1- =P (accept H1 given H1 is true)
the probability of correctly identifing H1
(correctly identify a better medicine)
If β=0.2, power=0.8=80%
Studies with the drug X have shown that
usage of drug X induces very serious side
effects. Therefore drug X was with-drawn
from the market.
New alternative drug Y was examined and
the reduction in harmful effects, compared
to drug X, was observed.
What is the significance level that you will use
to evaluate the significance of reduction in
harmful effects of drug Y, compared to drug
X?
The effect of alcohol on the driver’s reaction
time was investigated on a simple random
sample. Observed reaction times, before
and after the alcohol intake, have shown the
increase in average reaction time after the
alcohol intake.
What is the significance level that you will use
to evaluate the significance of increase in
reaction time?
1.
2.
the medical and practical consequences of the
two kinds of errors
the desired impact of the results
<
(the most common approach =0.05 and =0.2)
ie. if the control treatment is already widely used
and is known to be reasonably safe and effective,
whereas the test treatment is new, costly, or
produces serious side effects.
>
ie. if there is no established control treatment and
test treatment is relatively inexpensive, easy to
apply and is not known to have any serious side
effects.
Choices other than =0.05 and =0.2
=0.10 and =0.2 for preliminary trials that
are likely to be replicated.
=0.01 and =0.05 for the trial that are
unlikely replicated.
A company who used to develop a clot-busting
product in the indication of occluded central
venous catheter - Nuvelo Pharmaceuticals
was sewed by their investors for setting
extraordinarily small significance level
α=0.00125
http://onbiostatistics.blogspot.com/2010/01/si
gnificant-level-of-000125.html
Power quantifies the ability of the study to
find true differences.
Power = 1- =P (accept H1 given H1 is true)
the probability of correctly identifing H1
(correctly identify a better medicine)
If β=0.2, power=0.8=80%
is the minimum difference between groups
that is judged to be clinically important
1. Minimal effect which has clinical relevance in the
management of patients
or
2. The anticipated effect of the new treatment
Power Depends on 4 elements:
The real difference between the two medicines,
Big big power
The variation among individuals,
Small big power
The sample size, n
Large n big power
Type I error,
Large big power
N
The power 1- N
The N
Sample Size
“How large a sample do I need?”
-Very commonly asked
-Important question
-Answer not so simple
Statistical power calculations
-Use statistical software or
graphical method
-Depends on data type
Braga L, Byrne R, Lorenzo A et al. Methodological quality
assessment of RCTs in hypospadias literature. 23rd
Annual ESPU Congress - Zurich, Switzerland - 2012
Analyses showed that publication after 2006 (p<0.01),
RCT sample size >50 (p=0.03), significance level
α=0.01 (p<0.01) and blinding of outcome assessor
(p<0.01) were significantly associated with better
quality of RCTs.
Hypospadias is a birth defect of the urethra in males
Weir R. Randomised controlled trial to meta-analysis ratio: a reply
from a group producing systematic reviews. 2007. The New Zel Med
Journal 120, 1-3
Antman et al showed that recommendations for routine
use of thrombolytic therapy first appeared in 1987, 14
years after a statistically significant reduction in mortality
was apparent on a subsequent cumulative meta-analysis
of all relevant RCTs.
At the first time a significant reduction in mortality was
apparent in the cumulative meta-analysis of IV
streptokinase therapy (1973, p=0.01), 2432 patients had
been randomised in eight small trials. The results of a
further 25 studies (34,542 additionalpatients) published
before routine recommendation of thrombolytic therapy,
reduced the significance level to p=0.001 in 1979 and
p=0.0001 in 1986.
CONCLUSION:
Overall advice to use steam inhalation, or
ibuprofen rather than paracetamol, does not
help control symptoms in patients with acute
respiratory tract infections and must be
balanced against the possible progression of
symptoms during the next month for a
minority of patients. Advice to use ibuprofen
might help short term control of symptoms in
those with chest infections and in children.
CONCLUSION:
Our findings suggest the presence of
heterogeneity in the associations between
individual fruit consumption and risk of type
2 diabetes. Greater consumption of specific
whole fruits, particularly blueberries, grapes,
and apples, is significantly associated with a
lower risk of type 2 diabetes, whereas greater
consumption of fruit juice is associated with a
higher risk.
Conclusions Although limited in quantity,
existing randomised trial evidence on
exercise interventions suggests that exercise
and many drug interventions are often
potentially similar in terms of their mortality
benefits in the secondary prevention of
coronary heart disease, rehabilitation after
stroke, treatment of heart failure, and
prevention of diabetes.
Sanjay Basu et al. Palm oil taxes and
cardiovascular disease mortality in India:
economic-epidemiologic model, BMJ. 2013 Oct
22;347;
Conclusions Curtailing palm oil intake through
taxation may modestly reduce hyperlipidemia and
cardiovascular mortality, but with potential
distributional consequences differentially benefiting
male and urban populations, as well as affecting
food security.