Introduction to meta-analysis

Download Report

Transcript Introduction to meta-analysis

Meta-Analyses: Appropriate Growth
Will G Hopkins
Faculty of Health Science
AUT University, Auckland, NZ
Resources:
 Cochrane Reviewers’ Handbook (2006) at cochrane.org.
 QUOROM: Quality of Reporting of Meta-analyses (of controlled
trials). Lancet 354:1896-1900, 1999.
 MOOSE: Meta-analysis of Observational Studies in
Epidemiology. JAMA 283:2008-2012, 2000.
 Gene-association studies at HuGeNet.ca.
 Diagnostic tests. Ann. Intern. Med. 120:667-676, 1994.
 An Introduction to Meta-analysis at sportsci.org/2004
Overview
 A meta-analyzed estimate of an effect is:





an average of qualifying study-estimates of the effect, with…
…more weight for study-estimates with better precision,
…adjustment for and estimation of effects of study characteristics,
…accounting for any clustering of study-estimates,
…accounting for residual differences in study-estimates.
 Possible problems with the Cochrane Handbook and
the Review Manager software (RevMan).
An average of qualifying study-estimates of the effect
 Studies that qualify
 Spell out selection criteria: design, population, treatment.
 Include conference abstracts to reduce publication bias due to
journals rejecting non-significant studies.
 Averaging requires estimates of effects in the same units:
This:
rather than this:
 Averaging effects derived from continuous variables
 Aim for dimensionless units, such as change in % or % change.
 Convert % changes in physiological and performance measures
to factor changes then log transform before meta-analysis.
• Important when effects are greater than ±10%.
• 37%  1.37  log(1.37)
• -60%  0.40  log(0.40)  meta-analysis  100(eanswer-1)%
• 140%  2.40  log(2.40)
 Physical performance is best analyzed as % change in mean
power output, not % change in time, because…
a 1% change in endurance power output produces…
• 1% in running time-trial speed or time;
• 0.4% in road-cycling time-trial time;
• 0.3% in rowing-ergometer time-trial time;
• 15% in time to exhaustion in a sub-VO2maximal
constant-power test;
• T/0.50% in time to exhaustion in a supra-VO2maximal constantpower test lasting T minutes;
• 1% change in peak power in an incremental test, but…
• >1% change in time to exhaustion in the test (but can usually
recalculate to % change in power);
• >1% change in power in any test following a pre-load (but
sometimes can’t convert to % change in power in time trial).
 Standardizing or Cohenizing of changes is a widespread but
misused approach to turn effects into dimensionless units…
• To standardize, express the difference or change in the mean as a
fraction of the between-subject SD (mean/SD).
Standardized change = 3.0
• But study samples are often drawn
from populations with different SDs,
post
so differences in effect size will be
very large
pre
due partly to the differences in SDs.
• Such differences are irrelevant and
tend to mask interesting differences.
• So meta-analyze a measure reflecting
strength
the biological effect, such as % change.
• Combine the pre-test SDs from the studies selectively and
appropriately, to get one or more population SDs.
• Express the meta-analyzed effect as a standardized effect using
this/these SDs.
• Use Hopkins’ modified Cohen scale to interpret the magnitude:
<0.2
trivial
0.2-0.6 0.6-1.2 1.2-2.0 2.0-4.0
>4.0
small moderate large very large awesome
 Averaging effects from psychometric inventories
 Recalculate effects after rescaling all measures to 0-100.
 Standardize after meta-analysis, not before.
 Averaging effects from counts (of injury, illness, death)
 Best effect for such time-dependent events is the hazard ratio.
• Hazard = instantaneous risk = proportion per small unit of time.
• Proportional-hazards (Cox) regression is based on assumption
that this ratio is independent of follow-up time.
• Risk and odds ratios change with follow-up time, so convert any
to hazard ratios.
• Odds ratios from well-designed case-control studies are already
hazard ratios.
• If the condition is uncommon (odds or risks are <10% in both
groups), risk and odds ratios can be treated as hazard ratios.
More weight for study-estimates with better precision
 Usual weighting factor is 1/(standard error)2.
 Equivalent to sample size, other things (errors of measurement)
being equal.
 Calculate from...
 the confidence interval or limits
 the test statistic (t, 2, F)
• But F ratios with numerator degrees of freedom >1 can’t be used.
 the p value
• If "p<0.05"…, analyze as p=0.05.
• If "p>0.05“, can’t use.
 To get standard error for controlled trials, can also use…
 SDs of change scores,
 post-test SDs (but very often gives large standard error),
 P values for each group, but not if one is p>0.05.
DO NOT COMPARE STATISTICAL SIGNIFICANCE
IN EXPERIMENTAL AND CONTROL GROUPS.
 If none of the above are available for up to ~20% of studyestimates, assume a standard error of measurement to derive a
standard error.
 If can’t get standard error for >20% of study-estimates, use
sample size as the weighting factor.
• The factor is (study sample size)/(mean study sample size).
• Equivalent to assuming dependent variable has same error of
measurement in all studies.
• For groups of unequal size n1 n2, use n = 4n1n2/(n1+n2).
• Divide each study’s factor by the number of estimates it provided,
to ensure equal contribution of all studies.
Adjustment for effects of study characteristics
 Include as covariates in a meta-regression to try to account for
differences in the effect between studies. Examples:




duration or dose of treatment;
method of measurement of dependent variable;
a score for study quality;
gender and mean characteristics of subjects (age, status…).
• Treat separate outcomes for females and males from the same
study as if they came from separate studies.
• If gender effects aren’t shown separately in one or more studies,
analyze gender as a proportion of one gender:
e.g., for a study of 3 males and 7 females, “maleness” = 0.3.
 Number of available study-estimates usually limits the analysis
to main effects (i.e., no interactions).
 Use a correlation matrix to identify collinearity problems.
Accounting for any clustering of study-estimates
 Frequent clusters are several post tests in a controlled trial or
different doses of a treatment in a crossover.
 Treating each post test or dose as a separate study biases
precision of the time or dose effect low and precision of all
other effects high.
 Fix with a mixed-model (= random-effect) meta-analysis.
 Mixed modeling would also allow inclusion of effects in
control and experimental groups as a cluster.
 Current approach is to include only their difference…
 …which doesn’t allow estimation of how much of the metaanalyzed effect is due to changes in the control groups.
Accounting for residual differences in study-estimates
 There are always real differences in the effect between studies,
even after adjustment for study characteristics.
 Use random-effect meta-analysis to estimate the real differences
as a standard deviation.
 The mean effect ± this SD is what folks can expect typically from
setting to setting.
• For treatments, the effect on any specific individual will be more
uncertain because of individual responses and measurement
error.
 Other random effects can account for any clustering.
 A simple meta-analysis using sample size as the weighting factor
is a random-effect meta-analysis.
Possible problems with Cochrane and RevMan
 Comparison of subgroups and estimation of covariates
 “Subgroup analyses can generate misleading recommendations
about directions for future research that, if followed, would waste
scarce resources.”
 “No formal method is currently implemented in RevMan. When
there are only two subgroups the overlap of the confidence
intervals of the summary estimates in the two groups can be
considered…”
 Random-effects modeling
 “In practice, the difference [between RevMan and more
sophisticated random-effects modeling] is likely to be small
unless there are few studies.” Really?
 Forest and funnel plots
 Some minor issues and suggestions for improvement.
This presentation will be available from:
Wait for Sportscience 11, 2007, or contact [email protected]