Statistical Considerations in Research Study Designs

Download Report

Transcript Statistical Considerations in Research Study Designs

How to Design and
Interpret Observational
Outcomes Studies in
Cardiovascular Disease
Nathan D. Wong, PhD, FACC
Professor and Director
Heart Disease Prevention Program
Division of Cardiology, UC Irvine
Adjunct Professor of Epidemiology, UCLA and UC Irvine
President, American Society for Preventive Cardiology
Why are papers rejected for
publication? (The Top 11 Reasons)
1. The study did not address an important scientific
issue
2. The study was not original
3. The study did not actually test the authors’
hypothesis
4. A different type of study should have been done
5. Practical difficulties led the authors to
compromise on the original study protocol (e.g.,
recruitment, procedures)
Greenhalgh T, BMJ 1997; 15: 243-6
Reasons 6-11 for Paper Rejection
6. The sample size was too small
7. The study was uncontrolled or inadequately
controlled
8. The statistical analysis was incorrect or
inappropriate
9. The authors drew unjustified conclusions from
the data
10. There is a significant conflict of interest among
authors
11. The paper is so badly written that it is
incomprehensible
Critical Appraisal
1. Why was the study done, and what
clinical question is being asked? (a brief
background, review of the literature, and
aim / hypothesis should be stated)
2. What type of study was done?
(experiment, clinical trial, observational
cohort or cross-sectional study, or
survey)
Critical Appraisal (cont.)
3. Was the design appropriate for the research?
• Clinical trial preferred to test efficacy of
treatments
• Cross-sectional study preferred for testing
validity of diagnostic/screening tests or risk
factor associations
• Longitudinal cohort study preferred for
prognostic studies
• Case-control study best to examine effects
of a given agent in relation to occurrence of
an illness, esp. rare illnesses (e.g., cancer)
Outline
• Elements of Designing a Research
Protocol
• Concepts of Study Design: Observational
cross-sectional, case-control, cohort
studies
• Advantages and Disadvantages of
Different Study Designs – which is right for
you?
• Analysis of Observational Studies
Nine Key Elements of a Research Study Protocol
•
•
•
•
•
•
•
•
•
Background
Hypotheses
Clinical Relevance
Specific Aims / Objectives
Methodology
Power / Sample Size
Measures and Outcomes
Data Management
Statistical Methodology
(UCI School of Medicine Scientific Review Committee)
Background
• A brief review of the problem to be
studied and of related studies that
generated the rationale and the
central idea of the proposed study.
Several pertinent references should
be provided.
Was the study original?
• Few studies break entirely new ground
• Many studies add to the evidence base of
earlier studies which may have had other
or more limitations
• Meta-analyses depend on literature
containing multiple studies addressing a
question in a similar manner
Features Distinguishing New vs.
Previous Studies
•
•
•
•
Sample size
Length of follow-up
More rigorous methodology
Different population studied different from
that of previous studies (ages, gender,
ethnic groups)?
• Does the new study address a clinical
issue of sufficient importance?
Greenhalgh T, BMJ 1997; 315: 305-8
Specific Aims / Objectives
• What the study is intended to study or
demonstrate; includes mention of predictor and
outcome (or endpoint) variables.
• For example: "The primary aim of the study is
to examine whether treatment A is more
effective than treatment B in reducing levels of
C", or "in finding out whether X is associated
with Y", etc.
• There may both principal and secondary aims
Elements of a Formulated Question
• Patient or Population: Who is the question
about? (e.g., pts with diabetes mellitus)
• Intervention or Exposure: What is being done or
what is happening to the patient/population?
(e.g., tight control)
• Outcome(s): How does the intervention affect
the patient/population (mortality, CHD incidence)
• Comparison(s): What could be done instead of
the intervention? (e.g., standard management)
Hypotheses
• The problem/s stated in the Background may
generate a primary hypothesis and possibly one or
two secondary hypotheses.
• A hypothesis is often stated in the null – e.g., "No
difference between treatments A and B" is
anticipated, or "No association between X and Y
exists".
• Alternatively, it can be stated according to what one
expects e.g., “A will be more effective than B in
reducing levels or symptoms of C", or “X will be
associated with Y".
Clinical / Community Relevance
• In the case of clinical studies, the potential
value in the understanding, diagnosis, or
management of a clinical condition or
pathological state should be stated.
• Funding agencies often now require a
statement of community relevance – e.g.,
how will the results be translated and
disseminated to the target population or
community.
Methodology
• Methodology should validate or not validate the
hypothesis and specific aims using procedures
consistent with sound scientific study design
including:
– the size and nature of the subjects studied
– recruitment, screening, and enrollment
procedures
– inclusion and exclusion criteria
– treatment schedules, and follow-up
procedures, if applicable. A chart of the
studies to be performed at each visit and the
time of each visit and test is needed.
Study Population Issues
• How were the subjects recruited? Is there
potential recruitment bias (e.g., from taking
respondents of advertisements), or is
survey done in a random (e.g., random
digit-dialing) or consecutive sample?
• Who was included? Many trials exclude
those who have co-morbidities, do not
speak English, or take other
medications—may provide scientifically
clean results, but may not be
representative of disease in question.
Study Population (cont.)
• Who was excluded? Study may exclude
those with more severe forms of disease,
therefore limiting generalizibility
• Were subjects studied in “real-life”
circumstances? Is the consenting process
describing the benefits/risks, access to
study staff, equipment available, etc. be
similar to that in an ordinary practice
situation?
Power / Sample Size
• A power/sample size analysis should
include an estimate of minimum effect or
difference expected at a given level of
power when the sample size is fixed, or a
projection of the number of subjects
needed to achieve a clinically important
difference in what is being examined in the
hypotheses and the specific aims.
Measures and Outcomes
• Includes both independent (predictor) and
dependent (outcome) variables.
• Outcomes include what the investigator is trying to
predict, e.g., new or recurrent onset of a disease
state, survival, or lowering of cholesterol.
• The independent or predictor variables should
always include treatment status (e.g., active vs.
placebo) in the case of a clinical trial, or primary
variables of interest (such as age, gender, levels of
X at baseline) for other studies.
• The measures and outcomes should expect to
answer the proposed question and the importance
of the knowledge expected from the research.
Data Management
• Data Management includes how data is
captured for analysis and the tools that will
be utilized while capturing the data. This
includes:
– Case report forms for clinical trials
– Surveys, questionnaires, or interview
instruments
– Computerized spreadsheets or entry forms
– Methods for data entry, error checking, and
maintenance of study databases
Statistical Methods of Analysis
• Statistical analysis includes a description
of the statistical tests planned to perform
to examine the results obtained, e.g.,
– Student’s t-test will be used to compare levels
of A and B between treatment and placebo
groups
– Multiple logistic regression analysis will be
used to examine an independent treatment
effect on the likelihood of recurrent disease.
Hierarchy of Evidence
(for making decisions about clinical
interventions or proving causation)
1. Systematic reviews and meta-analyses
2. Randomized controlled trials with definitive and
clinically significant effects
3. Randomized controlled trials with nondefinitive results
4. Cohort studies
5. Case-control studies
6. Cross-sectional surveys
7. Case reports
Features Affecting Strength and
Generalizability of Study
• sample size
• selection of comparison group (control or
placebo)
• selection of study sample (is it representative of
population the study results are intended to
apply to?)
• length of time of follow-up
• outcome assessed (e.g., hard vs. soft or
surrogate endpoint)
• Measurement and ability to control for potential
confounders
Case Reports and Series
• Provides “anectdotal” evidence about a
treatment or adverse reaction
• Often with significant detail not available in other
study designs
• May generate hypotheses, help in designing a
clinical trial.
• Several reports forming a “case series” can help
establish efficacy of a drug, or thru adverse
reports, cause its demise (example: Cerivastatin
fatal cases of rhabdomyolysis).
Observational Studies
• Cross-sectional, prospective, and casecontrol studies seldom can identify two
groups of subjects (exposed vs.
unexposed or cases vs. controls) that are
similar (e.g., in demographic or other risk
factors).
• Much of the controlling for baseline and/or
follow-up differences in subject
characteristics occurs in the analysis stage
(e.g., multivariable analysis as in
Framingham)
Observational Studies (cont.)
• While statistical procedures may be done
correctly, have we considered all possible
confounders?
• Some covariates may not have been
measured as accurately as possible, and
more often, may not be even known or
measured.
Observational, cross-sectional
• Examines association between two
factors (e.g, an exposure and a disease
state) assessed at a single point in time,
or when temporal relation is unknown
• Example: Prevalence of a known
condition, association of risk factors with
prevalent disease.
• Conclusions: Associations found may
suggest hypotheses to be further tested,
but are far from conclusive in proving
Cross-Sectional Studies and
Surveys
• Examples: NHANES III, CHIS
(telephone), chart-review studies
• Surveys should include a representative,
ideally randomly-chosen (rather than a
small sample of approached subjects who
actually agree to be surveyed) sample.
• Data collected cannot assume any
directionality in exposure / disease.
• Can statistically adjust for confounders,
but difficult to establish the temporal
nature of exposure and disease.
Prevalence of CHD by the Metabolic Syndrome and
Diabetes in the NHANES Population Age 50+
CHD Prevalence
25%
19.2%
20%
13.9%
15%
10%
8.7%
7.5%
5%
0%
No MS/No
% of
DM
Population =
54.2%
MS/No DM
28.7%
DM/No MS
2.3%
Alexander CM et al. Diabetes 2003;52:1210-1214..
DM/MS
14.8%
Prospective (Cohort) Studies
• Cohort studies begin with identification of
a population, assessment of exposure
(e.g., lipid or BP levels)
• Follow-up to the occurrence of outcomes
(CHD events)-- temporal sequence (e.g,
follow-up time) to events is known
Cohort Studies (cont.)
• Difficult to ascertain effect of exposure
because of many differences between
exposed and unexposed groups
(confounding factors).
• Statistical adjustment for known risk factor
differences can help, but unknown factors
that may differ between exposed and
unexposed groups will never be adjusted
for.
Duration of Follow-up
• Is the planned follow-up reasonable and
practical for the study question and
sample size utilized?
– effect of a new painkiller on degree of pain
relief may only require 48 hours
– effect of a cholesterol medication on mortality
may require 5 years
Prospective cohort studies
• Examples:
– Framingham Heart Study
– Cardiovascular Health Study (CHS)
– Multiethnic Study of Atherosclerosis
(MESA)
– Nurses Health Study
• Advantages:
– large sample size
– ability to follow persons from healthy to
diseased states
– temporal relation between risk factor
measures and development of disease
Prospective Studies (cont.)
• Disadvantages:
– expensive due to large sample size often
needed to accrue enough events
– many years to development of disease
– possible attrition
– causal inference not definitive as difficult to
consider all potential confounders
Framingham Heart
Study
• Longest running study of cardiovascular disease in the
world
• Began in 1948 with original cohort of 5,209 subjects aged
30-62 at baseline
• Biennial examinations, still ongoing, most of original
cohort deceased
• Offspring cohort of 5,124 of children of original cohort
enrolled in 1971, and more recently and still being
enrolled to better understand genetic components of CVD
risk are up to 3,500 grandchildren of the original cohort.
• Routine surveillance of cardiovascular disease events
adjudicated by panel of physicians
Framingham Most Significant
Milestones
• 1960 Cigarette smoking found to increase the risk of heart
disease
• 1961 Cholesterol level, blood pressure, and
electrocardiogram abnormalities found to increase the risk of
heart disease
• 1967 Physical activity found to reduce the risk of heart
disease and obesity to increase the risk of heart disease
• 1970 High blood pressure found to increase the risk of stroke
• 1976 Menopause found to increase the risk of heart disease
• 1978 Psychosocial factors found to affect heart disease
• 1988 High levels of HDL cholesterol found to reduce risk of
death
• 1994 Enlarged left ventricle (one of two lower chambers of the
heart) shown to increase the risk of stroke
• 1996 Progression from hypertension to heart failure described
14-y incidence
rates (%) for CHD
Low HDL-C Levels Increase CHD Risk
Even When Total-C Is Normal
(Framingham)
14
12
10
8
6
4
2
0
< 40 40–49 50–59  60
HDL-C (mg/dL)
 260
230–259
200–229
< 200
Risk of CHD by HDL-C and Total-C levels; aged 48–83 y
Castelli WP et al. JAMA 1986;256:2835–2838
Cardiovascular Health
Study
• 5,201 Medicare eligible individuals aged 65-102 at baseline
enrolled beginning 1992 at six field centers.
• Assessment of newer and older risk factors.
• Ongoing follow-up of cardiovascular events and mortality
• Subclinical disease measures included:
– carotid B-mode ultrasound for carotid IMT at Year 2, Year 7,
and Year 11
– m-mode echocardiographic measures of left ventricular
mass and dimensions, left atrial dimension done at baseline
(Year 2) (at UC Irvine) and follow-up (Year 7) examinations.
– Ankle brachial index (ABI) for measurement of PAD
– Pulmonary function (FVC and FEV1)
Procedure
BAS
E
Call
B
YR
3
Call
3
YR
4
Call
4
Tracking Update
X
X
X
X
X
X
Stressful Life Events
X
X
X
X
X
X
Depression Scale
X
X
X
Quality of Life
X
X
X
Social Support and
Network
X
X
X
Medications - Prescription
X
X
X
Physical Function:
ADL/IADL
X
X
Cognitive Function MMSE
X
OTC
3MSE
Digit Symbol
Substitution
X
X
X
X
X
X
X
X
X
Benton Visual Retention
Phlebotomy
X
Anthropometry - Weight
X
Standing Height
X
Waist Circumference
X
Hip Circumference
X
Arm Span
X
Cardiovascular Health Study:
Combined intimal-medial thickness
predicts total MI and stroke
Cardiovascular Health Study (CHS) (aged 65+): MI or stroke rate 25% over 7 years in
those at highest quintile of combined IMT (O’Leary et al. 1999)
Case-control Studies
• Most frequent type of epidemiologic study, can be
carried out in a shorter time and require a smaller
sample size, so are less expensive
• Only practical approach for identifying risk factors
for rare diseases (where follow-up of a large sample
for occurrence of the condition would be
impractical)
• Selection of appropriately matched control group
(e.g., hospital vs. healthy community controls) and
consideration of possible confounders crucial
• Relies on historical information to obtain exposure
status (and information on confounders)
Case-Control Studies (cont.)
• Cannot determine for sure whether
exposure preceded development of
disease
• Also difficult to identify all differences
between cases and controls that can be
statistically adjusted for
Example of case-control study:
Folate and B6 intake and risk of MI
(Tavani et al. Eur J Clin Nutr 2004)
• Cases were 507 patients with a first episode
of nonfatal AMI, and controls were 478
patients admitted to hospital for acute
conditions
• Information was collected by intervieweradministered questionnaires
• Compared to patients in the lowest tertile of
intake, the ORs for those in the highest tertile
were 0.56 (95% CI 0.35-0.88) for folate and
0.34 (95% CI 0.19-0.60) for vitamin B6.
• Author conclusion: A high intake of folates,
vitamin B6 and their combination is inversely
associated with AMI risk
Potential sources of bias and error
in case control studies
• Information on the potential risk factor or
confounding variables may not be available
from records or subjects’ memories
• Cases may search for a cause of their
disease and be more likely to report an
exposure than controls (recall bias)
• Uncertainty as to whether agent caused
disease or whether occurrence of the disease
caused the person to be exposed to the
agent
• Difficulty in assembling a case group
representative of all cases, and/or
assembling an appropriate control group
Prospective, observational:
nested case-control
• In this design, one takes incident cases
(e.g., incident CVD) and a matched set of
controls to examine the association of a
risk factor measured sometime before
development of the outcome of interest
• Less costly than a true prospective design
where all subjects are included in analysis;
may not provide equivalent estimates
Prospective study of CRP and risk of future
CVD events among apparently healthy
women (Ridker et al., Circulation 1998) – a
nested case control study
• 122 female pts who suffered a first CVD
event and 244 age and smoking-matched
controls free of CVD
• Logistic regression estimated relative risks
and 95% CI’s, adjusted for BMI, diabetes,
HTN, hypercholesterolemia, exercise, family
hx, and trt
• Those who developed CVD events had
higher baseline CRP than controls; those in
the highest quartile of CRP had a 4.8-fold (4.1
adjusted) increased risk of any vascular
event. For MI or stroke, RR=7.3 (5.5
adjusted)
hs-CRP Adds to Predictive Value of TC:HDL
Ratio in Determining Risk of First MI
Relative Risk
5.0
4.0
3.0
2.0
1.0
0.0
High
Medium
High
Medium
Low
Total Cholesterol:HDL Ratio
Ridker et al, Circulation. 1998;97:2007–2011.
Low
Examples where observational
studies have taken us down the
wrong path……
• Meta-analysis of observational studies have
shown a 50% lower risk of CHD among estrogen
users vs. non-users (which may have had many
unknown differences that were not adjusted for),
but recently randomized trials (HERS, WHI)
show no benefit
• Numerous prospective studies show a 25-50%
lower risk of CHD among those taking vitamin E
and other antoxidants vs. placebo– recent
randomized trials (e.g., HOPE, HPS) show no
benefit.
Randomized Clinical Trial
• Considered the gold standard in proving
causation– e.g., by “reducing” putative risk
factor of interest
• Randomization “equalizes” known and
unknown confounders/covariates so that
results can be attributed to treatment with
reasonable confidence
• Inclusion and exclusion criteria can often be
strict (to maximize success of trial) and may
require screening numerous patients for each
patient randomized
Randomized Clinical Trials (2)
• Expensive, labor intensive, attrition from
loss to follow-up or poor compliance can
jeopardize results, esp. if more than
outcome difference between groups
• Conditions are highly controlled and
may not reflect clinical practice or the
real world
• Funding source of study and
commercial interests of investigators
can raise questions about conclusions
of study
Randomized Controlled Trials (3)
• Randomized controlled trial eliminates
systematic bias (in theory) by allocating
treatments among participants in a random
fashion
• The allocation process eliminates selection bias
in group characteristics (check comparability of
baseline characteristics such as age, gender,
severity of disease and covariate risk factors)
(selection bias)
Questions to Ask Regarding
Statistical Analysis
• Was there sufficient power/sample size?
• Was the choice of statistical analysis
appropriate?
• Was the choice (and coding/classification) of
outcome and treatment variables appropriate?
• Is there an adequate description of magnitude
and precision of effect?
• Was there adjustment for potential confounders?
• Have the results been correctly interpreted and
not overstated?
Statistical significance and
power
• Statistical significance is based on the Type I
or Alpha error
– the probability of rejecting the null hypothesis
when it was true (saying there was a relationship
when there isn’t one)
– usually we accept being wrong <5% of the time, or
alpha=0.05
• The Type II or Beta error is the probability of
accepting the null when it was false (saying
there is no relationship when there is one)
• Power of a test is the probability of detecting
a true result or difference (rejecting the null
hypothesis of no difference when it is false),
also 1-beta (80% conventional)
Measures of Precision of Effect
• The p-value, or alpha error most commonly
indicates the precision of the result, with a
low p-value corresponding to a precise
result.
• A t-statistic, F-statistic, Chi-square, or rsquare value gives the relative magnitude of
a relation.
• The higher the magnitude of the above
statistics, the more precise or stronger is the
relationship between the explanatory
variable (s) and the outcome of interest.
Precision of Effect: The Confidence
Interval
• The estimate of where the true value of a
result lies is expressed within 95%
confidence intervals, which will contain the
true relative risk or odds ratio 95% of the
time – corresponds to 2-tailed alpha=0.05
• 95% Confidence intervals are the RR +
1.96 X SE (since SE is SD/ sqrt(N),
confidence intervals are smallest
(precision greatest) with larger studies.
Variable Classification
• What is your outcome (Y) (dependent variable) of interest?
– Categorical (binary, 3 or more categories) examples:
survival, CHD incidence, achievement of BP control
(yes vs. no)
– Continuous: change in blood pressure
• What is the main explanatory or independent variable (X)
of interest?
– Categorical (binary, 3 or more categories) examples:
treatment status (active vs. placebo), JNC-7 blood
pressure category (normal, pre-HTN, Stage 1 HTN,
Stage 2 HTN)
– Continuous: baseline systolic / diastolic blood pressure
Covariates / Confounders
• The relationship between X and Y may be
partially or completely due to one or more
covariates (C1, C2, C3, etc.) if these
covariates are related to both X and Y
• A comparison of baseline treatment group
differences in all possible known
covariates is often done and presented
• Covariates / confounders normally
equalized between groups only in
randomized clinical trial designs
Analyzing Effects of Confounders
• The effect of confounders can be
assessed by:
– Stratifying your analysis by levels of these
variables (e.g., examine relationship of X and
Y separately among levels of covariates C)
– Adjusting for covariates in a multivariable
analysis
– Considering interaction terms to test whether
effect of one factor (e.g., treatment) on
outcome varies by level of another factor
(e.g., gender)
Fallacies in Presenting Results:
Statistically vs. Clinically Significant?
• Having a large sample size can virtually assure
statistically significant results, but often with a very
low effect size or relative risk (e.g., a correlation of
0.10 is low but could be statistically significant
when the sample is large)
• Conversely, an insufficient sample size can hide
(not significant) clinically important differences
where the effect size or relative risk may be large.
• Statistical significance is directly related to sample
size and magnitude of effect or difference, and
indirectly related to variance in measure.
Assessing Accuracy of a Test
TRUE DISEASE STATUS /
TREATMENT DIFFERENCE
TEST
RESULT
DISEASED /
YES
NONDISEASED TOTAL
/ NO
POSITIVE /
reject null
a
b
a+b
NEGATIVE /
accept null
c
d
c+d
TOTAL
a+c
b+d
a+b+c+
d
SENSITIVITY = a / (a+c)
SPECIFICITY = d / (b+d)
Pos. Pred. Value = a / (a+b) Neg. Pred. Value = d/(c+d)
False positive error (alpha, Type I) = b / (b+d)
False negative error (beta, Type II) = c/ (a+c)
Statistics and Statistical
Procedures for Cross-Sectional
and Case-Control Designs
– When both independent and dependent
variables are continuous: Pearson
correlation or linear/polynomial regression
– When dependent variable is continuous
and independent variables are categorical
(with or without continuous or categorical
covariates)
Analysis of variance (Analysis of
covariance with covariates).
Analysis for Cross-Sectional and
Case Control Designs (cont.)
– When both independent and dependent
variables are categorical: Chi-square test of
proportions- prevalence odds ratio for
likelihood of factor Y in those with vs. w/o
factor X.
– When outcome is binary (e.g., survival) and
explanatory variables are categorical and/or
continuous:
• Student-test or Chi-square for initial analysis
• Logistic regression (multiple logistic regression for
covariate adjustment)
Odds of CVD Stratified by CRP Levels in U.S. Persons
(Malik and Wong et al., Diabetes Care, 2005)
6
O
d
d
s
R
a
t
i
o
***
5
4
3
2
***
*
*
**
1
0
High CRP
No
Metabolic
disease
Syndrome
Low CRP
Diabetes
–*p<.05, **p<.01, **** p<.0001 compared to no disease, low CRP
–CRP categories: >3 mg/l (High) and <3 mg/L (Low)
–age, gender, and risk-factor adjusted logistic regression (n=6497)
Metabolic Syndrome Independently Associated with
Inducible Ischemia from SPECT
(Wong ND et al., Diabetes Care 2005; 28: 1445-50 )
Predictor
OR
95% CI
P value
Log coronary calcium
(per SD)
4.11
2.60-6.51
<0.001
Chest Pain Symp
2.94
1.69-5.09
<0.001
1-2 MetS risk factors
2.99
0.70-12.8
0.14
3 MetS risk factors
4.80
1.01-22.9
0.049
4-5 MetS risk factors
10.93
2.09-57.2
0.005
Diabetes
4.55
0.98-21.1
0.053
*Estimates adjusted for age, gender, cholesterol and
smoking. Odds of ischemia for metabolic abnormalities
(yes vs. no) (separate model): 1.98 (1.20-3.98), p=0.008
Statistical Procedures for
Prospective Cohort Studies
• When outcome is continuous: Linear and/or
polynomial regression
• When outcome is binary: Relative risk (RR) for
incidence of disease in those with vs. without
risk factor of interest, adjusted for covariates and
considering follow-up time to event--Cox
proportional hazards regression: HR (t,zi) = HR0
(t) exp (α’zi)
• If follow-up time is not known, use logistic
regression: p (Y=1 | r1,r2,…) = 1/(1+ exp[-a-b1r1… b nr n)
CHD, CVD, and Total Mortality:
US Men and Women Ages 30-74
(age, gender, and risk-factor adjusted Cox regression) NHANES II
Follow-Up (n=6255)(Malik and Wong, et al., Circulation 2004; 110: 12451250)
7
***
6
***
5
***
4
***
***
***
3
***
***
***
2
*
**
1
0
CHD Mortality CVD Mortality Total Mortality
* p<.05, ** p<.01, **** p<.0001 compared to none
None
MetS
Diabetes
CVD
CVD+Diabetes
CV Event-Free 8-year Survival Using
Combined hs-CRP and LDL-C
Measurements
(n=27,939)
1.00
Median LDL 124 mg/dl
Probability of Event-free Survival
Median CRP 1.5mg/l
Low CRP-low LDL
0.99
Low CRP-high LDL
0.98
High CRP-low LDL
0.97
0.96
High CRP-high LDL
0.00
0
2
4
6
Years of Follow-up
Ridker et al, N Engl J Med. 2002;347:1157-1165.
8
Questions to ask regarding study
results
• How large is the treatment effect (or likelihood of
outcome)?
– Relative risk reduction (may obscure comparative
absolute risks)
– Absolute risk reduction: is this clinically significant?
• How precise is the treatment effect (or likelihood
of outcome)?
– What are the confidence intervals?
– Do they exclude the null value?
(e.g., is the result statistically significant– magnitude
of Chi-square or F-value)
MRC/BHF Heart Protection Study
(HPS): Eligibility
• Age 40–80 years
• Increased risk of CHD death due to prior disease
– Myocardial infarction or other coronary heart
disease
– Occlusive disease of noncoronary arteries
– Diabetes mellitus or treated hypertension
• Total cholesterol > 3.5 mmol/L (> 135 mg/dL)
• Statin or vitamins not considered clearly indicated or
contraindicated by patient’s own doctors
Heart Protection Study Group. Lancet. 2002;360:7-22.
HPS: First Major Coronary
Event
StatinPlaceboType of Major
Allocated Allocated
Vascular Event (n = 10269) (n = 10267)
Coronary events
Nonfatal MI
357 (3.5%)
574 (5.6%)
Coronary death
587 (5.7%)
707 (6.9%)
Subtotal: MCE
898 (8.7%)
1212 (11.8%)
Statin Better
0.73 (0.670.79)
P < 0.0001
Revascularizations
Coronary
513 (5.0%)
725 (7.1%)
Noncoronary
450 (4.4%)
532 (5.2%)
Subtotal: any RV
939 (9.1%)
1205 (11.7%)
Any MVE
2033 (19.8%)
Placebo Better
0.76 (0.700.83)
P < 0.0001
0.76 (0.720.81)
2585 (25.2%)
P < 0.0001
0.4
0.6
0.8
1.0
1.2
1.4
These results from the Heart Protection Study frequently present a relative risk reduction of 24% (or
relative risk of 0.76), but an absolute risk reduction of only 5.5% associated with the simvastatin
treatment.
Heart Protection Study Collaborative Group. Lancet. 2002;360:722.
Examining Magnitude of Effect: HPS Study
Example of Vascular Event Reduction
Event Yes
Simvastatin/ a
Event No
Treatment
2042
b
8227
Placebo /
Control
c
2606
d
7661
Control event rate (CER) = c/c+d = 2606/10267=0.254
Experimental event rate (EER) = a/a+b = 2042/10269 = 0.199
Relative Risk (RR) = EER/CER = (.199)/(.254) = 0.78
Relative Risk Reduction (RRR) = CER-EER/CER=(0.254-0.199)/.254= 0.22
Absolute Risk Reduction (ARR) = CER-EER = 0.01 – 0.008 = 0.055, or 5.5%
Number Needed to Treat = 1/ARR = 1/0.055 = 18.2 (or 56 events prevented
per 1000 treated)
Suggestions for Comparison of
Models for Risk Prediction
1. Compare global model fit
2. Compare calibration and discrimination
3. Assess reclassification
If global fit is better, but calibration/discrimination
similar--- Is fit better among some individuals? (i.e. high risk)
- Is the new risk category more accurate in those
reclassified?
4. Would a higher or lower risk estimate change
treatment for an individual patient?
Cook N. Circulation 2007.
ROC Curves and the c-Statistic
• Measure of discrimination
– Probability that the predicted risk is higher for a case
than a non-case
• A function of the sensitivity and specificity for
each value of a measure or model
– Perfect discrimination: c-statistic of 1
• Scores for all the cases are higher than scores for all noncases --- no overlap!
– No discrimination: c-statistic of 0.5 (coin toss)
• NOT the probability that an individual is
classified correctly (people with high score will
become a case) predictive value
Comparison of ROC Areas for Prediction
of Myocardial Ischemia
1.0
0.9
0.8
0.7
0.6
Age, gender, CRF, log(CCS+1)
Area=80%, SE=.03
0.5
0.4
0.3
0.2
0.1
Age, gender, CRF
Area=74%, SE=.03
All p<.001
Age, gender
Area=66%, SE=.03
log(CCS+1)
Area=76%, SE=.03
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1-Specificity
Berman and Wong et al., J Am Coll Cardiol 2004; 44: 923-930.
Problems with C-statistic
• If improvement in the c-statistic was used
as the criterion for model selection:
– Neither LDL, HDL, nor total cholesterol would
have been included in Framingham score!
Conclusion:
 Discrimination is only 1 aspect of model
performance
Reclassification
• Can new markers accurately stratify
individuals into higher or lower risk
categories?
• Important for clinical risk prediction!
• Net Reclassification Index (NRI)
• Integrated Discrimination Index
Reclassification Table
-Reclassified in desirable direction
-Reclassified in undesirable direction
Schnabel et abl. Circ 2010.
Summary
• Research protocols need to include key design
elements such as hypotheses, background /
aims, and methods, including subject
selection/power analysis and statistical methods.
• Different study designs have key advantages
and disadvantages and levels of evidence for
causation.
• Evaluating results from studies requires an
understanding of appropriate use of measures of
effect and consideration of statistical vs. clinical
significance.
Thank you!
For more
information
contact the UCI
Heart Disease
Prevention
Program at:
www.heart.uci.edu
949-824-5561