Transcript A+B

Biostatistics and Epidemiology:
ASPHO 2015
Lillian Sung MD, PhD
Disclosure Information
• Lillian Sung – No disclosures
Outline
• Principles of Use of Biostatistics in
Research
• Principles of Epidemiology and Clinical
Research Design
• Applying Research to Clinical Practice
Principles of Use of
Biostatistics in
Research
A. Principles of Use of Biostatistics in Research
1. Types of variables
Distinguish types of variables (eg, continuous, categorical, ordinal, nominal) (slide 7)
Understand how the type of variable (eg, continuous, categorical, nominal) affects the choice of statistical
test (slides 7, 12-16)
2. Distribution of Data
Understand how distribution of data affects the choice of statistical test (slides 8, 12, 13)
Differentiate normal from skewed distribution of data (slide 8)
Understand the appropriate use of the mean, median, and mode (slide 8)
Understand the appropriate use of standard deviation (slide 9)
Understand the appropriate use of standard error (slide 9)
3. Hypothesis testing
Distinguish the null hypothesis from an alternative hypothesis (slides 10, 11)
Interpret the results of hypothesis testing (slides 10, 11, 18)
4. Statistical tests
Understand the appropriate use of the chi-square test versus a t-test (slides 12-16)
Understand the appropriate use of analysis of variance (ANOVA) (slides 12, 13)
Understand the appropriate use of parametric (eg, t-test, ANOVA) versus non-parametric (eg, MannWhitney U, Wilcoxon) statistical tests (slides 12, 13)
Interpret the results of chi-square tests (slides 15, 16)
Interpret the results of t-tests (slide 14)
Understand the appropriate use of a paired and non-paired t-test (slides 14, 17)
Determine the appropriate use of a 1- versus 2-tailed test of significance (slides 19, 20)
Interpret a p-value (slides 18-21)
Interpret a p-value when multiple comparisons have been made (slide 21)
Interpret a confidence interval (slide 22)
Identify a type I error (slide 23)
Identify a type II error (slide 23)
5. Measurement of association
Differentiate relative risk reduction from absolute risk reduction (slide 24)
Calculate and interpret a relative risk (slide 25)
Calculate and interpret an odds ratio (slide 25)
Interpret a hazard ratio (slide 25)
Understand the uses and limitations of a correlation coefficient (slide 26)
6. Regression
Identify when to apply regression analysis (eg, linear, logistic) (slide 27)
Interpret a regression analysis (eg, linear, logistic) (slide 27)
Identify when to apply survival analysis (eg, Kaplan-Meier) (slides 28-30)
Interpret a survival analysis (eg, Kaplan-Meier) (slides 28-30)
7. Diagnostic tests
Recognize the importance of an independent "gold standard" in evaluating a diagnostic test (slides 31, 32)
Calculate and interpret sensitivity and specificity (slides 31, 32)
Calculate and interpret positive and negative predictive values (slide 31)
Understand how disease prevalence affects the positive and negative predictive value of a test (slide 32)
Calculate and interpret likelihood ratios (slide 33)
Interpret a receiver operator characteristic curve (slide 34)
Interpret and apply a clinical prediction rule (slide 34)
8. Systematic reviews and meta-analysis
Understand the purpose of a systematic review (slide 35)
Understand the advantages of adding a meta-analysis to a systematic review (slide 35)
Interpret the results of a meta-analysis (slide 35)
Identify the limitations of a systematic review (slide 35)
Identify the limitations of a meta-analysis (slide 35)
Types of Variables
Type
Example
Dichotomous
Induction death yes/no
Categorical/
Nominal
Ordinal
Leukemia type: AML, ALL, CML,
other
Continuous
Serum creatinine
Survival
Time to death
CTCAE toxicity grade 1, 2, 3, 4, 5
Nature of outcome variable (dichotomous,
categorical, ordinal, survival) drives the choice of
statistical tests
Distribution of Data
0.4
Central Tendency
Skewed
0.0
0.1
0.2
0.3
Normal
-3
-2
-1
0
1
2
3
z
Mean=Median=Mode
Mean – average value (use if normal)
Median – middle value (use if skewed)
Mode – most common value
Mode
Median
Mean
Distribution drives the
choice of statistical
tests
Standard Deviation and Standard Error
• Standard deviation – average spread from the
mean
• Wider the spread, the larger the SD
• Standard error – SD/sqrt(n)
Hypothesis Testing
• Define null hypothesis (treatment does not
work)
• Alternate hypothesis – treatment works
• Determine probability of observing data as
or more extreme assuming the treatment
does not work
• If this probability is sufficiently small –
reject null hypothesis
0.2
0.3
0.4
Rejection of Null Hypothesis
Z=+1.96
0.0
0.1
Z=-1.96
-3
-2
-1
0
1
z
Rejection of the null hypothesis
2
3
Statistical Tests
Nature of outcome variable
(dichotomous, categorical, ordinal,
survival) drives the choice of
statistical tests
Distribution drives the
choice of statistical tests
Outcome is Continuous
Number of Groups
of Interest
Two
Three or more
Parametric
Non-parametric
Student’s ttest
Mann Whitney
U (Wilcoxon
rank sum)
Kruskal Wallis
Analysis of
variance
(ANOVA)
Student’s T-test
Two groups with a continuous outcome measure
t = mean(gp1) – mean(gp2)
Variance
Larger t ~ smaller p value
Assumptions:
Data normally distributed
Observations independent
If data are matched (eg blood pressure before and after)
should use paired T-test
Outcome is Dichotomous
Number of Groups of
Interest
Two or More
Chi square or Fisher’s
exact test
Chi Square Test
• Compares proportions in 2 or more groups:
WBC ≥ 200
WBC < 200
Induction Death Yes
A
B
Induction Death No
C
D
• Calculate expected values for each cell
• X2 =∑ (O-E)2/E
• Larger X2 ~ smaller p value
Matched/Paired Versus
Independent
 Are the exposure groups independent of one another?
 Ways to induce matching:
• Compare outcome within an individual
• Eg. Pre-post intervention, cross-over trial
• Can create match by how you select subjects
• Eg. In case-control study, can match cases and
controls
P Values
• P value: probability of obtaining a test statistic at
least as extreme as the one actually observed
assuming the null hypothesis is true
• Translation: Chance of getting the results you
saw assuming that the treatment doesn’t work
• P=0.05: 5% chance of seeing data as extreme
assuming null hypothesis
– Translation: Assuming that the treatment
doesn’t work, there is a 5% chance of
observing a difference by chance alone
0.4
Two vs One Sided P Value
0.2
0.3
Two-sided P value will
evaluate both that the
treatment is better and
that the treatment is
worse than control
Z=+1.96
0.1
Z=-1.96
2.5%
0.0
2.5%
-3
-2
Typically use twosided P values
-1
0
1
2
z
Rejection of the null hypothesis
Two-sided P value
3
0.4
Two vs One Sided P Value
0.2
0.3
One-sided P value will
only evaluate that the
treatment is better
than control
Z=+1.645
0.1
Easier to show
“statistical
significance”
0.0
5%
-3
-2
-1
0
1
2
3
z
Rejection of the null hypothesis
One-sided P value
Less commonly used
• Multiple testing: if you do many tests, increase
the chance of finding P < 0.05 just by chance
alone, therefore need to adjust P value for
multiple comparisons
Confidence Intervals
 Confidence interval: probability that the
interval contains the true parameter
For example, 95% CI around the mean – 95%
probability that the interval contains the true
mean
Type I and Type II Errors
TRUTH
CONCLUSION
FROM TEST
Difference
Exists
No Difference
Exists
Difference
Exists
*
Power
Sensitivity
Type I or alpha
error
False positive
No
Difference
Exists
Type II or beta
error
False negative
*
Measures of Association
Outcome
Group
Yes
No
Treatment
A (1)
B (9)
Control
C (4)
D (6)
Risk of outcome in treatment group = 0.10 (A/A+B)
Risk of outcome in control group = 0.40 (C/(C+D)
• Absolute risk reduction: decrease in risk of an outcome
associated with an intervention
ARR = C/(C+D)- A/(A+B) = 0.40 – 0.10 = 0.30
• Relative risk reduction: absolute risk reduction divided by
event rate in the control arm
RRR = 0.30/0.40=0.75
• Number needed to treat = 1/ARR = 1/0.30 = 3.3
Odds Ratio and Relative Risk
Outcome
Group
Yes
No
Treatment
A
B
Control
C
D
• Risk in treatment = A/(A+B); Risk in control = C/(C+D)
Relative risk = A/(A+B)
C/(C+D)
RR 2.5 = 2.5 times risk of outcome if treated
• Odds in treatment = A/B; Odds in control = C/D
Odds ratio = A/B = AD/BC
C/D
OR 2.5 = 2.5 times odds of outcome if treated
• Hazard ratio: analogous to a relative risk used in survival analysis
Correlation Coefficient
•Strength of the linear relationship between two numbers
• - 1 ≤ r ≤ +1
7
7
6
6
5
5
4
4
3
3
2
r = 1.0
2
1
1
0
0
1
2
3
r = -1.0
4
• Measure of correlation
• Not measure of concordance
1
2
3
4
Regression
Used to define relationships or to predict an outcome based
on one or more exposure variables
• Univariate: single exposure variable, single outcome
• Multivariable: multiple exposure variables, single outcome
Type of regression depends on nature of outcome variable
• Dichotomous – logistic regression
• Continuous – linear regression
• Survival – Cox proportional hazards model
Survival Analysis
• Outcome is time to event
• Censor patients who don’t have an event
when last observed
• Most data is right censored
censored
START STUDY
STOP STUDY
• Example of how to
display survival
data
• Calculates survival
probability
whenever an event
occurs
Survival
Kaplan-Meier Method
Months
 Use to describe survival at a given time eg
survival at 30 months is 40%
When to Use Survival Analysis
• Time to event data
• Each individual - different length of follow-up
• Patients may be lost to follow-up
• Patients may be censored
Diagnostic Tests
New Test
Positive
Negative
Gold Standard
True
False
A
B
C
D
Need gold standard
• Sensitivity = A/(A+C) – proportion of those with the disease who
have a positive test
• Specificity – D/(B+D) – proportion of those without the disease
who have a negative test
• Positive predictive value = A/(A+B) – proportion of those who test
positive who have the disease
• Negative predictive value = D/(C+D) – proportion of those who
test negative who do not have the disease
Influence of Prevalence on
Diagnostic Tests
New Test
Positive
Negative
Gold Standard
True
False
A
B
C
D
• Because (A+C) is prevalence of disease:
• Prevalence influences PPV and NPV
• Does not influence sensitivity and specificity
Likelihood Ratios
• LR+: How much the odds of a disease increase
when the test is positive
= sensitivity/(1-specificity)
• LR-: How much the odds of a disease decrease
when the test is negative
= (1-sensitivity)/specificity
• Receiver operator
curve: to evaluate
the optimal
threshold for a
diagnostic test
• Clinical prediction
rule: using signs,
symptoms and tests
to predict a clinical
outcome
Sensitivity
Other Diagnostic Test Issue
(1-Specificity)
Systematic Reviews
• Identify studies that address a similar
question – and synthesize the data either
qualitatively or quantitatively
• Quantitative review – meta-analysis
• Limitations
• Heterogeneity in treatment effect – may not be
appropriate to combine
• Publication bias
Principles of
Epidemiology and
Clinical Research
Design and Applying
Research to Clinical
Practice
B. Principles of Epidemiology and Clinical Research Design
1. Study types
Distinguish between Phase I, II, III, and IV clinical trials (slide 39)
Recognize a retrospective study (slide 40)
Understand the strengths and limitations of retrospective studies (slide 40)
Recognize a case series (slide 41)
Understand the strengths and limitations of case series (slide 41)
Recognize a cross-sectional study (slide 42)
Understand the strengths and limitations of cross-sectional studies (slide 42)
Recognize a case-control study (slide 43)
Understand the strengths and limitations of case-control studies (slide 44)
Recognize a longitudinal study (slide 48)
Understand the strengths and limitations of longitudinal studies (slide 48)
Recognize a cohort study (slide 45)
Understand the strengths and limitations of cohort studies (slide 46)
Recognize a randomized-controlled study (slide 47)
Understand the strengths and limitations of randomized-controlled studies (slide 47)
Recognize a before-after study (slide 48)
Understand the strengths and limitations of before-after studies (slide 48)
Recognize a crossover study (slide 48)
Understand the strengths and limitations of crossover studies (slide 48)
Recognize an open-label study (slide 49)
Understand the strengths and limitations of open-label studies (slide 49)
Recognize a post-hoc analysis (slide 49)
Understand the strengths and limitations of post-hoc analyses (slide 49)
Recognize a subgroup analysis (slide 49)
Understand the strengths and limitations of subgroup analyses (slide 49)
2. Bias and Confounding
Understand how bias affects the validity of results (slide 50)
Understand how confounding affects the validity of results (slide 50)
Identify common strategies in study design to avoid or reduce bias (slide 51)
Identify common strategies in study design to avoid or reduce confounding (slide 51)
Understand how study results may differ between distinct sub-populations (effect modification) (slide 52)
3. Causation
Understand the difference between association and causation (slide 53)
Identify factors that strengthen causal inference in observational studies (eg, temporal sequence, dose
response, repetition in a different population, consistency with other studies, biologic plausibility) (slide
54)
4. Incidence and Prevalence
Distinguish disease incidence from disease prevalence (slide 55)
5. Screening
Understand factors that affect the rationale for screening for a condition or disease (eg, prevalence, test
accuracy, risk-benefit, disease burden, presence of a presymptomatic state) (slide 56)
6. Decision analysis
Understand the strengths and limitations of decision analyses (slide 57)
Interpret a decision analysis
7. Cost-benefit, cost-effectiveness, and outcomes
Differentiate cost-benefit from cost-effectiveness analysis (slide 58)
Understand how quality-adjusted life years are used in cost analyses (slide 58)
Understand the multiple perspectives (eg, of an individual, payor, society) that influence interpretation
of cost-benefit and cost-effectiveness analyses (slide 58)
8. Sensitivity analysis
Understand the strengths and limitations of sensitivity analysis (slide 59)
Interpret the results of sensitivity analysis (slide 59)
9. Measurement
Understand the types of validity that relate to measurement (eg, face, construct, criterion, predictive,
content) (slide 61)
Distinguish validity from reliability (slides 60, 61)
Distinguish internal from external validity (slide 61)
Distinguish accuracy from precision (slide 60)
Understand and interpret measurements of interobserver reliability (eg, kappa) (slide 60)
Understand and interpret Cronbach's alpha (slide 60)
Study Types
Phases of Drug Studies
Phase
Description
Subjects
I
Intended to find dose range
that is tolerated and safe MTD
Usually very small sample
size, typically without a
control group
II
Preliminary efficacy
information
Larger than Phase I – but
still limited sample size
III
Definitive efficacy
information and some
common side effects
Usually large sample size –
maybe blinded
IV
Post-marketing surveillance
– to detect rare side effects
Very large sample size
Retrospective Study
• Exposures and outcomes have already occurred
• Strengths:
• Feasible and inexpensive
• Limitations:
• Limited availability of confounders
• No control over when or how exposure or
outcome measured
• Recall bias
Case Series
• Describing similar cases, treatments or
outcomes
• Strengths:
• Feasible and inexpensive
• Limitations:
• Cannot test hypotheses
• Selection bias
Cross-Sectional Study
• All measures obtained on a single occasion
• Strengths:
•
•
Fast/inexpensive
No lost to follow-up
• Limitations:
•
•
Difficult to establish causal relationships
Can measure prevalence – not incidence
Case-Control Studies
• Identify those with and without outcome
• Look BACK to see how many had
potential predictor
CASES
Predictor
Present/Absent?
CONTROLS
Case-Control Studies
• Strengths:
– Good for rare outcomes or long latency
between predictor and outcome
• Limitations:
– Cannot estimate incidence or prevalence of
disease
– Can only study one outcome
– Prone to bias:
• Sampling – selection of controls is critical
• Recall bias
Cohort Studies
• Identify those with and without potential
predictor
• Look FORWARD to see how many have
outcome
• Can be prospective or retrospective
PRED YES
Outcome
Yes/No?
PRED NO
Cohort Studies
• Strengths:
– Time sequence strengthens inference
– Absence of recall bias
– Can calculate incidence
• Limitations:
– Expensive
Randomized Controlled Trials
 Randomization:
• Ensures that known and unknown potential
confounders are equally distributed among the
treatment and control groups
• Avoid allocation bias
 Strengths:
• Strongest design to make inferences about therapy
• Limits influence of confounders, allocation bias
 Limitations:
• Expensive
• Usually lack generalizability
Other Study Types
• Longitudinal study: track same individuals over a period of time
and repeat measurements
– Strengths: natural history
– Limitations: hard to determine causation, lost to follow-up
• Before and after study: evaluate an outcome before and
following institution of an intervention
– Strengths: feasible
– Limitations: confounders, regression to the mean
• Crossover study: evaluate an outcome in two time periods
within the same individual – typically randomize order
– Strengths: reduces variability, improves power
– Limitations: need chronic stable conditions, short onset of
action, condition cannot be “cured” by intervention
Other Study Types
• Open label study: not blinded
– Strengths: feasible, ethics
– Limitations: co-interventions (may treat groups
differently), contamination, observer bias
• Post-hoc analysis: examining data after a study has been
completed for relationships not hypothesized a priori
• Limitations: multiple testing
• Sub-group analysis: examining patterns in a sub-group of
patients
• Strengths: may be sub-groups of patients who respond
differently to treatment
• Limitations: multiple testing, limited power
Bias and Confounding
• Bias: systematic error
• Selection bias
• Measurement bias
• Confounder: third variable that is associated
with exposure and outcome variables and
not in the causal pathway
Both bias and confounders are major threats
to validity of any study
Strategies to Avoid Bias/Confounding
• Bias: randomization, double-blinding,
standardize measurement of outcomes
• Confounding:
• Restriction - only include specific sub-groups
• Stratification - analyze by sub-group
• Stratify randomization - ensure equal
distribution
• Multiple regression techniques
Effect Modifier
Systolic BP
250
200
Male
150
Female
100
50
0
20
25
30
35
40
BMI
Systolic BP by BMI
250
200
Systolic BP
• Interaction
• Two exposure variables
of interest
• Effect of one exposure
variable depends on the
second exposure
variable
• Example: influence of
BMI on SBP differs in
males and females
Systolic BP by BMI
150
Male
100
Female
50
0
20
25
30
35
40
BMI
Understand how results may differ between distinct
sub-populations (effect modification)
Causation
• Association: correlation
• Causation: outcome causally related to
exposure
Factors that strengthen causal inference in observational
studies (Bradford-Hill criteria):
•
•
•
•
•
•
•
•
Temporality – cause precedes effect
Strength - large relative risk
Dose-response
Reversibility
Consistency – repeatedly observed
Biologic plausibility
Specificity – one cause, one effect
Analogy – same relationship observed in a different
disease
Incidence and Prevalence
• Incidence: number of new cases per unit
time
• Prevalence: total number of cases
Screening
Factors that affect rationale for screening:





Disease prevalence and burden (important)
Latent stage of disease (presymptomatic)
Test available, accurate and acceptable
Screening will do more benefit than harm
Availability of treatment
Decision Analysis
 Decision analysis – quantitative method to
determine the treatment option with the
best expected outcome
 Strengths:
• Quantitative comparison of different
treatment strategies that can
incorporate benefits, side effects and
costs
 Limitations:
 Less useful for individual patient
decision making
 Many probabilities and outcomes (such
as quality of life) not known
Cost Analyses
• Cost benefit analysis: both benefits and costs expressed in
monetary terms - “net present value”
• Cost effectiveness analysis: compares costs and outcomes
of different strategies eg. cost per quality-adjusted life
years (QALYs)
• QALYs = quality of life * length of life
– In cost effectiveness analysis, illustrates the cost to gain
quality and quantity of life
• Different perspective – individual, society, healthcare
system – influence CBA and CEA
Sensitivity Analysis
• Decision and cost analyses, probabilities
and outcomes uncertain
• Sensitivity analysis – vary probabilities and
outcomes to determine if conclusions
change
• If findings don’t change, conclusions are
robust
Measurement
Reliability: consistency or reproducibility (precise)
 Test re-test (repeated over time), interrater
(repeated by different raters)
• Measure agreement with kappa statistic
(dichotomous outcomes)
 Cronbach’s alpha-internal consistency
• Extent to which items are correlated
Validity
Validity: degree to which it measures what it claims to
measure (accurate)
• Internal - within the specific study
• External - generalizability
• How to determine:
– Face – do the items make sense
– Content – does the instrument seem to contain the
correct items
– Criterion – if there is a gold standard
– Construct – extent to which the measure behaves in the
hypothesized manner
– Predictive – extent to which a score predicts some
criterion measure
Principles of
Epidemiology and
Clinical Research
Design and Applying
Research to Clinical
Practice
C. Applying Research to Clinical Practice
1. Assessment of study design, performance & analysis (internal validity)
Recognize when appropriate control groups have been selected for a case-control study (slide 64)
Recognize when appropriate control groups have been selected for a cohort study (slide 64)
Recognize the use and limitations of surrogate endpoints (slide 64)
Understand the use of intent-to-treat analysis (slide 65)
Understand how sample size affects the power of a study (slide 65)
Understand how sample size may limit the ability to detect adverse events (slide 65)
Understand how to calculate an adequate sample size for a controlled trial (ie, clinically meaningful
difference, variability in measurement, choice of alpha and beta) (slide 66)
2. Assessment of generalizability (external validity)
Identify factors that contribute to or jeopardize generalizability (slide 67)
Understand how non-representative samples can bias results (slide 67)
Assess how the data source (eg, diaries, billing data, discharge diagnostic code) may affect study results
(slide 68)
3. Application of information for patient care
Estimate the post-test probability of a disease, given the pretest probability of the disease and the
likelihood ratio for the test (slide 69)
Calculate absolute risk reduction (slide 69)
Calculate and interpret the number-needed-to treat (slide 69)
Distinguish statistical significance from clinical importance (slide 69)
4. Using the medical literature
Given the need for specific clinical information, identify a clear, structured, searchable clinical question
(slide 70)
Identify the study design most likely to yield valid information about the accuracy of a diagnostic test
(slide 70)
Identify the study design most likely to yield valid information about the benefits and/or harms of an
intervention (slide 70)
Identify the study design most likely to yield valid information about the prognosis of a condition (slide 70)
Internal Validity
• Appropriate control groups for case-control study
• From population at risk that are otherwise similar to
cases
• Appropriate control groups for cohort study
• Non-exposed that are otherwise similar to exposed
group
• Surrogate endpoints:
• Laboratory test or physical finding used instead of a
clinically meaningful endpoint
• Strengths: may be detected earlier or in more patients
• Limitations: effects on surrogate endpoint may not
correspond to effects on clinical endpoint
• Intention to treat analysis:
• Includes all randomized subjects to the group they were
assigned
• Important because patients do not cross-over randomly
• Should be primary analysis
• Power – probability test will reject the null hypothesis
• Larger sample size
• More power
• Greater ability to detect adverse events
Sample Size Considerations
To calculate sample size for a controlled trial:
 Alpha - typically 0.05 (5% probability of finding a difference by
chance alone if one does not exist)
 Beta - (1-power) - typically 0.1 or 0.2 (20% probability of not
finding a difference if one really exists)
 Minimal clinically important difference and variability of
outcome
External Validity
• Factors that can jeopardize generalizability
• Highly selected subjects
• Intervention applied in a manner not feasible for
routine practice
• Maneuvers to enhance compliance
• Excessive monitoring
• If study sample not representative – biased
because not generalizable
Influence of Data Sources
• Diaries – missing data, recall bias
• Administrative data – eg billings data,
discharge diagnosis codes:
– Validity needs to be established
– Lack clinical information such as disease risk
status
– Errors
Application of Information
• Post-test probability of disease = pretest
probability of disease x likelihood ratio of test
• ARR = risk (control) – risk (treatment) (slide 21)
NTT = 1/ARR
• Significance
– Statistical: set by alpha – more likely as sample
size increases
– Clinical: what is important – does not change
with sample size
Using Medical Literature
• To conduct search, need to identify a clear,
structured searchable research question
• Optimal study designs:
Purpose
Design That Yields Most
Valid Information
Benefits and/or harms of
an intervention
Prognosis
Randomized controlled
trials
Cohort studies
Diagnostic test
Cross-sectional studies
THANKS!
More questions?
Email: [email protected]
Phone: 416-813-5287