Regression Analysis
Download
Report
Transcript Regression Analysis
Chapter 11
Survival Analysis
Part 2
Survival Analysis and
Regression
Combine lots of information
Look at several variables simultaneously
Explore interactions
model interaction directly
Control (adjust) for confounding
2
Proportional hazards regression
(Cox Regression)
Can we relate predictors to survival time?
We would like something like linear regression
t B0 B1 X1 B2 X 2 ...
Can we incorporate censoring too?
Use the hazard function
3
Hazard function
Given patient survived to time t, what is the
probability they develop outcome very soon?
(t + small amount of time)
Approximates proportion of patients having
event around time t
4
Hazard function
(t )
Prob(t T t t T )
Hazard less intuitive than survival curve
Conditional probability the event will occur between t and t+
given it has not previously occurred
Rate per unit of time, as goes to 0 get instant rate
Tells us where the greatest risk is given survival up to that
time (risk of the event at that time for an individual)
5
Possible Hazard of Death from Birth
Probability of dying in next year as function of age
(t)
0
6
17
23
At which age would the hazard be greatest?
80
6
Possible Hazard of Divorce
0
2
10
25
35
50
7
Why “proportional hazards”?
Ratio of hazards measures relative risk
(t) for exposed
RR(t)
(t) for unexposed
If we assume relative risk is constant over time…
(t ) for exposed
c
(t ) for unexposed
The hazards are proportional!
8
Proportional Hazard of Death from Birth
Probability of dying in next year as function of age for
two groups (women, men)
(t)
0
6
17
23
80
At which age would the hazard be greatest?
9
Proportional Hazards and
Survival Curves
If we assume proportional hazards then
sa (t ) [sb (t )]
c
The curves should not cross.
10
Proportional hazards regression model
one covariate
(t ) 0 (t ) exp(1 X 1 )
0(t) - unspecified baseline hazard (constant)
(t) the hazard for subject with X=0 (cannot be
negative)
1 = regression coefficient associated with the
predictor (X)
1 positive indicates larger X increases the hazard
Can include more than one predictor
11
Interpretation of Regression Parameters
(t ) 0 (t ) exp(1 X1 2 X 2 3 X 3 .... p X p )
Log( (t )) o 1 ( x1 ) 2 ( x2 ) ... p ( x p )
For a binary predictor; X1 = 1 if exposed and 0 if unexposed,
exp(1) is the relative hazard for exposed versus unexposed
(1 is the log of the relative hazard)
exp(1) can be interpreted as relative risk or relative rate with
all other covariates held fixed.
12
Example - risk of outcome for
women vs. men
Suppose X1=1 for females, 0 for males
For females;
For males;
(t ) 0 (t ) exp(1 X1 )
(t ) 0 (t ) exp(1 *1) 0 (t ) exp(1 )
(t ) 0 (t ) exp(1 * 0) 0 (t )
hazardfor females 0 (t ) exp(1 )
Relativehazard
exp(1 )
hazardfor males
0 (t )
13
Example - Risk of outcome for
1 unit change in blood pressure
Suppose X1= systolic blood
pressure (mm Hg)
(t ) 0 (t ) exp(1 *114)
For person with SBP = 114
(t ) 0 (t ) exp(1 *113)
For person with SBP = 113
Relative risk of 1 unit
increase in SBP:
(t ) 0 (t ) exp(1 X1 )
0 (t ) exp(114* 1 )
0 (t ) exp(113* 1 )
exp(1141 1131 )
exp(1 )
14
Example - Risk of outcome for
10 unit change in blood pressure
Suppose X= systolic blood
pressure (mmHg)
(t ) 0 (t ) exp(1 *110)
For person with SBP = 110
(t ) 0 (t ) exp(1 *100)
For person with SBP = 100
Relative risk of 10 unit
increase in SBP:
(t ) 0 (t ) exp(1 X1 )
0 (t ) exp(110* 1 )
0 (t ) exp(100* 1 )
exp(1101 1001 )
exp(101 )
15
Parameter estimation
How do we come up with estimates for i?
Can’t use least squares since outcome is not
continuous
Maximum partial-likelihood (beyond the scope of this
class)
Given our data, what are the values of i that are
most likely?
See page 392 of Le for details
16
Inference for proportional hazards regression
Collect data, choose model, estimate is
Describe hazard ratios, exp(i), in statistical
terms.
How confident are we of our estimate?
Is the hazard ratio is different from one due to
chance?
17
95% Confidence Intervals for the relative
risk (hazard ratio)
Based on transforming the 95% CI for the hazard ratio
(e
i 1.96 SE
,e
i 1.96 SE
)
Supplied automatically by SAS
“We have a statistically significant association between the predictor
and the outcome controlling for all other covariates”
Equivalent to a hypothesis test; reject Ho: RR = 1 at alpha = 0.05
(Ha: RR1)
18
Hypothesis test for individual PH
regression coefficient
Null and alternative hypotheses
Ho : Bi = 0, Ha: Bi 0
Test statistic and p-values supplied by SAS
If p<0.05, “there is a statistically significant association
between the predictor and outcome variable controlling
for all other covariates” at alpha = 0.05
When X is binary, identical results as log-rank test
19
Hypothesis test for all coefficients
Null and alternative hypotheses
Ho : all Bi = 0, Ha: not all Bi 0
Several test statistics, each supplied by SAS
Likelihood ratio, score, Wald
p-values are supplied by SAS
If p<0.05, “there is a statistically significant association
between the predictors and outcome at alpha = 0.05”
20
Example
Myelomatosis: Tumors throughout the body composed of cells derived
from hemopoietic(blood) tissues of the bone marrow.
N=25
dur=>is time in days from the point of randomization to either death or
censoring (which could occur either by loss to follow-up or termination
of the observation).
Status=>has a value of 1 if dead; it has a value of 0 if censored.
Treat=>specifies a value of 1 or 2 to correspond to two treatments.
Renal=>has a value of 0 if renal functioning was normal at the time of
randomization; it has a value of 1 for impaired functioning.
The MYEL Data set take from: Survival Analysis Using SAS, A Practical Guide by Paul D. Allison - page 269
21
22
23
SAS- PHREG
PROC PHREG DATA = myel;
MODEL dur*status(0) =treat;
RUN;
Same as LIFETEST
Fit proportional hazards model with time to death as outcome
“ status(0)”; observations with status variable = 0 are censored
status= 1 means an event occurred
Look at effect of Treatment 2 vs. Treatment 1 on mortality.
PROC PHREG Output
Analysis of Maximum Likelihood Estimates
Variable
treat
DF
Parameter
Estimate
Standard
Error
Chi-Square
Pr > ChiSq
Hazard
Ratio
1
0.57276
0.50960
1.2633
0.2610
1.773
77% increased risk of death for treatment 2 vs. treatment 1,
But it is not significant? Why?
25
Complications
Complications
competing risks (high death rate)– RENAL FUNCTION
Non proportional hazards -time dependent covariates
(will show you later)
Extreme censoring in one group
26
SAS- PHREG
PROC PHREG DATA = myel;
MODEL dur*status(0) = renal treat;
RUN;
Same as LIFETEST
Look at effect of Treatment 2 vs. Treatment 1 on mortality
adjusted for renal functioning at baseline.
Output with adjusted
treatment effect
Analysis of Maximum Likelihood Estimates
Parameter
Hazard
Variable
renal
treat
DF
1
1
Estimate
4.10540
1.24308
Standard
Error
1.16451
0.59932
Chi-Square
12.4286
4.3021
Pr > ChiSq
0.0004
0.0381
Ratio
60.667
3.466
28
29