การวิเคราะห์ระยะปลอดเหตุ

Download Report

Transcript การวิเคราะห์ระยะปลอดเหตุ

Survival Analysis
Bandit Thinkhamrop, PhD. (Statistics)
Department of Biostatistics and Demography
Faculty of Public Health, Khon Kaen University
Begin at the conclusion
7
Type of the study outcome: Key for selecting
appropriate statistical methods
• Study outcome
• Dependent variable or response variable
• Focus on primary study outcome if there are more
• Type of the study outcome
•
•
•
•
Continuous
Categorical (dichotomous, polytomous, ordinal)
Numerical (Poisson) count
Event-free duration
The outcome determine statistics
Continuous
Mean
Median
Categorical
Proportion
(Prevalence
Or
Risk)
Linear Reg.
Count
Survival
Rate per “space”
Median survival
Risk of events at T(t)
Logistic Reg. Poisson Reg.
Cox Reg.
Statistics quantify errors for judgments
Parameter estimation
[95%CI]
Hypothesis testing
[P-value]
Back to the conclusion
Continuous
Categorical
Count
Survival
Appropriate statistical methods
Mean
Median
Proportion
(Prevalence or Risk)
Rate
per “space”
Median survival
Risk of events at T(t)
Magnitude of effect
95% CI
Answer the research question
based on lower or upper limit of the CI
P-value
Study outcome
• Survival outcome = event-free duration
• Event (1=Yes; 0=Censor)
• Duration or length of time between:
• Start date ()
• End date ()
• At the start, no one had event (event = 0) at time t(0)
• At any point since the start, event could occur, hence,
failure (event = 1) at time t(t)
• At the end of the study period, if event did not occur,
hence, censored (event = 0)
• Thus, the duration could be either ‘time-to-event’ or
‘time-to-censoring’
Censoring
• Censored data = incomplete ‘time to event’ data
• In the present of censoring, the ‘time to event’ is not known
• The duration indicates there has been no event occurred
since the start date up to last date assessed or observed,
a.k.a., the end date.
• The end date could be
• End of the study
• Last observed prior to the end of the study due to
• Lost to follow-up
• Withdrawn consent
• Competing events occurred, prohibiting progression to the event under
observation
• Explanatory variables changed, irrelevance to occurrence of event under
observation
Magnitude of effects
• Median survival
• Survival probability
• Hazard ratio
SURVIVAL ANALYSIS
Study aims:
• Median survival
• Median survival of liver cancer
• Survival probability
• Five-year survival of liver cancer
• Five-year survival rate of liver cancer
• Hazard ratio
• Factors affecting liver cancer survival
• Effect of chemotherapy on liver cancer survival
SURVIVAL ANALYSIS
Event
Dead, infection, relapsed, etc
Negative
Cured, improved, conception, discharged, etc Positive
Smoking cessation, ect
Neutral
Natural History of Cancer
Accrual, Follow-up, and Event
ID
1
2
3
2009
2010
2011
Begin the study
2012
End of the study
Dead
4
Dead
5
6
Start of accrual
Recruitment period
End of accrual
End of follow-up
Follow-up period
ID
1
2
3
4
5
6
Time since the beginning of the
study
0
1
2
3
4
48 months
22 months
14 months
40 months
26 months
13 months
The data : >48
Dead
Dead
>22
14
40
>26
>13
DATA
ID
1
2
3
4
5
6
SURVIVAL TIME
(Months)
48
22
14
40
26
13
OUTCOME AT THE END
OF THE STUDY
Still alive at the end of the study
Dead due to accident
Dead caused by the disease under investigation
Dead caused by the disease under investigation
Still alive at the end of the study
Lost to follow-up
EVENT
Censored
Censored
Dead
Dead
Censored
Censored
DATA
ID
TIME
1
2
3
4
5
6
48
22
14
40
26
13
EVENT
ID
TIME
Censored
Censored
Dead
Dead
Censored
Censored
1
2
3
4
5
6
48
22
14
40
26
13
EVENT
0
0
1
1
0
0
ANALYSIS
ID
TIME
1
2
3
4
5
6
48
22
14
40
26
13
EVENT
0
0
1
1
0
0
Prevalence = 2/6
Incidence density = 2/163 person-months
Proportion of surviving at month ‘t’
Median survival time
RESULTS
ID
1
2
3
4
5
6
TIME
48
22
14
40
26
13
EVENT
0
0
1
1
0
0
Incidence density = 1.2 per100 person-months
(95%CI: 0.1 to 4.4)
Proportion of surviving at 24 month = 80%
(95%CI: 20 to 97)
Median survival time = 40 Months
(95%CI: 14 to 48)
Type of Censoring
1) Left censoring: When the patient experiences the event
in question before the beginning of the study
observation period.
2) Interval censoring: When the patient is followed for
awhile and then goes on a trip for awhile and then
returns to continue being studied.
3) Right censoring:
1) single censoring: does not experience event during the study
observation period
2) A patient is lost to follow-up within the study period.
3) Experiences the event after the observation period
4) multiple censoring: May experience event multiple times after
study observation ends, when the event in question is not
death.
Summary description of survival data set
stdes
• This command describes summary
information about the data set. It provides
summary statistics about the number of
subjects, records, time at risk, failure
events, etc.
Computation of S(t)
1) Suppose the study time is divided into periods, the
number of which is designated by the letter, t.
2) The survivorship probability is computed by
multiplying a proportion of people surviving for each
period of the study.
3) If we subtract the conditional probability of the
failure event for each period from one, we obtain that
quantity.
4) The product of these quantities constitutes the
survivorship function.
Kaplan-Meier Methods
Kaplan-Meier survival curve
Median survival time
Survival Function
• The number in the risk set is used as the
denominator.
• For the numerator, the number dying in
period t is subtracted from the number in
the risk set. The product of these ratios
over the study time=
S (t ) 

t ( i ) T
nt  dt
nt
Survival experience
Survival curve more than one group
Comparing survival between groups
ID
TIME
DEAD
DRUG
1
2
3
4
5
6
7
8
9
10
11
12
48
22
14
40
26
13
13
6
12
14
22
13
0
0
1
1
0
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
Kaplan-Meier surve
Kaplan-Meier survival estimates, by drug
1.00
drug 1
0.75
0.50
0.25
drug 0
0.00
0
20
40
analysis time
60
Log-rank test
 
2
•
•
•
•
•
•
•
•
•
•
t
n
n1
n2
d
c
O1
O2
E1
E2
=
=
=
=
=
=
=
=
=
=
# of groups
Oi  Ei 
i
Ei

2
Time
Number at risk for both groups at time t
Number at risk for group 1 at time t
Number at risk for group 2 at time t
Dead for both groups at time t
Censored for both groups at time t
Number of dead for group 1 at time t
Number of dead for group 2 at time t
Number of expected dead for group 1 at time t
Number of expected dead for group 2 at time t
Log-rank test example
• DRUG1 = 48+, 22+, 26+, 13+,14,40
• DRUG0 = 13+, 6, 12, 14, 22, 13
Hazard Function
P t  T  t  t T  t 
h(t )  lim
t 0
t
Survival Function vs Hazard Function
H(t) = -ln(S(t))
(S(t)) = EXP(-H(t))
Hazard rate
• The conditional probability of the event
under study, provided the patient has
survived up to an including that time period
• Sometimes called the intensity function, the
failure rate, the instantaneous failure rate
Formulation of the hazard rate
 Pr(t  t  T  t | T  t 
h(t )  lim 

t 0

t


f (t )

S (t )
The HR can vary from 0 to infinity. It can increase or decrease
or remain constant over time. It can become the focal point of
much survival analysis.
Cox Regression
• The Cox model presumes that the ratio of the
hazard rate to a baseline hazard rate is an
exponential function of the parameter vector.
h(t) = h0(t)  EXP(b1X1 + b2X2 + b3X3 + . . . + bpXp )
Hazard ratio
Testing the Adequacy of the
model
1. We save the Schoenfeld residuals of the
model and the scaled Schoenfeld
residuals.
2. For persons censored, the value of the
residual is set to missing.
borrowed from Professor Robert A. Yaffee
A graphical test of the proportion
hazards assumption
• A graph of the log hazard would reveal 2 lines
over time, one for the baseline hazard (when x=0)
and the other for when x = 1
• The difference between these two curves over
time should be constant = B
If we plot the Schoenfeld residuals over
the line y=0, the best fitting line should
be parallel to y=0.
borrowed from Professor Robert A. Yaffee
Graphical tests
• Criteria of adequacy:
The residuals, particularly the rescaled residuals,
plotted against time should show no trend(slope)
and should be more or less constant over time.
borrowed from Professor Robert A. Yaffee
Other issues
•
•
•
•
Time-Varying Covariates
Interactions may be plotted
Conditional Proportional Hazards models:
Stratification of the model may be
performed. Then the stphtest should be
performed for each stratum.
borrowed from Professor Robert A. Yaffee
Suggested Readings for beginners
Suggested Readings for advanced learners
Survival analysis in practice
• What is the type of research question that
survival analysis should be used?
Stata for one-group survival analysis
•
•
•
•
•
•
•
stset time, failure(event)
stdescribe
tab event
stsum
strate
stci
sts list, at(12 24)
Stata for one-group survival analysis (cont.)
•
•
•
•
•
•
•
•
sts g
sts g, atrisk
sts g, lost
sts g, enter
sts g, risktable
sts g, cumhaz
sts g, cumhaz ci
sts g, hazard
Stata for multiple-group survival analysis
•
•
•
•
•
•
•
•
•
•
•
stset time, failure(event)
stdescribe
stsum, by(group)
sts test group
sts test group, wilcoxon
strate group
stci , by(group)
sts g, by(group) atrisk
sts g, by(group) risktable
sts g, by(group) cumhaz lost
sts g, by(group) hazard ci
Stata for multiple-group survival analysis
•
•
•
•
•
•
•
•
•
•
sts list, , by(group) at(12 24)
sts list, , by(group) at(12 24) compare
ltable group, interval(#)
ltable group, graph
ltable group, hazard
stmh group
stmh group, by(strata)
stmc group
stcox group
stir group
Stata for Model Fitting
• Continuous covariate
• xtile newvar = varlist , nq(4)
• tabstat varlist, stat(n min max)
by(newvar)
• xi:stcox i.newvar
• stsum, by(newvar)
• Categorical covariate
• tab exposure outcome, col
• xi:stcox i.exposure
Sample size for Cox Model
• stpower cox, failprob(.2) hratio(0.1 0.3)
sd(.3) r2(.1) power(0.8 0.9) hr