hazard function - People Server at UNCW

Download Report

Transcript hazard function - People Server at UNCW

Survival Data
• survival time examples:
– time a cancer patient is in remission
– time til a disease-free person has a heart attack
– time til death of a healthy mouse
– time til a computer component fails
– time til a paroled prisoner gets rearrested
– time til death of a liver transplant patient
– time til a cell phone customer switches carrier
– time til recovery after surgery
• all are "time til some event occurs" - longer
times are better in all but the last…
Three goals of survival analysis
• estimate the survival function
• compare survival functions (e.g., across
levels of a categorical variable treatment vs. placebo)
• understand the relationship of the
survival function to explanatory
variables ( e.g., is survival time different
for various values of an explanatory
variable?)
• The survival function S(y)=P(Y>y) can be
estimated by the empirical survival
function, which essentially gets the relative
frequency of the number of Y’s > y…
• Look at Definition 1.3 on p.5:
Y1, … ,Yn are i.i.d. (independent and
identically distributed) survival variables.
Then Sn(y) =empirical survival function at y
= (# of the Y’s > y)/n = estimate of S(y).
n
• Note that S (y)  1  I (Y ), where A  y,
n
n
A
i
I 1
where I is the indicator function…
• Review of Bernoulli & Binomial RVs:
– Show that the expected value of a
Bernoulli rv Z with parameter p (i.e.,
P(Z=1)=p) is p and that the variance of Z is
p(1-p)
– Then knowing that the sum of n iid
(independent and identically distributed)
Bernoullis is a Binomial rv with parameters
n and p, show on the next slide that the
empirical survivor function Sn(y) is an
unbiased estimator of S(y)
• Note that nSn   I(y,) (Yi )  sum of iid Bernoullis
and as such nSn has B(n,p) where
p=P(Y>y)=S(y).
*
• Also
note
that
for
a
fixed
y

E(nSn (y * ))  nS(y * ) so nE(Sn (y * ))  nS(y * )
so Sn is unbiased as an estimator of S
• What is the Var(Sn)? (see 1.6 and on
p.6 where the confidence interval is
computed…)
• Try this for Example 1.3, p.6
• Example 1.4 on page 8 shows that it is
sometimes difficult to compare survival
curves since they can cross each
other… (what makes one survival curve
“better” than another?)
• One way of comparing two survival
curves is by comparing their MTTF
(mean time til failure) values. Let’s try to
use R to draw the two curves given in
Ex. 1.4: S1(y)=exp(-y/2) and
S2(y)=exp(-y2/4)… see the handout
R#1.
• Note that the MTTF of a survival rv Y is just its
expected value E(Y). We can also show (Theorem
1.2) that

MTTF 
 S(y)dy
y 0
(Math & Stat majors: Show this is true using
integration by parts and l’Hospital’s rule…!)
• So suppose we have an exponential survival
function:
S(y)  exp(y / )
(btw, can you show this satisfies the properties of a
survival function?)

• Then the MTTF for this variable is  show this…
• And for any two such survival functions,
S1(y)=exp(-y/ and S2(y)=exp(-y/
one is “better” than the other if the
corresponding beta is “better”…
S1(y)  S2 (y) 
 MTTF1 MTTF2
iff
• HW: Use R to plot on the same axes at
least two such survival functions with
different values of beta and show this
result.
The hazard function
• The hazard function gives the so-called
“instantaneous” risk of death (or failure) at
time t. Recall that for continuous rvs, the
probability of occurrence at time t is 0 for all
t. So we think about the probability in a
“small” interval around t, given that we’ve
survived to t, and then let the small interval
go to zero (in the limit). The result is given
on page 9 as the hazard rate or hazard
function…
• Definition of hazard function:
P(y  Y  y  y | Y  y)
h(y)  lim y 0
y
• notes

– the hazard function is conditional on the
individual having already survived to time y
– the numerator is a non-decreasing function
of y (it is more likely that Y will occur in a
longer interval) so we divide by the length
of the interval to compensate
– we take the limit as the length of the
interval gets smaller to get the risk at
exactly y - “instantaneous risk”
– we can show (see p.9) that the hazard
function is equal to
f (y)
h(y) 
S(y)
– use f(y)=-d/dy(S(y)) and the above to show

that
S(y)  exp(  h(u)du)

u 0
– so all three of f, h, and S are
representations that can be found from the
others and are used in various situations…

• more notes on the hazard function:
– hazard is in the form of a rate - hazard is not a
probability because it can be >1, but the
hazard must be > 0; so the graph of h(y) does
not have to look at all like that of a survivor
function
– in order to understand the hazard function, it
must be estimated.
– think of the hazard h(y) as the instantaneous
risk the event will occur per unit time, given
that the event has not occurred up to time y.
– note that for given y, a larger S(y) corresponds
to a smaller h(y) and vice versa…
• life expectancy at age t:
– if Y=survival time and we know that Y>t, then
Y-t=residual lifetime at age t and the mean
residual lifetime at age t is the conditional
expectation E(Y-t|Y>t) = r(t) 
 S(y)dy
– it can be shown that r(t) 
y t
S(t)
– note that when t=0, r(0)=MTTF
– we define the mean life expectancy at age t as
E(Y|Y>t) = t +r(t)

– go over Example 1.6 on page 11…