Describing Continuous-Time Event Occurrence Data

Download Report

Transcript Describing Continuous-Time Event Occurrence Data

Describing continuous time event occurrence data
ALDA, Chapter Thirteen
“Time has no divisions to mark its passing”
Thomas Mann
John B. Willett & Judith D. Singer
Harvard Graduate School of Education
Chapter 13: Describing continuous-time event occurrence data
Salient features of continuous-time event data (§13.1)—what does it
mean to collect data in “continuous time” and how is this different from
discrete-time?
Redefining the survivor and hazard functions and strategies for
estimation (§13.1.2 and 13.1.3)—it’s not as straightforward as you might
think (or like)—especially for the hazard function
The cumulative hazard function (§13.4)——a relative of the hazard
function that is easy to estimate in continuous time and whose behavior
gives us insight into hazard
Developing your intuition about survivor, cumulative hazard and kernelsmoothed hazard functions (§13.5)—it takes practice, but knowing what
you now know about discrete-time survival analysis you can learn to
interpret the behavior of the sample estimates of these continuous time
functions—essential knowledge for model building.
What happens when we record event occurrence in continuous time?
We know the precise instant when events occur—
e.g., Jane took her first drink at 6:19 after release from an alcohol
treatment program
There exists an infinite number of these instants.
Any division of continuous time—weeks, days,
hours, etc—can always be made finer
(in contrast to the finite—and usually small—number of values for
TIME in discrete-time)
The probability of observing any particular
event time is infinitesimally small
(it approaches 0 as time’s divisions get finer).
• This has serious implications for the definition of
hazard, the lynchpin of survival analysis.
• Continuous-time hazard must be defined
differently, and it is difficult to estimate and
display it in data analysis
(ALDA, Section 13.1.1, pp 469-471)
The probability of ties—two or more
individuals sharing an event time—
is therefore infinitesimally small
• Continuous-time survival methods assume no ties.
When they exist—and they inevitably do—they can cause
difficulties.
• Why are ties inevitable in continuous time?
Continuous-time data are really not continuous. Because
they are collected to the nearest unit (year, month, week,
etc), they are really “rounded.” The theory of
continuous-time survival analysis, however, is developed
assuming that the probability of ties  0.
What do continuous time data look like?
Data source: Diekmann & colleagues
(1996), Journal of Social Psychology
Sample: 57 motorists in Munich
Germany (purposefully) blocked at a
green light by a Volkswagen Jetta
Research design:Tracked from light
change until horn honk
n=43 (75.4%) honked their horns
before the light turned red; the rest are
censored
Event time recorded to the nearest 100th
of a second!
Only tie
(ALDA, Section 13.1.1, pp 471-472)
Notation for continuous-time
event data
T is a continuous random variable
representing event time
Ti indicates the event time for
individual i
CENSORi indicates whether Ti is
censored
tj clocks the infinite number of
instants when the event could
occur
A few
patient
people
Defining continuous time survivor and hazard functions
The survivor function
The hazard function
The survival probability for individual i
at time tj is the probability that his or
her event time, Ti will exceed tj
Hazard assesses the risk—at a particular
moment—that an individual who has not
yet done so will experience the event
S (t ij )  Pr Ti  t j
Can’t be defined as a (conditional) probability
because that probability  0


(Note: this definition is essentially identical
to that in discrete time)
Divide time into an infinite number of
vanishingly small intervals:
[tj , tj + t)
Tips for interpreting
continuous-time hazard
• It is not a probability—it is a rate
per unit of time
• You must be explicit about the
unit of time—60 mph, 60K/yr
• Unlike probabilities, rates can
exceed 1 (has implications for
modeling—instead of modeling logit
hazard we model log hazard)
• Intuition using repeatable events—
estimate the number of events in a finite
period (e.g., if monthly hazard =.10,
annual hazard = 1.2)
(ALDA, Section 13.1.2 & 13.1.2, pp 472-475)
includes tj
excludes tj + t
Compute the probability that Ti falls in
this interval as t  0 and divide by
the interval width to derive an estimate
of hazard per unit time:


 Pr Ti is in the interval [t j , t j  t ) | Ti  t j 
h(tij )  limit as t  0 

t


Grouped methods for estimating the survivor and hazard functions
Discrete-time method
Group the event times
into intervals and use
discrete-time methods
Median=3.92 secs
Actuarial/Life-table method
• Adapt the estimates by assuming
that times are distributed randomly
throughout the interval
• Plot as step function to reinforce
the association with the entire
interval
Risk of honking
increases and
peaks
Initial grace period
eventual decline
(ALDA, Section 13.2, pp 475-483)
But why categorize
happy healthy
continuous data?
Kaplan-Meier estimates of the survivor function
Key idea: Use observed event times to construct
intervals so that each contains just one observed
event time; then use standard discrete-time
methods.
• Since the first 3 observed event times are 1.41, 1.51
and 1.67, construct two intervals: [1.41, 1.51), [1.51,
1.67)
• By convention, construct an initial interval [0, 1.41)
• Continue through all the observed event times
Estimated median lifetime=3.5769 seconds
Kaplan-Meier estimate of the
survivor function
Conditional probability of event
occurrence
n eventsj
ˆ
p(t j ) 
n at risk j
Note how erratic they are especially as risk set
declines in later intervals
(ALDA, Section 13.3, p 483-491)

Sˆ (t j )  1  pˆ (t1 )1  pˆ (t 2 ) 1  pˆ (t j )
Note how smooth these estimates are

Kaplan-Meier estimates: Displays, comparisons, pros and cons
Pros of KM approach
• Uses all observed information on event times
without categorization
• If event occurrence is recorded using a truly
continuous metric, the estimated survivor
function appears almost ‘continuous.’
• Estimates are as refined as the fine-ness of
data collection—certainly finer than DT and
Actuarial/Life Table approaches
S(tj )
1.00
0.75
Kaplan Meier
0.50
Discrete-time
0.25
Actuarial
Drawbacks of KM approach
• When examining plots for subgroups, the
“drops” will occur in different places
making visual comparison trickier
0.00
0
5
10
15
20
Seconds after light turns green
• No corresponding estimate of hazard. You
can compute:
pˆ KM (t j )
hˆKM (t j ) 
width j
but the estimates are generally too erratic
to be of much direct use (although we will
end up using them for another purpose).
(ALDA, Section 13.3, p 483-491)
Knowing the value of the hazard
function, is there any way to
discern its shape over time?
The cumulative hazard function: A conceptual introduction
Cumulative hazard function
• Assesses the total amount of accumulated
risk individual i has faced from the
beginning of time (t0) to the present (tj)


H (t ij )  cumulation h(t ij ) ,
between t0 and t j
• By definition, begins at 0 and rises over
time (never decreasing).
• Has no directly interpretable metric, and
is not a probability
• Cumulation prevents it from directly
assessing unique risk (hazard)
• But, examining its changing shape allows
us to deduce this information
From H(t) to h(t)
To deduce the shape of h(t), study how
the rate of increase in H(t) changes
over time. Any change in its rate of increase
reflects a corresponding change in h(t)
(ALDA, Section 13.4, pp 488-491)
To develop an intuition, first
move from h(t) to H(t).
Because h(t) is constant, H(t)
increases linearly as the same fixed
amount of risk—the constant value of
hazard—is added to the prior
cumulative level at each successive
instant (making H(t) linear).
Next, move from H(t) to h(t)
because this is what you need to do
in practice.
• Guesstimate the rate of increase
in H(t) at different points in time
• Because the slopes are identical,
the rate of change in H(t) is
constant over time, indicating
that h(t) is constant over time
From cumulative hazard to hazard: Developing your intuition
H(t) increases more
slowly over time
H(t) increases
more rapidly over
time—it
accelerates
H(t) increases slowly,
then rapidly and then
slowly
h(t) must be increasing
h(t) must be decreasing
(the linear increase in h(t) is not
guaranteed, but a steady increase is)
Over time, a smaller amt of risk is added
to H(t) suggesting the asymptote in h(t)
(ALDA, Section 13.4.1, pp 488-491)
h(t) must be initially low, then
increase and then decrease
When rate of increase in H(t) reverses
itself, h(t) has hit a peak (or trough)
The cumulative hazard function in practice:
Estimation methods and data analytic practice
Nelson-Aalen method
-ln S(t) method
• Goal: To cumulate the “hazard” that exists
at all possible times between t0 and tj
• Idea: Use the (erratic) Kaplan-Meier type
hazard estimates to compute “total hazard
ˆ
during each interval” hKM (t j ) width j
• Sum these up  hˆKM (t j )width j
• It requires calculus to derive, but it can be
established that: H(tj)=- -ln S(tj).
• So…estimate H(t) by taking the negative
log of the KM estimated survivor function
• As one would hope, very similar to NA
estimates, especially at early times when
both sets of estimates are most stable.
H(tj )
3.50
Negative log survivor
Examining the changing
slopes in Hˆ (t ) to learn
about hazard
3.00
2.50
Nelson-Aalen
slowing down
2.00
Conclusions:
Hazard is initially low,
increases until around the
5th second, and then
decreases again
1.50
1.00
0.50
faster rate of increase
0.00
0
5
slowest rate
of increase
(ALDA, Section 13.4.2, p 491-494)
10
Seconds after light turns green
15
20
Can we systematically quantify
these changing rates of increase
so as to describe hazard?
Kernel-smoothed estimates of the hazard function
Bandwidth=1
Idea: Use the changing rates of change in cumulative
hazard to estimate (admittedly erratic) pseudohazard estimates & average together to stabilize
• h(tj)=rate of change in {-ln S(tj)}
• So….successive differences in sample cumulative hazard
yield “pseudo-slope” estimates of hazard
• Choose a temporal window—a “bandwidth”—and aggregate
together these estimates
Bandwidth=2
• Yields approximate hazard values based on estimates nearby
Finally, … a clear window on hazard
(especially with wider bandwidths)
 But, as the bandwidth widens:
Bandwidth=3
• The link between the smoothed function and hazard
diminishes because it is estimating hazard’s average within
a broader timeframe
• The estimates can only be computed at intermediate time
frames (a big problem if hazard is highest initially)
(ALDA, Section 13.5, p 494-497)
Developing your data analytic intuition for continuous time event data: Overview
B: Years on bench
107 US Supreme Ct justices
D: Employment duration
2,074 health care workers
(Zorn & Van Winkle, 2000)
(Singer et al, 1998)
We’ll illustrate ideas using
four data sets, chosen
because they have very
different distributions of
event occurrence over time
A: Time to first heavy
drinking day
89 recently treated
alcoholics
(Cooney et al, 1991)
C: Age at first depressive
episode
1,974 adults ages 18 - 94
(Sorenson, Rutter, Aneshensel,
1991)
To describe the distribution of continuous-time event data, plot:
• The survivor function (Kaplan-Meier estimates)—usually begin here
because we can interpret its level and it can be estimated at all times
• The cumulative hazard function (either –LS or Nelson-Aalen estimates)
• If possible, kernel smoothed estimates of the hazard function
(ALDA, Section 13.6, pp 497-502)
Developing your data analytic intuition: Alcohol relapse & judge tenure
1.00
Weeks to first heavy
drinking day
• Relapse very common:
S(t)
1.00
initial risk of relapse that
declines over time
• Kernel-smoothed hazard
shows steady decline over
time (although can’t comment
on first 12 weeks because of
bandwidth)
• Event occurrence very
common: All eventually
0.00
0.00
0
25
50
75
0
100
5
10
15
20
25
30
35
Years on court
Weeks since discharge
H(t)
• Cumulative hazard rises
sharply initially and then
decelerates. Suggests great
Years on the Supreme Court
0.50
0.50
ML=22 weeks, final
S(t)=.2149 (at 100 wks)
S(t)
5.00
• Cumulative hazard rises
slowly initially and then
accelerates ~10yrs.
H(t)
1.50
4.00
1.00
3.00
Suggests low immediate risk
followed by steady increase in
risk
2.00
0.50
1.00
0.00
0.00
0
25
50
75
0
100
5
10
15
20
25
30
35
Years on court
Weeks since discharge
h(t)
0.14
h(t)
Bandwidth=12
0.10
0.08
0.02
0.06
0.04
0.01
0.02
0
25
50
75
Weeks since discharge
(ALDA, Section 13.6, pp 497-502)
Bandwidth=5
0.00
0.00
100
• Kernel-smoothed hazard
shows increasing risk over
time (although can’t comment
on first 5 years because of
bandwidth)
0.12
0.03
retire or die, ML=16 years
0
5
10
15
20
25
Years on court
30
35
Developing your data analytic intuition: Depression onset & employment duration
1.00
Age at first depressive
episode
S(t)
1.00
0.50
0.50
0.00
0.00
S(t)
Tenure at Community
Health Centers
•Onset very rare: no ML,
S(t)=.92 at age 54
0
25
50
75
100
0
26
Age (in years)
•Cumulative hazard rises
slowly, then sharply, then
slowly. Suggests that hazard is
first low, then rises to a peak, and
then declines
0.10
52
78
104
130
Weeks since hired
H(t)
0.60
at 139 weeks
H(t)
•Cumulative hazard seems
almost linear, with a few
‘bumps’. Suggests relatively
0.50
0.08
0.40
0.06
0.30
steady risk
0.04
0.20
•Kernel-smoothed hazard
shows this inverted-U shape
0.02
0.10
0.00
0
25
with a peak between 15 and 30
50
75
100
0.00
0
26
Age (in years)
3.00
52
78
104
130
Weeks since hired
0.01
2.00
0.00
1.00
0.00
Bandwidth=12
Bandwidth=7
0.00
0.00
0
25
50
Age (in years)
(ALDA, Section 13.6, pp 497-502)
75
100
0
26
52
78
104
Weeks since hired
•Kernel-smoothed hazard
reveals pronounced peaks
which correspond to
anniversaries—12mos, 18mos,
and 24 mos after hire.
h(t)
h(t) (1E-3)
•Over 3 years, many
people stay: no ML, S(t)=.57
130