Cox model with additional info on the baseline

Download Report

Transcript Cox model with additional info on the baseline

Empirical Likelihood
Mai Zhou
Dept. of Statistics, University of Kentucky
• Any first year Statistical Inference course
will talk about (parametric) “likelihood
function”.
• Three inference methods (tests) based on
likelihood:
1. Wald test
2. Score test (Rao’s Score test)
3. Likelihood ratio test (Wilks)
Empirical likelihood is a nonparametric
version of 3.
• Empirical Likelihood allows the statistician to
employ likelihood methods, without having to
pick a parametric family of distributions for the
data. --- Owen
• Empirical Likelihood allows for hypothesis
testing and confidence region construction
without an information/variance estimator.-- me
• Plus many additional nice properties.
• First book on this subject (2001) by A.
Owen “Empirical Likelihood” .
But in Cox model the (partial) likelihood
ratio exists for a long time (over 20 years).
SAS proc phreg, Splus function coxph( )
all have it computed.
Claim: The (partial) likelihood ratio statistic
for the regression coefficients in the Cox
model can be interpreted as a case of
Empirical Likelihood Ratio.
• For n observations,
• independent, from
likelihood is
• EL(F) =
x1 , x2 ,...xn
F (t )
the empirical
 F ( x )
i
i 1,...,n
Where
F ( xi )  F ( xi )  F ( xi )  PF ( X  xi )
• EL(F) is maximized by the empirical distribution
function:
1
F ( xi ) 
n
• An additional parameter of interest, when
maximizing the EL(F)
   g (t )dF (t )
• F(t) or F ( xi ) can be considered as
nuisance parameters
Censored Observations
For a right censored observation
xi
•
• The likelihood contribution is
1  F ( xi )
• For a left censored observation the contribution is
F ( xi )
• Interval censored:
F (U i )  F ( Li )
Truncated observations
For a left truncated observation (often referred to as
delayed entry) :
(entry time, survival time) = yi , xi
• The likelihood contribution is
F ( xi )
1  F ( yi )
• If the survival time is also right censored, then the
likelihood contribution is
1  F ( xi )
1  F ( yi )
• Empirical Likelihood Theorem:
• If the null hypothesis is true then
 2 log
maxFH 0 EL( F )
maxFH 0  H A EL( F )

2
1
R = Gnu S/Splus
http://cran.us.r-project.org
+ many add-on packages
A Package for empirical likelihood with
censored/truncated data
Contributed package – emplik (maintained by
Mai Zhou)
It
Does Empirical likelihood ratio tests for means
or weighted hazard, based on left-truncated, right
censored or left, right, doubly censored data.
Tests hypothesis of the form:
H 0 :  g (t )dF(t )  
H 0 :  g (t )d(t )  
with right, left, doubly censored data. Or
with left-truncated, right censored data.
(t )   log(1  F (t ))
t
dF( s)
(t )  
1

F
(
s

)

• Example: Data taken from Klein & Moeschberger (1997)
book as reported in their table 1.7
•
•
•
•
•
•
y = left truncation time
= (51, 58, 55, 28, 25, 48, 47, 25, 31, 30, 33, 43, 45, 35, 36)
x = survival times of female psychiatric inpatients
= (52, 59, 57, 50, 57, 59, 61, 61, 62, 67, 68, 69, 69, 65, 76)
d = censoring status
= ( 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1 )
> library(emplik)
> el.ltrc.EM( y, x, d, mu=62)
The mean of the NPMLE is 63.18557.
• (if ‘fun’ is left out, then fun=t, by default).
Two of the outputs are
-2LLR = 0.2740571
Pval = 0.6006231
• Repeat the test for many different values of
the mean. (mu=59, etc. )
• If the hypothesized mean is inside the interval
[58.78936, 67.81304], the p-value of the test is
larger then 0.05. ----- the 95% confidence
interval for the mean is
• [58.78936, 67.81304]
• For doubly censored data, the standard deviation of
the NPMLE is hard to compute.
• The Wald test/confidence interval is hard to do.
• No problem with empirical likelihood ratio!
• No need to estimate the standard deviation, instead,
we need to maximize EL under some constraint.
• The maximization can be achieved with the help of
modern computer. (E-M algorithm etc.)