Statistical Methods for Testing Carcinogenic Potential of New Drugs

Download Report

Transcript Statistical Methods for Testing Carcinogenic Potential of New Drugs

Statistical Methods for Testing
Carcinogenic Potential of New Drugs
in Animal Carcinogenicity Studies
Hojin Moon, Ph.D.
E-mail: [email protected]
September 16, 2005
September 16, 2005
Collaborators
Dr. Ralph L. Kodell – DBRA, NCTR, FDA
 Dr. Hongshik Ahn – SUNY@Stony Brook

September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Animal Carcinogenicity Study
Studies are conducted to assess the
oncogenic potential of chemicals
encountered in food or drugs for the
protection of public health
 Studies often involve a problem of testing
the statistical significance of a doseresponse relationship among dose
(treatment) groups.


Various statistical testing methods for a doseresponse relationship (Ahn and Kodell, 1998)
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Animal Carcinogenicity Study

The statistical analysis of animal carcinogenicity
data and the Peto COD controversy are current
issues in the government-regulated
pharmaceutical industry


(Lee et al., 2002; STP Peto Analysis Working Group,
2001, 2002; U.S. FDA, 2001)
Town Hall meetings were held in both June 2001
& June 2002 at the annual meetings of the STP
to discuss issues surrounding COD assignment
and implications for using the Peto test or the
alternative Poly-3 test

Opinions of a number of statisticians (Lee et al., 2002)
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Dose-Related Trend Tests

Cochran-Armitage Trend Test (Cochran, 1954; Armitage,
1955)





To detect linear trend across dose groups in lifetime tumor
incidence rates
Does not require COD
Requires an assumption under H0 that all animals are at equal
risk of developing a tumor over the duration of a study
A problem for this test arises from the presence of treatmentinduced mortality unrelated to the tumor of interest
The CA test is known to be sensitive to increase in treatment
lethality and to fail to control the probability of a Type I error
(Bailer & Portier, 1988; Mancuso et al., 2002; Moon et al., 2003)
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Cochran-Armitage Trend Test
Dose Group
1
2
….
g
Total
# w. T
y1
y2
….
yg
y.
# w/o T
N1 - y1
N2 - y2
….
Ng – yg
N - y.
# subjects
N1
N2
….
i : Ei Ngy.( Ni / N )
N
 The CA test utilizes the tumor data pooled over the study duration for each group
Expected # w T in group
 Observed # w T in group
 Dose level in group

i : di
i : yi
g

X  i 1 d i ( yi  Ei ),

g
g
2


V  { y .( N  y .) /[ N ( N  1)]} i 1 N i ( d i  d ) , d  ( i 1 N i d i ) / N

ZCA  X / V
 Under the null hypothesis of equal tumor incidence rates among groups
Ho : ZCA  N (0,1)
 Some treatments shorten overall survival -> decreased risks of tumor onset
 Survival time is not utilized
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
The Poly-k Trend Test




Appropriate alternative to the Peto-type test
No COD required
Adopted by NTP as its official test for
carcinogenicity
Survival-adjusted quantal-response procedure
that takes dose-group differences in intercurrent
mortality (all deaths other than those resulting
from a tumor of interest) into account.
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
The Poly-k Trend Test

Bailer & Portier (1988)
 Proposed the Poly-3 test, which made an adjustment of the CA
test by using a fractional weighting scheme

# at risk in group
where
wik 
i : ri   wik

Ni
k 1
if dies with tumor
1
( tik / tmax ) 3
otherwise
(time-at-risk weight for the kth animal in group i)




Replace Ni with ri in calculating ZCA
First mentioned the Poly-k test without specifying how to obtain k
Recommended k=3 following evaluation of neoplasm onset time
distribution in control F344 rats and B6C3F1 mice (Portier et al.,
1986)
The Poly-k test with correct k -> Superior operating
characteristics to the Poly-3 test
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
The Poly-k Trend Test

Bieler & Williams (1993)




Further modified the CA test by an adjustment of the variance
estimation of the test statistic using the delta method (Woodruff,
1971)
Showed that the Bailer-Portier Poly-3 test is anticonservative for
low tumor incidence rates and for high treatment toxicity
Characteristics of the BP Poly-3 test and the BW Poly-3 test can
be found in Chen et al. (2000)
Objectives

The Poly-k statistic: asymptotically normal under H0 of equal
tumor incidence rates among groups (Bieler & Williams, 1993)


Valid only if the correct value of k is used
Develop the method of bootstrap resampling to estimate the
empirical distribution of the test statistic and corresponding critical
value of the Poly-k test while taking into account the presence of
competing risks
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Generalized Poly-k Test

Moon et al. (2003)

Proposed a method for estimating k for data
with interval sacrifices (interim sacrifices and
a terminal sacrifice)
of the poly-k based empirical lifetime
cumulative tumor incidence rate, a function of k
 Estimation of cumulative tumor incidence rate
(Kodell & Ahn, 1997)
 Equate two estimate and find k
 Estimation
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Generalized Poly-k Test

Moon et al. (2005) – Bootstrap-based ageadjusted Poly-k test



Improving the Poly-k test for data with a single
terminal sacrifice
Estimation of k for single sacrifice data is more
difficult than that for data with interval sacrifices due
to lack of information on tumor development among
live animals before the termination of the experiment
Propose a method of bootstrap-based age-adjusted
resampling to improve the Poly-k test via a
modification of the permutation method of Farrar &
Crump (1990), which was used for exact statistical
tests
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Bootstrap Method

Suitable for data with the same CRSR


When the CRSR is different across dose groups in the
original data, the bootstrap samples from the pooled
data may not reflect the CRSR of each group, while
satisfying the null distribution of equal tumor
incidence rate across groups
Need to modify the bootstrap method in order to
preserve the survival rates in each dose group

Develop an age-adjusted scheme
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Age-adjusted Bootstrap Scheme
Data Set
X = (x1, x2, …, xn)
T(X)
Age-adjusted scheme
I(I,m); i=1,….,G; m=1,….,Mi
. . . . .
Bootstrap
Bootstrap
X*1
X*2
. . . . .
X*B
T(X*1)
T(X*2)
. . . . .
T(X*B)
100(1-α)th percentile: CR(X);
September 16, 2005
Samples
Replicates
Reject H0 if T(X) ≥ CR(X)
National Center for Toxicological Research, U.S. Food and Drug Administration
Example
• Death times (in days) in a hypothetical animal carcinogenicity data set with 4 groups
ID
Group 1
Group 2
Group 3
A
74
B
145
C
176
D
Group 4
185
E
243
F
300
G
316
H
324
I
340
J
341
K
L
343
345
M
351
N
385
…..
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Example
• Death times (in days) in a hypothetical animal carcinogenicity data set with 4 groups
ID
Group 1
Group 2
Group 3
A
74
B
145
C
176
D
Group 4
185
E
243
F
300
G
316
H
324
I
340
J
341
K
L
343
345
M
351
N
385
…..
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Example
• Death times (in days) in a hypothetical animal carcinogenicity data set with 4 groups
ID
Group 1
Group 2
Group 3
A
74
B
145
C
176
D
Group 4
185
E
243
F
300
G
316
H
324
I
340
J
341
K
L
343
345
M
351
N
385
…..
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Example
• Death times (in days) in a hypothetical animal carcinogenicity data set with 4 groups
ID
Group 1
Group 2
Group 3
A
74
B
145
C
176
D
Group 4
185
E
243
F
300
G
316
H
324
I
340
J
341
K
L
343
345
M
351
N
385
…..
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Simulation Study


To evaluate the improvement of the proposed
test in terms of the robustness to a variety of
tumor onset distributions
Typical bioassay design according to standard
designs of NTP



4 dose groups (dose levels: 0, 1, 2 and 4) of 50
animals each
Experimental duration of 2 yrs.
A single terminal sacrifice at the end of the
experiment
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Simulation Study

Tumor onset distributions:


Tumor rates:



tumor rates for the highest dose group by 104 weeks: 5, 3 and 2
times the background tumor rates of .05, .15 and .30,
respectively
CRSR (from NTP feeding studies, Haseman et al., 1998)


tumor rates are the same across dose groups
Power evaluation:


.05, .15 and .30 for the control
Size evaluation:


Weibull tumor onset distribution with shape parameter k = 1.5,
3.0 and 6.0
(.6, .6, .6, .6); (.6, .5, .4, .3); (.6, .6, .5, .2); (.5, .5, .5, .2);
(.5, .6, .5, .4); (.5, .7, .6, .4); (.5, .7, .6, .5)
5000 simulated data sets; α = .05 significance level;
For each data set, 5000 bootstrap samples
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Simulation Study

Size & Power Evaluation with 5000 simulated data sets, 5000 bootstrap samples for each data
set and 5% nominal significance level
TR
CRSR
Weibull 1.5
B
.3
.3
Weibull 3.0
N
B
Weibull 6.0
N
B
N
.6,.6,.6,.6
.053
.050
.054
.050
.055
.052
.5,.5,.5,.2
.044
.066
.044
.041
.040
.021
.6,.6,.5,.2
.036
.072
.033
.037
.033
.018
.6,.5,.4,.3
.047
.069
.043
.045
.040
.024
.5,.6,.5,.4
.049
.055
.050
.048
.048
.037
.5,.7,.6,.4
.046
.053
.048
.046
.045
.036
.5,.7,.6,.5
.054
.050
.051
.047
.054
.044
.6,.6,.6,.6
.918
.934
.908
.923
.893
.904
.5,.5,.5,.2
.837
.932
.781
.847
.725
.667
.6,.6,.5,.2
.790
.939
.734
.846
.668
.638
.6,.5,.4,.3
.864
.938
.825
.884
.773
.748
.5,.6,.5,.4
.886
.929
.868
.895
.834
.819
.5,.7,.6,.4
.881
.930
.856
.892
.817
.810
.5,.7,.6,.5
.904
.927
.884
.909
.859
.865
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Example

The 2-yr Gavage Study of Furan



Furan (C4H4O), a clear and colorless liquid, serves
primarily as an intermediate in the synthesis and
preparation of numerous organic compounds (NTP,
1993)
Toxicology and carcinogenesis studies were
conducted by administering furan in corn oil by
gavage to groups of F344/N rats and B6C3F1 mice of
each sex for 2 yrs
Furan was nominated by the NCI for evaluation of
carcinogenic potential due to its large production
volume and use, and because of the potential for
widespread human exposure to a variety of furancontaining compounds
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Example

Female F344/N rats



Evaluation of carcinogenic potential on incidences of
cholangiocarcinoma or hepatocellular neoplasms of
the liver
Groups of 50 rats were administered 2, 4 or 8 mg
furan per kg body weight in corn oil by gavage 5 days
per week for 2 yrs
Male B6C3F1 mice


Evaluation of carcinogenic potential on incidences of
adenocarcinoma or alveolar/bronchiolar adenoma of
the lung.
Groups of 50 mice received doses of 8 or 15 mg/kg
furan 5 days per week for 2 yrs
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
 Test results on the carcinogenic activity of furan in female F344/N rats based
on increased incidences of cholangiocarcinoma and hepatocellular neoplasms
of the liver (Reject when T(X) ≥ CR(X))
mg/kg
T(X)aBW
CR(X)bNormal
CR(X)cBootstrap
Overall
4.1617
1.6449 (p<.001)
2.0141 (p<.001)
0,2,4
2.7705
1.6449 (p=.003)
1.9584 (p=.004)
0,2,8
4.3559
1.6449 (p<.001)
1.9584 (p<.001)
0,4,8
3.6632
1.6449 (p<.001)
1.8214 (p<.001)
0,2
1.4641
1.6449 (p=.072)
1.4625 (p=.040)
0,4
2.6542
1.6449 (p=.004)
1.5905 (p=.001)
0,8
3.8420
1.6449 (p<.001)
1.7423 (p<.001)
aThe
BWP3 test statistic obtained from the data
bStandard
cCritical
normal critical value at the significance level .05
value estimated by the 95th percentile of T(X)’s from our method
 NTP concluded that under the conditions of these 2-yr gavage
studies, there was clear evidence of carcinogenic activity of furan in
female F344/N rats based on increased incidences of
cholangiocarcinoma and hepatocellular neoplasms of the liver
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
 Test results on the carcinogenic potential of furan on incidences of
adenocarcinoma and alveolar/bronchiolar adenoma of the lung in male B6C3F1
mice (Reject when T(X) ≥ CR(X))
mg/kg
T(X)aBW
CR(X)bNormal
CR(X)cBootstrap
Overall
1.6995
1.6449 (p=.045)
1.7774 (p=.058)
0,15
1.6805
1.6449 (p=.046)
1.6938 (p=.052)
0,8
.2229
1.6449 (p=.41)
1.9248 (p=.53)
aThe
BWP3 test statistic obtained from the data
bStandard
cCritical
normal critical value at the significance level .05
value estimated by the 95th percentile of T(X)’s from our method
 Our test results agree with the conclusions from NTP
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
Significance





The statistical analysis of tumorigenicity data from
animal bioassays remains an important regulatory issue
to FDA and the pharmaceutical industry
The present research will build to further refine the Polyk test in order to make it more broadly competitive with
the Peto test
The improved Poly-k test for dose-related trend will be
robust to a variety of tumor onset distributions.
It will control the false positive rate better than the Poly3 test, thus having enhanced performance in identifying
dose-related trends.
With no information on COD or tumor lethality, the
improved version can be used confidently when Peto’s
test can not be implemented
September 16, 2005
National Center for Toxicological Research, U.S. Food and Drug Administration
References














Ahn H, Kodell RL (1998). Analysis of long-term carcinogenicity studies. In Design and Analysis of
Animal Studies in Pharmaceutical Development, Chow SC, Liu JP (eds). Marcel Dekker, Inc.:
New York, 259-289.
Armitage P (1955). Tests for linear trends in proportions and frequencies. Biometrics, 11, 375-386.
Bailer AJ, Portier CJ (1988). Effects of treatment-induced mortality and tumor-induced mortality
on tests for carcinogenicity in small samples. Biometrics, 44, 417-431.
Bieler GS, Williams RL (1993). Ratio estimates, the delta method, and quantal response tests for
increased carcinogenicity. Biometrics, 49, 793-801.
Chen JJ, Lin KK, Huque MF, Arani RB (2000). Weighted p-value for animals carcinogenicity trend
test. Biometrics, 56, 596-592.
Cochran WG (1954). Some methods for strengthening the common x2 tests. Biometrics, 10, 417451.
Lee PN, Fry JS, Fairweather WR, Haseman JK, Kodell RL, Chen JJ et al. (2002). Current issues:
statistical methods for carcinogenicity studies. Toxicologic Pathology, 30, 403-414.
Mancuso JY, Ahn H, Chen JJ, Mancuso JP (2002). Age-adjusted exact trend tests in the event of
rare occurrences. Biometrics, 58, 403-412.
Moon H, Ahn H, Kodell RL, Lee JJ (2003). Estimation of k for the poly-k test. Statistics in
Medicine, 22, 2619-2636.
National Toxicology Program (1993). Toxicology and carcinogenesis studies of furan in F344/N
rats and B6C3F1 mice (Gavage studies). NTP Technical Report, 402, Research Triangle Park,
NC.
STP Peto Analysis Working Group (2001). The Society of Toxicological Pathology’s position on
statistical methods for rodent carcinogenicity studies. Toxicologic Pathology, 29(6), 670-672.
STP Peto Analysis Working Group (2002). The Society of Toxicological Pathology’s
recommendations on rodent carcinogenicity studies. Toxicologic Pathology, 30, 415-418.
U.S. FDA (2001). Guidance for industry: statistical aspects of the design, analysis, and
interpretation of chronic rodent carcinogenicity studies of pharmaceuticals. Federal Register,
66(89), 23266-23267.
Woodruff RS (1971). A simple method for approximating the variance of a complicated estimate.
Journal of the American Statistical Association, 66, 411-414.