Detection and Inference of Suddenly Emerging Geographical

Download Report

Transcript Detection and Inference of Suddenly Emerging Geographical

Early Detection of
Disease Outbreaks
Prospective Surveillance
For a pre-specified geographical area, there are
existing purely temporal statistical methods for
the detection of a sudden disease outbreak.
Two Important Issues
1. Such methods can be used simultaneously
for multiple geographical areas, but that
leads to multiple testing, providing more
false alarms than what is reflected in the
nominal significance level.
2. Disease outbreaks may not conform to the
pre-specified geographical areas.
Example:
Thyroid Cancer Incidence
in New Mexico
Data Source: New Mexico Tumor Registry
Time Period: 1973-1992
Gender: Male
Population: 580,000
Annual Incidence Rate: 2.8/100,000
Aggregation Level: 32 Counties
Adjustments for: Age and Temporal Trends
Monte Carlo Replications: 999
Example: Thyroid Cancer
Median age at diagnosis: 44 years
United States (SEER) incidence: 4.5 / 100,000
United States mortality: 0.3 / 100,000
Five year survival: 95%
Known risk factors:
• Radiation treatment for head and neck conditions.
• Radioactive downfall (Hiroshima/Nagasaki, Chernobyl,
Marshall Islands)
• Work as radiologic technician (USA) or x-ray operator
(Sweden).
Detecting Emerging Clusters
• Instead of a circular window in two
dimensions, we use a cylindrical window in
three dimensions.
• The base of the cylinder represents space,
while the height represents time.
• The cylinder is flexible in its circular base and
starting date, but we only consider those
cylinders that reach all the way to the end of
the study period. Hence, we are only
considering ‘alive’ clusters.
Hypothesis Test
• Find Likelihood for Each Choice of
Cylinder
• Through Maximum Likelihood Estimation,
Find the Most Likely Cluster
• Apply Likelihood Ratio Test
• Evaluate Significance Through Monte Carol
Simulation
Cases
Expected
Years Most Likely Cluster
Cluster
Period
Space-Time Scan Statistic
Alive Clusters
RR
75-78
75-79
75-80
75-81
75-82
73-83
73-84
85
73-86
73-87
48
9
10
72
85
84
113
3
129
142
36
3.3
3.8
53
62
62
90
0.2
108
117
1.4
2.7
2.6
1.4
1.4
1.4
1.3
13.8
1.2
1.2
p=
0.60
0.58
0.54
0.19
0.08
0.13
0.14
0.23
0.49
0.21
73-88 North Central – SanMiguel
73-88 143 115
73-89 North Central + Colfax,Harding 73-89 165 134
1.2
1.2
0.08
0.06
73-78
73-79
73-80
73-81
73-82
73-83
73-84
73-85
73-86
73-87
Bernadillo + 7 counties West
LosAlamos, Rio Arriba
LosAlamos, Rio Arriba
North Central – SanMiguel
North Central – SanMiguel
Bernadillo, Valencia
North Central
Lincoln
North Central + Colfax, Harding
North Central + Colfax, Harding
North Central Counties = Bernadillo, Los Alamos, Mora, Rio Arriba,
Sandoval, San Miguel, Santa Fe and Taos.
Cases
73-78
73-79
73-80
73-81
73-82
73-83
73-84
73-85
73-86
73-87
73-88
73-89
73-90
75-78
75-79
75-80
75-81
75-82
73-83
73-84
85
73-86
73-87
73-88
73-89
79-90
48
9
10
72
85
84
113
3
129
142
143
165
41
89-91
7
Bernadillo + 7 counties West
LosAlamos, Rio Arriba
LosAlamos, Rio Arriba
North Central – SanMiguel
North Central – SanMiguel
Bernadillo, Valencia
North Central
Lincoln
North Central + Colfax, Harding
North Central + Colfax, Harding
North Central – SanMiguel
North Central + Colfax,Harding
LosAlamos, RioArriba,
SantaFe, Taos
73-91 LosAlamos
Expected
Years Most Likely Cluster
Cluster
Period
Space-Time Scan Statistic
Alive Clusters
RR
p=
36
1.4
3.3 2.7
3.8
2.6
53 1.4
62 1.4
62 1.4
90
1.3
0.2 13.8
108
1.2
117
1.2
115 1.2
134
1.2
22
1.8
0.60
0.58
0.54
0.19
0.08
0.13
0.14
0.23
0.49
0.21
0.08
0.06
0.06
0.9
0.02
7.6
North Central Counties = Bernadillo, Los Alamos, Mora, Rio Arriba,
Sandoval, San Miguel, Santa Fe and Taos.
Los Alamos
73-78
73-79
73-80
73-81
73-82
73-83
73-84
73-85
73-86
73-87
73-88
73-89
73-90
Expected
Cases
Years Most Likely Cluster
Cluster
Period
Space-Time Scan Statistic
Alive Clusters
RR
p=
Bernadillo + 7 counties West
75-78 48 36
1.4
0.60
LosAlamos, Rio Arriba
75-79
9 3.3 2.7 0.58
LosAlamos, Rio Arriba
75-80 10 3.8
2.6 0.54
North Central – SanMiguel
75-81 72 53 1.4
0.19
North Central – SanMiguel
75-82 85 62 1.4
0.08
Bernadillo, Valencia
73-83 84 62 1.4 0.13
North Central
73-84 113 90
1.3 0.14
Lincoln
85
3 0.2 13.8 0.23
North Central + Colfax, Harding 73-86 129 108
1.2 0.49
North Central + Colfax, Harding 73-87 142 117
1.2 0.21
North Central – SanMiguel
73-88 143 115 1.2 0.08
North Central + Colfax,Harding 73-89 165 134
1.2 0.06
LosAlamos, RioArriba,
79-90 41 22
1.8 0.06
SantaFe, Taos
73-91 LosAlamos
89-91
7 0.9
7.6 0.02
73-92 LosAlamos
89-92 9 1.2
7.4 0.002
North Central Counties = Bernadillo, Los Alamos, Mora, Rio Arriba,
Sandoval, San Miguel, Santa Fe and Taos.
Adjusting for Yearly Surveillance
The Los Alamos Cluster
1991 Analysis: p=0.13
(unadjusted p=0.02)
1992 Analysis: p=0.016
(unadjusted p=0.002)
Los Alamos
14
cases
12
10
8
6
4
2
0
1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992
observed
expected
Thyroid Cancer in Los Alamos
The New Mexico Department of Health have investigated
the individual nature of all 17 male thyroid cancer cases
reported in Los Alamos 1970-1995. All were confirmed
cases.
Thyroid Cancer in Los Alamos
• 3/17 had a history of therapeutic ionizing
radiation treatment to the head and neck.
• 8/17 had been regularly monitored for exposure
to ionizing radiation due to their particular work
at the Los Alamos National Laboratory.
• 2/17 had had significant workplace-related
exposure to ionizing radiation from atmospheric
weapons testing fieldwork.
A know risk factor, ionizing radiation, is hence a
likely explanation for the observed cluster.
Practical Considerations
• Chronic or infectious diseases.
• Known or unknown etiology.
• Daily, weekly, monthly, or yearly data, depending
on the type of disease.
• It is not possible to detect clusters much smaller
than the level of data aggregation.
• Data quality control.
• Help prioritize areas for deeper investigation.
• P-values should be used as a general guideline,
rather than in a strict sense.
Limitations
• Space-time clusters may occur for other reasons
than disease outbreaks
• Automated detection systems does not replace
the observant eyes of physicians and other health
workers.
• Epidemiological investigations by public health
department are needed to confirm or dismiss the
signals.
Conclusions
• The space-time scan statistic can serve as an
important tool in prospective systematic timeperiodic geographical surveillance for the early
detection of disease outbreaks.
• It is possible to detect emerging clusters, and
we can adjust for the multiple tests performed
over the years.
• The method can be used for different diseases.
Thyroid Cancer in Los Alamos
The New Mexico Department of Health have investigated
the individual nature of all 17 male thyroid cancer cases
reported in Los Alamos 1970-1995. All were confirmed
cases.
Thyroid Cancer in Los Alamos
• 3/17 had a history of therapeutic ionizing
radiation treatment to the head and neck.
• 8/17 had been regularly monitored for exposure
to ionizing radiation due to their particular work
at the Los Alamos National Laboratory.
• 2/17 had had significant workplace-related
exposure to ionizing radiation from atmospheric
weapons testing fieldwork.
A know risk factor, ionizing radiation, is hence a
likely explanation for the observed cluster.
Practical Considerations
• Chronic or infectious diseases.
• Known or unknown etiology.
• Daily, weekly, monthly, or yearly data, depending
on the type of disease.
• It is not possible to detect clusters much smaller
than the level of data aggregation.
• Data quality control.
• Help prioritize areas for deeper investigation.
• P-values should be used as a general guideline,
rather than in a strict sense.
Practical Considerations (cont.)
• Possible to specify 0.05 probability of a false
alarm:
- since start
- during last 20 years
- during last 5 years ( ~ one false alarm per 100
years)
- during last year ( ~ one false alarm per 20 years)
- during last 18 days (~ one false alarm per year)
Conclusions
• The space-time scan statistic can serve as an
important tool in systematic time-periodic
geographical disease surveillance.
• It is possible to detect emerging clusters, and
we can adjust for the multiple tests performed
over the years.
• The method can be used for different diseases.
Computing Time
Each analysis took between 5 and 75 seconds
to run on a 400 MHz Pentium Pro.
References
Kulldorff M. Prospective time-periodic
geographical disease surveillance using a
scan statistic. Journal of the Royal
Statistical Society, A164:61-72, 2001.
Software: Kulldorff M et al. SaTScan v.3.1.
http://www.satscan.org/