Transcript 0intro
Spatial Statistics
for Cancer Surveillance
Martin Kulldorff
Harvard Medical School and
Harvard Pilgrim Health Care
Two Applications of Spatial Data
and GIS in Cancer Research
Studies of Specific Hypotheses: Evaluate the
relationship between cancer and geographical
variables of interest such as radon, pesticide use or
income levels, adjusting for geographical variation.
Surveillance: Evaluate the geographical variation of
cancer, adjusting for known or suspected variables
such as age, gender or income.
Reasons for Geographical
Cancer Surveillance
• Disease Etiology
• Known Etiology but
Unknown Presence
• Health Services
• Public Education
• Outbreak Detection
• New Diseases
Cancer Prevention and Control
• Are people in some geographical area at
higher risk of brain cancer? This could be
due to environmental, socio-economical,
behavioral or genetic risk factors.
Cancer Prevention and Control
• Are there geographical differences in the
access to and/or use of early detection
programs, such as mammography
screening?
Cancer Prevention and Control
• Are there geographical differences in the
access to and/or use of state-of-the-art
breast cancer treatment?
Different Types of Cancer Data
• Count Data: Incidence, Mortality, Prevalence
• Categorical Data: Stage, Histology, Treatment
• Continuous Data: Survival
For Incidence and Mortality
Poisson Data
Numerator: Number of Cases
Denominator: Person-years at risk
For Prevalence
Bernoulli Data (0/1 Data)
Numerator: People with Thyroid Cancer
Denominator: Those without Thyroid Cancer
Note: When prevalence is low, a Poisson model is
a very good approximation for Bernoulli data.
For Stage, Histology and Treatment
Bernoulli Data (0/1 Data)
Numerator: Cases of a specific type, e.g. late stage.
Denominator: All cases.
Ordinal Data
For example: Stage 1, 2, 3, 4
For Survival
Survival Data
Length of Survival
(Censored Data is Common)
Data Aggregation
(spatial resolution)
• Exact Location
• Census Block Group
• Zip Code
• Census Tract
• County
• State
Data Aggregation
• Same level of aggregation usually needed
due to data availability.
• Less aggregation is typically better as more
information is retained.
• Many statistical methods can be used
irrespectively of aggregation level.
Course Outline
Geographical Cancer Surveillance
1. Mapping Rates and Proportions
2. Smoothed Maps
3. Tests for Spatial Randomness
4. Spatial Scan Statistic
5. Global Clustering Tests
6. Brain Cancer Mortality
7. Survival Data
Course Outline
Space-Time Cancer Surveillance
8. Space-Time Scan Statistic for the Early
Detection of Disease Outbreaks
Statistical Software
9. SaTScan Demonstration
Comments and Questions
WELCOME AT ANY TIME
Software and Slide Presentation
AVAILABLE FROM THE WEB