Innovative Uses of Geographic Information Systems

Download Report

Transcript Innovative Uses of Geographic Information Systems

Innovative Uses of
Geographic Information
Systems
Lance A. Waller
Department of Biostatistics
Rollins School of Public Health
Emory University
[email protected]
Outline
 Why does the geography of
immunization matter?
 What is GIS?
 What does GIS do?
 What data do I have?
 What questions can I answer
with my data?
Why geography?
 Is immunization coverage
constant?
 If you know where coverage is
low, can you do something?
 If you know where coverage is
high, can you learn something?
What is GIS?
 A “geographic information system”
(GIS) links:
 Geographic features
Houses
 Census tracts
 Attribute measurements
 Immunized (yes/no)
 Age
 Sociodemographics

Think of…
Each cell contains an attribute value
linked with
Map (locations)
Objects on the map are features.
Table (attributes)
What does a GIS do?
Basic GIS operation #1:
 Layering
Non-compliers
Health center
cachement
Compliers
Basic GIS operation # 2:
Buffers around an area
 Buffering
 Find
areas
within a userspecified
distance of:
points
 lines
 areas

Buffers around a line feature
Famous public health map
!
Snow, J. (1949) Snow on Cholera.
Oxford University Press: London.
Wow! Can we do that?
 Many introductions to GIS and
public health essentially say:
 “If John Snow could do it with
shoe leather, ink, and paper, just
imagine what we can do with a
computer!”
Basic take-home figure
 The Whirling Vortex of GIS analysis
The question
you can
answer with
those data
The question
you want to
answer
GIS
The data you
need to answer
that question
The data you
can get
Original source: Toxicologist EPA Region IV
What kind of questions?
 Where is coverage the lowest?
 Where is coverage the highest?
 Outbreak size starting in high
coverage area?
 Outbreak size starting in low
coverage area?
 How could coverages impact the
course of an outbreak?
 Best response to current outbreak?
What kind of attributes?
 Compliers
 Residence
location
 Census region counts
 Sociodemographic data
 Census summaries on age, race,
sex, income of census region
residents
 Some information on compliers’
sociodemographics
Additional attributes
 Noncompliers
 Residence
location
 Regional counts
 School data
 School district
 Health plan data
 Billing provides residence address
 ZIP codes?
Basic location types
 Point data
 Latitude
and longitude
 (Seems) precise
 Distance calculations
 Regional data
 Counts (cases/controls) from
census regions
Any complications?
 Maxcy (1926): Endemic typhus
fever in Montgomery, AL
 Where is “where”?
 Which location for each case?
Maxcy, K.F. (1926) “An epidemiological study of endemic typhus (Brill’s
disease) in the Southeastern United States with special reference to its
mode of transmition.” Public Health Reports 41, 2967-2995.
Residence:
Employment:
Lilienfeld, D.E. and Stolley, P.D. (1994) Foundations of Epidemiology,
Third Edition. Oxford University Press: New York, pp. 136-140.
Complication: Nonconstant
population density
Complications with regions
 Counts lose some resolution...
1
4
1
1
2
2
Modifiable Areal Unit Problem
 Different aggregations can lead to
different results.
1
4
1
1
0
2
1
0
2
2
0 0 0
2
2
2
0
4
0
MAUP example: John Snow
?
Monmonier, M (1991) How to Lie with Maps.
University of Chicago Press: Chicago. p. 142.
What questions can I ask?
 Point locations
 Interesting/uninteresting
clusters
 Interesting: clusters of noncompliers away from clusters of
compliers
 Regional counts
 Interesting/uninteresting raised
counts
 Interesting: Less coverage than
“expected”
Point locations
 Treat locations as spatial point
process
 Spatial “intensity” (average number
of events per unit area)
 Think of intensity as a surface
 Compare intensity of compliers to
intensity of non-compliers.
 Peaks and valleys in same places?
Monte Carlo simulation
 Simulate data sets under null hypothesis
(e.g., constant coverage rate).
 See if observed data (actual compliers)
appear “unusual”.
 To compare intensities, split all locations
into compliers and non-compliers at
random, find out how high peaks, how low
valleys can get.
 Most GIS packages will not do this, but it
is a very handy tool in spatial statistics.
Regions
 Compare observed counts to
“expected” counts.
 Some basic point process results
extend to counts (counts of points in
regions).
 Constant coverage rate (perhaps
age-adjusted) again a common way
of obtaining “expected” counts.
 Monte Carlo simulation for
significance.
Related work
 Cancer registries: North American
Association of Central Cancer Registries
(NAACCR) report on GIS (Wiggins 2002)
 Birth outcome registries
 Public Health/Bioterrorism/Syndromic
Surveillance
 Similarities:

Registry data
 Differences:
 Infectious vs. chronic outcome
 Urgency of temporality
Conclusion
 Best work a collaboration between
 Geographers
 GISers
 Epidemiologists
 Statisticians
 Get the best data you can to answer
the questions you want.
Handy references
 Wiggins L (Ed). Using Geographic Information
Systems Technology in the Collection, Analysis,
and Presentation of Cancer Registry Data: A
Handbook of Basic Practices. Springfield (IL):
North American Association of Central Cancer
Registries, October 2002, 68 pp.
 Cromley, E.K. and McLafferty, S.L. (2002) GIS
and Public Health. The Guilford Press.
 Bailey and Gatrell (1995) Interactive Spatial Data
Analysis. Longman.
 Waller and Crawford (2004) Applied Spatial
Statistics for Public Health Data. Wiley.
What kind of software?
Commercial GIS Software
(ArcView, Mapinfo)
Statistically challenged
Statistical Software
(SAS, S+ Spatial Stats)
Spatially and/or visually challenged
Extensions (Analysts)
$$$, limited capability
Packages
by scientific user
good, but basic
Scripts and Macros
User-contributed
Subject-specific
SpaceStat/GeoDa
SaTScan
GS+
ClusterSeer
WinBUGS/GeoBUGS
XGOBI/XGvis
R (many nice spatial modules,
must write code, quality
control?)
Link to GIS
S+/ArcView 3.x
SAS Bridge to ArcGIS 8.x
Often do not give numerical output