Innovative Uses of Geographic Information Systems
Download
Report
Transcript Innovative Uses of Geographic Information Systems
Innovative Uses of
Geographic Information
Systems
Lance A. Waller
Department of Biostatistics
Rollins School of Public Health
Emory University
[email protected]
Outline
Why does the geography of
immunization matter?
What is GIS?
What does GIS do?
What data do I have?
What questions can I answer
with my data?
Why geography?
Is immunization coverage
constant?
If you know where coverage is
low, can you do something?
If you know where coverage is
high, can you learn something?
What is GIS?
A “geographic information system”
(GIS) links:
Geographic features
Houses
Census tracts
Attribute measurements
Immunized (yes/no)
Age
Sociodemographics
Think of…
Each cell contains an attribute value
linked with
Map (locations)
Objects on the map are features.
Table (attributes)
What does a GIS do?
Basic GIS operation #1:
Layering
Non-compliers
Health center
cachement
Compliers
Basic GIS operation # 2:
Buffers around an area
Buffering
Find
areas
within a userspecified
distance of:
points
lines
areas
Buffers around a line feature
Famous public health map
!
Snow, J. (1949) Snow on Cholera.
Oxford University Press: London.
Wow! Can we do that?
Many introductions to GIS and
public health essentially say:
“If John Snow could do it with
shoe leather, ink, and paper, just
imagine what we can do with a
computer!”
Basic take-home figure
The Whirling Vortex of GIS analysis
The question
you can
answer with
those data
The question
you want to
answer
GIS
The data you
need to answer
that question
The data you
can get
Original source: Toxicologist EPA Region IV
What kind of questions?
Where is coverage the lowest?
Where is coverage the highest?
Outbreak size starting in high
coverage area?
Outbreak size starting in low
coverage area?
How could coverages impact the
course of an outbreak?
Best response to current outbreak?
What kind of attributes?
Compliers
Residence
location
Census region counts
Sociodemographic data
Census summaries on age, race,
sex, income of census region
residents
Some information on compliers’
sociodemographics
Additional attributes
Noncompliers
Residence
location
Regional counts
School data
School district
Health plan data
Billing provides residence address
ZIP codes?
Basic location types
Point data
Latitude
and longitude
(Seems) precise
Distance calculations
Regional data
Counts (cases/controls) from
census regions
Any complications?
Maxcy (1926): Endemic typhus
fever in Montgomery, AL
Where is “where”?
Which location for each case?
Maxcy, K.F. (1926) “An epidemiological study of endemic typhus (Brill’s
disease) in the Southeastern United States with special reference to its
mode of transmition.” Public Health Reports 41, 2967-2995.
Residence:
Employment:
Lilienfeld, D.E. and Stolley, P.D. (1994) Foundations of Epidemiology,
Third Edition. Oxford University Press: New York, pp. 136-140.
Complication: Nonconstant
population density
Complications with regions
Counts lose some resolution...
1
4
1
1
2
2
Modifiable Areal Unit Problem
Different aggregations can lead to
different results.
1
4
1
1
0
2
1
0
2
2
0 0 0
2
2
2
0
4
0
MAUP example: John Snow
?
Monmonier, M (1991) How to Lie with Maps.
University of Chicago Press: Chicago. p. 142.
What questions can I ask?
Point locations
Interesting/uninteresting
clusters
Interesting: clusters of noncompliers away from clusters of
compliers
Regional counts
Interesting/uninteresting raised
counts
Interesting: Less coverage than
“expected”
Point locations
Treat locations as spatial point
process
Spatial “intensity” (average number
of events per unit area)
Think of intensity as a surface
Compare intensity of compliers to
intensity of non-compliers.
Peaks and valleys in same places?
Monte Carlo simulation
Simulate data sets under null hypothesis
(e.g., constant coverage rate).
See if observed data (actual compliers)
appear “unusual”.
To compare intensities, split all locations
into compliers and non-compliers at
random, find out how high peaks, how low
valleys can get.
Most GIS packages will not do this, but it
is a very handy tool in spatial statistics.
Regions
Compare observed counts to
“expected” counts.
Some basic point process results
extend to counts (counts of points in
regions).
Constant coverage rate (perhaps
age-adjusted) again a common way
of obtaining “expected” counts.
Monte Carlo simulation for
significance.
Related work
Cancer registries: North American
Association of Central Cancer Registries
(NAACCR) report on GIS (Wiggins 2002)
Birth outcome registries
Public Health/Bioterrorism/Syndromic
Surveillance
Similarities:
Registry data
Differences:
Infectious vs. chronic outcome
Urgency of temporality
Conclusion
Best work a collaboration between
Geographers
GISers
Epidemiologists
Statisticians
Get the best data you can to answer
the questions you want.
Handy references
Wiggins L (Ed). Using Geographic Information
Systems Technology in the Collection, Analysis,
and Presentation of Cancer Registry Data: A
Handbook of Basic Practices. Springfield (IL):
North American Association of Central Cancer
Registries, October 2002, 68 pp.
Cromley, E.K. and McLafferty, S.L. (2002) GIS
and Public Health. The Guilford Press.
Bailey and Gatrell (1995) Interactive Spatial Data
Analysis. Longman.
Waller and Crawford (2004) Applied Spatial
Statistics for Public Health Data. Wiley.
What kind of software?
Commercial GIS Software
(ArcView, Mapinfo)
Statistically challenged
Statistical Software
(SAS, S+ Spatial Stats)
Spatially and/or visually challenged
Extensions (Analysts)
$$$, limited capability
Packages
by scientific user
good, but basic
Scripts and Macros
User-contributed
Subject-specific
SpaceStat/GeoDa
SaTScan
GS+
ClusterSeer
WinBUGS/GeoBUGS
XGOBI/XGvis
R (many nice spatial modules,
must write code, quality
control?)
Link to GIS
S+/ArcView 3.x
SAS Bridge to ArcGIS 8.x
Often do not give numerical output