Zone design methods for epidemiological studies

Download Report

Transcript Zone design methods for epidemiological studies

Zone design methods for
epidemiological studies
Samantha Cockings, David Martin
Department of Geography
University of Southampton, UK
Thanks to:
Arne Poulstrup, Henrik Hansen
Medical Office of Health, Province of Vejle, Denmark
[email protected]
Why use areas?

No choice - data only available for areas



Confidentiality
Cost
Through choice




Believe some phenomena are area-level
Rates/ratios
Visualisation/mapping
Decision-making/planning
Problems with using areas

Modifiable areal unit problem (MAUP)



Scale
Aggregation
For a given set of data, different
aggregations/zoning systems will often
show apparently different spatial
patterns in the data (Openshaw, 1984)
Ecological fallacy
Relationships between variables which are observed
at one level of aggregation may not hold at the individual,
or any other, level of aggregation (Blalock, 1964)


Small numbers/instability of rates
Non-nesting units
Recent developments in (UK) automated
zone design methods/tools

2001 UK Census of Population

Automated design of Output Areas (OAs)
Martin et al (2001)1; Martin (2002)2

Based on Automated Zoning Procedure (AZP)
Openshaw (1977)3; Openshaw & Rao (1995)4

Automated Zone Matching software (AZM)
Martin (2002)5
1
3 Transactions
2 Population
4
Environment & Planning A, 33, 1949-1962
Trends 108, 7-15
5 IJGIS, 17, 181-196
of the IBG, NS, 2, 459-472
Environment & Planning A, 27, 425-446
Methods
Automated zone design …
iterative recombination
Building blocks
Initial random aggregation
Iterative recombination
Maximise objective function
Aggregated
zones
Martin, D (2002), Population Trends, 108, p.11
How can automated zone design help
in environment and health studies?


Explore sensitivity of results to MAUP
Design sets of ‘optimal’ purpose-specific zones

Stability of estimates
• Zones of homogeneous population size?

Exploring spatial patterning of disease
• Zones of homogeneous rates?

Analysing relationships between variables
• Zones of homogeneous risk/confounding factors?

Barriers/boundaries
• Zones constrained by geog. features or admin.
boundaries
Empirical study 1: Pre-aggregated data
Morbidity and deprivation in SW England

County of Avon (1991 Census)





1970 enumeration districts
177 wards
Premature (0-64 years) limiting long term
illness (LLTI)
Townsend deprivation score
Standardisation to England & Wales
SMR LLTI 0-64
0 - 0.62
0.62 - 0.83
0.83 - 1.03
1.03 - 1.36
1.36 - 9998
Restricted
N
0
2
4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Townsend score
-6.87 - -3.37
-3.37 - -1.97
-1.97 - -0.43
-0.43 - 1.83
1.83 - 9998
Restricted
N
0
2
4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Population (0-64) EDs
1 - 291
292 - 364
365 - 420
421 - 500
501 - 1321
Restricted
N
0
2
4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Population (0-64) wards
43 - 1754
1758 - 2939
2986 - 4065
4142 - 7868
8020 - 14333
N
0
2
4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Aims



Explore sensitivity of association at different
scales (population size)
Explore sensitivity of association for
different aggregations at a given scale
Explore ‘robustness’ of ED and ward level
zoning systems for this type of spatial
analysis
AZM software
©David Martin
target 3250; mean 0-64 pop. 3713
N
0
2
4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Population (0-64) target 3250
2931 - 3252
3253 - 3603
3604 - 3994
3995 - 4517
4518 - 5746
N
0
2
4 Kilometers
© Crown copyright/ED-LINE Consortium, ESRC/JISC supported
Correlation (Townsend score and LLTI SMR)
against mean pop. size … the scale effect
Wards
EDs
Standard deviation (pop. 0-64) against mean pop.
size … the scale effect
Wards
EDs
Correlation (LLTI-Townsend) vs. mean population
size at given scale … the aggregation effect
0.89
Correlation (LLTI SMR 0-64 and Townsend score)
0.87
0.85
0.83
0.81
0.79
0.77
0.75
0
500
1000
1500
2000
2500
3000
Mean population (0-64)
3500
4000
4500
5000
Results





Observed association affected by choice of
zoning system – MAUP/ecological fallacy
Automated zoning systems demonstrating
greater stability of population size, higher
correlations
Generally increasing Townsend-LLTI
correlation with increasing zone size (pop.)
and iterations
ED and ward correlations at low end of
variation at given scale
Neighbourhood scale of ~3000 for UK?
Empirical study 2: Individual level data
Dioxins and cancer, Kolding, Denmark

Background



c.50,000 residents
Airborne carcinogenic dioxin
Data



Geo-referenced addresses of residents
1986-2002
Roads, rivers, lakes
Buildings/urban areas
Possible zone design criteria


Population size: threshold/target
Physical boundaries



Roads, rivers, lakes
Shape
Homogeneity


Built environment - dwelling type, tenure
Socio-economic - education, income, occupation
Methods: Thiessen polygons around addresses
Methods: Using constraining features – roads and rivers
Methods: Clipped thiessen polygons
Illustrative zoning system from AZM: target 300, threshold 250
Next steps

Other design constraints




Physical boundaries in zone design process
Homogeneity
• Built environment
• Social environment
Use zones to calculate rates of cancer
Sensitivity analysis
Conclusions

All zoning systems are imposed and should
not be considered neutral or stable

Zone design methods offer:


The ability to explore the sensitivity and
robustness of existing and alternative zoning
systems
The ability to design purpose-specific zoning
systems according to pre-defined criteria
Environment and health studies:
What are we trying to model?
Points?
People
Health
Outcome
Points/areas?
Risk factors
Individual level
Points?
Confounding
factors
Predisposing: age, sex,
ethnicity, genetics, birthweight
Lifestyle: smoking, diet, exercise,
alcohol
Socio-economic: occupation,
income, education
‘People’/‘Composition’
Area level
Areas?
Pollution: air, water, noise
‘Neighbourhood’: services,
housing type/quality, ethnic
groupings/population
mixing, deprivation, crime,
support networks
‘Place’/’Context’
Standard deviation (0-64) vs. mean population size
for different aggregations at a given scale
800.000
700.000
Standard deviation (0-64)
600.000
500.000
400.000
300.000
200.000
100.000
0.000
0
500
1000
1500
2000
2500
3000
Mean population size (0-64)
3500
4000
4500
5000