Modifiable areal units

Download Report

Transcript Modifiable areal units

Some comments on modifiable
areal units
Transforming data from one to another
geographical system
[email protected]
1
The modifiable areal unit problem
• Various problems linked with the change
from one set of areal units to a different one
– How to aggregate data?
• Simple addition (easy in principle)
• Smoothing (making maps easy to interpret)
– How to disaggregate data?
– Other phenomena: e.g. what happens with
correlations when data are aggregated?
2
Data disaggregation
• Yi is known for geographic units Ai, i=1…I .
We want Yk for subunits Bk
So that
Yi 
 Yk
B k  Ai
• Different situations:
– Only the target variable Y for units Ai is known.
– A covariable Z is known for units Bk, but limited
information on the link between Y and Z.
– A covariable Z is known for units Bk, and the link
between Y and Z is rather well known .
– Individual data with co-ordinates known (generally
confidential)
• Possibly with a covariable
3
Disaggregation without additional
information
First step:
• Ask yourself: Are you really sure that you
do not have any additional information?
• If you are sure you have several options,
but none of them is usually good:
– Attribution proportional to area
– Smoothing
• Good for map simplification
• May be used for disaggregation if the target
variable tends to be geographically smooth.
4
Simple areal weighting: illustration
Simulated example of administrative units and catchments
1
2
A
C
3
B
4
Administrative regions
Catchments
Intersections
Aim: attributing to catchments values of a statistical magnitude known
for administrative units
Method: attributing to each intersection an amount proportional to the
area and reaggregating per catchment
5
Effect when the item is spatially
concentrated
1.18
1.26
0.75
100
2.9
1.02
Truth (unknown)
Estimated
density
from Density
attributed
to
statistics by administrative catchments by simple areal
region
weighting
The representation by administrative unit gives a
poor picture, but reallocating to catchments
worsens things
6
Item with homogeneous distribution
in each administrative unit
34
19.1
2
2
18.8
50
10
50
10
23.5
Item homogeneous by region
Official statistics
Density
attributed
to
catchments by simple areal
weighting
Reallocation gives a completely wrong picture.
7
Effect when the item is homogeneous
per catchment
18.6
0
20
18.0
15.9
17.4
2
18.1
19.4
18.0
50
Item
homogeneous
catchment
by Official statistics
Density
attributed
to
catchments by simple areal
weighting
Representation by administrative unit is quite bad,
but reallocating to catchments does not improve
things.
8
Covariable Z known for sub-units with
good information on the link Y-Zj.
• Examples of covariables: thematic maps
(land cover, soil, DEM, etc.)
• Areal weighting with coefficients
proportional to known Uj
• For subunit Bk Yk  wk U j Z jk
j
wk 
Yi
U j Z jk
j , Bk  Ai
9
Example of disaggregation with good
information from co-variables
•
•
•
•
Target variable: Y=use of fertilizers
Co-variable Z: CORINE Land Cover
We assume we have reliable data by NUTS 2
We need:
– Approx. input per ha of crop in the area
– Proportion of area of each crop in each CLC class
(can be estimated from LUCAS)
10
Raw profiles of CLC classes from
LUCAS (EU15 except Sweden)
11
But things are not so easy….
• CLC profiles with LUCAS need to be improved:
– Cleaning noise from co-location inaccuracy
– Adaptation to different geographical areas.
• Input per ha of a given crop is not homogeneous.
• Data per NUTS2 are not necessarily reliable
• Etc…
• But “perfect” is sometimes an enemy of “good”
12
Covariable Z known for sub-units with
little information on the link Y-Z.
• Disaggregation based on a model with
parameters estimated using Y and Z
– Defining a mask
– EM algorithm
– Iterative estimation with several levels of
aggregation
– Etc…
• Examples of covariables: thematic maps
(land cover, soil, DEM, etc.)
13
Simple areal weighting combined with a
mask
1.2
1.1
5
25
Concentrated
mask.
item
2
1.6
and Official statistics
with the mask
mapped Simple areal weighting with
the mask
The mask improves the mapping, but reaggregating in a
different system degrades it again.
14
Example of disaggregation with an iterative
algorithm
• Target variable: population (available by
commune)
• Co-variable: land cover map (CLC)
• Output: estimated population density map
with the resolution of CLC.
15
Known levels
Disaggregating population density.
Principle of the iterative algorithm
NUTS2
Commune 1
Land cover 1
Land cover 2
Commune 2
Land cover 1
Land cover 3
Commune 3
Land cover 1
Land cover 2
Land cover 3
To be estimated
16
Iterative algorithm
• Pretend that you only know data at the highest level
(NUTS2)
• Disaggregate with your covariable (CLC) and an
initial set of coefficients to commune level
• Measure disagreement with known commune data
• Get new coefficients that reduce the disagreement
• Repeat until the disagreement becomes stable
• Apply the estimated coefficients to the commune
data.
17
Individual data known (e.g. area frame
survey)
• Aggregating data to a different spatial
system is easy in principle
– Posible impact on the variance
• If a covariable is known: Small area
estimators (Bayesian technique), that uses:
– Sample units inside the small area
– Link between sample and co-variable everywhere
• Same spatial system desirable for covariable and results.
18
General comments to disaggregation
• It is always possible to disaggregate and
produce a map.
• A different question is the quality of the
disaggregation
• The key point is using pertinent covariables
• A number of algorithms can be used
• Assess how precise is the link between the covariable and the target variable.
19