Transcript Document

19th Advanced Summer School in Regional Science: GIS & Spatial Econometrics Groningen
Data Issues & Career
Advancement
Paul Cheshire
email [email protected]
July 2006
Overview
Originally – European data issues & career
advancement
But a bit wider…
Data –
•Some general principles and problems
•Some problems with Eurostat data
Career advancement
•What career?
An academic career…
•Publication
•Grant getting
•Poor pay and some distractions
Data
Model  Hypotheses  Data
Data for applied social science enquiry are never
‘accurate’; they are often not ‘right’
Lack of accuracy may not matter (much):
always ask “does it make a difference?”
Not being right – in the sense of not fully reflecting the
model/hypotheses – usually does make a difference
Frequently too much concern with model & analytical
technique relative to choice and measurement of data
The danger of having too many sophisticated buttons
to push
Good research illuminates
relevant questions
Reliable & appropriate data is THE foundation
Ideal of ‘scientific’ data
Experiment in controlled conditions, eliminate all
extraneous influences
Observe & record……But social sciences
“Take nothing on authority”
Results testable and replicable: data ‘objective’,
‘open’ & independent e.g. problems of medical
research, government
Problems
Observer bias: Disturbance
Scientists should & do go to great lengths to minimise
But – need to recognise:
Degrees of bias, accuracy, objectivity & reliability
Not ‘either’/’or’
Also type, source and form of inaccuracy etc is important to
assess:
Is inaccuracy of a type or form that might bias? Negate? the
results?
Social science ↔ value systems ↔ self-interest
Even government data e.g. Unemployment data
National patterns in revision of GDP estimates –
Goldman Sachs ‘UK bad PR…’
Basic test (& requirement) - the statistical provider has no
interest in outcome: best if motives/self-interest of the
statistical provider are geared to the accuracy of the data
Motivation of data provider
National/International Statistical Offices:
Mature democracies cf dictatorships or banana
republics
But again a continuum: there are degrees of
‘massaging’ and independence
First requirement of good ‘scientific’ data is a
clearly defined ‘statistical concept’
Heat…mass…..energy OK
Need agreed units – income, output,
unemployment,education…
But…Culture of Statistical offices not that of
researchers:
dedicated to ‘accuracy’ not problems of
researchers
The Dangers of off-the-peg Data
1. Is the data appropriate?
2. Is the data comparable?
(over time? between places?)
Example
Spatial disparities…..but how measured?
between what units?
Income?…Eurostat GDP pc for NUTS regions?
Obvious problems – units arbitrary:
GDP measured where output is produced:
people counted where they live
Therefore net commuting matters
Big city/regions ‘underbounded’
e.g. Bremen, Hamburg, Brussel, London
So GDP pc is overestimated
Commuting across boundaries
1. At last even acknowledged by Eurostat!
2. …in some regions the GDP per capita figure
can be significantly influenced by commuter
flows….[so] that GDP per capita can be
overestimated in these regions (e.g. Inner
London) and underestimated in the regions
where commuters live (e.g. Outer London,
Kent and Essex). (Eurostat, 2005, News
Release 47/2005, 7th April. )
DGRegio Measures of Spatial
Disparities Expand
Source: 3rd Cohesion Report, 2004
So the Richest ‘Regions’ are Cities
Source: 3rd Cohesion Report, 2004
Range of GDP p.c. in EU Level 2 Regions:
EU 15 =100 1996: EU25=100 2001
Rank
1996
NUTS Region/Country
GDP pc EU=100
1996 2001
Lubelskie
31.4
1
Acores
50
61.2
2
Voreio Aigaio
52
68.1
3
Madeira
54
86.0
4
Extramadura
55
58.7
5
Dessau
55
66.0
10
Calabria
59
68.1
11
Alentejo
60
66.6
13
Thüringen
61
72.6
UK
99.8
115.7
Italy
102.7
109.9
Latvia
…
36.8
198
Greater London
140
180.7
201
Ile de France
160
180.7
205
Brussel/Bruxelles 173
238.5
206
Hamburg
192
187.3
Sources: 6th Periodic Report 3rd Cohesion Report
And (NUTS) Boundaries
Change
Almost all the time!
For example the UK
Substantial re-organisation of local government
In mid 1990s
London…
For example the Netherlands
Prior to revision of Structural Funds in 1988
created Flevoland
Structure of NUTS 2003
constructed from pre-existing national units
Territorial
Unit
Label
No of regions
Major zones
NUTS level 1
72 (78)
Macro regions
NUTS level 2
213 (211)
Smaller regions
NUTS level 3
1091 (1093)
LAU level 1
1904
LAU level 2
98544*
Districts
Municipalities
*Of which about 38 000 are French Communes
Administrative or nonadministrative regions
Belgium
Denmark
Germany
Greece
Spain
France
Ireland
Italy
Luxembourg
Netherlands
Austria
Portugal
Finland
Sweden
United Kingdom
level 1
level 2
level 3
3
1
16
4
7
9
1
11
1
4
3
3
2
1
12
11
1
40
13
18
26
2
20
1
12
9
7
6
8
37
43
15
441
51
52
100
8
103
1
40
35
30
20
21
133
GDP p.c. for different Londons 1998: EU15=100
source: REGIO
157.4
Greater
GreaterLondon
London
157.4
Inner
InnerLondon
London
250.6
Inner
InnerLondon
London-West
-West
461.9
Inner
InnerLondon
London--East
East
129.1
250.6
461.9
129.1
99.4
Outer
OuterLondon
London
99.4
Outer
London– –East
East
& N.East
East
Outer London
& North
77.8
Outer London – South
Outer
London – South
95.3
Outer London – West & North West
Outer
London – W. & N. West
South East
South
East
77.8
95.3
120.9
120.9
116
116
Increase in ‘regional disparities generated academic articles!
So - Need Appropriate &
Consistent spatial Concepts

Not NUTS!
 Need not just appropriate & consistent
statistical concepts (income, unemployment…)
but also spatial concepts
 One – Functional Urban Region (FURs)
– Core – concentration of jobs
– Hinterland – defined on commuting

So economic sphere of influence & selfcontained
 Similar to US (Standard) Statistical Metropolitan
Areas (S)MSAs
 But boundaries change….
Even just want to know city size…
Population present (thousands)
Paris
1990
London
1991
city
2157
City of London
4
Petite
couronne
Grande
couronne
Ile de France
FUR a
FUR b
3988
Inner London
2343
4520
GLA
6394
10660 South East
10624 FUR a
11418 FUR a
16794
8757
12519
FUR a b Functional Urban Region of respectively 1971 & 1991
boundaries
The difference boundaries make:
Some FURS which are also NUTS
1991 Population
GDP pc @ PPS
%Change 1981-91
FUR
NUTS
FUR
NUTS
FUR-NUTS
Bremen
1272
682
58.2
80.7
-22.5
Hamburg
2806
1645
64.2
84.7
-20.5
Ile de France /Paris
10624
10740
102.1
87.1
15.0
Brussel /Bruxelles
3399
960
73.4
92.9
-19.5
Greater London
8757
6871
114.0
95.2
18.8
Functional Urban Regions self contained...
NW Europe defined on 1991 data
And Spatial Dependence…






If studying growth, productivity, living
standards
Cross border commuting likely to show up as
(nuisance) spatial dependence
If net commuting AB then GDP pc
‘overestimated’ in B, ‘underestimated’ in A
If people decentralise relative to jobs over
time…
BUT not just boundaries that matter:
Statistical Concepts change……..
And values revised….
GDP & Other Concepts Change…

“Is it possible to compare data from ESA95
and ESA79?
Concepts and definitions between the two systems
ESA95 and ESA79 are very different. In addition ESA79
data is of very limited comparability between Member
States. Therefore it would not be correct to create long
series by linking data from the two systems.”
Source: Eurostat website – frequently asked questions
My experience - differences are greater for regions than
for nations; and particularly for some countries:
Germany, Spain & Denmark: Not able to reconcile series
Consult the small print starting with European
Regional [& Urban] Statistics: Reference Guide
(Eurostat: Annual)

But - Eurostat urban data…to be avoided
Missing data - No Regional Price
Deflators
GDP, GVA available in Purchasing Power
Standards
 But calculated nationally
 Significant price variations across regions
importance of which differs between
countries.
 Relate particularly to housing
 Biggest in UK, France,Spain???

Regional Income Disparities: Effect of Government
Transfers, Net Taxes & Regional Price Variation
Region
GDP pc
EU=100
Greater London
151.2
Rest of South East
[105]
South West
94.8
East Anglia
99.8
East Midlands
94.5
West Midlands
90.5
Yorkshire &
89.0
Humberside
North West
90.0
North
85.3
Wales
83.2
Scotland
92.8
Northern Ireland
75.1
Range
76.1
P.D. Income Real P.D.I.
UK=100
UK=100
125.4
119.2
107.6
105.4
98.9
100.2
96.6
97.5
95.7
97.0
91.9
93.1
93.6
95.6
93.6
91.5
88.1
94.9
85.5
39.9
94.8
94.5
90.1
95.9
86.6
32.6
Sources: 5th Periodic Report and Joseph Rowntree Foundation, Enquiry into Income and Wealth.
Another Example:Unemployment
Registration compared to sample survey data:
Registration depends on i) incentives to register
and ii) qualifications necessary to be registered
Usually relate to:
Welfare and unemployment benefit systems…
And politicians’ priorities
But Labour Force
Survey measure of
Unemployment derived
from a small sample
survey - therefore not
‘accurate’;
But still more reliable,
comparable & useful
than ‘registration’ for
most purposes
Country
Sample:Registered
U% 1981
Germany
0.98
UK
1.08
France
1.10
Belgium
1.18
Denmark
1.57
Ireland
1.40
Italy
0.89
Netherlands
1.30
Luxembourg
3.30
What is Unemployment?
ILO Concept – not in work but actively seeking it.
The number varies with how you ask the question
LFS prior to 1983:
If not in work “are you actively seeking work?”
1984 on:
If not in work “have you looked for work in the
past week?”
Region
U% 1983
U% 1984
Difference
South East
9.9
8.5
-1.4
East Anglia 9.9
8.1
-1.8
North West
16.1
14.0
-2.1
Scotland
16.1
13.5
-2.6
N. Ireland
20.2
17.5
-2.7
Data Errors Can Matter







How big is the error?
How big is the data set?
How critical is precise estimate of influence of the
variable for the investigation?
Cheshire & Sheppard studies on Reading housing
market
‘Lucky’ had to generate own data – designed for
purpose
But much work - so trade off of accuracy against
number of Obs. A bigger, worse data set likely to be
less useful than a smaller better designed data set with
fewer errors
….”Residual chasing” e.g. price: but also ‘area’
Data Errors Can Matter
Career Advancement: What
career?







Do you want to be an academic? If so - driven
by teaching or by research?
Most academics – certainly most successful –
motivated by research
Do not do it for the money or the (direct)
influence
Money – into financial institutions or consultancy
Academics: bloody minded and in sense
arrogant: they can find things out others do not
know or have got wrong; they ‘think for
themselves’
Most academics inclined to do the very best:
But need to know when to do enough to satisfy
So - you want an academic career…

Publication & Grants (& consultancy)
A PhD is only a means to an end – ‘satisfy the
examiners’
 The ‘end’ is peer reviewed publication – collectively
these are what constitute the ‘body of scientific
knowledge’
 From an international (Anglo-Saxon?) perspective…
 Aim for publication in refereed journals
 Have a publication strategy: Need a ‘pipeline’
 All publications take a significant and variable amount of
time going through the process
Working paper-conference paper-submission-revisionacceptance….proofs…citation?
 Go to conferences, seminars etc; offer papers: you learn
and gives discipline

Books or Journals?

Varies somewhat by subject

e.g. economics cf history
– (in British RAE estimated that a book had a negative contribution
to an individual’s research ranking in economics)
– Contributed chapters to books have the lowest payoff

Choosing journals:
– ‘general interest’ or specialist/field journals?

Expect – indeed – seek rejections
– If you do not get rejections you are not aiming high enough

Impact factors as a measure of a journal’s
standing/influence
– But again varies by discipline & sub-discipline partly because of
citation & referencing practices by discipline e.g. ‘economics’
versus ‘social geography’

Bibliographic measures increasingly used e.g. European
Economic Association rankings
Grants & funding







Get/apply for funding for research you are interested in
doing: funding bodies increasingly trying to guide
research
Research Councils, Charitable Foundations, European
Science Foundation, EU, Government, Commercial …
Have ideas in your mind of research you would like to do
– then aim it at calls or initiatives from funders
Beware of losing independence, constraints on rights to
publish and bureaucratic administrative requirements.
The ideal is to win money for research you have initiated
want to do and entirely control
Write proposals clearly and logically so they can be
understood not just by specialists: make sure what you
propose to do sounds as if it can be done (within the
budget)
Ask for money from grants for ‘dissemination’ and
equipment
Consultancy is not Wicked & Can
be Useful






Can give access to data; access to funds
And, of course, extra income
But it can be a serious distraction and take
over e.g.acquire ‘dependents’, get seduced
My own strategy – unless consultancy really
interesting for its own sake – always ask –
“will it generate an academic publication?”
Finally – beware of plagiarism
Be co-operative: not the end of the world if
someone steals your idea – they could not come up
with their own. But you could - and come up with
another one. You will learn far more co-operating with
others; and have more fun!