Presentation - Quality on Statistics 2010

Download Report

Transcript Presentation - Quality on Statistics 2010

European Conference on
Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki ( Finland )
Longitudinal data from Italian
Labour Force Survey
Barbara Boschetto
[email protected]
Antonio R. Discenza
[email protected]
Francesca Fiori
[email protected]
Carlo Lucarelli
[email protected]
Simona Rosati
[email protected]
ISTAT - Italian National Institute of Statistics
Labour Force Survey Division
Unit “Methods for LFS data treatment”
Outline of the presentation
Issues related to the production of longitudinal microdata
and gross flows estimates consistent with the official
quarterly estimates
a specific focus is devoted to the weighting procedure, which
account both for a suitable reference population and
compensate for the total non-response at subsequent waves
the most relevant methodological problems addressed are:
definition of a suitable reference population for the
longitudinal sample
longitudinal non-responses and eligibility
coherence between cross-sectional and longitudinal
estimates
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Household rotation scheme 2-2-2 in the
Italian LFS
REFERENCE PERIOD
ROTATION GROUP
Quarter 4
2000
A2 B1
Quarter 1
2001
Quarter 2
2001
Quarter 3
2001
A3
Quarter 4
2001
A4 B3
Quarter 1
2002
Quarter 2
2002
Quarter 3
2002
D4 E3
Quarter 4
2002
E4
B2 C1
C2 D1
D2 E1
E2
B4 C3
F1
F2 G1
C4 D3
G2 H1
H2
F3
I1
I2
L1
50% of the sample overlaps after 1 quarter and 1 year
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Net changes in quarterly levels
are the final result of a high number of gross flows of
different nature and different size
Demographic flows:
– Children aged 15 entering working age
– Deaths
– Internal and International migration
Labour status transitions:
– Flows between the three main activity states
(employment, unemployment and inactivity)
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
GROSS LABOUR MARKET FLOWS
CHILDREN AGED 15
ISCRITTI
AND
PEOPLE
ALL'ANAGRAFE
ENTERING e
15ENNI
MUNICIPALITIES
OCCUPATI
EMPLOYMENT
DEATHS
AND
CANCELLATI
PEOPLE
LEAVING
DALL'ANAGRAFE e
MUNICIPALITIES
MORTI
PERSONE IN
UNEMPLOYMENT
CERCA DI
OCCUPAZIONE
European Conference on Quality in Official Statistics – Q2010
NON FORZE DI
INACTIVE
LAVORO
3 - 6 May 2010 - Helsinki
Choice 1: the reference population is equal to the
population of the initial quarter
Ideally, longitudinal data from LFS should represent the
whole initial population.
However, the initial population actually change during the
period of observation because of deaths and internal and
international migrations.
Thus longitudinal data could represent the whole initial
population only if the LFS was designed like a “proper”
panel, in which all the individuals in the initial sample were
“followed” for a new interview at a later stage.
This means that the information must be collected also on
people moving to another municipality or to another
country.
Actually, information for those persons who left the country
during a given period is usually never available.
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Figure 1: Scheme for a “desirable” complete matrix with stocks
and gross flows from LFS .
Labour Status at the
beginning of the period
Deaths
People
Leaving
the
Country
(15 and over)
Total cross- sectional
Population at the
beginning of period
(15 and over)
(D)
(L)
( S1 )
Labour Status at the end of the period
Total Longitudinal
Population
(15 and over)
Employed
Unemployed
Inactive
Total
Employed
(T)
Unemployed
Inactive
Total
Children age 15
(C)
People Entering the Country
(15 and over)
(E)
Total cross- sectional
Population at the end of
period (15 and over)
( S2 )
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Choice 2: the reference population is a specific
longitudinal population
However, even the population resident in a country at the
beginning of the period, which is still resident in the country at
the end of the reference period, can experience movements to
and from the different municipalities (internal migration).
Usually, in the LFS, people moving out of the household,
across the country, are not “followed” for re-interview.
Is it still correct to use the initial population as the
reference population ?
In fact, in the longitudinal sample we have information only
about those individuals still resident in the same municipality at
the end of the period.
The longitudinal component (sub-sample) of the Italian
LFS requires thus the specification of a suitable
reference population.
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Choice 2: the reference population is a specific
longitudinal population
• If we weight the longitudinal sample to the initial population
we make a very strong assumption: the behaviour of
individuals which moves out of the municipality from one
wave to another is similar to those who do not move.
• We have, thus, at least two problems:
 Actually, at least in Italy, these two groups are very
different
 Moreover, if we use the longitudinal microdata to
produce flow estimates, there are no records of
individuals moving to other regions/country and/or dying
(whereas they do exist in the population).
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Definition of longitudinal population
The Longitudinal Population is defined in Italy as the
population which is resident in the same municipality for
the entire 12 months period
excluding
– deaths
– those who have moved to other Italian municipalities
(change of residence)
– Migrants to other countries
•It is computed from population register data on resident
population; it is classified by broad age groups, geographical
area (NUTS III) and nationality (Italian, EU, non-EU)
•It is fully consistent with the reference population of EU-LFS
quarterly data was also ensured.
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Figure 1: Scheme for the two transition matrices referred to the
longitudinal population and to internal migrants..
Labour Status at the
beginning of the period
Longitudinal
Population
(15 and over)
Employed
Inactive
Total
Total
(T)
Unemployed
Inactive
Total
Population which
moves across the
country
(15 and over)
Unemployed
Inactive
Total
Employed
Unemployed
Inactive
Employed
Labour Status at the end of the period
Employed
Unemployed
( TL )
+
Labour Status at the
beginning of the period
Labour Status at the
beginning of the period
Longitudinal
Population
(15 and over)
Labour Status at the end of the period
Total
European Conference on Quality in Official Statistics – Q2010
Labour Status at the end of the period
Employed
Unemployed
Employed
Unemployed
Inactive
Total
3 - 6 May 2010 - Helsinki
Inactive
Total
Figure 3: Scheme for a “actual” complete matrix with stocks and
gross flows from Italia LFS .
Labour Status at the
beginning of the period
Longitudinal Population
(15 and over)
Deaths
People Moving
Across or Leaving
the Country
(15 and over)
Total cross- sectional
Population at the
beginning of period
(15 and over)
(D)
( LM )
( S1 )
Labour Status at the end of the period
Employed
Unemployed
Inactive
Total
Employed
Unemployed
( TL )
Inactive
Total
Children age 15
(C)
People Moving Across or
(15
Entering the Country
and over)
( EM )
Total cross- sectional
Population at the end of
period (15 and over)
( S2 )
The longitudinal estimates must be consistent with the “official” estimates
provided by the cross-sectional samples (the full sample) at the beginning
and at the end of the observed period.
Using specific constraints in the calibration procedure used to weight
longitudinal sample it is possible to reduce the risk of obtaining inconsistent
results.
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Longitudinal non-responses and eligibility
A very important aspect of the longitudinal component of the
LFS is usually affected by unit non-response in subsequent
waves, such as:
 Municipality non-response: some (very small) municipalities are
substituted in July at the beginning of a new annual survey cycle
and some others may, for different reasons, fail to provide the
interviews in subsequent waves;
 Household non-response: all the members of the household do
not fill in the questionnaire because they refuse to respond;
 Individual non-response: some members of the household do not
fill in the questionnaire because they refuse to respond, or they
cannot be contacted or left the household to create a new
household in the same municipality.
Unit non-response may produce bias if non-respondents have
significantly different labour features with respect to
respondents.
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Figure 4: Classification of individuals from the initial sample and
eligibility in the Italian LFS (in presence of longitudinal non-response).
INITIAL QUARTER
YES
NOT
MATCHABLE
INDIVIDUALS NOT
ANYMORE
RESIDENT IN
THE SAME
MUNICIPALITY
(EXIT THE INITIAL
POPULATION)
(d)
YES
(b)
LONGITUDINAL
LINK
PRESENT IN THE
LONGITUDINAL
SAMPLE
LONGITUDINAL
WEIGHTS
RESPONDENTS
(e)
YES
MATCHED
YES
MATCHABLE
INDIVIDUALS
STILL RESIDENT
IN THE SAME
MUNICIPALITY
(c)
YES
(a)
PRESENT IN
THE FINAL
SAMPLE
YES
YES
REFUSALS
(f)
NO
NO
NO
UNREACHEABLE
(g)
NO
NO
NO
INTERNAL
MIGRATION (TO
ANOTHER
MUNICIPALITY)
NO
NO
NO
INTERNATIONAL
MIGRATION (TO
ANOTHER
COUNTRY)
NO
NO
NO
DEATHS
NO
NO
NO
CLASSIFICATION OF INDIVIDUALS FROM FINAL
SAMPLE
NOT MATCHED ( h )
LONGITUDINAL
LINK
ELIGIBLE FOR LONGITUDINAL SAMPLE
REPRESENT
THE INITIAL
POPULATION
NOT ELIGIBLE FOR LONGITUDINAL
SAMPLE
PRESENT IN
THE INITIAL
SAMPLE
LONGITUDINAL SAMPLE
FINAL QUARTER
we don’t haveNO enough information to distinguish notrespondents eligible from thise not-eligible.
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Eligibility
All the individuals can be classified into two groups:
 Eligible:
– they represent part of the longitudinal population (because still
living in the same municipality),
– they should be re-interviewed at the subsequent wave.
– some of them are non-respondents in the final quarter, so that
they must be considered in a model for treatment of nonresponse (they must be represented by individuals with similar
characteristics);
 Not-eligible:
– they left the initial population during the observed period
(deaths and migrations)
– they do not represent part of the longitudinal population
– they must be excluded from a model for treatment of nonresponse.
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Weighting longitudinal data in three steps 1/3
Step 1 :
All the individuals which are linkable/matchable at the
beginning of the period are selected.
 are all the individuals of the two rotation groups which
overlap at 12 month and resident in those municipalities
which provided interviews for both waves;
they can be considered like a random sub-sample of the whole
cross-sectional sample
their base longitudinal weights are obtained from crosssectional weights applying the following correction
k i*


  k i /  k i *1 P
iLinkable

European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Weighting longitudinal data in three steps 2/3
Step 2 :
•Accounts for bias due to municipality non-response
•Accounts for the differences between the rotation groups which
overlaps and those who don’t;
•Help to ensure consistency between longitudinal and crosssectional “official” estimates
the first calibration procedure makes matchable individuals at
the beginning of the period
represent exactly the same cross-sectional population of
the whole cross-sectional sample.
provide exactly the same cross sectional “official” estimates
for a number of relevant figures (cross-classification of sex,
region, age group, labour activity status, education, etc.).
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Weighting longitudinal data in three steps 2/3
Thus, from the base longitudinal weights and for all the
linkable individuals the intermediate longitudinal weights
g i*  k i* i*
are obtained as result of a minimum constrained problem as
follows:

g i*,rse
iLinkable

N1
  wi ,rse 1 Prse
g i* X i
iLinkable
i 1
N1
  wi X i
i 1

* *
* 
min   D (k i  i , k i )
iLinkable

European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Weighting longitudinal data in three steps 3/3
Step 3 :
•Adjusts for bias due to individual non-response
•Make weighted longitudinal-sample totals conform to the
longitudinal population.
The hypothesis underlying is that
the non-response is random inside the cells resulting from
nesting population by gender, by age groups and NUTS1,
NUTS2 and NUTS3 domains
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Weighting longitudinal data in three steps 3/3
only for linked individuals the final longitudinal weights
g i*  k i* i*
are computed applying a new calibration stage to make
weighted longitudinal component totals
*
w
 i,rse  l Prse
ilinked
conform to the longitudinal population under the following
constraints

* *
* 
min   D( g i  i , g i )
iLinked

European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
POPULATION IN THE INITIAL
QUARTER
WEIGHTS
FINAL CROSS-SECTIONAL
INTERMEDIATE
WEIGHTS
LONGITUDINAL
CROSS SECTIONAL SAMPLE
MATCHABLE
AT BEGINNING OF THE
SUB-SAMPLE
PERIOD
Flow chart of weighting procedure
Step 2
LONGITUDINAL
POPULATION
Internal Migration
International Mitration
Deaths
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
LONGITUDINAL
POPULATION
European Conference on Quality in Official Statistics – Q2010
FINAL
WEIGHTS
REFUSALS
UNREACHABLE
Internal Migration
International Mitration
Deaths
INTERMEDIATE
WEIGHTS
MATCHED
RESPONDENTS
NOT
MATCHED
MATCHABLE SUB-SAMPLE
Flow chart of weighting procedure
Step 3
LONGITUDINAL
POPULATION
3 - 6 May 2010 - Helsinki
Complete Matrix with net and gross flows.
Quarter 1 2007 – Quarter 1 2008. (Thousands)
Labour Status at 2008Q1
Labour Status at
2007Q1
Longitudinal
Population
Employed
Unemployed
Inactive
Total
Employed Unemployed Inactive
20.346
353
489
449
1.260
22.095
Children aged 15
0
People Entering
the Municipalities
1075
Population aged
15+ 20087Q1
23.170
Total
1.281 21.980
514
1.452
757
23.131 25.149
Net change due to
Longitudinal
Population
flows
1.559
24.926
48.581
+ 115
0
584
202
359
1.761
25.870
People
Leaving
the
Deaths Municipalities
Population
aged 15+
2007Q1
49
817
22.846
2
102
1.556
495
377
26.021
547
1.296
50.424
Net change due to
584
Demographic flows
- 49
1.636
50.801
European Conference on Quality in Official Statistics – Q2010
Net change due to
Migratory flows
+ 258
Net change
in cross-sectional employment
+324
3 - 6 May 2010 - Helsinki
Transition Matrix for longitudinal population.
Quarter 1 2007 – Quarter 1 2008. (Thousands)
Labour Status at 2008Q1
Longitudinal
Persistence
Population in
employment
Labour Status at
2007Q1
Employed
Unemployed
Inactive
Total
Employed
Unemployed
Inactive
Total
Leaving
employment
1.634
20.346
353
1.281
21.980
489
449
514
1.452
1.260
757
23.131
25.149
22.095
1.559
24.926
48.581Net change
+105
Entering
employment
1.749
European Conference on Quality in Official Statistics – Q2010
almost 3.400
movements
3 - 6 May 2010 - Helsinki
Main Findings
Potentials
 Longitudinal data provide extremely useful insights on labour
market dynamics
 Are obtained without important additional costs, but with high
investment in methodology
 Can be produced regularly on quarterly bases
Constraints
 EU-LFS is not a panel survey, thus longitudinal estimates can refer
only to a specific longitudinal reference population
 Known totals for this longitudinal reference population must be
available for weighting
 Methods for non-response treatment must be used to reduce bias
 Methods to ensure consistency with cross-sectional estimates
must be used
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Some Examples of
Analysis of Labour Market
from Quarter 1 2004 – Quarter 1 2008
using 12 months longitudinal data
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Employment: persistence and transition probabilities
by gender and region. 2007Q1 – 2008Q1
Italy
92,6
Male
1,6
93,7
Female
1,4
90,7
1,9
North
93,9
Center
82
84
7,3
2,5
86
4,9
1,4
89,5
80
4,8
1,2
93,3
South
5,8
88
90
5,3
8
92
94
96
98
100
Percentage Points
persistence in employment
transition to unemployment
transition to inactivity
Women have lower persistence probability and
higher transition probability to inactivity
South has lower persistence probability and
much higher
transition probability
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Employment: persistence and transition probabilities by
job characteristics. 2007Q1 – 2008Q1
Total employment
92,6
Self Employed - Full time
1,6
94,2
Self Employed - Part time
74
0,9
2,2
94,7
Permanent - Employee - Part time
1,2 4,2
90,5
2,2
84,6
Temporary - Employee - Part time
5
79,8
70
4,9
23,7
Permanent - Employee - Full time
Temporary - Employee - Full time
5,8
75
10,4
6,5
80
7,3
13,7
85
90
95
100
Percentage Points
persistence in employment
transition to unemployment
transition to inactivity
High segmentation in persistence and transition
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Unemployment: transition probability to employment by
duration of search at the starting point
Percentage points
45
40
35
30
25
20
2004Q1-2005Q1
2005Q1-2006Q1
Less than 6 months
2006Q1-2007Q1
From 6 to 11 months
2007Q1-2008Q1
12 or more
Transition probability is inversely correlated to the duration of
search for employment
Opportunities to get an Employment for long term Unemployed
are stable European
in the
period
Conference
on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
Unemployment: persistence and transition probabilities
by sex and NUTS1 region. 2007Q1 – 2008Q1
Italy
30,9
Male
33,7
34,6
Female
36,9
27,3
North
10
29,8
37,8
35,9
0
42,1
45,8
26,2
South
28,5
30,6
24,4
Centre
35,4
20
36
25,9
30
40
50
38,2
60
70
80
90
100
Percentage Points
persistence in unemployment
transition to employment
transition to inactivity
Higher probability to get an employment for men
Higher probability to leave labour force for women
Huge differences in the persistence and
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki
transition
probabilities between North and South
THANK YOU
FOR YOUR ATTENTION
European Conference on Quality in Official Statistics – Q2010
3 - 6 May 2010 - Helsinki