Transcript the centres
The centre sampling technique in
surveys on foreign migrants
The balance of a multi-year experience
Gian Carlo Blangiardo
Università Milano-Bicocca / Fondazione ISMU
The mission
To increase the knowledge of the
phenomenon of foreign migrants in Italy
According to quantitative and qualitative aspects
Such as (four examples):
Example 1: Numerical consistency and juridical status
Migrants from “High Migration pressure Countries (HMCs)” in relation to their juridical status
in the Lombardia region: 2001-2007 (thousands).
1000
Res i dent s
900
Ragul ar but not r es i dent s
800
I r r egul ar
T ot al
700
600
500
400
300
200
100
0
2001
Source: Ismu Foundation
2002
2003
2004
2005
2006
2007
Example 2: Monitoring the frequencies of illegal status…
Irregular immigrants rates (per 100 presents in each macro area) in the
province of Milan: 1998-2006
50
45
40
35
30
25
20
15
10
5
1/
1/
1
2/ 99
7/ 8
1
1/ 99
1/ 8
1
2/ 99
7/ 9
1
1/ 99
1/ 9
2
1/ 00
7 0
3 1 /2
/1 00
2/ 0
2
1/ 000
7
3 1 /2
/1 00
2/ 1
2
1/ 001
7
3 1 /2
/1 00
2/ 2
2
1/ 002
7
3 1 /2
/1 00
2/ 3
30 20
/6 03
30 /20
/1 0
2/ 4
30 20
/6 04
30 /20
/1 0
2/ 5
30 20
/6 05
/2
00
6
0
Est Europa
Altri Africa
Source: Ismu Foundation
Asia
Amer. Latina
Nord Africa
Totale
Example 3:.…and the “recall” and “amnesty” effects of cyclic regularization
Estimated irregular migrants in Italy (1990-2007 in thousands)
800
700
600
Migliaia
500
400
300
200
100
0
1988
1990
Source: Ismu Foundation
1992
1994
1996
1998
2000
2002
2004
2006
2008
Example4: Structural aspects
Comparison between legal and illegal migrants in Lombardia (years 2004-2006)
Documented migrants
from HMCs
Number of individuals in the sample
Gender
Civil status
Relatives abroad
Education
Accommodation
Employment
status
Undocumented
migrants from HMCs
14061
2837
% with female head of household
% single
% married
% spouse abroad (married individuals)
% children abroad (individuals with children)
% no education
% university
own property
rented flat
hotel
free accommodation
c/o job place
irregular accommodation
30.6
40.7
50.0
47.9
51.5
10.0
13.4
12.9
72.6
0.3
6.5
7.3
0.3
37.6
58.6
33.2
90.9
91.3
12.7
11.1
1.1
59.1
0.5
17.8
15.9
5.5
employed
self employed
unemployed
86.6
8.6
4.8
76.8
6.7
16.4
Number of household members
Number of
children:
total
in Italy
Years of permanence in Italy
Age
Wage
Source: G.C.Blangiardo, F.Fasani, B.Speciale
mean
2.14
median
1
mean
1.30
median
1
1.11
0.59
7.59
34.45
1120.70
1
0
6
34
1000
0.86
0.09
2.38
31.67
837.22
0
0
2
30
800
In order to achieve similar results
Suitable statistics are required
Do the official sources fulfill these needs?
The contribution of Italian official sources
is increasing but still unsatisfactory
Limits
Official sources take in consideration and give information only
about the regular foreign residents, without concentrating on the
specific characteristics as structural aspects or the life conditions.
Are the sampling surveys a valid alternative ?
Limits
No list of the population is available generally.
Relevant particularly if the reference universe includes
all the immigrants (irrespective on their juridical
status).
The real problem becomes: how to select (at random,
as requested by probabilistic samples) and to contact
the sample units?
The centre sampling method (CS):
a convenient solution (support to official sources)
The basic principle of the CS method assumes that each statistical units
(the migrant) visits almost one local centre of aggregation of some kind
(institutions, places of worship, entertainment, care centres, meeting
points, call centre, etc.).
The centres can be divided into two main categories:
-centres where the complete list of participants could be available (i.e. population
register, language courses, medical and care centres, etc.)
-centres without any list with the further distinction between:
centres with a limited number of participants (i.e. social assistance with a
standard number of places/bed)
centres “opened” and with no information available (i.e. squares, parks,
shopping centres, bars and discos, etc.).
Menonna 2006
By CS method we can imagine that the universe of foreign citizens
present there at the time of the survey is made up of a list of H
statistical units, each of which by necessity keeps a set of contacts
with some centres or gathering places located in the area. Once a
sufficiently wide and heterogeneous set of ‘centres’ is identified, the
universe of foreign citizens, whose nominative list is non available,
can be formally described by the following table:
List of units (unknown)
Sequence
1
2
…
i
…
H-1
H
Names
W(i)
a
b
…
…
…
w
z
*-----------------------------------List of centres (known)----------------------------------*
Centre 1
1
0
…
…
…
0
1
Tot. H(1)
Centre 2
0
0
…
1
…
1
1
Tot. H(2)
List of centres possibly attended
Centre 3
…
…
Centrek-1
0
…
…
0
1
…
…
0
…
…
…
…
0
…
…
1
…
…
…
…
1
…
…
0
0
…
…
1
Tot. H(3)
…
…
Tot. H(k-1)
Centre k
1
0
…
0
…
0
1
Tot. (k)
In each column the value is 1 if the subject visits that centre, and 0 otherwise (we can also
consider “how much time” is spent in each centre. In this case the attendance can be
formally expressed by a value 0≤X≤1) . It follows that the total of a given column
identifies the number of individuals (among the H constituting the universe of reference)
visiting that centre.
This means that, instead of selecting n sample units by rows (i.e. n
names from the unknown list) we can: a) select n columns/centres
(known) and then b) choose randomly n individuals among those
regularly visiting the selected centres.
According to this assumption the preliminary step is to identify all
(or a sufficiently large set of) the centres located in the chosen
territory and visited by the migrants.
After having identified the set of centres of aggregation in the
territory of interest, the interview section can start.
To maintain the representativeness of the sample, it is very important to
choose the individuals at random. This requirement can be satisfied in many
different ways. Let us assume that in the chosen territory there are k centres
visited by the migrants. These centres are of different size. In practice, the
number of interviews in a certain centre depends on its size. If the centre is
considered to be small, a small number of interviewees will be chosen. On
the contrary, the bigger the centre, the more migrants visit it, the more
individuals will be interviewed.
In any selected centre the corresponding set of interviewees must be
selected at random among its visitors.
Later, the interviewees (chosen individuals) are asked to fulfill the
questionnaire with questions concerning her/his structural
characteristics, both individual and family ones, as for example:
sex, age, civil status, citizenship, education, religion, regular
position of the staying, residence, housing conditions, economic
activities, remittances, family structure, etc.
They are also asked which of the k centres (indicated on a specific
annex to the questionnaire) they normally visit.
Once the questionnaires are filled, the foreign citizens are given a
profile according to the centres they visit (all the individuals who
visit the same centres are given the same profiles).
Their individual probability of inclusion in the sample has been
determined as dependent:
1) directly on the number of selected centres the person really visits;
and
2) inversely on the number of individuals from the population who
visit that centre.
As a consequence, the sample that we collect by CS technique is
originally biased.
It must be transformed to an unbiased sample be means of
appropriate weights to be associated with each sample unit.
In other words the more centres any individual in the universe visits,
the larger the inclusion probability of being interviewed will be.
Consequently, if drawn into the sample, he will be associated ex-post
with a lower weight.
But, the ex-post weights also depend on the number of individuals
who visit those centres.
The larger and more visited the centre is, the smaller the inclusion
probability is, and therefore the value of the weight for this individual
is higher.
Finally it can be shown that by the adoption of these weights
the sample that comes out by CS technique can be considered
as representative of the whole universe and fully comparable
to a hypothetical traditional simple random sample for which,
in the contrary the (generally unknown) list of units is strictly
required.
From “who are the migrants” to “how many they are”
When surveys results can contribute
to output quantitative evaluations
It must be pointed out that:
- CS sampling made available a representative sample from which we can
derive estimations of:
1)
The rate of foreigners (by citizenship) who were recorded, at the time
of the survey, in the Official Population Register (the so called
“anagrafe”): i.e. the rate of residents (per 100 presents).
2)
The rate of foreigners (by citizenship) who were in possession, at the
time of the survey, of legal status with respect to residence: i.e. the rate
of regulars (per 100 presents).
- Official sources (National Statistical Institute - Istat) produce yearly the
number of foreigners (by citizenship and sex) who are legally recorded
into the Official Population Register of any Italian municipality
Opportunity for a fruitful marriage?
Sample rates
Official statistics
Quantitative estimates of immigrants according to
juridical status (specification by citizenship and sex)
Number of :
residents,
regular not residents,
irregular
Example
Estimate of Ukrainian migrants in Milan on 1st July 2007
Ukrainians in the population register of Milan
(residents) on 1st July 2007 = 3628
Rate of residents Ukrainians estimated on 1st July 2007 by CS sample
(residents in Milan per 100 presents)
= 67,5%
Rate of regular Ukrainians estimated on 1st July 2007 by CS sample
(residents in Milan per 100 presents)
= 79,8%
Valuation of Ukrainians in Milan on 1st July 2007
total of presents: 3628 / 0,675
= 5375
Of which
illegal: 5375 / (1-0,798)
= 1086
regular not resident: 5375 - 3628 – 1086
= 661
residents
= 3628
Source: Ismu Foundation, Regional Observatory for Integration and Multiethnicity
More than 15 years of field experiences in surveys
by Centre Sampling method (summary)
•Indagine coordinata sulla presenza straniera in Italia 1993-1994 (gruppo MURST 40%
Università di Milano, Bologna, Ancona, Roma, Torino, Latina, Napoli (3000 units).
•Indagini IReR - OETAMM (Area metropolitana milanese 1991 e 1992 (500 units); Monza
1992 (200 units); Brescia 1993 (300 units).
•Indagine NIDI - Eurostat/IRP-CNR, 1997 (1000 units, Milano, Roma, Caserta, Modena,
Vicenza)
•Indagini Osservatori I.S.MU: Milano 1996-2000 (1000 units a year); Provincia di Milano
1997-2000 (2000 units a year); Provincia di Lodi 1999 e 2001 (500 units a year); Provincia di
Mantova 2000-2001 (500 units a year); Provincia di Varese 2000 (500 units); Provincia di
Cremona 2000 (500 units); Provincia di Lecco 2000 e 2001 (500 units a year).
•Osservatorio Regionale della Lombardia 2001-2008 (8000 units a year, 9000 units since year
2006)
•Ricerca ISMU-Ministero del Welfare 2005 (30000 units)
•Osservatorio provinciale di Biella 2006 (500 units)
•Osservatorio provinciale di Venezia 2007 (800 units)
•Osservatorio provinciale di Cuneo 2007 (1000 units)
•Università di Trento Dipartimento di Sociologia - Eurostat 2007 (900 units)
Methodological Reference for CS method
Baio G. (University College London),
Blangiardo G.C. (Università di Milano-Bicocca),
Blangiardo M., (Imperial College London),
Centre sampling technique in foreign migration surveys.
A methodological note
(forthcoming)
Summary
1. Center sampling methodology
1.1 Framework of analysis
1.2 Identifications of the weights
1.3 Estimating the proportion of each profile
1.4 Allocation the sample size into K centres: impact on the computation of
the weights
1.5 Relaxing the assumption on ex-ante knowledge of the relative importance
of each centre
2. Example: estimating the Egyptians in Milan
Thanks for your attention