The centre sampling technique in surveys on foreign migrants The balance of a multi-year experience Gian Carlo Blangiardo Università Milano-Bicocca / Fondazione ISMU The mission To increase the knowledge of the phenomenon of foreign migrants in Italy According to quantitative and qualitative aspects Such as (four examples): Example 1: Numerical consistency and juridical status Migrants from “High Migration pressure Countries (HMCs)” in relation to their juridical status in the Lombardia region: 2001-2007 (thousands). 1000 Res i dent s 900 Ragul ar but not r es i dent s 800 I r r egul ar T ot al 700 600 500 400 300 200 100 0 2001 Source: Ismu Foundation 2002 2003 2004 2005 2006 2007 Example 2: Monitoring the frequencies of illegal status… Irregular immigrants rates (per 100 presents in each macro area) in the province of Milan: 1998-2006 50 45 40 35 30 25 20 15 10 5 1/ 1/ 1 2/ 99 7/ 8 1 1/ 99 1/ 8 1 2/ 99 7/ 9 1 1/ 99 1/ 9 2 1/ 00 7 0 3 1 /2 /1 00 2/ 0 2 1/ 000 7 3 1 /2 /1 00 2/ 1 2 1/ 001 7 3 1 /2 /1 00 2/ 2 2 1/ 002 7 3 1 /2 /1 00 2/ 3 30 20 /6 03 30 /20 /1 0 2/ 4 30 20 /6 04 30 /20 /1 0 2/ 5 30 20 /6 05 /2 00 6 0 Est Europa Altri Africa Source: Ismu Foundation Asia Amer. Latina Nord Africa Totale Example 3:.…and the “recall” and “amnesty” effects of cyclic regularization Estimated irregular migrants in Italy (1990-2007 in thousands) 800 700 600 Migliaia 500 400 300 200 100 0 1988 1990 Source: Ismu Foundation 1992 1994 1996 1998 2000 2002 2004 2006 2008 Example4: Structural aspects Comparison between legal and illegal migrants in Lombardia (years 2004-2006) Documented migrants from HMCs Number of individuals in the sample Gender Civil status Relatives abroad Education Accommodation Employment status Undocumented migrants from HMCs 14061 2837 % with female head of household % single % married % spouse abroad (married individuals) % children abroad (individuals with children) % no education % university own property rented flat hotel free accommodation c/o job place irregular accommodation 30.6 40.7 50.0 47.9 51.5 10.0 13.4 12.9 72.6 0.3 6.5 7.3 0.3 37.6 58.6 33.2 90.9 91.3 12.7 11.1 1.1 59.1 0.5 17.8 15.9 5.5 employed self employed unemployed 86.6 8.6 4.8 76.8 6.7 16.4 Number of household members Number of children: total in Italy Years of permanence in Italy Age Wage Source: G.C.Blangiardo, F.Fasani, B.Speciale mean 2.14 median 1 mean 1.30 median 1 1.11 0.59 7.59 34.45 1120.70 1 0 6 34 1000 0.86 0.09 2.38 31.67 837.22 0 0 2 30 800 In order to achieve similar results Suitable statistics are required Do the official sources fulfill these needs? The contribution of Italian official sources is increasing but still unsatisfactory Limits Official sources take in consideration and give information only about the regular foreign residents, without concentrating on the specific characteristics as structural aspects or the life conditions. Are the sampling surveys a valid alternative ? Limits No list of the population is available generally. Relevant particularly if the reference universe includes all the immigrants (irrespective on their juridical status). The real problem becomes: how to select (at random, as requested by probabilistic samples) and to contact the sample units? The centre sampling method (CS): a convenient solution (support to official sources) The basic principle of the CS method assumes that each statistical units (the migrant) visits almost one local centre of aggregation of some kind (institutions, places of worship, entertainment, care centres, meeting points, call centre, etc.). The centres can be divided into two main categories: -centres where the complete list of participants could be available (i.e. population register, language courses, medical and care centres, etc.) -centres without any list with the further distinction between: centres with a limited number of participants (i.e. social assistance with a standard number of places/bed) centres “opened” and with no information available (i.e. squares, parks, shopping centres, bars and discos, etc.). Menonna 2006 By CS method we can imagine that the universe of foreign citizens present there at the time of the survey is made up of a list of H statistical units, each of which by necessity keeps a set of contacts with some centres or gathering places located in the area. Once a sufficiently wide and heterogeneous set of ‘centres’ is identified, the universe of foreign citizens, whose nominative list is non available, can be formally described by the following table: List of units (unknown) Sequence 1 2 … i … H-1 H Names W(i) a b … … … w z *-----------------------------------List of centres (known)----------------------------------* Centre 1 1 0 … … … 0 1 Tot. H(1) Centre 2 0 0 … 1 … 1 1 Tot. H(2) List of centres possibly attended Centre 3 … … Centrek-1 0 … … 0 1 … … 0 … … … … 0 … … 1 … … … … 1 … … 0 0 … … 1 Tot. H(3) … … Tot. H(k-1) Centre k 1 0 … 0 … 0 1 Tot. (k) In each column the value is 1 if the subject visits that centre, and 0 otherwise (we can also consider “how much time” is spent in each centre. In this case the attendance can be formally expressed by a value 0≤X≤1) . It follows that the total of a given column identifies the number of individuals (among the H constituting the universe of reference) visiting that centre. This means that, instead of selecting n sample units by rows (i.e. n names from the unknown list) we can: a) select n columns/centres (known) and then b) choose randomly n individuals among those regularly visiting the selected centres. According to this assumption the preliminary step is to identify all (or a sufficiently large set of) the centres located in the chosen territory and visited by the migrants. After having identified the set of centres of aggregation in the territory of interest, the interview section can start. To maintain the representativeness of the sample, it is very important to choose the individuals at random. This requirement can be satisfied in many different ways. Let us assume that in the chosen territory there are k centres visited by the migrants. These centres are of different size. In practice, the number of interviews in a certain centre depends on its size. If the centre is considered to be small, a small number of interviewees will be chosen. On the contrary, the bigger the centre, the more migrants visit it, the more individuals will be interviewed. In any selected centre the corresponding set of interviewees must be selected at random among its visitors. Later, the interviewees (chosen individuals) are asked to fulfill the questionnaire with questions concerning her/his structural characteristics, both individual and family ones, as for example: sex, age, civil status, citizenship, education, religion, regular position of the staying, residence, housing conditions, economic activities, remittances, family structure, etc. They are also asked which of the k centres (indicated on a specific annex to the questionnaire) they normally visit. Once the questionnaires are filled, the foreign citizens are given a profile according to the centres they visit (all the individuals who visit the same centres are given the same profiles). Their individual probability of inclusion in the sample has been determined as dependent: 1) directly on the number of selected centres the person really visits; and 2) inversely on the number of individuals from the population who visit that centre. As a consequence, the sample that we collect by CS technique is originally biased. It must be transformed to an unbiased sample be means of appropriate weights to be associated with each sample unit. In other words the more centres any individual in the universe visits, the larger the inclusion probability of being interviewed will be. Consequently, if drawn into the sample, he will be associated ex-post with a lower weight. But, the ex-post weights also depend on the number of individuals who visit those centres. The larger and more visited the centre is, the smaller the inclusion probability is, and therefore the value of the weight for this individual is higher. Finally it can be shown that by the adoption of these weights the sample that comes out by CS technique can be considered as representative of the whole universe and fully comparable to a hypothetical traditional simple random sample for which, in the contrary the (generally unknown) list of units is strictly required. From “who are the migrants” to “how many they are” When surveys results can contribute to output quantitative evaluations It must be pointed out that: - CS sampling made available a representative sample from which we can derive estimations of: 1) The rate of foreigners (by citizenship) who were recorded, at the time of the survey, in the Official Population Register (the so called “anagrafe”): i.e. the rate of residents (per 100 presents). 2) The rate of foreigners (by citizenship) who were in possession, at the time of the survey, of legal status with respect to residence: i.e. the rate of regulars (per 100 presents). - Official sources (National Statistical Institute - Istat) produce yearly the number of foreigners (by citizenship and sex) who are legally recorded into the Official Population Register of any Italian municipality Opportunity for a fruitful marriage? Sample rates Official statistics Quantitative estimates of immigrants according to juridical status (specification by citizenship and sex) Number of : residents, regular not residents, irregular Example Estimate of Ukrainian migrants in Milan on 1st July 2007 Ukrainians in the population register of Milan (residents) on 1st July 2007 = 3628 Rate of residents Ukrainians estimated on 1st July 2007 by CS sample (residents in Milan per 100 presents) = 67,5% Rate of regular Ukrainians estimated on 1st July 2007 by CS sample (residents in Milan per 100 presents) = 79,8% Valuation of Ukrainians in Milan on 1st July 2007 total of presents: 3628 / 0,675 = 5375 Of which illegal: 5375 / (1-0,798) = 1086 regular not resident: 5375 - 3628 – 1086 = 661 residents = 3628 Source: Ismu Foundation, Regional Observatory for Integration and Multiethnicity More than 15 years of field experiences in surveys by Centre Sampling method (summary) •Indagine coordinata sulla presenza straniera in Italia 1993-1994 (gruppo MURST 40% Università di Milano, Bologna, Ancona, Roma, Torino, Latina, Napoli (3000 units). •Indagini IReR - OETAMM (Area metropolitana milanese 1991 e 1992 (500 units); Monza 1992 (200 units); Brescia 1993 (300 units). •Indagine NIDI - Eurostat/IRP-CNR, 1997 (1000 units, Milano, Roma, Caserta, Modena, Vicenza) •Indagini Osservatori I.S.MU: Milano 1996-2000 (1000 units a year); Provincia di Milano 1997-2000 (2000 units a year); Provincia di Lodi 1999 e 2001 (500 units a year); Provincia di Mantova 2000-2001 (500 units a year); Provincia di Varese 2000 (500 units); Provincia di Cremona 2000 (500 units); Provincia di Lecco 2000 e 2001 (500 units a year). •Osservatorio Regionale della Lombardia 2001-2008 (8000 units a year, 9000 units since year 2006) •Ricerca ISMU-Ministero del Welfare 2005 (30000 units) •Osservatorio provinciale di Biella 2006 (500 units) •Osservatorio provinciale di Venezia 2007 (800 units) •Osservatorio provinciale di Cuneo 2007 (1000 units) •Università di Trento Dipartimento di Sociologia - Eurostat 2007 (900 units) Methodological Reference for CS method Baio G. (University College London), Blangiardo G.C. (Università di Milano-Bicocca), Blangiardo M., (Imperial College London), Centre sampling technique in foreign migration surveys. A methodological note (forthcoming) Summary 1. Center sampling methodology 1.1 Framework of analysis 1.2 Identifications of the weights 1.3 Estimating the proportion of each profile 1.4 Allocation the sample size into K centres: impact on the computation of the weights 1.5 Relaxing the assumption on ex-ante knowledge of the relative importance of each centre 2. Example: estimating the Egyptians in Milan Thanks for your attention