Design and Use of the IPUMS-International Data Series http://international.ipums.org Matt Sobek Minnesota Population Center sobek@pop.umn.edu IPUMS-International Overview Processing Dissemination system Strengths and limitations Users END https://international.ipums.org Matt Sobek Minnesota Population Center sobek@pop.umn.edu What is IPUMS-International? Census data – 1960 to present Samples – 1 to 10%, nationally representative Microdata – individual-level Integrated – consistent codes across time and place Downloadable – anonymized Extract system – select variables – pooled data Map of IPUMS Partners Dark green = disseminating data Light green = partners, not yet disseminating 83 countries Current Countries in IPUMS Africa Asia Americas Europe Egypt Ghana Guinea Kenya Rwanda South Africa Uganda Armenia Cambodia China India Iraq Israel Jordan Kyrgyz Rep. Malaysia Mongolia Palestine Philippines Vietnam Argentina Bolivia Brazil Canada Chile Colombia Costa Rica Ecuador Mexico Panama United States Venezuela Austria Belarus France Greece Hungary Italy Netherlands Portugal Romania Slovenia Spain United Kingdom 44 countries 130 samples 279 million persons Countries in IPUMS Archive Bangladesh Botswana Cuba Czech Republic Dominican Rep. El Salvador Ethiopia Fiji Germany Guatemala Haiti Honduras Indonesia Liberia Madagascar Malawi Mali Mauritius Nepal Nicaragua Pakistan Paraguay Peru Puerto Rico Senegal Saint Lucia Sierra Leone Sudan Switzerland Tanzania Thailand Turkmenistan Uruguay Zambia IPUMS Microdata Relation to head Marital status Literacy Occupation Availability of Selected Person Variables (Number of samples) Relationship to head 130 Religion 54 Age 130 Language 33 Sex 130 Ethnicity 41 Marital status Age at first marriage 129 16 Race School attendance 20 105 Children ever born 91 Literacy Children surviving Mother's mortality status 59 16 Education attainment Years of schooling 119 72 Country of birth 81 Employment status 119 Place of birth Citizenship 90 67 Class of worker Occupation 120 116 Year of immigration 22 Industry 116 Migration, international 53 Hours worked weekly 38 Total income Earned income 24 26 Migration, internal Disability 101 32 91 Availability of Selected Household Variables (Number of samples) Urban-rural status 89 Geography, 1st level 120 Geography, 2nd level 86 Electricity 81 Water 95 Sewage 76 Home ownership 107 Toilet 86 Number of rooms 102 Cooking fuel 39 Floor material 46 Telephone 57 Wall material 40 Television 45 Roof material 27 Computer 16 Living Area 20 Automobiles 42 536 Integrated variables 10,600 Unharmonized variables User Access Application • Scholarly and educational purposes • Key: it must not be redistributed Once approved, access to all data Free Making the IPUMS Pre-processing Integration Dissemination Making the IPUMS Pre-processing • Language translation • Reformatting • Error correction • Sampling • Confidentiality Integration Making the IPUMS Pre-processing • Language translation • Reformatting • Error correction • Sampling • Confidentiality Integration • Metadata • Data harmonization • Constructed variables Census Questionnaire (Mexico 2000) Water Access Text of Census Questionnaire (Mexico 2000) 5. Number of Rooms How many rooms are used for sleeping without counting hallways? _____ Write the number Without counting the hallways or bathrooms how many total rooms are in this dwelling? Count the kitchen _____Write the number 6. Access to water Read all of the options until you get an affirmative answer. Circle only one answer 1 Running water inside the dwelling 2 Running water outside the dwelling but on the land 3 Running water from a public faucet or hydrant 4 Running water that is carried from another dwelling 5 Tanked in by truck 6 Water from a well, river, lake, stream or other Answers 3, 4, 5, 6 continue with number 8 7. Water supply How many days of the week is water available? Circle only one answer 1 Daily 2 Every third day 3 Twice a week 4 Once a week 5 Occasionally XML-Tagged Census Questionnaire (Mexico 2000) Water access Data Integration – Marital Status MARST Marital Status China 1982 code label CN82A403 100 SINGLE/NEVER MARRIED 200 MARRIED/IN UNION 210 Married (not specified) Colombia 1973 CO73A411 Kenya 1989 Mexico 1970 KN89A413 MX70A402 US90A425 1=never married 4=single 1=single 9=single 6=never married 2=married 3=monogamous 2=married 1=married 211 Civil 3=only civil 212 Religious 4=only religious 213 Civil and religious 2=civil and religious 214 Polygamous 220 300 310 3=polygamous Consensual union 1=free union SEPARATED/DIVORCED Legally separated 322 De facto separated 5=free union 3=sep. or divorced Separated 321 U.S.A. 1990 6=separated 8=separated 3=separated 5=divorced 7=divorced 4=divorced 5=widowed 330 Divorced 4=divorced 400 WIDOWED 3=widowed 5=widowed 4=widowed 6=widowed 999 UNKNOWN/MISSING 0=missing 6=unknown B=blank 1=unknown Family Interrelationship Variables (Simple household) Pernum Relate Age Sex Marst Chborn Spouse’s Location 1 head 46 male married n/a 2 2 spouse 44 female married 3 1 3 aunt 77 female widow 7 0 4 child 15 female single 0 0 5 child 13 female single n/a 0 6 child 11 male single n/a 0 Pernum Relate Age Sex Marst Chborn Mother’s Location Father’s Location 1 head 46 male married n/a 0 0 2 spouse 44 female married 3 0 0 3 aunt 77 female widow 7 0 0 4 child 15 female single 0 2 1 5 child 13 female single n/a 2 1 6 child 11 male single n/a 2 1 IPUMS “Pointer” Variables (Complex household) Pernum Relationship Age Sex Marst Chborn Spouse’s Location Mother’s Location Father’s Location 1 head 53 female separated 6 0 0 0 2 child 28 male single n/a 0 1 0 3 child 22 male single n/a 0 1 0 4 child 21 male single n/a 0 1 0 5 child 25 female married 2 6 1 0 6 child-in-law 28 male married n/a 5 0 0 7 grandchild 3 male single n/a 0 5 6 8 grandchild 1 male single n/a 0 5 6 9 non-relative 32 female separated 2 0 0 0 10 non-relative 10 male single n/a 0 9 0 11 non-relative 5 female single n/a 0 9 0 Family Interrelationship Pointers 13 censuses include data on location of parent or spouse Agree Disagree Under age 18 Spouse 99.5 0.5 Mother 98.7 1.3 Father 99.4 0.6 Mother 97.5 2.5 Father 98.7 1.3 IPUMS Home Page Variables Page Variables Page Variables Page Sample Filtering Variables Page Unharmonized Variables Variable Description (Marital status) Comparability Discussion (Marital status) Enumeration Text (Marital status) Enumeration Text (Marital status, Cambodia) Variable Codes (Marital status) Variable Codes (Marital status) Variable Codes (Marital status) IPUMS Home Page Extract Step 1 – Login Extract Step 2 – Select Samples Extract Step 3 – Select Variables Extract Step 4 – Variable Options Extract Step 4 – Select Cases Extract Step 4 – Attach Characteristics Age of spouse Employment status of father Occupation of father Extract Step 5 – Customize Sample Sizes Extract Step 5 – Customize Sample Sizes Extract Step 5 – Customize Sample Sizes Extract Step 6 – Submit Download or Revise Extract Key Strengths of the Census Samples • Large Enable study of relatively small populations • Internationally comparable Pool data across countries – integrated variables • Temporal depth Provide historical perspective Key Strengths of the Census Samples • Microdata All of a person’s characteristics – multivariate analysis • Hierarchical Characteristics of everyone a person resided with Cohabitation and family interrelationships Limitations Due to Confidentiality • Samples Too small to answer some questions • Geography 20,000 population or larger • Sensitive variables, very small categories Other Issues and Limitations • Cross-sectional data Not longitudinal • User burden Information overload; culturally specific knowledge Variable labels are insufficient IPUMS Users 2200 registered users Academic field (%) 47 Economics 21 Demography 10 Sociology 22 Other 54% Graduate students Samples Extracted 67% multiple samples 45% multiple countries 17% 5 or more countries Decade of Extracted Sample Decade 1960s 1970s 1980s 1990s 2000s Percent 11 14 16 30 29 Most Frequently Extracted Countries 1. 2. 3. 4. 5. Mexico Brazil United States Colombia France 6. 7. 8. 9. 10. Chile Ecuador Vietnam Kenya Argentina Most Frequently Extracted Variables Relation to head Age Sex Marital status Educational attainment Years of schooling School attendance Literacy Employment status Class of worker Occupation recode Industry recode Occupation Industry Urban-rural status Country of birth Nativity status Migration status, 5 years Children ever born Children surviving Religion Ownership of dwelling Water Electricity Sewage Number of rooms Toilet Earned income Total income Spouse’s location in household Median Age by Country Italy 42 Chile 29 Kyrgyz Republic 22 Greece 39 Argentina 27 Mongolia 21 Austria 38 Israel 27 Philippines 21 Hungary 38 Brazil 25 Bolivia 20 Portugal 38 China 25 Egypt 20 Canada 37 Colombia 25 Jordan 20 France 37 Costa Rica 24 Ghana 19 Netherlands 37 Mexico 24 Cambodia 17 Slovenia 37 Panama 24 Guinea 17 Spain 37 South Africa 24 Iraq 17 United Kingdom 37 Ecuador 23 Kenya 17 Belarus 36 Malaysia 23 Palestine 17 United States 36 Venezuela 23 Rwanda 17 Romania 35 Vietnam 23 Uganda 15 Armenia 31 India 22 (Calculated from the most recent sample from each country.) Population Pyramids Palestine 10 8 6 4 2 0 2 4 6 8 10 10 8 6 4 2 0 2 Egypt Iraq 10 8 6 4 2 0 2 4 6 8 10 4 6 8 10 Population Pyramids 10 8 6 4 2 0 2 4 6 8 10 10 8 6 4 2 0 2 4 6 8 10 10 8 6 4 2 0 2 4 Young Medium Old (Uganda 2002) (Philippines 2000) (USA 2005) 6 8 10 Population Pyramids 10 8 6 4 2 0 2 4 6 8 10 10 8 6 4 2 0 2 4 6 8 10 10 8 6 4 2 0 2 Belarus Cambodia China 1998 1998 1990 4 6 8 10 Population Pyramids Mexico 10 8 6 4 2 0 2 1960 4 6 8 10 10 8 6 4 2 0 2 1990 4 6 8 10 10 8 6 4 2 0 2 2005 4 6 8 10 Married Female Labor Force Participation in Latin America (age 18 to 65) 50 45 40 Brazil Percent in Labor Force 35 30 Colombia 25 Venezuela 20 15 Chile 10 Mexico Costa Rica Ecuador 5 0 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 Married Female Labor Force Participation: Latin America and U.S. (age 18 to 65) 70 60 Percent in Labor Force 50 40 United States 30 20 Latin America 10 0 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Married Female Labor Force Participation: Latin America and U.S. (age 18 to 65) 70 United States 60 Percent in Labor Force 50 Brazil 40 Compare Latin America to U.S. 40 years earlier Colombia 30 Venezuela 20 Ecuador Chile Costa Rica 10 0 1920 Mexico 1930 1940 1950 1960 1970 1980 1990 2000 2010 Married Female Labor Force Participation: Mexican-born Women, 1970-2000 70 60 Mexican-born Women in United States Percent in Labor Force 50 40 30 Women in Mexico 20 10 0 1970 1975 1980 1985 1990 1995 2000 Males Females Persons age 16 to 65. United States 1960 United States 1970 United States 1980 United States 1990 United States 2000 France 1962 France 1968 France 1975 France 1982 France 1990 South Africa 1996 South Africa 2001 Kenya 1989 Kenya 1999 Vietnam 1989 Vietnam 1999 China 1982 Venezuela 1971 Venezuela 1981 Venezuela 1990 Mexico 1970 Mexico 1990 Mexico 2000 Ecuador 1962 Ecuador 1974 Ecuador 1982 Ecuador 1990 Ecuador 2001 Costa Rica 1963 Costa Rica 1973 Costa Rica 1984 Costa Rica 2000 Colombia 1964 Colombia 1973 Colombia 1985 Colombia 1993 Chile 1960 Chile 1970 Chile 1982 Chile 1992 Chile 2002 Brazil 1960 Brazil 1970 Brazil 1980 Brazil 1991 Brazil 2000 Percent of Working-Age Population Working-Age Population in the Labor Force, by Sex 100 90 80 70 60 50 40 30 20 10 0 Population Residing with an Elderly Person 30 20 15 10 5 Brazil Colombia Mexico Kenya Elderly persons (age 65+) S Africa China Vietnam France Non-elderly residing with an elderly person 2000 1990 1980 1970 1960 1990 1982 1975 1968 1962 1999 1989 1982 2001 1996 1999 1989 2000 1990 1970 1993 1985 1973 2000 1991 1980 1970 0 1960 Percent of total population 25 United States Percent of elders in elder-head intergenerational families 50 Argentina Brazil 40 Chile Colombia Percent Costa Rica 30 Ecuador Kenya 20 Mexico Philippines Romania 10 Rwanda Vietnam South Africa 0 1970 1975 1980 1985 1990 1995 2000 Uganda Venezuela Percent of elders in younger-head families 50 Argentina Brazil 40 Chile Colombia Percent Costa Rica 30 Ecuador Kenya 20 Mexico Philippines Romania 10 Rwanda Vietnam South Africa 0 1970 1975 1980 1985 1990 1995 2000 Uganda Venezuela Trends in Intergenerational Families Intergenerational families headed by the older generation are becoming more common in most countries, with exceptions mainly in Africa. Intergenerational families headed by the younger generation—the configuration that suggests old-age support—are much rarer, and they are on the decline in most countries. Persons with Completed Secondary Education: National Populations Versus Migrants to the United States 100 90 80 70 Percent 60 50 40 30 20 10 0 Brazil Chile Costa Rica Ecuador In home country, ca. 2000 Mexico Vietnam Migrants to U.S. 1995-2000 Kenya South Africa