Overview of European Data from Official Statistics Roxane Silberman CNRS/Réseau Quetelet and DwB coordinator With the support of Cyril Jayet, Marie Cros, Raphaëlle Fleureux, Alexandre Kych (CNRS-RQ) DwB training course: Working with data from Official Statistics particularly the Longitudinal SILC Paris, GENES, 3rd DwB Training Course, February 19-21, 2014 Aim of this presentation • Main course is on SILC longitudinal, European microdata ie provided by Eurostat • Eurostat microdata completely based on national microdata Increasing links between the national and the European level, however differences remain Using the European microdata requires understanding these links as it raises important methodological issues National microdata still offer rich resources for comparative research • 3 objectives • Provide an understanding of the construction of the European microdata • Take this occasion for providing a comprehensive overview of what is available for comparative research both at European and at national level regarding official microdata, that are still underused • More than Eurostat microdata at European level • Related microdata available at national level more detailed, other variables, and sometimes on a longer period (ex. SILC) • Other microdata availabla at national levels (ex. SILC related topics) • How to locate and access these microdata and new tools DwB will offer • An overview • Focus on transnational access to confidential microdata (M. Isnard presentation) Outline • • National and European official microdata: terminology and historical backgrounds Overview of official microdata in Europe • Integrated European microdata (Eurostat) Other European microdata National microdata collected and harmonized in European databases Other national microdata (with some focus on related topics to SILC) How to locate and access official microdata within Europe? Metadata Transnational access DwB support and new tools I. National and European official microdata Terminology and historical backgrounds - - Official microdata ? A vast perimeter, moving and covering different types of microdata Historical backgrounds for European microdata Differences in national systems for official microdata Consequences for research Official microdata ? A vast perimeter … • Different words (official, government, national etc..) In any cases, statistics provided by government bodies A larger perimeter than NSIs and Eurostat • Microdata provided by : National level National Statistical Institute National Statistical administrations coordinated by the NSI + tax data Central banks Government agencies particularly those in charge of Social security, health, pensions … Local authorities increasingly Number of government bodies producing and providing data vary according to the organization of the each statistical system and degree of centralization European level Eurostat European Central Bank European Commission and agencies … moving • Perimeter may change according to the decision of governments/NSIs ex. France: Customs statistical department went out and in Tax data recently included as a Statistical department coordinated by INSEE Cereq microdata on transition from school to work went out … Ex. Household finance and consumption surveys (HFCS) move to central banks coordinated by the BCE • Mixed status for some government agencies (social security, unemployment …) • • What about data to be certified by NSIs ? • Under different legal framework (surveys, administrative data, business data, fiscal data, health data, financial data …) with consequences for access Perimeter does not necessarily cover similar data in the different countries depending on role of other producers (universities …) and historical changes (Eastern countries recently joining the EU) … and covering different types of microdata Censuses or registers + longitudinal samples from censuses Surveys (including some panels) Individual and households surveys Business surveys Administrative data (frequently longitudinal databases) Individual Business Combined datasets Administrative datasets – More common in countries based on registers – Now increasingly the case in all countries (yet requiring a common identifier) Administrative data and surveys more and more common for longitudinal data Raising many issues in terms of metadata and access Historical background for European microdata • Increasing harmonization process at international level since WW2 led by international organisations (UN, OECD…) mostly by encouragement and persuasion • European Union framework specific as there is a political and legal framework European Statistical System development started with the CECA during the 50ies Progressively developed during 30 years within a somewhat unclear framework (Commission also relying on other sources) New start and developments since the 90ies towards more autonomy and integration A legal framework establishing the ESS in 2009 (Regulation N° 223/2009, Rather recently As a partnership between the Commission statistical authority (Eurostat), the NSIs and other national statistical authorities, with cooperation with the ECB In compliance with the principle of subsidiarity • • • • Main focus on indicators, less on microdata for research analysis Subsidiarity principle at the core of the ESS Harmonization and integration process growing: 80% of the national data linked to the European requests Yet persistent difficulties … relying on national official statistical systems built in different ways • National statistical systems pre-existed to the ESS • Built in quite different ways through history from diverse sources and bodies, piece by piece, • Important differences • Toward an increasing coordination role of the NSI Yet keeping traces from the construction even in centralized statistical system where some bodies still remain apart Central banks in general apart Surveys and registers countries Centralized vs decentralized/coordinated Political systems • Regional autonomy (Spain, UK and Scotland) Federal system (Germany and the landers) France and overseas departments National and local authorities gaining importance Recent political changes (Eastern countries) Resulting in differences also in type, number and variety of datasets Consequences for working with data • Eurostat microdata are increasingly rich and underused resources for comparative research However still raising a number of methodological issues • Other resources important both at European and national level for comparative research Some other European sources integrated or post-harmonized Related national parts may also offer in some countries more variety, a wider historical perspective, more questions and more detailed microdata o However the harmonization process may impact the series and determine breaks in the series at national level Other national datasets available for comparative research, however not harmonized • Researchers face difficulties and “silos” for information and access both at national and at European level Dissemination under European bodies yet access to the national parts under decision of countries Not all integrated microdata include all countries Access still burdensome even within the new regulation (more in M. Isnard presentation) At national level Information and access more or less fragmented depending on the degree of centralisation for production and dissemination (NSIs, Data archives) Transnational access depending on the legal status framework II. European and national microdata Three subsets from a European perspective European integrated microdata National microdata harmonized at European level Other national microdata for comparative research European integrated microdata Produced and provided at national level and integrated and provided at European level by European government bodies To remind : national part may differ at national level and at European level A part are pre-existing surveys integrating harmonization requirements Eurostat European Central Bank European Commission and other European government bodies Others under EU regulations or recommendations of Eurostat European Integrated microdata Eurostat microdata A growing number of datasets European Union Labour Force Survey (LFS) European Community Household Panel (ECHP) Statistics on Income and Living Conditions (SILC) Adult Education Survey (AES) Community Innovation Survey (CIS) Structure of Earnings Survey (SES) European Road Freight Transport Survey (ERFT) European Health Interview Survey (EHIS) Continuous Vocational Training Survey (CVTS) Community Statistics on Information Society (CSIS) However … • • • Date of inclusion for countries differ National microdata may offer a wider historical perspective in some cases See LFS : France since 1962, Spain since 1964, Norway since 1972, Portugal since 1992, Estonia since 1995 Implementation in national instruments differ : different surveys, variables from administrative data Good metadata important (ex. questionnaires) More questions and more detailed microdata at national level in several cases The SILC example in Estonia, France, and some other countries SILC and the Estonian Social Survey • • • “ESS is the Estonian branch of a pan-European survey of income and living conditions called the EU-SILC (…). Statistics Estonia, however, has added questions, which are of interest to the domestic consumers of Estonia, to the EUcommissioned survey, and attempts to have the survey be a combination of Estonian and European data requirements.” In 2004, four modules were added (…). They were all commissioned by Estonian domestic consumers. The topics of the four modules concerned social contacts; family attitudes and political views; crime, violence and feeling of security; and ethnic integration. In 2005, there were three modules in ESS: one by order of Eurostat and the other two by domestic consumers. The topic of the Eurostat module was “Social origin”(…) Estonian domestic modules were entitled “From school to work” and “Trade unions and collective agreements”. SILC and the French EPCV and SRCV • SILC (SRCV) starts in 2004, but… Living conditions 1978-1979, 86-87; 93-94 European Community Household Panel (1994-2001) Permanent Living Conditions Survey EPCV (1996-2004) • The current French SRCV system took over from the former permanent survey of living conditions (EPCV) system in 2004 • It reprises some questions from the EPCV and includes some other questions not required at Eurostat level • Persons being in the panel for 9 years /vs 4 years required by Eurostat • See Stéfan Lollivier presentation Other examples for SILC implementation at national level •SILC datasets from Eurostat do not contain Swiss data (2007 - 2010). •The Great Britain component of the EU-SILC dataset is collected by the Office for National Statistics (ONS) as part of the General Lifestyle Survey (GLF) (held at the Archive under Special Licence access conditions - see GN 33403). •The Northern Ireland component is collected by the Northern Ireland Statistics and Research Agency (NISRA) as part of the Living Conditions Survey (LCS) European integrated microdata European Central Bank microdata • • • • Household Finance and Consumption Survey Every 3 years First deliverable in 2013 No pre-existing national survey in some countries while older waves in others France: Every 6 years and oldest waves Questionnaire 112 pages in France vs 65 pages for the European survey, yet some variables collected at European level not in the French survey Adaptation of the survey was needed (break in the national serie) Moved from the NSI to the Central Bank (with consequence for researcher access) European integrated microdata Other European government microdata • • The European Commission, the Directorate-General for Economic and Financial Affairs (DG ECFIN) Business and Consumer Surveys EUROFOUND Surveys on working conditions DG ECFIN and the Business and Consumer Surveys •The European Commission, the Directorate-General for Economic and Financial Affairs (DG ECFIN) • manages a network of national institutes to conduct a harmonised EU programme of 6 business and consumer tendency surveys (quarterly or monthly from 1985, 1995 for services). Industry Services Consumers Retail trade Building Investment and some others EUROFOUND and the surveys on working conditions The European Foundation for the Improvement of Living and Working Conditions • • • An autonomous EU agency Set up by the Council Regulation (EEC) No. 1365/75 of 26 May 1975 Contributes to the planning and design of better living and working conditions`` 3 surveys combining companies and employees surveys The European Working Conditions Survey (EWCS) : 1990, 1995, 2000, 2005, 2010 • Working conditions and the quality of work and employment The European Quality of Life Survey (EQLS) : 2003, 2007, 2010 • A broad range of indicators of quality of life, both objective and subjective The European Company Survey (ECS) : 2004, 2009, 2013 • Workplace practices based on the views of both managers and employee representatives • Datasets available for download at the ESDS (UKDA) European Integrated microdata Other microdata under EU regulations • • • Information and Communication technologies surveys (TIC)` Regulation n°808/2004 and updated regulation 1006/2009 linked with the European roadmap for the TIC Household budget surveys (HBS) Eurostat recommendations about methodology and harmonization Censuses Successive regulations (2008, 2009, 2010…) to achieve more comparability, output oriented National official microdata post- harmonized at European level • • • Non government bodies Collected and a posteriori harmonized by universities, archives IECM (IPUMs international) Censuses May gather official microdata and academic datasets LIS (Luxembourg Income Study) Household Budget Survey MTUS Time Use Survey More detailed microdata often accessible at national level IECM/IPUMS and national dissemination of European censuses IECM + national dissemination IECM in progress + national dissemination Only national dissemination No information More may be available at national level Ex. : United Kingdom and France Several Public Use Files (PUF) available for dwelling, individuals, residential mobility at different geographical levels on INSEE website Several Scientific Use Files (SUF) more detailed available for the researchers via Archives Access to highly detailed microdata available via Secure access for approved research projects (ONS in UK, CASD/GENES in France) LIS and MTUS • • LIS and MTUS are examples of post harmonized microdata at European level from both government sources and non government (universities) sources More detailed microdata at national level in several countries LIS is a cross-national data center, located in Luxembourg. LIS is home to the Luxembourg Income Study Database (LIS) and the Luxembourg Wealth Study (LWS)Database. Pays Enquête Année Income Unit Data Collection Cyprus LWS 2001 Primary Economy Unit Central Bank of Cyprus and University of Cyprus Finland LWS 1998 Household Wealth Survey Statistics Finland Germany LWS 2001 German Socio Economic German Institute for Panel Economic Research, DIW Survey of Household Bank of Italy Italy LWS 2002 Income and Wealth UK LWS 2000 British Household Panel Institute for Social and Survey + cross national Economics Research equivalent files LIS 1999 Family Resource Survey Department for Work and Pension , ONS, National Centre for Social Research The Centre for Time Use Research collects Time Use Surveys France Time Use Survey BELGIUM 1966 The Multinational INSEE Comparative Time- Pierre Feldheim and Claude Javeau, Sociological Budget Research Project BULGARIA 1988 Institute, Free University of Brussels The 1988 Bulgarian National Time Use Central Statistical Office, Institute of Sociology at the Survey Bulgarian Academy of Sciences FINLAND 1987-88 Time Use Survey HUNGARY 1965 The Multinational Statistics Finland Comparative Time- Sociological Research Group, Hungarian Academy of Budget Research Project ITALY 1979/80 Sciences Il Tempo della Citta. Una Ricerca Sull'uso University of Turin del Tempo Quotidiano in una Metropoli UK 1961 The People's Activities UK 2005 Omnibus Survey, One Day Diary of Time The Office for National Statistics coordinated the study and Use Module BBC collected the data. The Institute for Social and Economic Research at the University of Essex transferred the diaries into coded electronic data. Other OS microdata, non integrated nor post-harmonized at European level, useful for comparative research • Example for surveys on living conditions (other than those related to the SILC) May include some from academic institutions • Administrative microdata and OS surveys Will be increasingly combined and used by researchers Examples Employers and employees A guide to Linked Employer-Employee Data Sources in the EU and Beyond (Tanvi Desai, London School of Economics, 2008) Social security, pensions. The Impact of Social Security Contributions on Earnings: Evidence from administrative data in France, Germany, Netherlands an UK ( Antoine Bozio, Research proposal submitted in September 2011 to the Open Research Area (ORA) call) Other national surveys on living conditions Country Denmark The register for health and social conditions Estonia France Family allowance and child benefits 1957-2012 Estonian Social Survey 2004-2010 Household Budget Survey 2010 Living conditions 78-79; 86-87; 93-94 Permanent Living Conditions Survey (EPCV) 1996-2004 The Statistical survey on income and living conditions (SRCV) 2004-2009 Survey of users of accommodation and hot meal distribution services (Homeless people) 2001 United General Lifestyle Survey (GLF) Kingdom Living Standards During Unemployment Norway 1977-2012 2000-2008 1983-1984 English Housing Survey 2008-2011 Norwegian Level of Living Study 1973-2007 Study on housing conditions among low-income families 1995 Welfare and level of living among the very frail elderly 2000 Linked Employer-Employee Data sources, examples of national surveys Country Czech Republic National surveys Information System on Average Earnings (ISAE) Spain Germany LIAB (Linked EmployerEmployee Data of the IAB) Data producer Access notes The Czech Direct access is only The Structure of Earnings Survey for the Czech Republic is Ministry of Labour available on-site at derived from the ISAE there is no access to the Czech SES commissions the TREXIMA data at the national level. private agency TREXIMA INE does not provide access to any other linked employeremployee or firm panel data resources than SES. The Banco de España conducts an annual survey of non-financial firms, the Central Balance Sheet Data. However, the microdata for this survey are only available to researchers affiliated with the Banco de España IAB (Establishment Panel ) DESTATIS (employment statistics) France Data access is possible The L-IAB data is a linked employer-employee dataset via on-site use and constructed from the IAB Establishment Panel and the afterwards via remote Federal Employment Agencies employment statistics. data access also. REPONSE (Relations Professionnelles et Negociations d’Entreprise) DARES DARES COI (Changement Organisationnels et l’Informatisation) DARES Réseau Quetelet ,(SUF) Enquête Familles at Employeurs INED DADS INSEE Labour cost and structure of INSEE earnings survey (Ecmoss) Réseau Quetelet, (SUF) Réseau Quételet, (SUF) The COI is used for France’s contribution to Eurostat’s ICT survey The Structure of Earning Survey for France is a part of this survey Country France Data sources DADS Years 1993- Data collection Fiscal and social administration, INSEE Provider Reseau Quételet (CMH, GENES) Germany Sample of Integrated Labour Market Biographies (SIAB) 1975-2008 IAB IAB Fiscal administration; DESTATIS DESTATIS 1990-2006 DESTATIS DESTATIS 1975-2003 ONS Secure data service, UKDA Lohn- und Einkommensteuerstatstik faktisch anonymisierte Daten (FAST) Verdienststrukturerhebung (VSE) United Kingdom New Earnings Survey (NES) – 1992-2004 Annual Survey of Hours and Earnings (ASHE) 2004Annual Business Inquiry Bozio Antoine. The Impact of Social Security Contributions on Earnings: Evidence from administrative data in France, Germany, Netherlands an UK. Research proposal submitted in September 2011 to the Open Research Area (ORA) call III. How to locate and access official microdata within Europe? Metadata Transnational access DwB support and new tools Metadata • • Highly fragmented at national level No single point of access even at national level Some countries are opening a portal for access to official data (open data initiatives) yet mostly for aggregate data In some countries, archives gather metadata from different government producers (yet currently only a few) • Different metadata standards A map providing general trends for NSIs and Archives • • Possible little variations due to lack of information available on the web or very recent changes) Does not include other providers as statistical departments Metadata dissemination at national level for national microdata: NSIs and Archives Metadata dissemination at European level • • Each European body (Eurostat, ECB …) CESSDA, currently only for some national microdata (depending on the perimeter of the Data Archives members) o Using NESSTAR Based on DDI standard for documentation A unified way to look for data and metadata (documentation describing the data) Allows to browse into variables (instead of looking into the questionnaire) o Avoid silos between official and academic data Access Terminology issues Access ranging from fully anonymized to highly detailed microdata • Campus files (CUF) • Public use files (PUF) • Scientific Use files (SUF) • Confidential, highly detailed, sensitive microdata, scientific confidential files (ScF) However differences remain in terminology Campus files/public use files Scientific Use files/ confidential files Access via CDRom/FTP, on site, remote execution, remote access Access for Eurostat and other integrated European microdata • • • • Eurostat still burdensome Even within the framework of a new regulation (See Michel Isnard presentation) A Remote access network in project (DARA ESSnet) European Central Bank in progress LIS : remote execution for countries paying fees as members IPUMS/ IECM free and easy, yet highly anonymized Access to national OS in Europe • Highly fragmented: Different types of accreditation procedures, application forms, criteria for each type, type of access for each country/producer/provider/type of data o Quite long for comparative research o Ex : Type of access at national level for the national components related to the SILC Type of access at national level for the national components related to the SILC SUF On-site access Remote Execution X X X X X 1967-2011 X X France 2004-2009 X Germany 2005-2008 Countries Years Austria 2003-2007 Czech 2005-2010 Estonia 2004-2010 Finland CUF PUF X Remote Access X X X X Ireland Italy 2005-2010 X Latvia 2005-2011 X Lithuania 2005-2010 Poland 2005-2010 Portugal 2004-2009 Slovakia 2006-2011 X X X X X Slovenia X Spain 2004-2011 X Switzerland 2007-2009 X X However transnational access increasingly possible o Even for highly detailed microdata o Several maps providing general trends Based on CESSDAPPP and DwB work 1 National Statistical Institute selected per country Does not include the NSAs and other government bodies Only general trends (possible little variations due to recent changes) o Important for future comparative research based on administrative data difficult to integrate at European level Transnational access to Public Use Files In some countries the number of PUF is (very) limited. Transnational access to Scientific Use Files Data archives providing access to SUF for OS Transnational access to confidential data III. DwB support Current activities and future perspectives CIMES and MISSY DwB support • Current activities Support for transnational access to highly detailed microdata from 4 countries (Germany, France, UK, Netherlands) DwB regular calls, support for accreditation, financial support for travel and fees for RDCs (2 remaining calls) Metadata CIMES Centralising and Integrating Metadata from European Statistics Currently 1,796 datasets from 22 European countries including information on access conditions MISSY Microdata Information System An online information system with metadata for all integrated European microdata from official statistics held by Eurostat and Integrated European Census Microdata (IECM) • Future perspective A European Service Centre for Official Statistics (ESC-OS) as a single point of access linked to the CESSDA Portal That could offer a range of services: metadata, training, support for accreditation, a European Remote Access Network for access to confidential OS Thanks for Listening Contact: roxane.silberman@ens.fr Website: http://www.dwbproject.org/