The Data Search Principal Data Sources and Access Susan Mowers, Data Librarian Sarah Roach, Research Assistant Outline Doors to Data … ▫ Microdata ▫ Aggregate data … • Microdata Search (hands-on: Odesi, SAS) ▫ Public microdata ▫ Confidential microdata: RDC and RTRA • Aggregate data (hands-on: extract) ▫ Canadian: CANSIM & other ▫ International: OECD, World Bank, Haver, IMF, … Suggestion • Please logon to your computer ▫ abcd###@uottawa.ca yyyyddmmsin • Problems? We can help! Doors to data • When to use microdata? For high degree of detail All variables at individual unit of analysis, leading to … Many choices about subject matter Greater range of statistical analyses possible Doors to data • When to use aggregate data? Microdata not available? e.g., business survey microdata not readily available Need macroeconomic data? e.g., region, country, provincial or city-level Need time-series data? e.g., comparative values already calculated across time periods Comparing data types … Aggregate data Microdata Questions? Outline We are here Doors to Data ▫ Microdata ▫ Aggregate data • Microdata Search (hands-on: Odesi, SAS) ▫ Public microdata ▫ Confidential microdata: RDC and RTRA • Aggregate data (hands-on: extract) ▫ Canadian: CANSIM & other ▫ International: OECD, World Bank, Haver, IMF …. Public Statistics Canada Microdata Access via the Library and Odesi Public microdata • Confidentiality/privacy problems are resolved with PUMFs ▫ Low-risk nature of public data ▫ 24/7 access via Odesi to Statistics Canada public data* ▫ Contact point for help: GSG Centre/MRT *& other sources, e.g., ICPSR [Link] and World Bank [Link] … Let’s see! • Public microdata file • Personal income variable ▫ [LINK to Odesi] ▫ Note: What type of data? Would it be specific enough? Let’s see (cont’d) Screen 1 - What type of data? - Would it be specific enough? Let’s see! • Public microdata file • Cultural or racial origin variable ▫ [Link to Odesi] ▫ Note: Do these values reflect the actual question and the level of detail asked? Would it be specific enough? Let’s see (cont’d) Screen 2 Do these values reflect the actual question and the level of detail asked? Would they be specific enough? Let’s see! • Public microdata file ▫ Is there a correlation between cultural / racial origin AND income? ▫ [LINK to example from Odesi] Let’s see (cont’d) Screen 3 Did you know? • Odesi provides both the public microdata files and codebooks Download both (data and codebook) Download the data as a subset or full datafile Always download the codebook More info here, e.g., codebook LINK] and topical index LINK], or e-mail smowers@uottawa.ca Practice: Download public data! • Download a subset. Note also this how-to video [LINK] • Download codebook & topical index Questions? outline • Doors to Data ▫ Microdata ▫ Aggregate data • Microdata Search ▫ Public microdata We are here ▫ Confidential microdata: RDC and RTRA • Aggregate data (hands-on: download) ▫ Canadian: CANSIM and other ▫ International: UN, OECD, World Bank, IMF, Haver Confidential Statistics Canada Microdata Access via the RDC and RTRA Agenda • Why use confidential microdata? • Access via Research Data Centre (RDC) • Access via Real Time Remote Access (RTRA) Why use confidential microdata? Need more specific data Public data has limitations. It often … (1) aggregates continuous data, like age and income and (2) suppresses detailed geography Let’s see! • Confidential synthetic file ▫ Is there a correlation between cultural / racial origin AND income? ▫ [Link to example from Odesi] Explanation: click here for information about uses for this synthetic data file. Let’s see (cont’d) Screen 4 Why use confidential microdata? Need panel data • Panel data follow a panel of individuals over repeated cycles of a survey. • Public data limitation: Public data files do include longitudinal data (for reasons of confidentiality) Why use confidential microdata? No public data exists Public microdata sometimes offers limited surveys. For example, it doesn`t have … The Uniform Crime Reporting Survey The Canadian Cancer Registry The Canadian Forces Mental Health Survey Questions? Agenda • Why use confidential microdata? We are here • Access via RDC • Access via RTRA What is the RDC? • The Research Data Centre (RDC) provides provides researchers access to confidential microdata. • Access is provided in a secure university setting. Where is the RDC and how is it used? • The COOL RDC can be found on uOttawa campus on the 3rd floor of the Morriset library! • All work with the data must be done inside the RDC. • Output can be released to researchers by request pending vetting for disclosure risk Application Process & Survey Availability To access the RDC there are 3 steps to follow: 1. Apply online on the SSHRC website 2. Complete a security screening 3. Sign a microdata research contract A list of the surveys available in the RDC can be found here: http://www.rdc-cdr.ca/datasets-and-surveys Want more information? Zacharie Tsala Dimbuene RDC Analyst Office: Morisset Library 322 Email: coolrdc@uottawa.ca Web site: [Link] Agenda • Why use confidential microdata? • Access via RDC We are here • Access via RTRA What is RTRA? RTRA (Real Time Remote Access) allows remote access to confidential microdata output Provides descriptive statistics RTRA can be particularly useful during the proposal stage of a research project. How does RTRA work? • Submit code to Stats Can (online) indicating the statistics you want and received output within the hour. • Code is generated in SAS. • Training sessions are available for new RTRA researchers! Availability of SAS and help SAS is available… Vanier Labs, or Free browser version also available online New to SAS? Training sessions are available. RTRA Surveys Confidential data available by remote access RTRA Surveys *PUMF=Public Use Microdata File Availability via Availability via PUMF*? RDC? Canadian Cancer Registry (CCR) NO YES Canadian Forces Mental Health Survey (CFMH) NO YES Canadian Survey on Disability (CSD) NO YES Health Services Access Survey (HSAS) NO YES Homicide Survey NO YES Life After Service Survey (LASS) NO YES Longitudinal Survey of Immigrants to Canada (LSIC) NO YES Maternity Experiences Survey (MES) NO YES Post-Secondary Education Participation Survey (PEPS) NO YES Postsecondary Student Information System (PSIS) NO NO Registered Apprenticeship Information System (RAIS) NO NO Survey on Living with Chronic Diseases in Canada (SLCDC): Arthritis NO YES The National Apprenticeship Survey (NAS) NO NO Uniform Crime Reporting Survey (UCR) NO YES How do I apply to RTRA? • Fill out and sign an application form [Link | Info] indicating which survey(s) you would like access to and email it to me at sarah.roach@uottawa.ca You should have access within two weeks! More information? • Compare regular SAS code versus RTRA SAS code – CCHS 2012 example [Link] More information? • RTRA code [Link] uOttawa RTRA Web site [Link] Questions? Outline • Doors to Data ▫ Microdata ▫ Aggregate data • Microdata Search (hands-on: Odesi, SAS) We are here ▫ Public microdata ▫ Confidential microdata: RDC and RTRA • Aggregate data (hands-on: extract) ▫ Canada: CANSIM & other ▫ International: OECD, World Bank, Haver, IMF … Aggregate Data Canadian and International Sources About aggregate data … • Unit of analysis is at the economy level, e.g., Canada, U.S., U.K., province/state … • Often is repeated time-series (aggregate) data Unemployment rate (sa %)* / Labour Force Surveys 1995-2014 – for U.S., U.K., Canada *Calculated from Labour force status=unemployed from repeated cycles of Labour Force Surveys Canadian aggregate data • CANSIM tables [Link] • Statistics Canada DLI data server! [Link] • Odesi, various • Conference Board of Canada e-Data (forecast data, metropolitan-level, confidence indices) [Link] CANSIM • Official government data from numerous sources, includes business surveys • Parts of a CANSIM table: ▫ Title: ▫ Table #: ▫ Each possible combination of categories and options in a table. Also called a series. Time series: ▫ Q1, 1980– Current Vector: ▫ Geography (1 item: Canada) Seasonal adjustment: Adjusted, unadjusted. Sub-sector accounts (3 items) Estimates: (120 items) Time frame: ▫ 380-0081 Dimensions: ▫ Revenue, expenditure and budgetary balance - Provincial administration, education and health quarterly (dollars x 1,000,000) A series (vector), measured over a number of years Footnotes Data definitions Source: Adatped from Kwantlen Polytechnic University. (2015). Statistics: CANSIM (Guide). http://libguides.kpu.ca/c.php?g=183875&p=1212158 CANSIM Instructions 1. 2. 3. 4. Go to CANSIM. In the Search box, type “provincial expenditur.” On the Search Results page, click on Table 326-0009. There are five tabs located above the data table: Data table (you are by default in this selection), Add/Remove data (to narrow your filtering/search), Manipulate (time series), Download (to save the data), and Related information (other useful links), and Help. TWO OPTIONS – go to tab Add/Remove to narrow search and time frame, OR go to tab Download and download entire table as a Beyond 20/20 (data viewer you can install on your computer. Source Adapted from the Government of Canada. (2015). Canada Business Network Blog. http://www.canadabusiness.ca/eng/blog/entry/4 005/ to Fall 2015 Census tables • Two types … Census tables Method 1 – Odesi Census National Household Survey (NHS) * browse (Demographics & population) (Social surveys) 2011 Profiles [LINK] Profiles [LINK] Tabulations [LINK] Tabulations (Data Tables) [LINK] 2006 Profiles [LINK] Tabulations [LINK] *Don’t forget the replacement voluntary survey, the NHS Method 2 – New DLI data server [Link] Other databases of Economics statistical tables (Canada)? • http://uottawa.ca.libguides.com/econcan-e ▫ Download Beyond 20/20 viewer for Windows [Link] and how-to Flash demos [Link] Questions? Outline • Doors to Data ▫ Microdata ▫ Aggregate data • Microdata Search (hands-on: Odesi, Stata, SAS) ▫ Public microdata ▫ Confidential microdata: RDC and RTRA • Aggregate data (hands-on: extract) We are here ▫ CANSIM & other ▫ International OECD, World Bank, Haver, IMF, FAO … International aggregate data • World Bank, World Development Indicators [Link] • International Monetary Fund (IMF) [Link] • OECD.Stat [Link] • Haver [Link] • Other: Stocks and commodities World Development Indicators • Cover many topics on, (and related to), economics, including social development and the environment • All countries • Annual data from 1960 to present • See also Africa Development Indicators, if researching Africa (some additional variables). World Development Indicators • Country selection 1. (optional) you can pick a grouping first, e.g., region, income level, on left, 2. click on COUNTRY in middle, 3. click on desired countries. • Series selection 1. You can do a keyword search 2. Or drill down under topics to left 3. If you are still having problems, see [Link] for a browsable list of series names, then use wording you find here • Select years ▫ Use tick boxes • When downloading ▫ You can download many series at a time, but ▫ Only one country at a time, so ▫ TIP: In the same session, as you keep downloading countries, when you download to Excel, it will contain all countries you have downloaded in that session (so you can keep adding countries and you will only end up keeping the latest Excel table). International Monetary Fund • Databases: International Financial Statistics, Direction of Trade (1980+), Balance of Payments, and Government Finance Statistics, among others • Covers countries, regions and NGO’s. • Covers 1948 – Present, for major IMF database: International Financial Statistics • Over 7,000 economic concepts Quick Links Broad topics Balance of Payments External Trade and Exchange Rates Financial Indicators Fund Accounts Government and Public Sector Finance Indicators of Economic Activity International Investment Position International Reserves Labor Markets National Accounts Prices International Monetary Fund Regular portal New portal [Link] [Link] Covers many IMF databases* - includes more visualization features No Google Chrome To download, REGISTER, then sign in to your account To download, REGISTER, then sign in to your account Note: Your account will work in both portals Recommend: Build your own query or Search (to left), then click on View data and Excel icon (top of screen) Country filters Help info [Link] * Excludes Trade and Investment (2) Bulk download Help info [Link] (1) Customize OECD.Stat • Cover many topics on, and related to, economics, including social development and the environment • Country coverage usually restricted to member countries [Link] • Different frequency options: annual, quarterly, etc. • Great tabulation options … • OECD.Stat lets you manipulate your tables: ▫ Pick and choose from among many variables and items/values ▫ Drag your variables to rows / columns ▫ Multiple countries and series in one single table ▫ Then download ! Haver Analytics • Customized to be econometric analysis –ready ▫ e.g., DLX Add-ins to all versions of Excel provide instant updates of your spreadsheets). • Many advanced functions built in ▫ e.g., calculate growth rates and n-period moving averages, create log scales, recession shading and aggregations, seasonal adjustment). • Comparable macroeconomic databases • Additional data ▫ Stock index prices ▫ Ordering in-depth Asian (South and East Asian, and Chinese) databases in Spring • Requires a DLX plug-in installation (Windows) [Link] Training • Guide [Link] ▫ Page 1 - - - Intro ▫ Pages 2-3 - Excel spreadsheets from Haver ▫ Page 4 - - - Haver charts for Powerpoint • On-site training by Haver Economists: March/April International aggregate data • Stocks and commodity prices ▫ Bloomberg @ Telfer Financial Research and Learning Lab ▫ Finance guides: Stocks [Link], commodities [Link] ▫ Other sources of commodity prices World Bank Food and Agricultural Organization [Link] Questions? Appointments Susan Mowers Data Librarian Office: Morisset Library 309B Email: smowers@uottawa.ca Sarah Roach RTRA Research Assistant Office Hours: By appointment Email: sarah.roach@uottawa.ca