Microeconometrics dr Katarzyna Kopczewska Class 01 Outline: - Course introduction - Requirements - Intro to STATA – Linear model example Course introduction and requirements Lab in Microeconometrics follows the lecture in Micoreconometrics. Students are expected to attend the class (min. 80% of time). Lab exercises will be based on Case Studies solved in STATA. To proof Your activity during Lab is to fill in “strategic agenda” for Case Study presented. To pass this Lab students have to present their model and fill “strategic agendas”. Class 01 issues 1. 2. 3. 4. 5. 6. 7. Load the data Summarize data in tables / on graph Estimate linear model using options for regress Add a dummy variable Run appropriate tests Check VIF Draw conclusion – fill in “strategic agenda” Dataset description: Swiss Fertility and Socioeconomic Indicators (1888) Data - Standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888. A data frame with 47 observations on 6 variables, each of which is in percent, i.e., in [0,100]. [,1] [,2] [,3] [,4] [,5] [,6] Fertility Agriculture Examination Education Catholic Infant.Mortality Ig, "common standardized fertility measure" % of males involved in agriculture as occupation % "draftees" receiving highest mark on army examination % education beyond primary school for "draftees". % catholic (as opposed to "protestant"). live births who live less than 1 year. All variables but 'Fertility' give proportions of the population. Details (paraphrasing Mosteller and Tukey): Switzerland, in 1888, was entering a period known as the "demographic transition"; i.e., its fertility was beginning to fall from the high level typical of underdeveloped countries. Additional info: Files for all 182 districts in 1888 and other years have been available at <URL: http://opr.princeton.edu/archive/eufert/switz.html or Source: Mosteller, F. and Tukey, J. W. (1977) _Data Analysis and Regression: A Second Course in Statistics_. Addison-Wesley, Reading Mass. Kosuke Imai, Gary King and Olivia Lau (2007). Zelig: Everyone's Statistical Software. R package version 3.0-1. http://gking.harvard.edu/zelig Commands used for modelling: File/Import to read the data Graphics / Two way graph - to make scatterplot in new window . plot Y X to make scatterplot in results window . avplots (after regress) to check individual relationships . correlate to chceck correlation among variables . sum variable, d to display summary of variable Statistics / Linear models and related / Linear regression to estimate OLS model . regress to estimate OLS model . vif (after regress) Variance Inflation Factor to detect multicollinearity . estat hettest (after regress) to chech heteroskedasticity To create dummy variables: Without missing values With missing values . gen young = 0 . replace young = 1 if age<25 . replace young = 1 if age<25 . replace young = . if missing(age) . gen young = 0 Compare on: http://www.stata.com/support/faqs/data/dummy.html . predict resi, resi to generate fitted values . egen ze=std(resi) to generate standarised fitted values . swilk variable to test normality @@@@@@@@@@@ VIF – Variance Inflation Factor Assuming y=a+b2x2 + b3x3 + …+ bKxk + e Regress xk on other xi and check R2 (called R2K) VIF=1/(1-R2K) , problematic variable K when VIF>10 (usually variables schould be dropped)