POLS 395 Quantitative methods for political science I This mini-course is intended to give students a very basic introduction to fundamental statistical and econometric techniques used in the social sciences, and an introduction to using statistical software to conduct political analysis of data. It is extremely important that aspiring political scientists learn the material offered in this class for several practical reasons. First, it will help you read and understand political science research, whether in journals, books, or conference papers. Second, the material in this course is a foundation for the other required methods course needed for a PhD in political science, and is necessary if you want to learn more advanced techniques later. The mathematical requirements for the class are minimal. Only a decent knowledge of algebra is assumed. While learning these techniques often appears daunting, understanding and using them is not really that hard. We are trying to cover a semester’s worth of material is 2 weeks. Several things are essential to staying on top of what we are doing: 1. As with graduate school generally, think of this as a slavish full time job for 2 weeks. That is, you should plan to devote 8-10 hours a day to this Monday-Friday. (I plan to leave weekends open.) 2. Do the assigned reading carefully before the class session. 3. We will do most problems in class. Participate in doing the problems to ensure that you understand the material and can communicate that to me. For most sub-fields of political science (political philosophy being the main exception) there is a premium placed on understanding how to use quantitative methods (including why they are ultimately not a substitute for asking interesting questions and being intellectually creative). Even if you will never use statistical methods in your research, a good grounding in statistical analysis is essential to avoid much of the handwringing about the place of these methods in political science. Books There are two book required for the course: Hill, Griffiths and Judge, Undergraduate Econometrics, 2nd edition and SPSS 10 Guide to Data Analysis. Both will be useful for you as references for future courses and research, so they are worth getting. There are also references to other sources in the syllabus. Full references can be found at the end of the syllabus. Evaluation It is impractical to devote huge portions of time to exams. Assessments (i.e., grades) will be based on my evaluation of your progress in the summer, as evidenced by your performance in class and a research project conducted during the Fall semester that uses an appropriate quantitative analysis. More details about the final project will be forthcoming. Week 1 Introduction Morning Normative vs. positive political science Research Approach 1. Research idea 2. Literature Review 3. Formulating hypotheses a. Hypotheses must be testable/falsifiable b. Hypotheses are relational, and maybe causal National Wealth national democracy, Wealth Republicanism c. They are derived from theory d. Dependent vs. independent variables 4. Defining the concepts Concept= an abstraction based on characteristics of perceived reality What are some concepts readily used on political science? democracy politically left social capital war party ID political efficacy justice terrorism 5. Operationalization Specification of the process by which a concept is measured How would we operationalize these concepts? Democracy—Elections? Free press? Free association? Contestation? (Ronald Dahl, Democracy; David Held, Models of Democracy) Politically left—statements, survey responses War—declaration, use of official army, violent confrontation extensive in space, time and manpower, casualties? Terrorism 6. Measuring the Data a. Cases/units of analysis (individual or aggregate) b. Levels of measurement i. Nominal ii. Ordinal iii. Interval iv. Ratio Q1, (GDA, 72) Examples: Vote choice Income Partisanship Incarceration 7. Statistical Analysis a. Experimental vs. non-experimental data i. intervention vs. observation ii. replication b. Statistical Analysis to infer from “samples” to populations, as in experiments i. Random Samples (equal chance of selection from population) ii. Probability sample iii. Undercoverage and nonresponse, sources of bias iv. Random treatments v. Estimated effects and their “significance” 8. Correlation, causation and spuriousness (Corbett Ch 3) Reading Undergrad. Econometrics [UE], Ch 1, PKK (to be handed out) (66-8, 82-98) (Suggested: Corbett Ch. 1-6) Afternoon Introducing Statistical Software (SPSS) Reading Guide to Data Analysis [GDA] Ch 1-2 (1-31), skim Appendix B Problems: Basic data entry and transformation excercises Problems for class 1: Open an existing data set. Transform a variable Identify the types of first 15 variables as nominal, ordinal, interval, ratio Read p. 551 in SPSS book. Which dataset(s) use aggregate data. Retrieve data from ICPSR Week 2 Descriptive Statistics and Visual Presentation Morning Descriptive Statistics Frequencies Counts Percentages (valid percentages) Q. 1 (51); q 3?. -Frequency Tables -Frequency bar chart -Histograms Missing data Central Tendency Mean Median Mode Q. 2, 4, 7, 8 (72-73) Q 2 (86) Variability Range Interquartile Range Variance Coefficient of Variation Standard Deviation Standardized Score Q9 (73) Distribution (combines central tendency, and variability) -Histogram -Stem and leaf -Boxplot Q3 (108) Crosstabulation For looking at more than one variable with few categories Row and column percentages (marginals) Additional variables Q1,2,4 (126) Scatterplots Q1 (159) Reading GDA, Ch. 3-8 (“Conceptual” sections) (Suggested: Corbett, Ch 8-9; Moore and McCabe, Ch 1). Problems: GDA: p. 51: 1; p. 72-3: 2,7,8; p. 86: 2; p. 108: 3; p. 126: 1,2, 4; p. 159: 1 Afternoon Reading GDA, Ch. 3-8 (“How to...” sections), Appendix A Problems: GDA: p.56: 2, 3; p.74: 4, 5; p.87: 1, 3; p. 109: 1, 2; p. 130: 1, 6; p. 160: 1, 3; p. 162: 11, 12 Week 3 Probability, Random Variables, Expected Values, PDFs Experiments, Outcomes, and Random Variables Experimental data Non-experimental data Discrete vs. continuous random variables Probability Distributions of random variables Discrete (die roll) Continuous (income, sum of “n” dice) Expected Values of Random Variables Summation Mean Mean of a function of a random variable Variance of a random variable Sampling? A Joint Density Function. Marginal pdfs Conditional pdfs, Independence of variables Expected Value of a Function of several random variables Covariance and Correlation Mean of a weighted sum Variance of a weighted sum Distributions of random variables Binomial Inference from a sample to a population Discrete random variable Applications: presidential approval, drug efficacy trials Normal Standard normal Central limit theorem and sample means Applications: Alternative Distributions Morning Reading: UE, Ch 2 (11-36) GDA, Ch 9-10 (163-196) (Suggested: GHJ, Ch. 2-3) Problems: UE- 2.1-2.6, 2.9; GDA: p. 175-6: 1,2 ; p. 190-3: 1,2,3, 5, 7, 9 Problems: GDA p. 176: 2,3; p. 196: 1,2 Afternoon Week 4 Simple linear regression model (SLRM) Theoretical Model Public spending GDP per capita How much does relative spending change, relative to wealth? Empirical model Introducing the error term Error is what is “left over” after removing the systematic component Key statistical assumptions 1. (y|x), dependent variable, is a linear function if X (line),. Y=1+2x + 2. The expected value of e (E(e)) is zero, since E(y)= 1+2x [this is a constant] 3. Errors are homoskedastic var() = var(y) = 2, for all 4. Errors have zero covariance cov(i, j)=cov(yi, yj)=0 knowing y for one variable does not permit prediction of another variable. 5. X values are fixed (not random) and have more than 1 value. [optional] 6. For each X, the ~N(0,2) (as is y). True Parameters for relationship are unknowns. They are random variables! They are estimated from rules of estimation and the data. Similarly, the true errors are not known. But what do the errors actually represent? 1. other explanatory factors not in the model. 2. deviation from a true linear relationship 3. randomness in behavior. Estimating Parameters for the relationship between x and y We want to estimate values for 1 and 2 given using the above modeling assumptions and data. Plot the data Linearity and others Least squares principle Estimate a line (i.e., parameter estimates in b1 + b2 X) Obtain errors from line Square errors Sum them Find estimates that minimize squared errors b2= T xtyt - xt yt T xt2 – ( xt)2 b1 = y(bar) –b2x(bar) Why must X have at least 2 values? Estimators versus estimates What do the estimates mean? Predicted values and errors Do Exercise 3.2 p. 61 Do exercise 3.11 p. 65 (give graph paper) Do Exercise 3.15 Do exercise 3.13 (a-e) Converting non-linear relationships into the linear model If we suspect that the relationship between the variables is of a particular non-linear functional form, we can often transform the variables into a form that is linear. E.g., Natural log of both sides= an estimate of %change in y given a % change in x. (1 unit is thus a doubling of x) Morning Reading UE , Ch. 3 (42-61) GDA, Ch 18-19 (skim 349-391, 399-406) (Suggested: GHJ, Ch. 5) Problems: UE, 3.1, 3.2, 3.10, 3.12, 3.13, 3.15 GDA, p.399: 1, 4, 6 Afternoon Reading GDA, Ch 18-19 (pp. 392-398) Problems: UE, 3.17 [using SPSS, not Excel] GDA pp. 403-406: 1-7, 13-18 Week 5 Properties of Least Squares, and Inference in the SLRM Ch 4. Properties 1. Least Squares Estimators as Random Variables Different samples will produce different estimates 2. Sampling Properties of Least Squares Estimators Concept-- Efficiency: Bias and precision and targets. Expected values of b1 and b2 E(b2)= E(2 + wtet) = 2 since E(e)=0 and wt is a constant. E(b1)=1 Thus, least squares estimator is unbiased. This is true only if no (large) systematic effects are omitted from the model (i.e., an important explanatory variable) Repeated sampling context- there is sampling variation in b2 estimates Derivation of equation 4.21 This shows why b2= 2 + wt et Variance and covariance of b1 and b2 We prefer estimators with lower variance to higher variance, as it decreases the likely range in which we can be “confident” that our estimate falls. Lower sampling variance= higher sampling precision var(b2)= 2 (xt-xbar)2 var(b1)= 2 [ xx2 / T(xt-xbar)2 cov(b1,b2) = 2 [-xbar / (xt-xbar)2] Note that var(b2) [and var(b1) and cov (b1,b2)] is different, if cov(ei,ej) ~= 0 or var(et)~=2 for all et then Implications of formulae for efficiency 1. a higher 2 means higher var(bx) 2. less variation in X values means higher var(bx) [graph of greater spread] 3. smaller sample size (T), higher var(bx) and cov(b1,b2) 4. farther x’s from 0 greater variance in b1 5. cov negative when slope is postive Linear estimators- since b2 is a weighted sum (linear combination) of the y’s, it is called a linear estimator 3. Gauss Markov theorem- Least squares estimates are BLUE 1. 2. 3. 4. 5. b1 and b2 are best linear and unbiased estimators, not necessarily best linear, or best unbiased. “best” implies minimum variance, i.e., higher prob. Of obtaining an estimate closer to the true one. Gauss-Markov is true only insofar as SR1-SR5 are true. Normality is not required Gauss-Markov does not apply to the estimate from a single sample, only to the estimator. 4. Probability Distribution of Least Squares Estimators If e/y are normally distributed, then so is the estimator for b1 and b2 (b/c b2 is a weighted sum of y). But, recall that their variances are not just 2. If e/y are not normally distributed, the least squares estimators will be normal if T is sufficiently large (i.e., T >50). 5. Estimating the Variance of the Error Term (2) a good basis for estimating the variance of the random error term (2) is the average estimated residual. An otherwise downward bias in this estimate is corrected by subtracting the number of estimated parameters in the model (2 in the SLM case) from the denominator. 2= [ residt2 ]/T-2 [What is the expected value of the error term?] Estimates of the variances and covariances of the LS estimators. (These simply plug the above into the formulae in 2. above.) Square root of these estimator variances are the standard errors of b1 and b2. Ch 5 Inference Interval Estimation Theory Standardize estimate from b and se of b Probability Distribution of estimated variance is (T-2)(hat)2 /2 ~2 (T-2). “t-distribution” divide a standard normal Z y the swaure root of an independent chi-sqaure variable V~ 2(m) t= Z / (V/m)1/2 ~ t(m) t= (b2-2 )/ se(b2) ~ t(T-2) Obtaining Interval Estimates Repeated Sampling Context Interval estimates combine estimated point location with a range (based on sampling variability) in which the unknown parameters might fall. An illustration Hypothesis Testing The null hypothesis Alternative hypothesis The test statistic The rejection region Food example Type I and Type II errors The p-value of a hypothesis test Tests of significance Food example Relationship between two-tailed hypothesis tests and interval estimation One-tailed tests Comment on stating null and alternative hypotheses The Least Squares Predictor Prediction in the food model Morning Reading UE, Ch. 4-5 (68-85, 90-114); GDA, Ch. 20 (407-419, 424-430) (Suggested: GHJ, Ch. 6-7) Problems: UE, 4.1, 4.2, 4.4; 5.1-5.5, 5.10; 5.16 GDA, p. 424-426: 2-4, 6 Afternoon Reading GDA, Ch 20 (419-423) Problems: UE, 4.7 GDA, pp. 427-429: 1-5, 10-13 Week 6 Reporting Results and Choosing Functional Form/Diagnostics Coefficient of Determination (R2) ANOVA and R2 for food expenditure Correlation Analysis Correlation and R2 Reporting Regression Results The Effects of Scaling Choosing a Functional Form Common functional forms Examples Food example Other models and forms Empirical Issues in Choosing a functional form Are residuals normally distributed? Morning Reading UE, Ch. 6 (121-140); GDA, Ch. 21 (431-448) (Suggested: GHJ, Ch. 8) Problems: UE, 6.1-6.4 Problems: UE, 6.6, 6.12, 6.19, 6.21 GDA, pp. 451-453: 1,2,4 Afternoon Week 7 Multiple Regression Model Ch 7 Multiple Regression Model Specification and the Data Theoretical Model Empirical Model General Model Assumptions of the Model Y is a linear function of RHS variables correct model E(et)=0 and E(y)=sum(βnxtn) var(e)= y=σ2 cov(et,es)=cov(yt,ys)=0 [optional] et~ N(0,σ2) , yt~N(sum(βnxtn), σ2) xn is fixed absence of exact collinearity in X Estimating the Paramters of Multiple Regression Model Least Sqaures Procedure Hamburger chain example Estimation of Error variance Sampling Properties of the Least Squares Estimator Variances and covariances of the least squares estimators Important factors affecting variance of parameter estimate, b2 Higher σ2, higher var(b2) Larger N, smaller var(b2) Larger variation of X, var(b2) Collinearity in X’s increase var(b2) Properties of least Squares Estimators Assuming Normally Distributed Errors Interval Estimation Hypothesis Testing for a Single Coefficient Testing Significance One-tailed hypothesis testing Testing elasticity of demand (skip) Testing effectiveness of advertising Measuring Goodness of Fit Ch 8 Multiple Regression (cont’d) F-test F-distribution theory Testing the Significance of a Model Relationship between joint and individual tests (The F-test is apparently not conveniently automated in SPSS) An extended model Testing some hypotheses Significance of advertising Optimal level of Advertising Optimal level of advertising and price Use of non-sample information (skip) Model specification Omitted variables 1)Correct Model: Democracy= f(GDP, Values) 2) Estimated model: Democracy=f(GDP), -Estimates in 2 are biased (and SE are lower) unless GDP and Values have zero correlation -Obvious evidence of omitted variable is estimates that are unexpected or “too high” -F-tests can be used to suggest if model should be included, but “failure to reject a null for variable” can mean that data is not good enough to discern a truly significant effect Irrelevant variables -Reversing the “real world” above implies that Values is “irrelevant.” Irrelevant variables increase variance is SE of GDP (unless uncorrelated with GDP). - There is always a risk of Type I error; we may reject a true null incorrectly . Theory and Logic are the ultimate guides to deciding what to include and exclude in a model. In most real world situations, variables are non-zero correlated, and many could be technically required in a model. RESET Test Tests for omitted variables or wrong functional form. It is calculated as a joint F-test on a model with yhat2 and yhat3 as parameters Collinearity Consequences Relationships between variables hard to isolate relationships - But a lack of variance also makes it hard isolate an effect (more variation in X can reduce SE of coefficients). Identifying -correlation -auxilary regressions (for multi-collinearity) - Mitigating better/more data restrictions on parameters (theory) Prediction Made in same way as simple linear model case. Morning Reading UE, Ch 7, 8 (145-193, skip section 8.5); GDA, Ch. 22 (455-468) (Suggested: GHJ, Ch. 9-11) Problems: UE, 7.4, 7.5, 8.3 GDA, pp. 482-3: 1-4 Afternoon Robert Jackman, “Political Institutions and Voter Turnout in the Industrial Democracies” American Political Science Review. 81(2): 405-24. Aldrich, John and James Battista. 2002. “Conditional Party Government in the States” American Journal of Political Science. 46(1): 164-172. Week 8 Dummy/Binary Variables and Interaction Effects Introduction Are estimates in model appropriate for all observations? Intercept Dummy Variables Yt = 1 + D + 2Xt Slope Dummy variables Yt = 1 + 2Xt + γ (XtD) Yt = 1 + D + 2Xt + γ (XtD) University Effect on Housing Prices Example Common Applications of Dummy Variables Interactions between Qualitative factors -Specify function for each group Qualitative Variables with Several Categories - reference group Controlling for time or “regime” Testing for the Existence of Qualitative Effects Testing jointly for the presence of several qualitative effects (joint F-test, assuming constant error variance) Testing the Equivalence of two regressions using dummy variables Chow test Can the two populations be “pooled”? F-test (SSER – SSEU)/J SSEU/(T-K) Morning Reading Problems SSEu =SSE1 + SSE2 UE, Ch. 9 (199-213) Afternoon discussion Reading Montinola, Gabriella and Robert Jackman, “Sources of Corruption: A Cross-country Study.” British Journal of Political Science. 32(1):147-70. Week 9 Estimating regression models with dichotomous/ordinal dependent variables Introduction Models with Binary Dependent Variables Linear Probability Model Probit Max Likelihood Estimation of probit Intepretation of probit models Example Logit Models Other Qualitative Dependent variables Multinomial Choice Models Ordered Choice models Count data and Poisson Regression Limited DV Models Tobit Sample selection bias Morning Reading UE, Ch. 18 (368-380), Moore and McCabe, Introduction to the Practice of Statistics. Ch, 15 (handout) Problems TBA Afternoon Reading Farber, Henry and Joann Gowa, 1995. Polities and Peace, International Security. 20(2): 123-46. Thompson and Tucker, 1997. “A Tale of Two Democratic Peace Critiques” Journal of Conflict Resolution. 41(3): 428-454. Supplementary Readings Sometimes getting the concepts is a matter of presentation. There are a number of other book that you might find helpful in dealing with specific topics. Books requiring minimal math knowledge. Probability and Statistics Moore and McCabe. 1999. Introduction to the Practice of Statistics 3rd ed. (WH Freeman) Wonnacott and Wonnacott. 1990. Introductory Statistics 5th ed. (Wiley). Advanced Regression/Econometrics Hanushek and Jackson. 1977. Statistical Methods for Social Scientists. Academic Press. Gujarati. 1988. Basic Econometrics. 2nd ed. McGraw-Hill. More advanced treatments Griffiths, Hill and Judge, Learning and Practicing Econometrics. (Wiley) Written by the same authors in more or less the same format, this contains the mathematical essentials underlying what we are doing. Greene. 1990. Econometric Analysis. (MacMillan)