POLS 395 Quantitative methods for political science I

advertisement
POLS 395 Quantitative methods for political science I
This mini-course is intended to give students a very basic introduction to
fundamental statistical and econometric techniques used in the social sciences, and an
introduction to using statistical software to conduct political analysis of data. It is
extremely important that aspiring political scientists learn the material offered in this
class for several practical reasons. First, it will help you read and understand political
science research, whether in journals, books, or conference papers. Second, the material
in this course is a foundation for the other required methods course needed for a PhD in
political science, and is necessary if you want to learn more advanced techniques later.
The mathematical requirements for the class are minimal. Only a decent
knowledge of algebra is assumed. While learning these techniques often appears
daunting, understanding and using them is not really that hard.
We are trying to cover a semester’s worth of material is 2 weeks. Several things are
essential to staying on top of what we are doing:
1. As with graduate school generally, think of this as a slavish full time job for 2
weeks. That is, you should plan to devote 8-10 hours a day to this Monday-Friday. (I
plan to leave weekends open.)
2. Do the assigned reading carefully before the class session.
3. We will do most problems in class. Participate in doing the problems to ensure that
you understand the material and can communicate that to me.
For most sub-fields of political science (political philosophy being the main
exception) there is a premium placed on understanding how to use quantitative methods
(including why they are ultimately not a substitute for asking interesting questions and
being intellectually creative). Even if you will never use statistical methods in your
research, a good grounding in statistical analysis is essential to avoid much of the handwringing about the place of these methods in political science.
Books There are two book required for the course: Hill, Griffiths and Judge,
Undergraduate Econometrics, 2nd edition and SPSS 10 Guide to Data Analysis. Both
will be useful for you as references for future courses and research, so they are worth
getting.
There are also references to other sources in the syllabus. Full references can be found at
the end of the syllabus.
Evaluation It is impractical to devote huge portions of time to exams. Assessments
(i.e., grades) will be based on my evaluation of your progress in the summer, as
evidenced by your performance in class and a research project conducted during the Fall
semester that uses an appropriate quantitative analysis. More details about the final
project will be forthcoming.
Week 1
Introduction
Morning
Normative vs. positive political science
Research Approach
1. Research idea
2. Literature Review
3. Formulating hypotheses
a. Hypotheses must be testable/falsifiable
b. Hypotheses are relational, and maybe causal
National Wealth national democracy, Wealth
Republicanism
c. They are derived from theory
d. Dependent vs. independent variables
4. Defining the concepts
Concept= an abstraction based on characteristics of perceived
reality
What are some concepts readily used on political science?
democracy
politically left
social capital
war
party ID
political efficacy
justice
terrorism
5. Operationalization
Specification of the process by which a concept is measured
How would we operationalize these concepts?
Democracy—Elections? Free press? Free association? Contestation? (Ronald Dahl,
Democracy; David Held, Models of Democracy)
Politically left—statements, survey responses
War—declaration, use of official army, violent confrontation extensive in space, time and
manpower, casualties?
Terrorism
6. Measuring the Data
a. Cases/units of analysis (individual or aggregate)
b. Levels of measurement
i. Nominal
ii. Ordinal
iii. Interval
iv. Ratio
Q1, (GDA, 72)
Examples:
Vote choice
Income
Partisanship
Incarceration
7. Statistical Analysis
a. Experimental vs. non-experimental data
i. intervention vs. observation
ii. replication
b. Statistical Analysis to infer from “samples” to populations, as
in experiments
i. Random Samples (equal chance of selection from
population)
ii. Probability sample
iii. Undercoverage and nonresponse, sources of bias
iv. Random treatments
v. Estimated effects and their “significance”
8. Correlation, causation and spuriousness (Corbett Ch 3)
Reading Undergrad. Econometrics [UE], Ch 1, PKK (to be handed out) (66-8, 82-98)
(Suggested: Corbett Ch. 1-6)
Afternoon
Introducing Statistical Software (SPSS)
Reading Guide to Data Analysis [GDA] Ch 1-2 (1-31), skim Appendix B
Problems: Basic data entry and transformation excercises
Problems for class 1:
Open an existing data set.
Transform a variable
Identify the types of first 15 variables as nominal, ordinal, interval, ratio
Read p. 551 in SPSS book. Which dataset(s) use aggregate data.
Retrieve data from ICPSR
Week 2
Descriptive Statistics and Visual Presentation
Morning
Descriptive Statistics
Frequencies
Counts
Percentages (valid percentages)
Q. 1 (51); q 3?.
-Frequency Tables
-Frequency bar chart
-Histograms
Missing data
Central Tendency
Mean
Median
Mode
Q. 2, 4, 7, 8 (72-73) Q 2 (86)
Variability
Range
Interquartile Range
Variance
Coefficient of Variation
Standard Deviation
Standardized Score
Q9 (73)
Distribution (combines central tendency, and variability)
-Histogram
-Stem and leaf
-Boxplot
Q3 (108)
Crosstabulation
For looking at more than one variable with few categories
Row and column percentages (marginals)
Additional variables
Q1,2,4 (126)
Scatterplots
Q1 (159)
Reading GDA, Ch. 3-8 (“Conceptual” sections) (Suggested: Corbett, Ch 8-9; Moore and McCabe, Ch 1).
Problems:
GDA: p. 51: 1; p. 72-3: 2,7,8; p. 86: 2; p. 108: 3; p. 126: 1,2, 4; p. 159:
1
Afternoon
Reading GDA, Ch. 3-8 (“How to...” sections), Appendix A
Problems:
GDA: p.56: 2, 3; p.74: 4, 5; p.87: 1, 3; p. 109: 1, 2; p. 130: 1, 6; p. 160:
1, 3; p. 162: 11, 12
Week 3
Probability, Random Variables, Expected Values, PDFs
Experiments, Outcomes, and Random Variables
Experimental data
Non-experimental data
Discrete vs. continuous random variables
Probability Distributions of random variables
Discrete (die roll)
Continuous (income, sum of “n” dice)
Expected Values of Random Variables
Summation
Mean
Mean of a function of a random variable
Variance of a random variable
Sampling?
A Joint Density Function.
Marginal pdfs
Conditional pdfs,
Independence of variables
Expected Value of a Function of several random variables
Covariance and Correlation
Mean of a weighted sum
Variance of a weighted sum
Distributions of random variables
Binomial
Inference from a sample to a population
Discrete random variable
Applications: presidential approval, drug efficacy trials
Normal
Standard normal
Central limit theorem and sample means
Applications:
Alternative Distributions
Morning
Reading: UE, Ch 2 (11-36) GDA, Ch 9-10 (163-196)
(Suggested: GHJ, Ch. 2-3)
Problems:
UE- 2.1-2.6, 2.9;
GDA: p. 175-6: 1,2 ; p. 190-3: 1,2,3, 5, 7, 9
Problems:
GDA p. 176: 2,3; p. 196: 1,2
Afternoon
Week 4
Simple linear regression model (SLRM)
Theoretical Model
Public spending
GDP per capita
How much does relative spending change, relative to wealth?
Empirical model
Introducing the error term
Error is what is “left over” after removing the systematic component
Key statistical assumptions
1. (y|x), dependent variable, is a linear function if X (line),.
Y=1+2x +
2. The expected value of e (E(e)) is zero, since E(y)= 1+2x [this is a constant]
3. Errors are homoskedastic
var() = var(y) = 2, for all 
4. Errors have zero covariance
cov(i, j)=cov(yi, yj)=0
knowing y for one variable does not permit prediction of another variable.
5. X values are fixed (not random) and have more than 1 value.
[optional] 6. For each X, the ~N(0,2) (as is y).
True Parameters for relationship are unknowns. They are random variables! They are
estimated from rules of estimation and the data.
Similarly, the true errors are not known. But what do the errors actually represent?
1. other explanatory factors not in the model.
2. deviation from a true linear relationship
3. randomness in behavior.
Estimating Parameters for the relationship between x and y
We want to estimate values for 1 and 2 given using the above modeling assumptions
and data. Plot the data
Linearity and others
Least squares principle
Estimate a line (i.e., parameter estimates in b1 + b2 X)
Obtain errors from line
Square errors
Sum them
Find estimates that minimize squared errors
b2= T  xtyt - xt  yt
T  xt2 – ( xt)2
b1 = y(bar) –b2x(bar)
Why must X have at least 2 values?
Estimators versus estimates
What do the estimates mean?
Predicted values and errors
Do Exercise 3.2 p. 61
Do exercise 3.11 p. 65 (give graph paper)
Do Exercise 3.15
Do exercise 3.13 (a-e)
Converting non-linear relationships into the linear model
If we suspect that the relationship between the variables is of a particular non-linear
functional form, we can often transform the variables into a form that is linear. E.g.,
Natural log of both sides= an estimate of %change in y given a % change in x. (1 unit is
thus a doubling of x)
Morning
Reading UE , Ch. 3 (42-61)
GDA, Ch 18-19 (skim 349-391, 399-406)
(Suggested: GHJ, Ch. 5)
Problems:
UE, 3.1, 3.2, 3.10, 3.12, 3.13, 3.15
GDA, p.399: 1, 4, 6
Afternoon
Reading GDA, Ch 18-19 (pp. 392-398)
Problems:
UE, 3.17 [using SPSS, not Excel]
GDA pp. 403-406: 1-7, 13-18
Week 5
Properties of Least Squares, and Inference in the SLRM
Ch 4. Properties
1. Least Squares Estimators as Random Variables
Different samples will produce different estimates
2. Sampling Properties of Least Squares Estimators
Concept-- Efficiency: Bias and precision and targets.
Expected values of b1 and b2
E(b2)= E(2 +  wtet) = 2 since E(e)=0 and wt is a constant.
E(b1)=1
Thus, least squares estimator is unbiased.
This is true only if no (large) systematic effects are omitted from the model (i.e., an
important explanatory variable)
Repeated sampling context- there is sampling variation in b2
estimates
Derivation of equation 4.21
This shows why b2= 2 + wt et
Variance and covariance of b1 and b2
We prefer estimators with lower variance to higher variance, as it decreases the
likely range in which we can be “confident” that our estimate falls.
Lower sampling variance= higher sampling precision
var(b2)=
2
 (xt-xbar)2
var(b1)=
2 [ xx2 / T(xt-xbar)2
cov(b1,b2) = 2 [-xbar /  (xt-xbar)2]
Note that var(b2) [and var(b1) and cov (b1,b2)] is different, if cov(ei,ej) ~= 0 or var(et)~=2
for all et then
Implications of formulae for efficiency
1. a higher 2 means higher var(bx)
2. less variation in X values means higher var(bx) [graph of greater spread]
3. smaller sample size (T), higher var(bx) and cov(b1,b2)
4. farther x’s from 0 greater variance in b1
5. cov negative when slope is postive
Linear estimators- since b2 is a weighted sum (linear combination) of the y’s, it is called
a linear estimator
3. Gauss Markov theorem- Least squares estimates are BLUE
1.
2.
3.
4.
5.
b1 and b2 are best linear and unbiased estimators, not necessarily best
linear, or best unbiased.
“best” implies minimum variance, i.e., higher prob. Of obtaining an
estimate closer to the true one.
Gauss-Markov is true only insofar as SR1-SR5 are true.
Normality is not required
Gauss-Markov does not apply to the estimate from a single sample, only
to the estimator.
4. Probability Distribution of Least Squares Estimators
If e/y are normally distributed, then so is the estimator for b1 and b2 (b/c b2 is a weighted
sum of y). But, recall that their variances are not just 2.
If e/y are not normally distributed, the least squares estimators will be normal if T is
sufficiently large (i.e., T >50).
5. Estimating the Variance of the Error Term (2)
a good basis for estimating the variance of the random error term (2) is the
average estimated residual. An otherwise downward bias in this estimate is corrected by
subtracting the number of estimated parameters in the model (2 in the SLM case) from
the denominator.
2= [ residt2 ]/T-2
[What is the expected value of the error term?]
Estimates of the variances and covariances of the LS estimators.
(These simply plug the above into the formulae in 2. above.)
Square root of these estimator variances are the standard errors of b1 and
b2.
Ch 5 Inference
Interval Estimation
Theory
Standardize estimate from b and se of b
Probability Distribution of estimated variance is (T-2)(hat)2 /2
~2 (T-2).
“t-distribution”
divide a standard normal Z y the swaure root of an
independent chi-sqaure variable V~ 2(m)
t= Z / (V/m)1/2 ~ t(m)
t= (b2-2 )/ se(b2) ~ t(T-2)
Obtaining Interval Estimates
Repeated Sampling Context
Interval estimates combine estimated point location with a range (based on
sampling variability) in which the unknown parameters might fall.
An illustration
Hypothesis Testing
The null hypothesis
Alternative hypothesis
The test statistic
The rejection region
Food example
Type I and Type II errors
The p-value of a hypothesis test
Tests of significance
Food example
Relationship between two-tailed hypothesis tests and interval estimation
One-tailed tests
Comment on stating null and alternative hypotheses
The Least Squares Predictor
Prediction in the food model
Morning
Reading UE, Ch. 4-5 (68-85, 90-114); GDA, Ch. 20 (407-419, 424-430)
(Suggested: GHJ, Ch. 6-7)
Problems:
UE, 4.1, 4.2, 4.4; 5.1-5.5, 5.10; 5.16
GDA, p. 424-426: 2-4, 6
Afternoon
Reading GDA, Ch 20 (419-423)
Problems:
UE, 4.7
GDA, pp. 427-429: 1-5, 10-13
Week 6
Reporting Results and Choosing Functional Form/Diagnostics
Coefficient of Determination (R2)
ANOVA and R2 for food expenditure
Correlation Analysis
Correlation and R2
Reporting Regression Results
The Effects of Scaling
Choosing a Functional Form
Common functional forms
Examples
Food example
Other models and forms
Empirical Issues in Choosing a functional form
Are residuals normally distributed?
Morning
Reading
UE, Ch. 6 (121-140); GDA, Ch. 21 (431-448)
(Suggested: GHJ, Ch. 8)
Problems:
UE, 6.1-6.4
Problems:
UE, 6.6, 6.12, 6.19, 6.21
GDA, pp. 451-453: 1,2,4
Afternoon
Week 7
Multiple Regression Model
Ch 7 Multiple Regression
Model Specification and the Data
Theoretical Model
Empirical Model
General Model
Assumptions of the Model
Y is a linear function of RHS variables
correct model E(et)=0 and E(y)=sum(βnxtn)
var(e)= y=σ2
cov(et,es)=cov(yt,ys)=0
[optional] et~ N(0,σ2) , yt~N(sum(βnxtn), σ2)
xn is fixed
absence of exact collinearity in X
Estimating the Paramters of Multiple Regression Model
Least Sqaures Procedure
Hamburger chain example
Estimation of Error variance
Sampling Properties of the Least Squares Estimator
Variances and covariances of the least squares estimators
Important factors affecting variance of parameter estimate, b2
Higher σ2, higher var(b2)
Larger N, smaller var(b2)
Larger variation of X, var(b2)
Collinearity in X’s increase var(b2)
Properties of least Squares Estimators Assuming Normally Distributed
Errors
Interval Estimation
Hypothesis Testing for a Single Coefficient
Testing Significance
One-tailed hypothesis testing
Testing elasticity of demand (skip)
Testing effectiveness of advertising
Measuring Goodness of Fit
Ch 8 Multiple Regression (cont’d)
F-test
F-distribution theory
Testing the Significance of a Model
Relationship between joint and individual tests
(The F-test is apparently not conveniently automated in SPSS)
An extended model
Testing some hypotheses
Significance of advertising
Optimal level of Advertising
Optimal level of advertising and price
Use of non-sample information (skip)
Model specification
Omitted variables
1)Correct Model: Democracy= f(GDP, Values)
2) Estimated model: Democracy=f(GDP),
-Estimates in 2 are biased (and SE are lower) unless GDP and Values have zero
correlation
-Obvious evidence of omitted variable is estimates that are unexpected or “too high”
-F-tests can be used to suggest if model should be included, but “failure to reject a null
for variable” can mean that data is not good enough to discern a truly significant effect
Irrelevant variables
-Reversing the “real world” above implies that Values is “irrelevant.” Irrelevant
variables increase variance is SE of GDP (unless uncorrelated with GDP).
- There is always a risk of Type I error; we may reject a true null incorrectly .
Theory and Logic are the ultimate guides to deciding what to include and exclude in
a model. In most real world situations, variables are non-zero correlated, and many
could be technically required in a model.
RESET Test
Tests for omitted variables or wrong functional form.
It is calculated as a joint F-test on a model with yhat2 and yhat3 as parameters
Collinearity
Consequences
Relationships between variables hard to isolate relationships
- But a lack of variance also makes it hard isolate an effect (more variation in X can
reduce SE of coefficients).
Identifying
-correlation
-auxilary regressions (for multi-collinearity)
-
Mitigating
better/more data
restrictions on parameters (theory)
Prediction
Made in same way as simple linear model case.
Morning
Reading
UE, Ch 7, 8 (145-193, skip section 8.5); GDA, Ch. 22 (455-468)
(Suggested: GHJ, Ch. 9-11)
Problems:
UE, 7.4, 7.5, 8.3
GDA, pp. 482-3: 1-4
Afternoon
Robert Jackman, “Political Institutions and Voter Turnout in the Industrial
Democracies” American Political Science Review. 81(2): 405-24.
Aldrich, John and James Battista. 2002. “Conditional Party Government in the
States” American Journal of Political Science. 46(1): 164-172.
Week 8
Dummy/Binary Variables and Interaction Effects
Introduction
Are estimates in model appropriate for all observations?
Intercept Dummy Variables
Yt = 1 +  D + 2Xt
Slope Dummy variables
Yt = 1 + 2Xt + γ (XtD)
Yt = 1 +  D + 2Xt + γ (XtD)
University Effect on Housing Prices Example
Common Applications of Dummy Variables
Interactions between Qualitative factors
-Specify function for each group
Qualitative Variables with Several Categories
- reference group
Controlling for time or “regime”
Testing for the Existence of Qualitative Effects
Testing jointly for the presence of several qualitative effects (joint F-test,
assuming constant error variance)
Testing the Equivalence of two regressions using dummy variables
Chow test
Can the two populations be “pooled”?
F-test
(SSER – SSEU)/J
SSEU/(T-K)
Morning
Reading
Problems
SSEu =SSE1 + SSE2
UE, Ch. 9 (199-213)
Afternoon discussion
Reading
Montinola, Gabriella and Robert Jackman, “Sources of Corruption: A
Cross-country Study.” British Journal of Political Science. 32(1):147-70.
Week 9
Estimating regression models with dichotomous/ordinal dependent
variables
Introduction
Models with Binary Dependent Variables
Linear Probability Model
Probit
Max Likelihood Estimation of probit
Intepretation of probit models
Example
Logit Models
Other Qualitative Dependent variables
Multinomial Choice Models
Ordered Choice models
Count data and Poisson Regression
Limited DV Models
Tobit
Sample selection bias
Morning
Reading
UE, Ch. 18 (368-380), Moore and McCabe, Introduction to the Practice of
Statistics. Ch, 15 (handout)
Problems TBA
Afternoon
Reading
Farber, Henry and Joann Gowa, 1995. Polities and Peace, International Security. 20(2):
123-46.
Thompson and Tucker, 1997. “A Tale of Two Democratic Peace Critiques” Journal of
Conflict Resolution. 41(3): 428-454.
Supplementary Readings
Sometimes getting the concepts is a matter of presentation. There are a number of other
book that you might find helpful in dealing with specific topics.
Books requiring minimal math knowledge.
Probability and Statistics
Moore and McCabe. 1999. Introduction to the Practice of Statistics 3rd ed. (WH
Freeman)
Wonnacott and Wonnacott. 1990. Introductory Statistics 5th ed. (Wiley).
Advanced Regression/Econometrics
Hanushek and Jackson. 1977. Statistical Methods for Social Scientists. Academic
Press.
Gujarati. 1988. Basic Econometrics. 2nd ed. McGraw-Hill.
More advanced treatments
Griffiths, Hill and Judge, Learning and Practicing Econometrics. (Wiley) Written by the
same authors in more or less the same format, this contains the mathematical essentials
underlying what we are doing.
Greene. 1990. Econometric Analysis. (MacMillan)
Download