Professor Schneider 324B S. Kedzie Hall sks@msu.edu 517/355-7682 PLS 802 Spring 2010 REGRESSION ANALYSIS This course provides an introduction to the theory, methods, and practice of regression analysis. The goals are to provide students with the skills that are necessary to: (1) read, understand, and evaluate the professional literature that uses regression analysis; (2) design and carry out studies that employ regression techniques for testing substantive theories; and (3) prepare to learn about more advanced statistical procedures. The course will not dwell on statistical theory, but it will also not take a superficial approach. Instead, it will focus on: The utility of regression analysis for evaluating empirical relationships between variables as a critical component of the theory-testing process. We will thoroughly cover the basic elements of the regression model and the development of the regression estimators. We will see that this model depends very heavily on several assumptions. Therefore, we will examine these assumptions in detail, considering why they are necessary, whether they are valid in practical research situations, and the consequences of violating them in particular applications of the regression techniques. These formal, analytic treatments will be counterbalanced by the use of frequent substantive examples and class exercises. Again, the overall course objective is not to turn you into a statistician– instead, the aim is to maximize your research skills as a political scientist. Course Prerequisites: Any course of this type must assume a working knowledge of elementary statistical concepts and techniques. We will conduct a brief review at the beginning of the course, but students must be familiar with such ideas as descriptive statistics, sampling distributions, statistical inference, confidence intervals, and hypothesis testing, before moving on to the more complicated matters that will comprise the majority of the course material. You must have completed at least one prior course in introductory statistics course– i.e., PLS 801 or the equivalent. Course Requirements: Formal course requirements are as follows: (1) Class attendance and active participation. This is mandatory. Statistical knowledge is cumulative, and gaps in the material will have detrimental consequences. (2) Completion of homework assignments. Most of these are computer-based data analysis exercises, designed to familiarize you with the application of various concepts and techniques. Each of these assignments will focus on a specific set of topics. However, the latter assignments are cumulative in the sense that they build upon earlier material in the class. Homework assignments will be given frequently (about once a week or so). They will not be assigned grades, but they will be checked for completion and comments will be provided to make sure that you fully understand the material. (3) Two examinations. A mid-term examination will be given in class on Wednesday, March 3; the final will be a take-home examination, due on Wednesday, May 5, 2010 at 5:00 p.m. PLS 802, Spring 2010 Page 2 Assignment of Final Grades: Homework Assignments & Class Participation Midterm Examination Final Examination 30 % 30 % 40 % Textbooks: The following are the main required texts for the course: * Gujarati, Damodar N., and Dawn C. Porter. 2009. Basic Econometrics (Fifth Edition). Boston, MA: McGraw-Hill. OR * McClendon, McKee. 1994 (reissued 2002). Multiple Regression and Causal Analysis. Prospect Heights, IL: Waveland Press. You should select either Gujarati or McClendon as your main required text. In addition, we will rely upon material from the following two other required books: * Berry, William D., and Stanley Feldman. 1985. Multiple Regression in Practice. Beverly Hills, CA: Sage Publications. * Fox, John. Regression Diagnostics.1991. Beverly Hills, CA: Sage Publications. The following books are useful recommended books; they provide more detailed, comprehensive coverage of the material along with explicit derivations of statistical concepts. If you plan on taking additional methods courses, you should acquire one of these books: Kennedy, Peter. 2008. A Guide to Econometrics (Sixth Edition), Malden, MA: Black Publishing, Inc. Wooldridge, Jeffrey M. 2006. Introductory Econometrics: A Modern Approach (Third Edition). Mason, OH: Thomson South-Western. PLS 802, Spring 2010 Page 3 The following books are useful supplemental books; they provide more basic explanations of key terms and concepts: Berry, William D. 1993. Understanding Regression Assumptions. Beverly Hills, CA: Sage Publications. Lewis-Beck, Michael. 1980. Applied Regression. Beverly Hills, Sage Publications. Schroeder, Larry D., David L. Sjoquist, and Paula E. Stephan. 1986. Understanding Regression Analysis: An Introductory Guide. Newbury Park: Sage Publications. You should read all the designated material assigned in the required texts. You should also have access to a basic statistics book to help you review statistical concepts and principles, and to provide reasonable alternative discussions of the bivariate and multiple regression models. Most of the recommended and supplemental books are either too advanced or elementary to be used as central texts in this course. However, several of them are very good and would be extremely useful books for you to rely upon for greater detail or additional explanations at various points in the course. Computing and Software: Computers and statistical software are absolutely necessary for employing modern statistical techniques in an effective manner. Therefore, they will be closely integrated into the course material. We will use STATA for most of the class examples, assignments, and examinations. But, you can also use other statistical software in this course (e.g., R, SAS, SPSS, SYSTAT, etc.), as long as it has the analytical routines and capacities that are required to complete the assignments and examinations. If you are not comfortable using STATA, there are a number of books which can help. These include: Acock, Alan, C. 2008. A Gentle Introduction to Stata (Second Edition). College Station, TX: Stata Corporation. Adkins, Lee, and Carter Hill. 2009. Using Stata for Principles of Econometrics (Third Edition). New York: Wiley. Kohler, Ulrich, and Frauke Kreuter. 2009. Data Analysis Using Stata (Second Edition). College Station, TX: Stata Corporation. Hamilton, Lawrence. 2009. Statistics with Stata. Cengage. You are not required to purchase these books, but you might find them helpful in your efforts to learn and use STATA. PLS 802, Spring 2010 Page 4 Topics and Reading Assignments I. Introduction to Regression Analysis Reading: Gujarati, pp. 15-33 McClendon, pp. 1-19 Kennedy, pp. 1-10 Wooldridge, pp. 1-19 II. Preliminary Material and Statistical Review A. Frequency Distributions, Univariate Summary Statistics, Probability Distributions Reading: Gujarati, pp. 801-823 McClendon, pp. 20-25 B. Statistical Inference and the Properties of Statistical Estimators Reading: Gujarati, pp. 823-837 1. Confidence Intervals & Hypothesis Tests 2. Differences Between Two Means, Two Variances, Etc. III. Basic Concepts for Understanding Regression Analysis: Functional Dependence, Linear Transformations, and Linear Combinations Reading: McClendon, pp. 25-28 Wooldridge, pp 707-802 IV. The Bivariate Regression Model A. Introduction: Basic Ideas and Concepts Reading: Gujarati, pp. 34-54 McClendon, pp. 28-30 Berry, pp. 1-22 PLS 802, Spring 2010 Page 5 B. The Least Squares Criterion and Estimation in the Bivariate Regression Model Reading: Gujarati, pp. 55-73 McClendon, pp.31-41 Berry and Feldman, pp. 9-12 Kennedy, pp. 11-59 Wooldridge, pp. 50-66, 89-95, 106-109, 123-126, 176-181, 187-190 C. Goodness of fit, the Correlation Coefficient and R2 Reading: Gujarati, pp. 73-92 McClendon, pp. 42-49 Schroeder, Sjoquist, and Stephan, pp. 23-29 D. Assumptions Underlying the Bivariate Linear Regression Model Reading: Gujarati, pp. 61-69; 97-101 McClendon, pp. 133-146 Berry and Feldman, pp. 9-12 Kennedy, pp. 11-59 Wooldridge, pp. 50-66, 89-95, 106-109, 123-126, 176-181, 187-190 E. Statistical Inference, Confidence Intervals, and Hypothesis Tests Reading: Gujarati, pp. 107-146 McClendon, pp. 147-154 Lewis-Beck, pp. 26-47 Schroeder, Sjoquist, and Stephan, pp. 36-53 Kennedy, pp. 51-90 Wooldridge, pp. 126-147 F. Summary, Extensions, and a Preliminary Look at Residuals, Outliers, and Influential Cases Reading: Gujarati, pp. 147-187 McClendon, pp. 49-59 Berry, pp. 22-88 PLS 802, Spring 2010 Page 6 V. The Multiple Regression Model A. Introduction: Notation, Assumptions, and Interpretation Reading: Gujarati, pp. 188-196; 213-227 McClendon, pp. 60-80 Berry and Feldman, pp. 9-18 Wooldridge, pp. 73-88 B. Measures of Goodness of Fit Reading: Gujarati, pp. 196-209 McClendon, pp. 80-83 Schroeder, Sjoquist, and Stephan, pp. 32-36 C. Statistical Inference and the Role of Hypothesis Testing Reading: Gujarati, pp. 233-259 McClendon, pp. 133-174 Berry and Feldman, pp. 12-18 Kennedy, pp. 60-80 Wooldridge, pp. 147-167, 214-218 D. Models of Substantive Assumptions Reading: Phenomena; McClendon, pp. 83-93, 154-157 Berry, pp. 1-24 Lewis-Beck, pp. 63-66 E. Summary and a Brief Look at Extensions Reading: Gujarati, pp. 259-276 McClendon, pp. 93-118 (McClendon, pp. 119-132) The Importance of Model PLS 802, Spring 2010 Page 7 VI. Model Building in Multiple Regression Analysis A. Model Specification Reading: Gujarati, pp. 467-522 McClendon, pp. 288-321 Berry and Feldman, pp. 18-26 Berry, pp. 30-45 Kennedy, pp. 71-92 Lewis-Beck, pp. 30-45 Schroeder, Sjoquist, and Stephan, pp. 67-70 B. Functional Forms, Nonlinearity and Transformations Reading: Gujarati, pp. 525-540 McClendon, pp. 230-270 Berry and Feldman, pp. 51-72 Berry, pp. 60-66 Kennedy, pp. 93-111 Schroeder, Sjoguist, and Stephan, pp. 58-61 Wooldridge, pp. 304-310 C. Nominal Independent Variables Reading: Gujarati, pp. 277-314 McClendon, pp. 198-229; 271-287 Kennedy, pp. 248-258 Schroeder, Sjoquist, and Stephan, pp. 56-58 Wooldridge, pp. 230-252 PLS 802, Spring 2010 Page 8 VII. Potential Problems in Multiple Regression Analysis A. Interpretation of Results Reading: Fox, pp.3-5 B. Multicollinearity and Its Effects Reading: Gujarati, pp. 320-364 McClendon, pp. 161-163 Berry and Feldman, pp. 37-50 Fox, pp. 10-21 Berry, pp. 24-27 Kennedy, pp. 192-202 Lewis-Beck, pp. 58-63 Schroeder, Sjoquist, and Stephan, pp. 71-72 Wooldridge, pp. 101-105 C. Nonnormal and Nonconstant (Heteroscedastic) Errors Reading: Gujarati, pp. 365-411 McClendon, pp. 174-197 Berry and Feldman, pp. 73-88 Fox, pp. 40-53 Berry, pp. 67, 72-81 Kennedy, pp. 133-139 Wooldridge, pp. 181-185 D. Measurement Error Reading: Gujarati, pp. 482-485 Berry and Feldman, pp. 26-37 Berry, pp. 45-60 Kennedy, pp. 157-163 Schroeder, Sjoquist, and Stephan, pp. 70-71 Wooldridge, pp. 318-325 PLS 802, Spring 2010 Page 9 E. Residual Analysis, Outliers, and Influential Observations Reading: Fox, pp. 21-40 Berry, pp. 27-29 Kennedy, pp. 372-388 VIII. Additional Topics A. Dichotomous Dependent Variables Reading: Gujarati, pp. 541-590 Schroeder, Sjoquist, and Stephan, pp. 79-80 Wooldridge, pp. 252-258 B. Simultaneous Equation Models Reading: Gujarati, pp. 673-736 McClendon, pp. 288-347 Berry, pp. 1-54 Schroeder, Sjoquist, and Stephan, pp. 77-79 C. Nonindependent Disturbances and Time Series Models Reading: Gujarati, pp. 737-800 Berry, pp. 67-72 Kennedy, pp. 139-156, 163-179 Schroeder, Sjoquist, and Stephan, pp. 72-75