SC 704: Topics in Multivariate Statistics Fall 2010 Wednesday 3:00 - 5:20 pm 306 Carney Hall Professor: Sara Moorman Office: 404 McGuinn Hall Office hours: Mondays 1:00-3:00 pm or by appointment E-mail: Sara.Moorman.1@bc.edu Please e-mail me from your BC account, and include “SC704” in the subject line. My response may be delayed up to 24 hours. Phone: (617) 552 - 4209 About the Course This applied course is designed for students in sociology, education, nursing, organizational studies, political science, psychology, or social work with a prior background in statistics at the level of SC703: Multivariate Statistics. It assumes a strong grounding in multivariate regression analysis. The major topics of the course will include OLS regression diagnostics, binary, ordered, and multinomial logistic regression, models for the analysis of count data (e.g., Poisson and negative binomial regression), treatment of missing data, and the analysis of clustered and stratified samples. All analyses in the course will be conducted using Stata, but no previous Stata experience is necessary. Required Readings Texts to purchase: Acock, Alan C. 2008. A Gentle Introduction to Stata. 2nd ed. College Station, TX: Stata Press. Long, J. Scott and Jeremy Freese. 2006. Regression Models for Categorical Dependent Variables Using Stata. 2nd ed. College Station, TX: Stata Press. McKnight, Patrick E., Katherine M. McKnight, Souraya Sidani, and Aurelio Jose Figueredo. 2007. Missing Data: A Gentle Introduction. New York: Guilford Press. Course reserves online: Access “read for next week” entries that are not from the required purchases as .pdf files through the library website (http://www.bc.edu/libraries/) or through the link on the course Blackboard page. Blackboard Visit the Blackboard page for this course regularly for announcements, grades, and course materials. SC 704 Topics in Multivariate Statistics page 2 of 11 Required Software This course requires the use of the program Stata. The most current version is available on the computers in the Sociology graduate student lounge. For use on your own computer, you have two options: (1) access the program through remote connection to apps.bc.edu, or (2) purchase the program through BC’s Research Services. Ask your department administrator about Campuswide GradPlan. Prices start at $29 (price for a six-month license). Assessment Grading scale A 93 – 100% B 83 – 86% F 0 – 59% Task Article of the week Project update Presentation Research paper AB- 90 – 92% 80 – 82% Due date Weekly (N = 12) September 22 October 6, 20 November 3, 17 December 1 December 1 December 15 B+ C 87 – 89% 60 – 79% Percentage of grade 10 at 1% each: 10% 6 at 10% each: 60% 10% 20% Article of the week: Each week, you will read a published article that uses the method we’re covering that week. I chose articles on substantive (i.e., not methodological) topics that have been published recently in major sociology journals, but I did not select based on execution or explanation of methods. That is, as you read the articles, you should critique them: Does this analysis best answer the research question given the data the researchers had available, or is there a discrepancy between the research question and its empirical operationalization? Would you have chosen different statistics instead of or in addition to the statistics employed? Were you left with any critiques of the data or methods, or did the authors anticipate your concerns? If you had the data at hand, would you be able to replicate the analysis? Were the results interpreted clearly and correctly? Were the results presented effectively in tables and/or figures? Are the interpretations fair, or do they seem to go beyond what the data can really support (e.g., are there causal claims based on cross-sectional data)? Make a list of what you think the authors did well and what they did poorly. In class we’ll discuss the article of the week, and you’ll submit your list to receive credit. I will simply note whether your work is complete or incomplete rather than judge the content of your responses. Because the purpose of this exercise is to prepare for class discussion, and because I am requiring only 10 out of 12 possible lists, I will not accept late work. Research project: I find that the best way to learn statistics is to practice them on real data that mean something to you. Therefore, the major product of this course will be a journal-style research article using two or more of the methods covered in class from September 22 onward (e.g., a test of mediation in a complex survey dataset, an outcome that requires binary logistic regression and one that requires Poisson regression, a multinomial logistic regression on multiply-imputed data). If you do not already have your own data, many datasets are publicly SC 704 Topics in Multivariate Statistics page 3 of 11 available, and many more are available through BC’s subscription to ICPSR. Every other week during the semester, you will submit a project update. The first “update” will be a project proposal, and updates after that will be drafts of sections of the paper. For instance, for one update you might submit the methods section, for the next update you might draft the introduction, and for the update after that you might submit a draft of the methods section that you’ve revised based on my feedback. Precisely what you turn in will be up to you, although I’m happy to make recommendations on a case-by-case basis. On December 1, you’ll give a 15-20 minute conference style presentation of your project in class, and on December 15, you’ll submit your completed paper. I’ll provide further details about length, formatting, etc. in future classes. The final paper may not be completed late. Project updates submitted after Wednesday at 11:59 pm will be graded, but will assume a late penalty of 10 points per day. Academic Honesty Your work must be your words and ideas. When writing papers, use quotation marks around someone else’s exact words and identify whose words they are. If you come across a good idea, by all means use it in your writing, but be sure to acknowledge whose idea it is. Do not allow another student to copy your work. Failure to comply will result in (a) automatic failure of the assignment, and (b) a report to the Dean and the Committee on Academic Integrity. For further information, please review the College’s policies on academic integrity here: http://www.bc.edu/offices/stserv/academic/resources/policy.html#integrity Schedule September 8: Ordinary Least Squares (OLS) Regression: Review and Diagnostics Read for 09/15: Acock chapter 8 and pp. 219-250 Long & Freese (LF) chapter 2 McVeigh, Rory, and D. Diaz Maria-Elena. 2009. Voting to ban same-sex marriage: Interests, values, and communities. American Sociological Review 74: 891-915. doi: 10.1177/000312240907400603 Additional sources on OLS regression, for your reference: Berry, William D. 1993. Understanding Regression Assumptions. Sage. ISBN: 9780803942639 Cohen, Jacob. 1994. The Earth is round (p < .05). American Psychologist 49: 997-1003. Fox, John. 1991. Regression Diagnostics. Sage. ISBN: 9780803939714 Least squares applet: http://hadm.sph.sc.edu/COURSES/J716/demos/LeastSquares/LeastSquaresDemo.html Raftery, Adrian E. 1995. Bayesian model selection in social research. Sociological Methodology 25: 111-63. SC 704 Topics in Multivariate Statistics page 4 of 11 September 15: Using Stata Submit: Assessment of McVeigh and Maria-Elena Read for 09/22: LF chapter 3 Rodgers-Farmer, Antoinette and Diane Davis. 2001. Analyzing complex survey data. Social Work Research 25(3): 185-192. Additional sources on Stata, for your reference: Acock chapters 1-4, pp. 316-321. LF chapter 1 Stata Journal The Getting Started with Stata (Mac, Unix, and Windows versions) manuals, available for loan at O’Neill or for purchase at http://www.stata.com/bookstore/gs.html Stata User’s Guide, available for loan at O’Neill or for purchase at http://www.stata.com/bookstore/guide.html Resources to help you learn and use Stata: http://www.ats.ucla.edu/stat/stata/ SPost: Postestimation analysis with Stata: http://www.indiana.edu/~jslsoc/spost.htm September 22: Complex Survey Data Submit: Research project proposal Read for 09/29: Hayes, Andrew F. 2009. Beyond Baron and Kenny: Statistical mediation analysis in the new millennium. Communication Monographs 76(4): 408-20. doi: 10.1080/03637750903310360 MacKinnon, David P., Amanda J. Fairchild, and Matthew S. Fritz. 2007. Mediation analysis. Annual Review of Psychology 58: 593-614. doi: 10.1146/annurev.psych.58.110405.085542 Morgan, Stephen L. and Jennifer J. Todd. 2009. Intergenerational closure and academic achievement in high school: A new evaluation of Coleman’s conjecture. Sociology of Education 82: 267-86. doi: 10.1177/003804070908200304 SC 704 Topics in Multivariate Statistics page 5 of 11 Additional sources on complex survey data, for your reference: Acock pp. 240-242 (on weighting). Heeringa, Steven G., Brady T. West, and Patricia A. Berglund. 2010. Applied Survey Data Analysis. CRC Press. ISBN: 9781420080667 Lee, Eun Sul and Ronald N. Forthofer. 2006. Analyzing Complex Survey Data. Sage. ISBN: 9780761930389 Winship, Christopher and Radbill, Larry. 1994. Sampling weights and regression analysis. Sociological Methods and Research 23(2): 230-57. doi: 10.1177/0049124194023002004 September 29: Mediation Submit: Assessment of Morgan and Todd Read for 10/06: Acock pp. 250-255 LF pp. 423-427 Beyerlein, Kraig and David Sikkink. 2008. Sorrow and solidarity: Why Americans volunteered for 9/11 relief efforts. Social Problems 55(2): 190-215. doi: 10.1525/sp.2008.55.2.190 Brambor, Thomas and William Roberts Clark. 2006. Understanding interaction models: Improving empirical analyses. Political Analysis 14: 63-82. doi: 10.1093/pan/mpi014 Additional sources on mediation, for your reference: Iacobucci, Dawn. 2008. Mediation Analysis. Sage. ISBN: 9781412925693 MacKinnon, David P., Chondra M. Lockwood, Jeanne M. Hoffman, Stephen G. West, and Virgil Sheets. 2002. A comparison of methods to test mediation and other intervening variable effects. Psychological Methods 7: 83-104. doi: 10.1037//1082989X.7.1.83 PRODCLIN: http://www.public.asu.edu/~davidpm/ripl/Prodclin/ Statistical mediation: http://www.public.asu.edu/~davidpm/ripl/mediate.htm SC 704 Topics in Multivariate Statistics page 6 of 11 October 6: Moderation Submit: Research project update Assessment of Beyerlein and Sikkink Read for 10/13: McKnight et al. chapters 1-5 Schnittker, Jason. 2009. Mirage of health in the era of biomedicalization: Evaluating change in the threshold of illness, 1972-1996. Social Forces, 87: 2155-82. doi: 10.1353/sof.0.0218 Additional sources on moderation, for your reference: Jaccard, James and Robert Turrisi. 2003. Interaction Effects in Multiple Regression. Sage. ISBN: 9780761927426 Miller, Jane E. 2004. The Chicago Guide to Writing about Numbers. University of Chicago Press. ISBN: 0226526313 Moderation variables: An introduction: http://davidakenny.net/cm/moderation.htm Statistical power calculators: http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize http://www.danielsoper.com/statcalc/default.aspx#c17 http://www.psycho.uni-duesseldorf.de/aap/projects/gpower/ October 13: Missing Data Submit: Assessment of Schnittker Read for 10/20: Baller, Robert D. and Kelly K. Richardson. 2009. The “dark side” of the strength of weak ties: The diffusion of suicidal thoughts. Journal of Health and Social Behavior 50(3): 261-76. doi: 10.1177/002214650905000302 McKnight et al. chapters 6-11 Additional sources on missing data, for your reference: Acock, Alan C. 2005. Working with missing values. Journal of Marriage and Family 67(4): 1012-28. doi: 10.1111/j.1741-3737.2005.00191.x SC 704 Topics in Multivariate Statistics page 7 of 11 Allison, Paul D. 2001. Missing Data. Sage. ISBN: 9780761916727 Little, Roderick J. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association 83: 1198-202. Little, Roderick J. and Donald B. Rubin. 2002. Statistical Analysis with Missing Data, 2nd Edition. New York, NY: Wiley. ISBN: 9780471183860 October 20: Missing Data Submit: Research project update Assessment of Baller and Richardson Read for 10/27: Acock chapter 11 LF chapter 4 Wiepking, Pamala and Ineke Maas. 2009. Resources that make you generous: Effects of social and human resources on charitable giving. Social Forces 87(4): 1973-95. doi: 10.1353/sof.0.0191 October 27: Binary Outcomes Submit: Assessment of Wiepking and Maas Read for 11/03: LF chapter 5 Liu, Ka-Yuet, Marissa King, and Peter S. Bearman. 2010. Social influence and the autism epidemic. American Journal of Sociology 115(5): 1387-434. Additional sources on binary outcomes, for your reference: Afifi, Abdelmonem, Virginia A. Clark, and Susanne May. 2004. Computer-Aided Multivariate Analysis, 4th Edition (Chapter 12). Chapman & Hall. ISBN: 9781584883081 Hilbe, Joseph M. 2009. Logistic Regression Models (Chapters 1-9). Taylor & Francis. ISBN: 9781420075755 Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables (Chapter 3). Sage. ISBN: 0803973748 SC 704 Topics in Multivariate Statistics page 8 of 11 Menard, Scott. 1995. Applied Logistic Regression Analysis. Sage. ISBN: 9780761922087 Pampel, Fred C. 2000. Logistic Regression: A Primer. Sage. ISBN: 9780761920106 Peng, Chao-Ying Joanne, Kuk Lida Lee, and Gary M. Ingersoll. 2002. An introduction to logistic regression analysis and reporting. The Journal of Education Research 96(1): 314. doi: 10.1080/00220670209598786 Powers, Daniel A. and Yu Xie. 2000. Statistical Methods for Categorical Data Analysis (Chapter 3). Academic Press. ISBN: 9780125637367 Thou Shalt Not Report Odds Ratios: http://itre.cis.upenn.edu/~myl/languagelog/archives/004767.html November 3: Ordinal Outcomes Submit: Research project update Assessment of Liu, King, and Bearman Read for 11/10: LF chapter 6 Griffin, Larry J. and Kenneth A. Bollen. 2009. What do these memories do? Civil rights remembrance and racial attitudes. American Sociological Review 74: 594-614. doi: 10.1177/000312240907400405 Additional sources on ordinal outcomes, for your reference: Borooah, Vani K. 2002. Logit and Probit: Ordered and Multinomial Models. Sage. ISBN: 0761922423 Fullerton, Andrew S. 2009. A conceptual framework for ordered logistic regression models. Sociological Methods and Research 38(2): 306-47. doi: 10.1177/0049124109346162 Hilbe, Joseph M. 2009. Logistic Regression Models (Chapter 10). Taylor & Francis. ISBN: 9781420075755 Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables (Chapter 5). Sage. ISBN: 0803973748 O’Connell, Ann A. 2005. Logistic Regression Models for Ordinal Response Variables. Sage. ISBN: 9780761929895 SC 704 Topics in Multivariate Statistics page 9 of 11 Powers, Daniel A. and Yu Xie. 2008. Statistical Methods for Categorical Data Analysis (Chapter 6). Academic Press. ISBN: 9780125637367 November 10: Nominal Outcomes Submit: Assessment of Griffin and Bollen Read for 11/17: Bjornstrom, Eileen E. S., Robert L. Kaufman, Ruth D. Peterson, and Michael D. Slater. 2010. Race and ethnic representations of lawbreakers and victims in crime news: A national study of television coverage. Social Problems 57(2): 269-93. doi: 10.1525/sp.2010.57.2.269 LF chapters 7-8 Mood, Carina. 2010. Neighborhood social influence and welfare receipt in Sweden: A panel data analysis. Social Forces 88(3): 1331-56. doi: 10.1353/sof.0.0304 Additional sources on nominal outcomes, for your reference: Borooah, Vani K. 2002. Logit and Probit: Ordered and Multinomial Models. Sage. ISBN: 0761922423 Hilbe, Joseph M. 2009. Logistic Regression Models (Chapter 11). Taylor & Francis. ISBN: 9781420075755 Hoffman, Saul D. and Greg J. Duncan. 1988. Multinomial and conditional logit discretechoice models in demography. Demography 25(3): 415-27. Liao, Tim Futing. 1994. Interpreting Probability Models. Sage. ISBN: 0803949995 Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables (Chapter 6). Sage. ISBN: 0803973748 McFadden, Daniel. 1973. Conditional logit analysis of qualitative choice behavior. Pp. 105-42 in P. Zarembka (Ed.), Frontiers of Econometrics, New York, NY, Academic Press. ISBN: 9780127761503 Powers, Daniel A. and Yu Xie. 2008. Statistical Methods for Categorical Data Analysis (Chapter 7). Academic Press. ISBN: 9780125637367 On relative risk ratios (exponentiated betas) in Stata: http://www.stata.com/statalist/archive/2005-04/msg00678.html SC 704 Topics in Multivariate Statistics page 10 of 11 November 17: Count Data Submit: Research project update Assessment of Bjornstrom, Kaufman, Peterson, and Slater Assessment of Mood Read for 12/01: Hook, Jennifer L. 2010. Gender inequality in the welfare state: Sex segregation in housework, 1965-2003. American Journal of Sociology 115(5): 1480-523. Additional sources on count data, for your reference: Cameron, A. Colin, and Pravin K. Travedi. 1998. Regression Analysis of Count Data. Cambridge University Press. ISBN: 0521632013 Coxe, Stefany, Stephen G. West, and Leona S. Aiken. 2009. The analysis of count data: A gentle introduction to Poisson regression and its alternatives. Journal of Personality Assessment 91(2): 121-36. doi: 10.1080/00223890802634175 Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables (Chapter 8). Sage. ISBN: 0803973748 Matthews, David E. and Vernon T. Farewell. 2007. Using and Understanding Medical Statistics (Chapter 12). Karger. ISBN: 3805581890 Woolridge, Jeffrey M. 2002. Econometric Analysis of Cross Section and Panel Data (Chapter 19). MIT Press. ISBN: 0262232197 November 24: No class, Thanksgiving December 1: Presentations Submit: Research project update Assessment of Hook Read for 12/08: Afifi, Abdelmonem, Virginia A. Clark, and Susanne May. 2004. Chapter 17 in Computer Aided Multivariate Analysis, 4th ed. Boca Raton, FL: CRC Press. SC 704 Topics in Multivariate Statistics page 11 of 11 Charles, Maria and Karen Bradley. 2009. Indulging our gendered selves? Sex segregation by field of study in 44 countries. American Journal of Sociology 114(4): 924-76. Foster, Jeremy J., Emma Barkus, and Christian Yavorsky. 2006. Pp. 47-56 in Understanding and Using Advanced Statistics. Thousand Oaks, CA: Sage. December 8: Log-Linear Analysis Submit: Assessment of Charles and Bradley Additional sources on log-linear analysis, for your reference: Hardy, Melissa A. and Alan Bryman. 2009. The Handbook of Data Analysis (Chapter 12). Sage. ISBN: 9781848601161 Ishii-Kuntz, Masako. 1994. Ordinal Log-Linear Models. Sage. ISBN: 9780803943766 Knoke, David and Peter J. Burke. 1980. Log-Linear Models. Sage. ISBN: 080391492X Powers, Daniel A. and Yu Xie. 2008. Statistical Methods for Categorical Data Analysis (Chapter 4). Academic Press. ISBN: 9780125637367