Please note that this syllabus should be regarded as only a general guide to the course. The instructor may have changed specific course content and requirements subsequent to posting this syllabus. Last Modified: 15:11:09 08/26/2011 SC 704: Topics in Multivariate Statistics Fall 2011 Tuesday/Thursday 3:00 – 4:15 pm O’Neill 245 Professor: Sara Moorman Office: 404 McGuinn Hall Office hours: Mondays 1:15-3:15 pm or by appointment E-mail: moormans@bc.edu Phone: (617) 552 - 4209 About the Course This applied course is designed for students in sociology, education, nursing, organizational studies, political science, psychology, or social work with a prior background in statistics at the level of SC703: Multivariate Statistics. It assumes a strong grounding in multivariate regression analysis. The major topics of the course will include OLS regression diagnostics, binary, ordered, and multinomial logistic regression, models for the analysis of count data (e.g., Poisson and negative binomial regression), treatment of missing data, and the analysis of clustered and stratified samples. All analyses in the course will be conducted using Stata, but no previous Stata experience is necessary. Readings Required textbooks: Enders, Craig K. 2010. Applied Missing Data Analysis. ISBN: 9781606236390 Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. ISBN: 0803973748 Long, J. Scott and Jeremy Freese. 2006. Regression Models for Categorical Dependent Variables Using Stata. 2nd ed. ISBN: 1597180114 Recommended textbook: Acock, Alan C. 2010. A Gentle Introduction to Stata. 3rd ed. ISBN: 1597180750 Course reserves online: Access “read for next week” entries that are not from the required purchases as .pdf files through the library website (http://www.bc.edu/libraries/) or through the link on the course Blackboard page (https://cms.bc.edu/webct/entryPageIns.dowebct). SC 704 Topics in Multivariate Statistics page 2 of 11 Software This course requires the use of the program Stata. The most current version is available on the computers in the Sociology graduate student lounge. For use on your own computer, you have two options: (1) access the program through remote connection to apps.bc.edu, or (2) purchase the program through BC’s Research Services. Ask your department administrator about Campuswide GradPlan. Prices start at $29 (price for a six-month student license). Assessment Grading scale A 93 – 100% B 83 – 86% F 0 – 59% Task Article of the week Project proposal Project update I Project update II Presentation Final paper draft AB- 90 – 92% 80 – 82% Due date Weekly, Tuesdays (N = 10) September 20 October 18 November 15 December 6 or 8 December 15 B+ C 87 – 89% 60 – 79% Percentage of grade 10 at 1% each: 10% 10% 20% 20% 20% 20% Article critique of the week For each week, I’ve selected a recent publication from a major sociology journal that uses a technique we’re covering in class that week. Their use of methods might be exemplary, or it might leave you wondering why such a good journal accepted such a weak paper! Make a list of what you think the authors did well and what they did poorly. In class we’ll discuss the article, and you’ll submit your list to receive credit. I will simply note whether your work is complete or incomplete rather than judge the content of your responses, so don’t worry if you don’t understand every last thing the authors did. Bring your questions to class and we’ll work them out. Because the purpose of this exercise is to prepare for class discussion, I will not accept late work. Specific questions to ask yourself while reading and listmaking include: Does this analysis best answer the research question given the data the researchers had available, or is there a discrepancy between the research question and its empirical operationalization? Would you have chosen different statistics instead of or in addition to the statistics employed? Were you left with any critiques of the data or methods, or did the authors anticipate your concerns? If you had the data at hand, would you be able to replicate the analysis? Were the results interpreted clearly and correctly? Were the results presented effectively in tables and/or figures? Are the interpretations fair, or do they seem to go beyond what the data can really support? Research project I find that the best way to learn statistics is to practice them on real data that mean something to you. Therefore, the major product of this course will be a journal-style research article (i.e., 20-35 pages in length, including the standard sections: title page, abstract, introduction, methods, results, discussion, references, tables/figures). The article is required to SC 704 Topics in Multivariate Statistics page 3 of 11 include two or more of the methods covered in class from September 20 onward. For example, you might run a test of mediation in a complex survey dataset, or one outcome that requires binary logistic regression and a second that requires Poisson regression, or a multinomial logistic regression on multiply-imputed data. (Neither using Stata for your analyses nor testing a simple OLS regression model “count” towards your two methods.) In mid-September you’ll submit a one or two page project proposal. In mid-October and November you will submit written “updates” that will be drafts of sections of the paper. For instance, for one update you might submit the introduction and methods sections, and for the next update you might submit the results and discussion sections, or you might want me to look at revisions of your introduction and methods sections. Precisely what you turn in will be up to you, although I’m happy to make recommendations on a case-by-case basis. On December 6 or 8, you’ll give a conference-style presentation of your project in class, and on December 15, you’ll submit your completed paper. Assignments submitted after 11:59 pm on their due dates will be graded, but will assume a late penalty of 10 points per day. Although it’s certainly not a requirement, you should seriously consider using this project as an opportunity to meet a degree requirement (e.g., area exams), prepare a conference presentation, and/or develop a submission for publication. If you’re already working on a project, I encourage you to use this course to develop it. If you’re starting from scratch, many datasets are publicly available from universities and government agencies, and many more are available to researchers through BC’s subscription to the Inter-University Consortium for Political and Social Research (ICPSR) at the University of Michigan. Be aware that the deadline to submit a paper for presentation at ASA 2012 in Denver will be in mid-January, and your course paper will fit their submission criteria. Academic Honesty Your work must be your words and ideas. When writing papers, use quotation marks around someone else’s exact words and identify whose words they are. If you come across a good idea, by all means use it in your writing, but be sure to acknowledge whose idea it is. Do not allow another student to copy your work. Failure to comply will result in (a) automatic failure of the assignment, and (b) a report to the Dean and the Committee on Academic Integrity. For further information, please review the College’s policies on academic integrity here: http://www.bc.edu/offices/stserv/academic/resources/policy.html#integrity Schedule September 6 and September 8 Lecture / Stata session topics Sept. 6: Using Stata Sept. 8: Locating and using data for secondary research, presented by Rani Dalgin and Barbara Mento from BC Research Services Read for next week Long chapters 2 and 4 SC 704 Topics in Multivariate Statistics page 4 of 11 Long & Freese chapters 2 and 3 September 13 and September 15 Lecture / Stata session topic Ordinary least squares (OLS) regression: Review and diagnostics Read for next week Glavin, Paul, Scott Schieman, and Sarah Reid. 2011. “Boundary-Spanning Work Demands and Their Consequences for Guilt and Psychological Distress.” Journal of Health and Social Behavior 52(1): 43-57. Johnson, David R. and Lisa A. Elliott. 1998. “Sampling Design Effects: Do They Affect the Analysis of Data from the National Survey of Family and Households?” Journal of Marriage and Family 60(4): 993-1001. Kreuter, Frauke and Richard Valliant. 2007. “A Survey on Survey Statistics: What Is Done and Can Be Done in Stata.” Stata Journal 7(1): 1-21. Winship, Christopher and Larry Radbill. 1994. “Sampling Weights and Regression Analysis.” Sociological Methods and Research 23(2): 230-57. To do for next week Research project proposal Assessment of Glavin, Schieman, and Reid September 20 and September 22 Lecture / Stata session topic Complex survey data Read for next week Baron, Reuben M. and David A. Kenny. 1986. “The Moderator-Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations.” Journal of Personality and Social Psychology 51(6): 1173-82. Hayes, Andrew F. 2009. “Beyond Baron and Kenny: Statistical Mediation Analysis in the New Millennium.” Communication Monographs 76(4): 408-20. MacKinnon, David P., Amanda J. Fairchild, and Matthew S. Fritz. 2007. “Mediation Analysis.” Annual Review of Psychology 58: 593-614. SC 704 Topics in Multivariate Statistics page 5 of 11 Staff, Jeremy, Angel Harris, Ricardo Sabates, and Laine Briddell. 2010. “Uncertainty in Early Occupational Aspirations: Role Exploration or Aimlessness?” Social Forces, 89(2): 659-84. To do for next week Assessment of Staff, Harris, Sabates, and Briddell September 27 and September 29 Lecture / Stata session topic Mediation Read for next week Fairchild, Amanda J. and Samuel D. McQuillin. 2010. “Evaluating Mediation and Moderation Effects in School Psychology: A Presentation of Methods and Review of Current Practice.” Journal of School Psychology 48: 53-84. Schmutz, Vaughn, and Alison Faupel. 2010. “Gender and Cultural Consecration in Popular Music.” Social Forces 89(2): 685-708. Wu, Amery D. and Bruno D. Zumbo. 2008. “Understanding and Using Mediators and Moderators.” Social Indicators Research 87: 367-92. To do for next week Assessment of Schmutz and Faupel October 4 and October 6 Lecture / Stata session topic Moderation Read for next week Enders chapters 1, 2, and 10 Robnett, Belinda and Cynthia Feliciano. 2011. “Patterns of Racial/Ethnic Exclusion by Internet Daters.” Social Forces 89(3): 807-828. To do for next week Assessment of Belinda and Feliciano SC 704 Topics in Multivariate Statistics page 6 of 11 October 11 and October 13 Lecture / Stata session topic Missing data Read for next week Enders chapters 7, 8, and 9 Schafer, Markus H. and Tetyana Pylypiv Shippee. 2010. “Age Identity in Context: Stress and the Subjective Side of Aging.” Social Psychology Quarterly 73(3): 245-64. To do for next week Assessment of Schafer and Shippee Project update I October 18 and October 20 Lecture / Stata session topic Missing data Read for next week Long chapter 3 Long & Freese chapter 4 Williams, Kirk R. and Nancy G. Guerra. 2011. “Perceptions of Collective Efficacy and Bullying Perpetration in Schools.” Social Problems 58(1): 126-43. To do for next week Assessment of Williams and Guerra October 25 and October 27 Lecture / Stata session topic Binary outcomes Read for next week Long chapter 5 Long & Freese chapter 5 Bailey, Amy Kate, Stewart E. Tolnay, E. M. Beck, and Jennifer D. Laird. Forthcoming. “Targeting Lynch Victims: Social Marginality or Status Transgressions?” American Sociological Review. SC 704 Topics in Multivariate Statistics page 7 of 11 To do for next week Assessment of Bailey, Tolnay, Beck, and Laird November 1 and November 8 ***Note: No class November 3 due to home football game*** Lecture / Stata session topic Ordinal outcomes Read for next week Long chapter 6 Long & Freese chapters 6 and 7 Coverdill, James E., Carlos A. Lopez, and Michelle A. Petrie. 2011. “Race, Ethnicity, and the Quality of Life in America, 1972-2008.” Social Forces 89(3): 783-806. To do for next week Assessment of Coverdill, Lopez, and Petrie November 10 and November 15 Lecture / Stata session topic Nominal outcomes Read for next week Buchmann, Claudia, Dennis J. Condron, and Vincent J. Roscigno. 2010. “Shadow Education, American Style: Test Preparation, the SAT, and College Enrollment.” Social Forces 89(2): 435-62. (has commentaries and rejoinder; see separate .pdf files by Sigal Alon, Eric Grodsky, and Claudia Buchmann) Anderton, Douglas T. and Eric Cheney. 2009. “Log-Linear Analysis.” Pp. 285-306 in The Handbook of Data Analysis, edited by M. A. Hardy and A. Bryman. Thousand Oaks, CA: Sage. Long chapter 9 section 5 only To do for next week Research project update II Assessment of Buchmann, Condron, and Roscigno SC 704 Topics in Multivariate Statistics page 8 of 11 November 17 Lecture / Stata session topic Log-linear analysis To read for 11/29 Long & Freese chapter 8 Long chapter 8 November 22 Informal peer review of sections of one another’s papers November 24 No class; happy Thanksgiving! November 29 and December 1 Lecture topic Count data To read for next week Faris, Robert and Diane Felmlee. 2011. “Status Struggles: Network Centrality and Gender Segregation in Same and Cross Gender Aggression.” American Sociological Review 76(1): 48-73. To do for next week Assessment of Faris and Felmlee December 6 and December 8 Class presentations December 15 Final paper draft due by 11:59 pm SC 704 Topics in Multivariate Statistics page 9 of 11 Additional sources, for your reference On Stata: o Acock book o Stata Journal o The Getting Started with Stata (Mac, Unix, and Windows versions) manuals, available for loan at O’Neill or for purchase at http://www.stata.com/bookstore/gs.html o Stata User’s Guide, available for loan at O’Neill or for purchase at http://www.stata.com/bookstore/guide.html o Resources to help you learn and use Stata: http://www.ats.ucla.edu/stat/stata/ o SPost: Postestimation analysis with Stata: http://www.indiana.edu/~jslsoc/spost.htm Relevant titles in the “little green” Sage series: o Allison, Paul D. 2001. Missing Data. o Berry, William D. 1993. Understanding Regression Assumptions. o Borooah, Vani K. 2002. Logit and Probit: Ordered and Multinomial Models. o Fox, John. 1991. Regression Diagnostics. o Iacobucci, Dawn. 2008. Mediation Analysis. o Ishii-Kuntz, Masako. 1994. Ordinal Log-Linear Models. o Jaccard, James and Robert Turrisi. 2003. Interaction Effects in Multiple Regression. o Knoke, David and Peter J. Burke. 1980. Log-Linear Models. o Lee, Eun Sul and Ronald N. Forthofer. 2006. Analyzing Complex Survey Data. o Liao, Tim Futing. 1994. Interpreting Probability Models. o Menard, Scott. 1995. Applied Logistic Regression Analysis. o O’Connell, Ann A. 2005. Logistic Regression Models for Ordinal Response Variables. Sage. o Pampel, Fred C. 2000. Logistic Regression: A Primer. Full-length textbooks: o Afifi, Abdelmonem, Virginia A. Clark, and Susanne May. 2004. Computer-Aided Multivariate Analysis, 4th Edition. Chapman & Hall. SC 704 Topics in Multivariate Statistics page 10 of 11 o Cameron, A. Colin, and Pravin K. Travedi. 1998. Regression Analysis of Count Data. Cambridge University Press. o Heeringa, Steven G., Brady T. West, and Patricia A. Berglund. 2010. Applied Survey Data Analysis. CRC Press. o Hilbe, Joseph M. 2009. Logistic Regression Models. Taylor & Francis. o Little, Roderick J. and Donald B. Rubin. 2002. Statistical Analysis with Missing Data, 2nd Edition. New York, NY: Wiley. o Matthews, David E. and Vernon T. Farewell. 2007. Using and Understanding Medical Statistics. Karger. o Powers, Daniel A. and Yu Xie. 2000. Statistical Methods for Categorical Data Analysis Academic Press. o Woolridge, Jeffrey M. 2002. Econometric Analysis of Cross Section and Panel Data. MIT Press. On the web: Least squares: http://hadm.sph.sc.edu/COURSES/J716/demos/LeastSquares/LeastSquaresDemo.html PRODCLIN: http://www.public.asu.edu/~davidpm/ripl/Prodclin/ Mediation: http://www.public.asu.edu/~davidpm/ripl/mediate.htm Moderation: An introduction: http://davidakenny.net/cm/moderation.htm Statistical power calculators: http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize http://www.danielsoper.com/statcalc/default.aspx#c17 http://www.psycho.uni-duesseldorf.de/aap/projects/gpower/ Other books, chapters, and articles: Acock, Alan C. 2005. “Working with Missing Values.” Journal of Marriage and Family 67(4): 1012-28. Cohen, Jacob. 1994. “The Earth Is Round (p < .05).” American Psychologist 49: 9971003. SC 704 Topics in Multivariate Statistics page 11 of 11 Coxe, Stefany, Stephen G. West, and Leona S. Aiken. 2009. “The Analysis of Count Data: A Gentle Introduction to Poisson Regression and its Alternatives.” Journal of Personality Assessment 91(2): 121-36. Fullerton, Andrew S. 2009. “A Conceptual Framework for Ordered Logistic Regression Models.” Sociological Methods and Research 38(2): 306-47. Hoffman, Saul D. and Greg J. Duncan. 1988. “Multinomial and Conditional Logit Discrete-Choice Models in Demography.” Demography 25(3): 415-27. Little, Roderick J. 1988. “A Test of Missing Completely at Random for Multivariate Data with Missing Values.” Journal of the American Statistical Association 83: 1198-202. MacKinnon, David P., Chondra M. Lockwood, Jeanne M. Hoffman, Stephen G. West, and Virgil Sheets. 2002. “A Comparison of Methods to Test Mediation and Other Intervening Variable Effects.” Psychological Methods 7: 83-104. McFadden, Daniel. 1973. “Conditional Logit Analysis of Qualitative Choice Behavior.” Pp. 105-42 in P. Zarembka (Ed.), Frontiers of Econometrics, New York, NY, Academic Press. Miller, Jane E. 2005. The Chicago Guide to Writing about Multivariate Analysis. University of Chicago Press. ***Highly recommended!*** Peng, Chao-Ying Joanne, Kuk Lida Lee, and Gary M. Ingersoll. 2002. “An Introduction to Logistic Regression Analysis and Reporting.” The Journal of Education Research 96(1): 3-14. Raftery, Adrian E. 1995. “Bayesian Model Selection in Social Research.” Sociological Methodology 25: 111-63.