SC 704: Topics in Multivariate Statistics Fall 2010

advertisement
SC 704: Topics in Multivariate Statistics
Fall 2010
Wednesday 3:00 - 5:20 pm
306 Carney Hall
Professor: Sara Moorman
Office: 404 McGuinn Hall
Office hours: Mondays 1:00-3:00 pm or by appointment
E-mail: Sara.Moorman.1@bc.edu Please e-mail me from your BC account, and include “SC704”
in the subject line. My response may be delayed up to 24 hours.
Phone: (617) 552 - 4209
About the Course
This applied course is designed for students in sociology, education, nursing, organizational
studies, political science, psychology, or social work with a prior background in statistics at the
level of SC703: Multivariate Statistics. It assumes a strong grounding in multivariate regression
analysis. The major topics of the course will include OLS regression diagnostics, binary,
ordered, and multinomial logistic regression, models for the analysis of count data (e.g., Poisson
and negative binomial regression), treatment of missing data, and the analysis of clustered and
stratified samples. All analyses in the course will be conducted using Stata, but no previous Stata
experience is necessary.
Required Readings
Texts to purchase:
Acock, Alan C. 2008. A Gentle Introduction to Stata. 2nd ed. College Station, TX: Stata Press.
Long, J. Scott and Jeremy Freese. 2006. Regression Models for Categorical Dependent
Variables Using Stata. 2nd ed. College Station, TX: Stata Press.
McKnight, Patrick E., Katherine M. McKnight, Souraya Sidani, and Aurelio Jose Figueredo.
2007. Missing Data: A Gentle Introduction. New York: Guilford Press.
Course reserves online:
Access “read for next week” entries that are not from the required purchases as .pdf files through
the library website (http://www.bc.edu/libraries/) or through the link on the course Blackboard
page.
Blackboard
Visit the Blackboard page for this course regularly for announcements, grades, and course
materials.
SC 704 Topics in Multivariate Statistics
page 2 of 11
Required Software
This course requires the use of the program Stata. The most current version is available on the
computers in the Sociology graduate student lounge. For use on your own computer, you have
two options: (1) access the program through remote connection to apps.bc.edu, or (2) purchase
the program through BC’s Research Services. Ask your department administrator about
Campuswide GradPlan. Prices start at $29 (price for a six-month license).
Assessment
Grading scale
A
93 – 100%
B
83 – 86%
F
0 – 59%
Task
Article of the week
Project update
Presentation
Research paper
AB-
90 – 92%
80 – 82%
Due date
Weekly (N = 12)
September 22
October 6, 20
November 3, 17
December 1
December 1
December 15
B+
C
87 – 89%
60 – 79%
Percentage of grade
10 at 1% each: 10%
6 at 10% each: 60%
10%
20%
Article of the week: Each week, you will read a published article that uses the method we’re
covering that week. I chose articles on substantive (i.e., not methodological) topics that have
been published recently in major sociology journals, but I did not select based on execution or
explanation of methods. That is, as you read the articles, you should critique them: Does this
analysis best answer the research question given the data the researchers had available, or is there
a discrepancy between the research question and its empirical operationalization? Would you
have chosen different statistics instead of or in addition to the statistics employed? Were you left
with any critiques of the data or methods, or did the authors anticipate your concerns? If you had
the data at hand, would you be able to replicate the analysis? Were the results interpreted clearly
and correctly? Were the results presented effectively in tables and/or figures? Are the
interpretations fair, or do they seem to go beyond what the data can really support (e.g., are there
causal claims based on cross-sectional data)? Make a list of what you think the authors did well
and what they did poorly. In class we’ll discuss the article of the week, and you’ll submit your
list to receive credit. I will simply note whether your work is complete or incomplete rather than
judge the content of your responses. Because the purpose of this exercise is to prepare for class
discussion, and because I am requiring only 10 out of 12 possible lists, I will not accept late
work.
Research project: I find that the best way to learn statistics is to practice them on real data that
mean something to you. Therefore, the major product of this course will be a journal-style
research article using two or more of the methods covered in class from September 22 onward
(e.g., a test of mediation in a complex survey dataset, an outcome that requires binary logistic
regression and one that requires Poisson regression, a multinomial logistic regression on
multiply-imputed data). If you do not already have your own data, many datasets are publicly
SC 704 Topics in Multivariate Statistics
page 3 of 11
available, and many more are available through BC’s subscription to ICPSR. Every other week
during the semester, you will submit a project update. The first “update” will be a project
proposal, and updates after that will be drafts of sections of the paper. For instance, for one
update you might submit the methods section, for the next update you might draft the
introduction, and for the update after that you might submit a draft of the methods section that
you’ve revised based on my feedback. Precisely what you turn in will be up to you, although I’m
happy to make recommendations on a case-by-case basis. On December 1, you’ll give a 15-20
minute conference style presentation of your project in class, and on December 15, you’ll submit
your completed paper. I’ll provide further details about length, formatting, etc. in future classes.
The final paper may not be completed late. Project updates submitted after Wednesday at 11:59
pm will be graded, but will assume a late penalty of 10 points per day.
Academic Honesty
Your work must be your words and ideas. When writing papers, use quotation marks around
someone else’s exact words and identify whose words they are. If you come across a good idea,
by all means use it in your writing, but be sure to acknowledge whose idea it is. Do not allow
another student to copy your work. Failure to comply will result in (a) automatic failure of the
assignment, and (b) a report to the Dean and the Committee on Academic Integrity. For further
information, please review the College’s policies on academic integrity here:
http://www.bc.edu/offices/stserv/academic/resources/policy.html#integrity
Schedule
September 8: Ordinary Least Squares (OLS) Regression: Review and Diagnostics
Read for 09/15:

Acock chapter 8 and pp. 219-250

Long & Freese (LF) chapter 2

McVeigh, Rory, and D. Diaz Maria-Elena. 2009. Voting to ban same-sex marriage:
Interests, values, and communities. American Sociological Review 74: 891-915. doi:
10.1177/000312240907400603
Additional sources on OLS regression, for your reference:

Berry, William D. 1993. Understanding Regression Assumptions. Sage. ISBN:
9780803942639

Cohen, Jacob. 1994. The Earth is round (p < .05). American Psychologist 49: 997-1003.

Fox, John. 1991. Regression Diagnostics. Sage. ISBN: 9780803939714

Least squares applet:
http://hadm.sph.sc.edu/COURSES/J716/demos/LeastSquares/LeastSquaresDemo.html

Raftery, Adrian E. 1995. Bayesian model selection in social research. Sociological
Methodology 25: 111-63.
SC 704 Topics in Multivariate Statistics
page 4 of 11
September 15: Using Stata
Submit:

Assessment of McVeigh and Maria-Elena
Read for 09/22:

LF chapter 3

Rodgers-Farmer, Antoinette and Diane Davis. 2001. Analyzing complex survey data.
Social Work Research 25(3): 185-192.
Additional sources on Stata, for your reference:

Acock chapters 1-4, pp. 316-321.

LF chapter 1

Stata Journal

The Getting Started with Stata (Mac, Unix, and Windows versions) manuals, available
for loan at O’Neill or for purchase at http://www.stata.com/bookstore/gs.html

Stata User’s Guide, available for loan at O’Neill or for purchase at
http://www.stata.com/bookstore/guide.html

Resources to help you learn and use Stata: http://www.ats.ucla.edu/stat/stata/

SPost: Postestimation analysis with Stata: http://www.indiana.edu/~jslsoc/spost.htm
September 22: Complex Survey Data
Submit:

Research project proposal
Read for 09/29:

Hayes, Andrew F. 2009. Beyond Baron and Kenny: Statistical mediation analysis in the
new millennium. Communication Monographs 76(4): 408-20. doi:
10.1080/03637750903310360

MacKinnon, David P., Amanda J. Fairchild, and Matthew S. Fritz. 2007. Mediation
analysis. Annual Review of Psychology 58: 593-614. doi:
10.1146/annurev.psych.58.110405.085542

Morgan, Stephen L. and Jennifer J. Todd. 2009. Intergenerational closure and academic
achievement in high school: A new evaluation of Coleman’s conjecture. Sociology of
Education 82: 267-86. doi: 10.1177/003804070908200304
SC 704 Topics in Multivariate Statistics
page 5 of 11
Additional sources on complex survey data, for your reference:

Acock pp. 240-242 (on weighting).

Heeringa, Steven G., Brady T. West, and Patricia A. Berglund. 2010. Applied Survey
Data Analysis. CRC Press. ISBN: 9781420080667

Lee, Eun Sul and Ronald N. Forthofer. 2006. Analyzing Complex Survey Data. Sage.
ISBN: 9780761930389

Winship, Christopher and Radbill, Larry. 1994. Sampling weights and regression
analysis. Sociological Methods and Research 23(2): 230-57. doi:
10.1177/0049124194023002004
September 29: Mediation
Submit:

Assessment of Morgan and Todd
Read for 10/06:

Acock pp. 250-255

LF pp. 423-427

Beyerlein, Kraig and David Sikkink. 2008. Sorrow and solidarity: Why Americans
volunteered for 9/11 relief efforts. Social Problems 55(2): 190-215. doi:
10.1525/sp.2008.55.2.190

Brambor, Thomas and William Roberts Clark. 2006. Understanding interaction models:
Improving empirical analyses. Political Analysis 14: 63-82. doi: 10.1093/pan/mpi014
Additional sources on mediation, for your reference:

Iacobucci, Dawn. 2008. Mediation Analysis. Sage. ISBN: 9781412925693

MacKinnon, David P., Chondra M. Lockwood, Jeanne M. Hoffman, Stephen G. West,
and Virgil Sheets. 2002. A comparison of methods to test mediation and other
intervening variable effects. Psychological Methods 7: 83-104. doi: 10.1037//1082989X.7.1.83

PRODCLIN: http://www.public.asu.edu/~davidpm/ripl/Prodclin/

Statistical mediation: http://www.public.asu.edu/~davidpm/ripl/mediate.htm
SC 704 Topics in Multivariate Statistics
page 6 of 11
October 6: Moderation
Submit:

Research project update

Assessment of Beyerlein and Sikkink
Read for 10/13:

McKnight et al. chapters 1-5

Schnittker, Jason. 2009. Mirage of health in the era of biomedicalization: Evaluating
change in the threshold of illness, 1972-1996. Social Forces, 87: 2155-82. doi:
10.1353/sof.0.0218
Additional sources on moderation, for your reference:

Jaccard, James and Robert Turrisi. 2003. Interaction Effects in Multiple Regression.
Sage. ISBN: 9780761927426

Miller, Jane E. 2004. The Chicago Guide to Writing about Numbers. University of
Chicago Press. ISBN: 0226526313

Moderation variables: An introduction: http://davidakenny.net/cm/moderation.htm

Statistical power calculators:
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize
http://www.danielsoper.com/statcalc/default.aspx#c17
http://www.psycho.uni-duesseldorf.de/aap/projects/gpower/
October 13: Missing Data
Submit:

Assessment of Schnittker
Read for 10/20:

Baller, Robert D. and Kelly K. Richardson. 2009. The “dark side” of the strength of weak
ties: The diffusion of suicidal thoughts. Journal of Health and Social Behavior 50(3):
261-76. doi: 10.1177/002214650905000302

McKnight et al. chapters 6-11
Additional sources on missing data, for your reference:

Acock, Alan C. 2005. Working with missing values. Journal of Marriage and Family
67(4): 1012-28. doi: 10.1111/j.1741-3737.2005.00191.x
SC 704 Topics in Multivariate Statistics
page 7 of 11

Allison, Paul D. 2001. Missing Data. Sage. ISBN: 9780761916727

Little, Roderick J. (1988). A test of missing completely at random for multivariate data
with missing values. Journal of the American Statistical Association 83: 1198-202.

Little, Roderick J. and Donald B. Rubin. 2002. Statistical Analysis with Missing Data, 2nd
Edition. New York, NY: Wiley. ISBN: 9780471183860
October 20: Missing Data
Submit:

Research project update

Assessment of Baller and Richardson
Read for 10/27:

Acock chapter 11

LF chapter 4

Wiepking, Pamala and Ineke Maas. 2009. Resources that make you generous: Effects of
social and human resources on charitable giving. Social Forces 87(4): 1973-95. doi:
10.1353/sof.0.0191
October 27: Binary Outcomes
Submit:

Assessment of Wiepking and Maas
Read for 11/03:

LF chapter 5

Liu, Ka-Yuet, Marissa King, and Peter S. Bearman. 2010. Social influence and the autism
epidemic. American Journal of Sociology 115(5): 1387-434.
Additional sources on binary outcomes, for your reference:

Afifi, Abdelmonem, Virginia A. Clark, and Susanne May. 2004. Computer-Aided
Multivariate Analysis, 4th Edition (Chapter 12). Chapman & Hall. ISBN: 9781584883081

Hilbe, Joseph M. 2009. Logistic Regression Models (Chapters 1-9). Taylor & Francis.
ISBN: 9781420075755

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent
Variables (Chapter 3). Sage. ISBN: 0803973748
SC 704 Topics in Multivariate Statistics
page 8 of 11

Menard, Scott. 1995. Applied Logistic Regression Analysis. Sage. ISBN: 9780761922087

Pampel, Fred C. 2000. Logistic Regression: A Primer. Sage. ISBN: 9780761920106

Peng, Chao-Ying Joanne, Kuk Lida Lee, and Gary M. Ingersoll. 2002. An introduction to
logistic regression analysis and reporting. The Journal of Education Research 96(1): 314. doi: 10.1080/00220670209598786

Powers, Daniel A. and Yu Xie. 2000. Statistical Methods for Categorical Data Analysis
(Chapter 3). Academic Press. ISBN: 9780125637367

Thou Shalt Not Report Odds Ratios:
http://itre.cis.upenn.edu/~myl/languagelog/archives/004767.html
November 3: Ordinal Outcomes
Submit:

Research project update

Assessment of Liu, King, and Bearman
Read for 11/10:

LF chapter 6

Griffin, Larry J. and Kenneth A. Bollen. 2009. What do these memories do? Civil rights
remembrance and racial attitudes. American Sociological Review 74: 594-614. doi:
10.1177/000312240907400405
Additional sources on ordinal outcomes, for your reference:

Borooah, Vani K. 2002. Logit and Probit: Ordered and Multinomial Models. Sage.
ISBN: 0761922423

Fullerton, Andrew S. 2009. A conceptual framework for ordered logistic regression
models. Sociological Methods and Research 38(2): 306-47. doi:
10.1177/0049124109346162

Hilbe, Joseph M. 2009. Logistic Regression Models (Chapter 10). Taylor & Francis.
ISBN: 9781420075755

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent
Variables (Chapter 5). Sage. ISBN: 0803973748

O’Connell, Ann A. 2005. Logistic Regression Models for Ordinal Response Variables.
Sage. ISBN: 9780761929895
SC 704 Topics in Multivariate Statistics

page 9 of 11
Powers, Daniel A. and Yu Xie. 2008. Statistical Methods for Categorical Data Analysis
(Chapter 6). Academic Press. ISBN: 9780125637367
November 10: Nominal Outcomes
Submit:

Assessment of Griffin and Bollen
Read for 11/17:

Bjornstrom, Eileen E. S., Robert L. Kaufman, Ruth D. Peterson, and Michael D. Slater.
2010. Race and ethnic representations of lawbreakers and victims in crime news: A
national study of television coverage. Social Problems 57(2): 269-93. doi:
10.1525/sp.2010.57.2.269

LF chapters 7-8

Mood, Carina. 2010. Neighborhood social influence and welfare receipt in Sweden: A
panel data analysis. Social Forces 88(3): 1331-56. doi: 10.1353/sof.0.0304
Additional sources on nominal outcomes, for your reference:

Borooah, Vani K. 2002. Logit and Probit: Ordered and Multinomial Models. Sage.
ISBN: 0761922423

Hilbe, Joseph M. 2009. Logistic Regression Models (Chapter 11). Taylor & Francis.
ISBN: 9781420075755

Hoffman, Saul D. and Greg J. Duncan. 1988. Multinomial and conditional logit discretechoice models in demography. Demography 25(3): 415-27.

Liao, Tim Futing. 1994. Interpreting Probability Models. Sage. ISBN: 0803949995

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent
Variables (Chapter 6). Sage. ISBN: 0803973748

McFadden, Daniel. 1973. Conditional logit analysis of qualitative choice behavior. Pp.
105-42 in P. Zarembka (Ed.), Frontiers of Econometrics, New York, NY, Academic
Press. ISBN: 9780127761503

Powers, Daniel A. and Yu Xie. 2008. Statistical Methods for Categorical Data Analysis
(Chapter 7). Academic Press. ISBN: 9780125637367

On relative risk ratios (exponentiated betas) in Stata:
http://www.stata.com/statalist/archive/2005-04/msg00678.html
SC 704 Topics in Multivariate Statistics
page 10 of 11
November 17: Count Data
Submit:

Research project update

Assessment of Bjornstrom, Kaufman, Peterson, and Slater

Assessment of Mood
Read for 12/01:

Hook, Jennifer L. 2010. Gender inequality in the welfare state: Sex segregation in
housework, 1965-2003. American Journal of Sociology 115(5): 1480-523.
Additional sources on count data, for your reference:

Cameron, A. Colin, and Pravin K. Travedi. 1998. Regression Analysis of Count Data.
Cambridge University Press. ISBN: 0521632013

Coxe, Stefany, Stephen G. West, and Leona S. Aiken. 2009. The analysis of count data:
A gentle introduction to Poisson regression and its alternatives. Journal of Personality
Assessment 91(2): 121-36. doi: 10.1080/00223890802634175

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent
Variables (Chapter 8). Sage. ISBN: 0803973748

Matthews, David E. and Vernon T. Farewell. 2007. Using and Understanding Medical
Statistics (Chapter 12). Karger. ISBN: 3805581890

Woolridge, Jeffrey M. 2002. Econometric Analysis of Cross Section and Panel Data
(Chapter 19). MIT Press. ISBN: 0262232197
November 24: No class, Thanksgiving
December 1: Presentations
Submit:

Research project update

Assessment of Hook
Read for 12/08:

Afifi, Abdelmonem, Virginia A. Clark, and Susanne May. 2004. Chapter 17 in Computer
Aided Multivariate Analysis, 4th ed. Boca Raton, FL: CRC Press.
SC 704 Topics in Multivariate Statistics

page 11 of 11
Charles, Maria and Karen Bradley. 2009. Indulging our gendered selves? Sex segregation
by field of study in 44 countries. American Journal of Sociology 114(4): 924-76.

Foster, Jeremy J., Emma Barkus, and Christian Yavorsky. 2006. Pp. 47-56 in
Understanding and Using Advanced Statistics. Thousand Oaks, CA: Sage.
December 8: Log-Linear Analysis
Submit:

Assessment of Charles and Bradley
Additional sources on log-linear analysis, for your reference:

Hardy, Melissa A. and Alan Bryman. 2009. The Handbook of Data Analysis (Chapter
12). Sage. ISBN: 9781848601161

Ishii-Kuntz, Masako. 1994. Ordinal Log-Linear Models. Sage. ISBN: 9780803943766

Knoke, David and Peter J. Burke. 1980. Log-Linear Models. Sage. ISBN: 080391492X

Powers, Daniel A. and Yu Xie. 2008. Statistical Methods for Categorical Data Analysis
(Chapter 4). Academic Press. ISBN: 9780125637367
Download