Statistical Analysis SC504-G-SP / HS927-G-SP
Spring Term 2008
Course Tutor: Leanne Andrews
Tel: 4466
Email: landre@essex.ac.uk
Room: JT18 (Department of Health and Human Sciences)
Office Hours: By appointment only
Times and Rooms
14.00-16.00: Lab G
16.00-17.00: Lab G (optional)
Aims of the course
This course is a practical introduction to analysing data produced by survey research and to its potential for investigating the social world. Using a combination of seminar and computer lab based formats the course is intended to provide participants with the tools to understand the principles and approaches behind quantitative data analysis. As well as enabling you to conduct investigations relevant to your own research it will equip you to be a critical user of other’s research.
Participants will cover methods for describing and representing information contained in a range of ‘real-life’ surveys, and will then move on to explore approaches and methods, including multivariate methods, designed to test theories about relationships between individuals’ characteristics, such as income, age, health, social class, ethnicity etc.
The course will be practically oriented focusing on the application of techniques to a variety of sources and research questions and to the reading and interpretation of results, and the evaluation of their implications for policy issues. The software employed for the quantitative investigation will be the widely-used SPSS package. Expertise in using this software will be built up over the course, and prior knowledge is not assumed.
At the end of the course, participants should be able to: identify and apply appropriate statistical techniques for a variety of research questions; present and interpret statistical findings; and recognise and discuss the strengths and limitations of alternative techniques and data.
As well as emphasising the practicalities of doing statistical analysis including multivariate analysis, the course will have a clear focus on interpretation.
The primary software package will be SPSS. Microsoft Excel will also be used.
Structure of Course
The course will be taught as nine 1-hour sessions to introduce the principles of the analysis and how to apply techniques with a two hour practical lab session for familiarising students with the approaches, practising and interpreting techniques, and becoming conversant with the SPSS statistical software, using guided exercises. The second hour of the two-hour session is optional but is highly recommended, particular for those students who are less confident with the applications or with the software. Students are required to undertake a significant amount of independent reading and computer practice in between sessions.
At the end of the course, participants should be able to: identify and apply appropriate statistical techniques, including multivariate statistical techniques, for a variety of research questions, using standard statistical software; present and interpret statistical findings correctly; recognise and discuss the strengths and limitations of alternative techniques and data.
Datasets
Lectures, computer practicals and assignments will draw on a selection of large scale surveys such as: the British Household Panel Survey; the Health Survey for England, the
General Household Survey; the British Crime Survey; the Family Resources Survey; the
Labour Force Survey.
Assessment
SC504
The course is assessed by three assignments, the first worth 20% the second worth 30% and the third worth 50%. The first assignment will involve some basic exercises around data manipulation and simple statistical measures. The second assignment will involve conducting and reporting the results of a fully-specified series of practical exercises. The second will require the students to design, conduct, report and interpret a piece of multivariate analysis within pre-set guidelines. Together the three assignments are equivalent to 3,500 word essay.
All three assignments should be handed in together by 12 noon on Friday 25 th
April 2008.
HS927
The course is assessed by two assignments, the first worth a third and the second twothirds of the overall mark. The first assignment will involve conducting and reporting the results of a fully-specified series of practical exercises. The second will require the students to design, conduct, report and interpret a piece of multivariate analysis within pre-set guidelines. Together the two assignments are equivalent to 3,500 word essay.
Both assignments should be handed in together by 12 noon on Friday 25 th April 2008.
Outline of Coverage
Week 16 (18 th January 2008): Principles and practicalities of manipulating data.
Week 17 (25 th January 2008): Relationships between variables: bivariate statistics
Week 18 (1 st February 2008): Ordinary Least Squares (bivariate)
Week 19 (8 th February 2008): Multiple Regression (1)
Week 20 (15 th February 2008): Multiple Regression (2)
Week 21 (22 nd February 2008) READING WEEK
Week 22 (29 th February 2008): Introduction to logistic regression
Week 23 (7 th March 2008): Logistic regression (2)
Week 24 (14 th March 2008): Multinomial and ordered logistic regression
Detailed Course Outline
Week 16 (18 th January 2008): Principles and practicalities of manipulating data.
Introduction to course: answering questions using secondary data analysis. Manipulation of survey data: variables, labels, sorting and selecting data; creating new variables: transforming and recoding. Measures of central tendency and dispersion
Week 17 (25 th January 2008): Bivariate statistics
Recap on statistical inference. Moving on from first week to look at relationships between variables: bivariate statistics: chi-square, t-tests and correlation.
Week 18 (1 st February 2008): Ordinary Least Squares – OLS (bivariate)
Coverage of simple (bivariate) ordinary least squares regression; calculating coefficient estimates, statistical significance of coefficient estimates, assessing goodness of fit.
Week 19 (8 th February 2008): Multiple Regression (1)
Multiple Regression: extension of the simple OLS regression model to the case of more than one independent variable. Dummy and other categorical independent variables; interaction terms.
Week 20 (15 th February 2008): Multiple Regression (2)
Continuation from week 19 and estimation, presentation and interpretation of OLS models with complex independent variables
Week 21 (22 nd February 2008) READING WEEK
Week 22 (29 th February 2008): Introduction to logistic regression
Problems with OLS where the dependent variable is categorical. Introduction to logistic regression for the case where the dependent variable is binary. Interpreting logistic coefficient estimates. Confidence intervals for coefficient estimates.
Week 23 (7 th March 2008): Logistic regression (2)
Continuation from Week 22 and understanding log odds, odds, odds ratios and probabilities. Predicting probabilities using logistic regression estimates. Examples of good and bad practice in presenting and interpreting logistic regression.
Week 24 (14 th March 2008): Multinomial and ordered logistic regression
Extension of the logistic regression model to the case of a dependent variable which can take more than two values. Interpretation of multinomial and ordered logistic regression
Reading
Allison, Paul D. (1984) Event History Analysis: Regression for Longitudinal Event Data ,
Sage.
Agresti, A. and Finlay, B. (1997) Statistical Methods for the Social Sciences, Third
Edition , Prentice Hall.
Bernard, H.R. (2000) Social research methods : qualitative and quantitative approaches,
Sage.
Bohrnstedt, G.W. and Knoke, D. (1994) Statistics for Social Data Analysis. 3rd Edition .
Berk, Richard A (2003) Regression Analysis: A Constructive Critique , Sage.
Borooah, Vani K (2002) Logit and Probit: Ordered and Multinomial Models , Sage.
Crown W (1998) Statistical Models for the Social and Behavioural Sciences , Praeger.
Dale, A. Arber, S. and Procter, M. (1988) Doing secondary analysis, Unwin Hyman.
Erickson, B.H. and Nosanchuk, T.A. (1992) Understanding Data . Second Edition, Open
University Press.
Field, A. (2006). Discovering Statistics Using SPSS, 2 nd
edition. Sage.
Frankfort-Nachmias, C. and Nachmias, D. (1996) Research Methods in the Social
Sciences . Fifth Edition, Arnold.
Gilbert, N. (2001) Researching social life.
2nd Edition, Sage.
Judd, C.M, Smith, E.R and Kidder, L.H. (1991) Research methods in social relations , 6th
Edition, Holt, Rinehart, and Winston.
Lieberson, S. (1987) Making it count : the improvement of social research and theory,
University of California Press.
Knoke, D., Borhnstedt, G.W. and Mee, A.P. (2002) Statistics for Social Data Analysis ,
4 th
Edition, Thomson.
Liao, TimFuting (1994) Interpreting Probability Models: Logit, Probit and other
Generalized Linear Models , Sage.
Marsh, C. (1988) Exploring Data: An Introduction to Data Analysis for Social Scientists ,
Polity.
Menard, Scott (1995) Applied Logistic Regression Analysis , Sage.
Menard, Scott (2002) Longitudinal Research , Sage.
Pallant, Julie (2001) SPSS Survival Manual: a step by step guide to data analysis using
SPSS for Windows (version 10 and 11) , Open University Press.
Powers, Daniel A and Xie, Yu (2000) Statistical Methods for Categorical Data Analysis ,
Academic Press.
Rose, D. and Sullivan, O. (1996) Introducing Data Analysis for Social Scientists . Second
Edition, Open University Press.
Wonnacott, R.J. and Wonnacott, T.H. (1985) Introductory Statistics Fourth Edition,
Wiley.
Yamaguchi, K (1991) Event History Analysis , Sage.
Websites
ESDS (The Economic and Social Data Service): http://www.esds.ac.uk/
The Data Archive: http://www.data-archive.ac.uk/
Tutorials on internet skills for social research methods, for social statistics and for sociologists (among others): http://www.vts.intute.ac.uk/
Encyclopaedia of statistics: www.statsoft.com/textbook/stathome.html
Electronic Statistics Textbook: http://www.statsoft.com/textbook/stathome.html
5