* Dr Juliet Hassard Deputy Director, Centre for Sustainable Working Life Lecturer in Occupational Health Psychology * What is secondary data analysis? * Types and sources of data * Opportunities, limitations, and challenges * Ethics * Thinking forward: funding and publishing. * * * The use of secondary data, or existing data that are freely available to researchers who were not involved in the original study, has a long and rich tradition in the social sciences [1]. * Sociology, economics, etc. * Traditionally, the field of psychology (any many of those within it) have dismissed the importance and value of studies using secondary data. * But times are changing……. * Why collect new data, given the wealth of existing data sets that can be used to answer important questions? * Longitudinal & large sample sizes. * * To ask and answer important questions. For example, * To understand the longitudinal nature of relationships. * To understand group differences, trends over time? * To explore new and emerging social phenomena. * More data (and types of data) are being collected (and available!) then ever before. * There is a unique opportunity to explore this ever growing source(s) of data, and to ask important research questions. * * Let’s get creative…….. * In small groups of 3-5. Discuss and outlines 4-5 different types of data/ types of information that could be used to investigate an important psychological research question. * Chat forums Blogs Published business reports Online support groups Second life App technology * * The UK Data Service * https://www.ukdataservice.ac.uk/ * Census data * International macrodata * Longitudinal studies * Qualitative/mixed methods * UK surveys * The National Data Service * http://www.nationaldataservice.org/a bout/ * Individual studies may have different access points. * E.g., Whitehall II Study, UCL. * Secondary data is everywhere – some in the public forum. * Online support groups: * COULSON, N.S., 2015. Exploring patient's engagement with web-based peer support for Inflammatory Bowel Disease: forums or Facebook? Health Psychology Update. 42(2), 3-9. * Longitudinal data (Whitehall II survey) * Kouvonen, A., et al . (2011). Negative aspects of close relationships as a predictor of increased body mass index and waist circumference: the Whitehall II study. American journal of public health, 101(8), 1474-1480. * Twitter, Instragram….. * Whiting, R., & Pritchard, K. (2015). “Big Data? Qualitative Approaches to Digital Research", Qualitative Research in Organizations and Management: An International Journal, Vol. 10 Iss: 3, pp.296 - 298 * * Low response rate High attrition rates Access to high quality measures Small sample size Reliance on convenience samples ‘Traditional’ Challenges in Psychological Research Correlation does not equal causation Limited money & resources to collect primary data Limited scope for extensive comparative research (across groups or internationally) * The data has already been collected. * Save time – primary researcher does not have to design study and collect new set of data. * The types of data that are typically collected tend to be higher quality than could be obtained by individual researchers. * Typically longitudinal, have large sample sizes that have been obtained using elaborate sample plans. Ref: Trzesniewski et al., 2011 * * Learning how to work with, manage and analyse secondary data can provide individual researchers with the raw materials to make important contributions to the scientific literature * … using data sets with impressive levels of external validity. Ref: Trzesniewski et al., 2011 * * Open-source approach to research * Replicate findings using similar analyses * Encourages careful reporting and justification of analytical decisions. * Allows researchers to test alternative explanations and competing models. * Encourages transparency, which in turns help facilitates good science. Ref: Trzesniewski et al., 2011 * * The data has already been * collected!!! * You may not have all the information on how or why certain types of information was collected. * You may not know of any particular problems that occurred during data collection. * Sometimes you are left wanting more ….. * The temptation: a statistical fishing trip. * Great research is driven by a good research question that is strongly underpinned and shaped by theory. * The purpose of analysing data is to refine the scientific understanding of the world and to develop theories by testing empirical hypotheses. * “Mo Money Mo Problems” - Mo Data, Mo Temptations ? * A note about statistical power. Ref: Trzesniewski et al., 2011 * * Considerable time and effort: * is invested by the researcher to understand the nature and structure of a data set. * is needed by the researcher to explain and justify the theoretical and analytical approached used. * Although, I would argue there is real advantages to the time invested in doing this. Ref: Trzesniewski et al., 2011 * * Measures in these datasets are often abbreviated. Often because the projects themselves were designed to serve multiple purposes and to support a multidisciplinary team. * Shortened measures, mix-levels of data, and single items measures. * These datasets often have impressive levels of breadth (many constructs are measured), but often with an associated cost in terms of depth of measurement. * Therefore, measurement issues are ~ therefore ~ one of the major issues in secondary data analysis * These issues often require quite a bit of conceptual consideration & defending in the peer-review process. * Ref: Trzesniewski et al., 2011 * A good grounding in psychometrics and Classic Test Theory. * You need to carefully consider and evaluate the trade-offs in reliability and validity. * You need to defend your position when writing up. * You need to understand how measurement issues frame your findings; and, in turn, your interpretation of your findings. Ref: Trzesniewski et al., 2011 * * Creating and managing data files * Data inventory * Research journal * Approach to missing data and data screening procedures * Use of and/or development of constructs * Use of proxy variables * Development & testing of composite measures * Single item measures * Accounting for the data structure in your analysis * * * * The aim of the doctoral thesis was to develop and test a theoretical model seeking to describe the aetiological role of psychosocial processes, in and out of the workplace, in predicting gender-related diversity issues in men’s and women’s health at a structural/population level. * An iterative multi-stage methodology was utilised to develop and test the proposed theoretical model. • Literature review – Theoretical framework Stage one • Identification of suitable source of data Stage two • Data review (data inventory) • Measurement development and testing Stage three • Data cleaning * * European Working Conditions Survey * Pan-European cross sectional survey of working conditions, worker’s health and safety, and living conditions (n = over 40, 000 workers) * Now on the 6th wave of data collection. * The survey as evolved over time asking more questions. * Survey items are informed and based on contemporary theory * The measures used are not always based on a validated psychometric measures * Single items vs. composite measures? * * The vast majority of latent conceptual constructs are complex and multifaceted in nature. * Consequently, the use of a single item as a theoretical concept may not yield an accurate, comprehensive, and reliable measurement of the given construct of interest. * * The guiding premise by many in the scientific community is that multiple responses reflect the “true” response more accurately than does a single response. * Imprecision in measurement is one of the causes (although not the sole cause) of measurement error. key * Measurement error creates ‘noise’ to the observed variables. * * Inaccurate and unreliable measurement of a concept results in key concerns regarding the overall validity and reliability of the hypotheses tested using this (or these) given measurement(s). * It is generally agreed/ suggested that research findings that are valid, reliable and generalizable, are built on a solid foundation of accurate and consistent measurement. * * The primary objective of creating a series of summated (or composite) scales is to avoid the exclusive use of, or dependence on, single item constructs where possible. * The use of several variables as indicators provides an opportunity to represent differing facets of a given concept, with the aim of yielding a more well-rounded perspective and, arguably, a better measurement of the given concept * * * Researchers need to ask: how was consent obtained in the original study? Where sensitive data is involved, we cannot/ should not assume informed consent. * Given that it is usually not feasible to seek additional consent, a professional judgement may have to be made about whether the use of secondary data violates the contract made between subjects and the primary researchers. * Growing interest in secondary data make it imperative that researchers in general now consider obtaining consent, which covers the possibility of secondary analysis as well as the research in hand. * This is consistent with professional guidelines on ethical practice * Heaton, J (1998). Secondary analysis of qualitative data. Social Research Update (issue 22). See: http://sru.soc.surrey.ac.uk/SRU22.html * Can you publish secondary data analysis – yes! * Never forget: the central role of theory. * Be detail orientated! * Justifying your research question is important, but you also need to be prepared to justify and outline the logic of your analysis framework and approach. * Understand and reflect on how the research design or any experienced methodological issues of your secondary data may impact or frame the interpretation of your results. * * Secondary data analysis is an important and useful research methodology. * There are many benefits and strengths to using secondary sources of data. * But there also important pragmatic and methodological challenges that face researchers. * * Trzesniewski, K. H., Donnellan, M., & Lucas, R. E. (2011). Secondary data analysis: An introduction for psychologists. American Psychological Association. * Vartanian, T. P. (2010). Secondary data analysis. Oxford University Press. * Heaton, J. (2008). Secondary analysis of qualitative data: An overview. Historical Social Research/Historische Sozialforschung, 3345. * Hinds, P. S., Vogel, R. J., & Clarke-Steffen, L. (1997). The possibilities and pitfalls of doing a secondary analysis of a qualitative data set. Qualitative Health Research, 7(3), 408-424. * * j.hassard@bbk.ac.uk