Secondary Data Analysis

advertisement
*
Dr Juliet Hassard
Deputy Director, Centre for Sustainable Working Life
Lecturer in Occupational Health Psychology
* What is secondary data
analysis?
* Types and sources of data
* Opportunities,
limitations, and
challenges
* Ethics
* Thinking forward:
funding and publishing.
*
*
* The use of secondary data, or
existing data that are freely
available to researchers who
were not involved in the
original study, has a long and
rich tradition in the social
sciences [1].
* Sociology, economics, etc.
* Traditionally, the field of
psychology (any many of those
within it) have dismissed the
importance and value of
studies using secondary data.
* But times are changing…….
* Why collect new data, given
the wealth of existing data
sets that can be used to
answer important questions?
* Longitudinal & large sample
sizes.
*
* To ask and answer
important questions.
For example,
* To understand the
longitudinal nature
of relationships.
* To understand
group differences,
trends over time?
* To explore new and
emerging social
phenomena.
* More data (and types of
data) are being collected
(and available!) then
ever before.
* There is a unique
opportunity to explore
this ever growing
source(s) of data, and to
ask important research
questions.
*
* Let’s get creative……..
* In small groups of 3-5. Discuss and outlines 4-5
different types of data/ types of information
that could be used to investigate an important
psychological research question.
*
Chat forums
Blogs
Published
business
reports
Online support groups
Second life
App technology
*
* The UK Data Service
* https://www.ukdataservice.ac.uk/
* Census data
* International macrodata
* Longitudinal studies
* Qualitative/mixed methods
* UK surveys
* The National Data Service
* http://www.nationaldataservice.org/a
bout/
* Individual studies may have
different access points.
* E.g., Whitehall II Study, UCL.
* Secondary data is everywhere –
some in the public forum.
* Online support groups:
*
COULSON, N.S., 2015. Exploring patient's engagement with web-based
peer support for Inflammatory Bowel Disease: forums or Facebook?
Health Psychology Update. 42(2), 3-9.
* Longitudinal data (Whitehall II survey)
*
Kouvonen, A., et al . (2011). Negative aspects of close relationships as
a predictor of increased body mass index and waist circumference: the
Whitehall II study. American journal of public health, 101(8), 1474-1480.
* Twitter, Instragram…..
*
Whiting, R., & Pritchard, K. (2015). “Big Data? Qualitative Approaches to
Digital Research", Qualitative Research in Organizations and
Management: An International Journal, Vol. 10 Iss: 3, pp.296 - 298
*
*
Low
response
rate
High attrition
rates
Access to high
quality
measures
Small sample size
Reliance on
convenience
samples
‘Traditional’
Challenges in
Psychological Research
Correlation does
not equal causation
Limited
money &
resources to
collect
primary data
Limited scope for
extensive
comparative research
(across groups or
internationally)
* The data has already been
collected.
* Save time – primary
researcher does not have to
design study and collect new
set of data.
* The types of data that are
typically collected tend to
be higher quality than could
be obtained by individual
researchers.
* Typically longitudinal, have
large sample sizes that have
been obtained using
elaborate sample plans.
Ref: Trzesniewski et al., 2011
*
* Learning how to work with,
manage and analyse
secondary data can provide
individual researchers with
the raw materials to make
important contributions to
the scientific literature
* … using data sets with
impressive levels of
external validity.
Ref: Trzesniewski et al., 2011
*
* Open-source approach to research
* Replicate findings using similar analyses
* Encourages careful reporting and justification of
analytical decisions.
* Allows researchers to test alternative explanations and
competing models.
* Encourages transparency, which in turns help
facilitates good science.
Ref: Trzesniewski et al., 2011
*
* The data has already been
*
collected!!!
* You may not have all the
information on how or why
certain types of information
was collected.
* You may not know of any
particular problems that
occurred during data
collection.
* Sometimes you are left
wanting more …..
* The temptation: a statistical fishing trip.
* Great research is driven by a good research question
that is strongly underpinned and shaped by theory.
* The purpose of analysing data is to refine the scientific
understanding of the world and to develop theories by
testing empirical hypotheses.
*
“Mo Money Mo Problems” - Mo Data, Mo Temptations ?
* A note about statistical power.
Ref: Trzesniewski et al., 2011
*
* Considerable time and effort:
*
is invested by the
researcher to understand
the nature and structure of
a data set.
* is needed by the researcher
to explain and justify the
theoretical and analytical
approached used.
* Although, I would argue there
is real advantages to the time
invested in doing this.
Ref: Trzesniewski et al., 2011
*
* Measures in these datasets are often abbreviated.
Often because the projects themselves were designed
to serve multiple purposes and to support a
multidisciplinary team.
* Shortened measures, mix-levels of data, and single items
measures.
* These datasets often have impressive levels of breadth
(many constructs are measured), but often with an
associated cost in terms of depth of measurement.
* Therefore, measurement issues are ~ therefore ~ one
of the major issues in secondary data analysis
*
These issues often require quite a bit of conceptual
consideration & defending in the peer-review process.
*
Ref: Trzesniewski et al., 2011
* A good grounding in psychometrics and Classic Test
Theory.
* You need to carefully consider and evaluate the
trade-offs in reliability and validity.
* You need to defend your position when writing up.
* You need to understand
how measurement issues
frame your findings; and, in turn, your
interpretation of your findings.
Ref: Trzesniewski et al., 2011
*
* Creating and managing data
files
* Data inventory
* Research journal
* Approach to missing data and
data screening procedures
* Use of and/or development of
constructs
* Use of proxy variables
* Development & testing of
composite measures
* Single item measures
* Accounting for the data
structure in your analysis
*
*
*
* The aim of the doctoral thesis was to develop and
test a theoretical model seeking to describe the
aetiological role of psychosocial processes, in and
out of the workplace, in predicting gender-related
diversity issues in men’s and women’s health at a
structural/population level.
* An iterative multi-stage methodology was utilised to
develop and test the proposed theoretical model.
• Literature review – Theoretical framework
Stage one
• Identification of suitable source of data
Stage two
• Data review (data inventory)
• Measurement development and testing
Stage three • Data cleaning
*
* European Working Conditions
Survey
* Pan-European cross sectional
survey of working conditions,
worker’s health and safety, and
living conditions (n = over 40,
000 workers)
* Now on the 6th wave of data
collection.
* The survey as evolved over
time asking more questions.
* Survey items are informed and
based on contemporary theory
* The measures used are not
always based on a validated
psychometric measures
* Single items vs. composite
measures?
*
* The vast majority of latent conceptual
constructs are complex and multifaceted in
nature.
* Consequently, the use of a single item as a
theoretical concept may not yield an accurate,
comprehensive, and reliable measurement of
the given construct of interest.
*
* The guiding premise by many in the scientific
community is that multiple responses reflect the
“true” response more accurately than does a single
response.
* Imprecision in measurement is one of the
causes (although not the sole cause) of
measurement error.
key
* Measurement error creates ‘noise’ to the observed
variables.
*
* Inaccurate and unreliable measurement of a
concept results in key concerns regarding the
overall validity and reliability of the
hypotheses tested using this (or these) given
measurement(s).
* It is generally agreed/ suggested that research
findings that are valid, reliable and
generalizable, are built on a solid foundation
of accurate and consistent measurement.
*
* The primary objective of creating a series of
summated (or composite) scales is to avoid the
exclusive use of, or dependence on, single item
constructs where possible.
* The use of several variables as indicators
provides an opportunity to represent differing
facets of a given concept, with the aim of
yielding a more well-rounded perspective and,
arguably, a better measurement of the given
concept
*
*
* Researchers need to ask: how was consent obtained in
the original study? Where sensitive data is involved,
we cannot/ should not assume informed consent.
* Given that it is usually not feasible to seek additional
consent, a professional judgement may have to be
made about whether the use of secondary data
violates the contract made between subjects and the
primary researchers.
* Growing interest in secondary data make it imperative
that researchers in general now consider obtaining
consent, which covers the possibility of secondary
analysis as well as the research in hand.
*
This is consistent with professional guidelines on ethical
practice
*
Heaton, J (1998). Secondary analysis of qualitative
data. Social Research Update (issue 22). See:
http://sru.soc.surrey.ac.uk/SRU22.html
* Can you publish secondary data analysis – yes!
* Never forget: the central role of theory.
* Be detail orientated!
* Justifying your research question is important, but you also need
to be prepared to justify and outline the logic of your analysis
framework and approach.
* Understand and reflect on how the research design or any
experienced methodological issues of your secondary data may
impact or frame the interpretation of your results.
*
* Secondary data analysis is an important and
useful research methodology.
* There are many benefits and strengths to using
secondary sources of data.
* But there also important pragmatic and
methodological challenges that face
researchers.
*
* Trzesniewski, K. H., Donnellan, M., & Lucas, R. E. (2011). Secondary
data analysis: An introduction for psychologists. American
Psychological Association.
* Vartanian, T. P. (2010). Secondary data analysis. Oxford University
Press.
* Heaton, J. (2008). Secondary analysis of qualitative data: An
overview. Historical Social Research/Historische Sozialforschung, 3345.
* Hinds, P. S., Vogel, R. J., & Clarke-Steffen, L. (1997). The
possibilities and pitfalls of doing a secondary analysis of a qualitative
data set. Qualitative Health Research, 7(3), 408-424.
*
*
j.hassard@bbk.ac.uk
Download