View the presentation

advertisement
Making Social Work Count
Lecture 4
An ESRC Curriculum Innovation and
Researcher Development Initiative
What is being studied?
Approaches to measuring variables
Assessment and judgment
• Social workers have to
assess all the time:
– Is there a problem or
need here?
– What is the risk of things
getting worse?
– Have I made a
difference?
• Researchers carry out
similar tasks
• This lecture considers
the key issue of
developing meaningful
measurements for use
in quantitative research
• Many of the issues are
of relevance to the
more general task of
“assessment”
Quantitative and qualitative
• All research involves simplification
– The question is whether we know what is gained and
lost by simplification
• Qualitative studies tend to focus on meaning
– Common strategy is identifying themes of relevance
• Quantitative studies convert issues to numbers
– Allows certain types of important description (e.g.
how many people have this problem?)
– And – crucially - comparison (e.g. are things getting
better? Does one group have more problems?)
Quantitative and qualitative
Quantitative research
• This session focuses on
quantitative research
• It identifies key
considerations in thinking
about the quality of
quantitative study
– Reliability
– Validity
Qualitative research
• Some of these
considerations can also be
applied to qualitative
research
• However, qualitative studies
also have their own criteria
for assessing good research
Learning outcomes
Understand
what a variable
is
Appreciate
different types
of variable that
can be used in
quantitative
research
Understand
issues in
relation to
reliability and
validity
Know what a
standardised
instrument is
Have had the
opportunity to
reflect on
implications for
practice
Example of children in care
• Returning to idea that care “fails” children
• Lecture 3 suggested that comparing children
who have left care with the general
population is not a valid comparison sample
• Now let’s look at outcome measures
Forrester et al. (2009) review
• The literature review focused on studies that looked
at child welfare over time for children in care
• Strongest finding: very poor research base – this is a
difficult area to research
• Of 13 studies, almost all suggested:
– Most of the harm occurs before care
– Children tend to do better once in care
– Some harm occurs as children leave care
– Even in good placements children still tend to
have problems
But…
What “outcomes” were
being measured?
What outcomes do YOU
think should be measured
for children in care?
Key points
• Deciding on “outcomes” • Key issues to consider:
or variables for a study
– WHO is deciding what is
to be measured? (e.g.
is NOT some valueexperts? Government?
neutral, technocratic
Service users?)
activity
– WHAT is being
measured?
– HOW is it being
measured? [focus of
this lecture]
Key points
What is measured?
• For instance, in studies
reviewed by Forrester:
– the most common issue
“measured” was behaviour
(and particularly problem
behaviour)
– education was the second
most common
– others included physical
growth, social relations, etc
How is it measured?
• Studies in the review:
– obtained information from
social work files and made a
researcher “judgment”
– used school tests
– pooled interview and other
data and made a researcher
“judgment”
– used questionnaires to carers
• What are the strengths and
weaknesses of each?
Attributes and variables
• An attribute – is a
characteristic of an
individual
e.g. height, intelligence,
beauty, serenity
• A variable – is the
operationalisation of an
attribute
e.g. metres, IQ score, marks
out of 10?, err…
It allows attributes to be
compared and described
• The focus of lecture is on:
how attributes are
operationalised?
Variables need to be reliable and valid
Reliability
• Are the results consistent, e.g. can the
results be replicated in different conditions
and across different groups?
Validity
• Does the instrument measure what it claims
to measure?
Measures should be both reliable and valid
Reliable
Not valid
Low reliability
Cannot be valid if
not reliable…
Not reliable
AND not valid
Standardised Instruments (SIs)
• Tools that measure a specific quality or
characteristic e.g. psychological distress
• They let us compare results across groups in
different settings e.g. social workers, families,
teachers, police.....
• SIs need to be high in both reliability and
validity
Reliability – overview
• The consistency of a measure
• A test is considered reliable if we get the same
result repeatedly
• Reliability can be estimated in a number of
different ways
– Test-retest reliability: over time
– Inter-rater reliability: between different scorers
– Internal Consistency Reliability:
across items on the same test
Test-Retest Reliability
• Tests the extent to which the test is repeatable
and stable over time
• The same social workers are given the same
questions 2 to 3 weeks later
• If the results differ substantially, and there has
been no intervention, then we should
question the reliability of those questions
Inter-rater reliability
• Where two or more people rate/score/judge
the test
• The scores of the judges are compared to find
the degree of correlation/consistency
between their judgements
• If there is a high degree of correlation
between the different judgements,
the test can be said to be reliable
Internal Consistency Reliability
• For example where there are two questions
within a SI that seem to be asking the same
thing
• If the test is internally valid the respondent
should give the same answer to both
questions
• More generally questions should be linked
to one another if they measuring
the same attribute
Validity
• The extent to which a test measures what it
claims to measure:
–
–
–
Construct validity: The degree to which the test
measures the construct of what it wants to measure
– the overarching type of validity
Predictive validity: The degree of effectiveness with
which the performance on a test predicts
performance in a real-life situation
Content validity: that items on the test represent
the entire range of possible items the test should
cover
Construct validity
• The degree to which the test measures what it is intended to
measure
• The over-arching concept in validity – all other types of
validity are ways of assessing this
• As a result construct validity has many elements:
– Predictive validity (can it predict things e.g. IQ scores and later test
results)
– Criterion validity (does it correctly differentiate e.g. does a screening
instrument identify people who are depressed)
– Construct validity (is the full range of the construct included)
– And other types…
Predictive validity
• Can structured risk assessment tools
predict children who will be abused?
• Are the predictions more accurate than
practitioners’ decisions?
Predictive validity
• Barlow et al (2013) found that most
attempts to predict had low success i.e.
high numbers of false positives or false
negatives
• Further research needed to develop
reliable tools that predict abuse or
re-abuse
• Though this is also true for practitioners…
Content validity
• Refers to the extent to which a measure represents
elements of a social construct or trait
• For example, a depression scale may lack content
validity if it only assesses the affective dimension of
depression but fails to take into account the
behavioural dimension
• Or : how should “ethnicity” be defined? In practice
it is not possible to capture the full range of
possible ethnicities – but what level of
simplification is “valid”?
General Health Questionnaire (GHQ)
• A reliable and valid screening
instrument identifying aspects
of current mental health
(anxiety/depression/social
phobia)
• The self administered
questionnaire asks if someone
has experienced a particular
symptom or behaviour recently
• Each item is rated on a fourpoint scale
• Used in many countries in
different languages
GHQ 12 questions
Questions include: Have you recently ......
1. Been able to concentrate on whatever you are doing
2. Lost much sleep over worry
3. Felt that you are playing a useful part in things
4. Felt capable of making decisions about things
5. Felt constantly under strain
6. Felt you couldn’t overcome your difficulties
7. Been able to enjoy your normal day to day activities
8. Been able to face up to your problems
9. Been feeling unhappy and depressed
10. Been losing confidence in yourself
11. Been thinking of yourself as a worthless person
12. Been feeling reasonably happy, all things considered
GHQ 12
• Different ways of measuring risk of psychiatric
problems using data
• All show reasonable link with clinical diagnosis
• Common way is ‘yes’ or ‘no’ (depending on
question) in 4 or more questions
• How do social workers do…?
Clinical scores for social workers and general
population using GHQ
50
43
45
40
35
33
30
25
18
20
15
10
5
0
NQSW
One year later
General population
Carpenter et al, 2010; ONS, 2010
How to measure children’s emotional and
behavioural welfare?
• SDQ: Questionnaire designed for carers, children and
teachers
• Reliability is tested by:
• Validity is tested by:
comparing emotional
• seeing whether scores
and behavioural welfare
predict children receiving
– and over time
specialist help, criminal
behaviour, excluded from
school and “real world”
outcomes
• also comparing with
clinical assessment and
other instruments
Strengths and Difficulties Questionnaire
(SDQ)
• A brief behavioural
screening questionnaire
for parents/carers/
teachers with 3-16 year
olds
• Asks about psychological
attributes, some positive
and others negative
– E.g. emotional, conduct,
hyperactivity, peer
relationship, prosocial
behaviour
SDQ questions
• 25 questions composed of five scales with five
questions in each scale
• E.g. 5 questions in the Emotional Symptoms Scale
1. I get a lot of headaches
2. I worry a lot
3. I am often unhappy
4. I am nervous in
5. I have many fears
Responses: Not true/Somewhat true/Certainly true
Why does this matter?
• Worth considering common social work research
methods such as coming to a “researcher
judgment” – how reliable? How valid?
• More importantly – what about your practice?
• What is a better way of judging whether a child
has emotional or behavioural problems, or an
adult is at risk of psychological problems – your
judgment or a standardized instrument?
• If you want to evaluate whether you are making a
difference – what role might a standardized
instrument have?
Learning outcomes
Do you?
• Understand what a variable is
• Appreciate different types of
variable that can be used in
quantitative research
• Understand issues in relation
to:
– Reliability
– Validity
• Know what a standardised
instrument is
• Have had the opportunity to
reflect on implications for
practice
References
• Goldberg, D. & Williams, P. (1988) A user’s guide to the General Health
Questionnaire. Slough: NFER-Nelson
• Goodman R (1997) The Strengths and Difficulties Questionnaire: A
Research Note. Journal of Child Psychology and Psychiatry, 38, 581-586
• http://www.sdqinfo.com/d0.html
• Barlow, J., Fisher, J.D. and Jones, D. (2013) Systematic Review of Models for
Analysing Significant Harm, Department for Education Report; London
Accessed:
https://www.gov.uk/government/uploads/system/uploads/attachment_d
ata/file/183949/DFE-RR199.pdf
Download