'Challenging Behaviour Scales':
Criterion-related Evidence of Validity
Rai Turton rai.turton@gmail.com
(Supporting information and references are included in the notes to the slides)
1
Scales that assess behaviour that challenges (BtC)
Present lists of topographies of behaviour that are considered to be challenging.
Elicit ratings of listed topographies on dimensions such as frequency, difficulty, severity, or degree of harm caused.
Examples: Aberrant Behavior Checklist; Behavior
Problems Inventory; Challenging Behaviour
Interview.
2
Framework for assessing validity
Definition of BtC:
●
Behaviour that causes, or is likely to cause, harm or exclusion (defined by impact, not by topography).
‘Face’ or content evidence:
●
●
●
Rating dimensions represent harm or exclusion;
Topographies represent behaviour that is likely to cause harm or exclusion.
Criterion-related evidence:
Scores on scales are associated with valid measures of harm or exclusion or with valid estimates of risk.
3
Validity and the uses of scales
●
People who use scales to assess, inform or make decisions, are responsible for the validity of their interpretation of scale scores and should ensure that they use valid scales (American Educational Research
Association et al., 1999).
●
'Challenging behaviour scales' are used to assess the effectiveness of services, to categorise or describe people, and to make decisions about people.
Previous reviews of scales: five reviews affirm the validity of several scales but cite little supporting evidence.
4
Criterion-related evidence of validity for 'challenging behaviour scales'
●
●
●
The impact of behaviour is in principle observable,
Direct measurement of impact of behaviour is possible, therefore direct criterion measures are feasible,
Criterion-related evidence of validity supports the assumption that scores on a scale represent behaviour that challenges,
●
The best assessment of validity uses direct measures of impact as criteria.
See Cronbach & Meehl (1955)
5
Results of literature search
PsychINFO & MEDLINE
Initial search to identify scales used to assess behaviour that challenges: 12 focused scales found
Scale-by-scale searches for published assessments of validity: 21 publications assessed the validity of 9 of the scales. Some assessed more than one scale, some used more than one type of assessment (23 studies of validity in all)
Searches for evidence of validity of scales used as criterion (or alternative construct) measures found very little.
6
Criterion-related evidence of validity
Direct measures: 1 study
ABC correlated with direct observations of behaviour (Aman et al., 1985)
Topographies of behaviour defined for observation matched ABC subscale topographies
Observers kept 'blind' to ABC ratings.
Recorded dimension of behaviour not stated.
Small sample (n = 36), 18 correlations examined.
7
Criterion-related evidence of validity
Indirect measures = proxy groups: 8 studies
ABC: 3 studies, groups defined by diagnoses or services received
BPI-01: 1 study, groups defined by diagnoses
MOAS: 3 studies, groups defined by diagnoses or services received
IBR-MOAS: 1 study, groups defined by diagnoses
6 of the studies do not provide evidence of their groups' validity as indicators of levels of BtC or aggression
8
Criterion-related evidence of validity
Indirect measures = other scales: 14 studies
(correlations between sub-scales)
ABC correlated with CBCL-2/3, ADD, SPSS-R, BPI-
01 and RSB-R
ASD-BPA correlated with BPI-01
ASD-BPC correlated with BASC-2
BPI-01/S correlated with ABC, ASD-BPA, DASH-II,
ICAP, NCBRF, RSB-R
CBI correlated with ABC
OAS correlated with ABC
Two studies offered evidence that correlated scales measure same thing. Little evidence that ‘criterion’ measures are valid.
9
Limitations
Only one study used direct criterion measures, with a small sample.
6 of 8 studies using proxy groups do not provide evidence of their groups' validity as indicators of levels of BtC or aggression.
3 of 14 studies correlating scales with scales assessed criterion-related validity and 11 assessed convergent validity; little evidence that criterion or alternative measures are valid.
Only 3 of the 21 publications explicitly ensured that raters/informants were unaware of other measures.
Little replication of studies.
10
Conclusions
The validities of challenging behaviour scales have not been satisfactorily demonstrated.
On the positive side, when evidence is looked for using direct criterion measures, it is likely to be found for some.
The small study that used direct behavioural observations and ensured that the observers were blind to scale ratings shows that good assessments are possible.
11