British Ability Scales 3: Score Classifications & Descriptions Descriptive Classifications for IQ and Other Composite Cognitive Scores Score classifications that are used in various individually-administered tests have changed over time. Older classification systems tended to be value-laden, using such terms as “Mentally Deficient,” and “Superior.” The Wechsler/Binet systems, which became widely used, can be traced to the early 1900s. Pintner (1923, p. 77) and the Stanford-Binet (Terman & Merrill, 1937; Merrill, 1938) had the following classification scheme: Pintner Classification Feebleminded Borderline Backward Normal Bright Very Bright Very Superior Stanford-Binet Classification Mentally Defective Borderline Defective Low Average Normal or Average High Average Superior Very Superior Intelligence Quotient 0-69 70-79 80-89 90-109 110-119 120-129 130 and above Interestingly, although the descriptive terms have changed over time, every subsequentlypublished major cognitive test battery has used the same score boundaries! Classification labels are not in themselves objective statements, but are descriptors of an individual’s level of general cognitive abilities recommended by an author as useful in communicating with lay people. The Wechsler scales used some of the same descriptors as those of Pintner and Merrill, but these also changed over the years. For example, the WISC-R (1974) and WISC-IV (2003) categories are as follows: WISC-R Classification Mentally Deficient Borderline Below Average (Dull) Average Above Average (Bright) Superior Very Superior WISC-IV Classification Extremely Low0-69 Borderline Low Average Average High Average Superior Very Superior Intelligence Quotient 70-79 80-89 90-109 110-119 120-129 130 and above The BAS3 Descriptive Classifications In recent years, there has been a tendency among authors and publishers to move to a classification system that is less qualitative or evaluative, and more quantitative in its descriptions. It is also desirable to have parallel descriptors above and below the mean. Thus 1 the British Ability Scales, ever since their first publication in 1979, and the Differential Ability Scales (1990, 2006) have used the following classification system: BAS3 Classification Very Low Low Below Average Average Above Average High Very High GCA or other composite score 69 and below 70-79 80-89 90-109 110-119 120-129 130 and above Percentiles 1-2 3-8 9-24 25-74 75-90 91-97 98-99 How Should Descriptors be Used? In the BAS3 SRS software, the GCA and SNC composites, and the cluster scores are reported with confidence limits. The author recommends that if a score has confidence limits that are in two categories, both categories should be used in describing the child’s score. For example, if a child’s GCA score is 91, with 90% confidence limits of 86-97, it would be most appropriate to report the child’s score as being in the average to below average range. A Note on Percentile Ranges The percentiles covered by the various standard scores are also shown in the table above. From this it will be seen that the central category (Average) has a very neat and interesting feature: the “Average” classification covers 50% of the population—it comprises all those individuals whose scores are in the range of 90 to 109, who lie between the 25 th and 74th percentiles. Thus according to these classification boundaries (which have been, and continue to be, adopted by all major cognitive test authors and publishers) an average score is defined as one that would be obtained by someone lying in the mid-fifty-percent of the population. A Final Note on the Definition of ‘Average’ Some educational psychology training courses teach that a region plus or minus one standard deviation from the mean should be categorised as ‘average’ (i.e. the middle 68%). It is important to bear in mind that this is merely a convention and that ‘average’ has only one real statistical meaning – a score at the mean itself. Even then, statisticians prefer to use ‘mean’ in order not to confuse it with the median or the mode. There have been many different conventions used in psychometrics regarding the definition of ‘average’. For example, British Army and Navy recruitment policies used to use the middle 40% as their ‘average’ selection grade, a scale adopted by Alice Heim in her wellknown AH series. If stanines 4 to 6 are used as ‘average’ this is the middle 54%. Increasingly, schools are using a three-point grading which splits performance into the bottom 25%, the middle 50% and the top 25%. While 68% may have a convenient statistical meaning, this 2 does not make it an ideal convention to convey accurate meaning to non-statisticians – the middle 50% makes far more sense. Whichever convention is used, it is critical that the meaning which the educational psychologist is attaching to ‘average’ is conveyed to anyone being presented with the results, or they may assume it to mean something else. References Elliott, C.D. (1990). Differential Ability Scales (DAS). San Antonio, TX: The Psychological Corporation. Elliott, C.D. (2006). Differential Ability Scales, 2nd edition (DAS-II). San Antonio, TX: Harcourt Assessment. Merrill, M.A. (1938). The significance of IQs on the Revised Stanford-Binet Scales. Journal of Educational Psychology, 29, 641-651. Pintner, R. (1923). Intelligence testing. New York: Holt, Rinehart & Winston. Terman, L.M. & Merrill, M.A. (1937). Measuring intelligence. London: Harrap. Wechsler, D. (1974). Manual for the Wechsler Intelligence Scale for Children-Revised (WISCR). New York: The Psychological Corporation. Wechsler, D. (2003). Manual for the Wechsler Intelligence Scale for Children, 4th Edition (WISC-IV). San Antonio, TX: Harcourt Assessment. 3