Piers – Harris Children's Self-Concept Scale, Second Edition

advertisement

Appraisal of Instruments 1

Piers – Harris Children’s Self-Concept Scale, Second Edition

Purpose of Test

The Piers–Harris Children’s Self-Concept Scale 2, subtitled The Way I Feel About Myself , was developed as a self-report instrument for assessing self-concept in children and adolescents between the ages of 7 and 18. The manual defines self-concept as “a relatively stable set of attributes reflecting both description and evaluation of one’s own behavior and attributes” (Piers–Harris 2, YEAR???, p.3). The test was originally developed in the early 1960’s and revised in 1984.

Quality of Test Manual

The test manual is easy to follow and provides a thorough exploration of the second edition revisions and restandardization. Clear interpretive guidelines for each domain score are provided, including information that associates particularly low domain scale scores with specific behaviors, disorders or other possible areas for concern (Piers–Harris, p.24-27). Noted strengths of the manual include case studies which illustrate how scores can be used for screening and evaluation (Kelley, M.L.,

16 th MMY), and relevant literature on the psychometrics of the instrument (Oswald, D.P., 16 th MMY).

Piers-Harris 2 includes a bibliography of studies that involved the original instrument.

Quality of Test

Four major types of validity issues are addressed in the manual: exaggeration, response bias, random responding, and moderator variables (Piers–Harris, p.18). The instrument has built-in indicators to assist in measuring and interpreting the impact of these issues. Empirical investigations of the effects of potential moderator variables have concluded that they have not had significant effects on scores

(Piers–Harris, p.20).

The Piers-Harris 2 manual considers two aspects of reliability: internal consistency and test-retest reliability. The internal consistency estimates of the entire standardization sample are high, for example

TOT alpha =.91, SEM 3.07. Alpha coefficients range from .60-.93 (Piers – Harris, p.50). Test-retest reliability data is not available for the Piers–Harris 2 revision. The manual indicated studies using the

Appraisal of Instruments 2

earlier scale were found to be acceptable (p.51). However, Donald P. Oswald (16 th MMY) considered the lack of new test-retest studies to be a weakness in the revised scale.

The manual asserted that content validity for this instrument is the least important aspect of validity because “self-concept is by definition a theoretical entity” (Piers–Harris, p.53). Content validity was assessed using a judge’s ratings. Kelley (16 th MMY) critiqued the lack of information about the judging process. Construct validity was assessed using factor analysis. Data was collected as part of the standardization study (Piers–Harris, p.53). Interscale correlations determined that domain scales represented separate but interrelated aspects of self-concept (Piers–Harris, p.54). The manual cited TOT correlations ranging from .73-.81. Oswald (16 th MMY) noted some of the higher correlations between domains may be attributed to shared items. Interscale correlations among domain scales ranged from .30 to .60. Convergent validity was measured by correlating the test with concurrent measures of anger/aggression attitudes and psychological symptoms (Piers–Harris, p.57). Table show the scale to scale relationships are “relatively strong and in the predicted direction” (Piers–Harris, p.60). Larger coefficients r > .25 are shown in bold. Validity evidence was also gathered from correlations with parent, teacher, and peer ratings; multitrait-multimethod studies; and intervention studies (Piers–Harris, p.65-66).

Criterion validity has been primarily assessed by the instrument’s “ability to distinguish between groups that are expected to differ in self-concept” and findings support its success in this area, with effect sizes ranging from .23-.60 for items that discriminated between clinic-referred and nonclinical children (Piers–

Harris, p.69). Overall there has been a long history of research finding supporting the reliability and validity of the scale. However, an important limitation to consider is that the majority of the studies were conducted with the original version of the scale (Kelley, 16 th MMY).

Ease of Administration and Scoring

The instrument can be administered individually or to small groups. The test can be scored manually by the examiner, or computer scored by either using a personal computer program or by mailing the response forms to the publisher. Administering and scoring the instrument can be accomplished by someone with minimal training, but interpretation should be conducted only by those trained in

Appraisal of Instruments 3

psychological assessment (Piers–Harris, p.4). The raw scores include a total score, which is a measure of general self-concept and six domain scores useful in identifying areas of strength and vulnerability (Piers–

Harris, p.21).

The original distribution of raw scores was transformed to approximate a normal distribution.

Raw scores are converted to T scores (mean=50, SD =10) and percentiles (Piers–Harris, p.17). The manual includes high, average and low range T scores for the total score (TOT) and each of the domain scales. The normal range is considered between 40-60. The manual noted that examining individual responses may help explain more about a child’s self-concept. For comparison, tables show the percentage of respondents in the standardization sample who endorsed an item in each direction (Piers–

Harris, p.27-29).

Interpretation and Adequacy of Test Norms

The Piers–Harris 2 has been standardized on a heterogeneous sample of 1,387 students, aged 7 to

18, recruited from school districts across the United States (Piers–Harris, p.3). Criticism of the original

Piers–Harris emphasized the limited nature of its standardization sample which was homogenous in terms of several demographic variables (Oswald, 16 th MMY). The new sample aims for similarity to U.S.

Census figures for ethnic groups and SES categories (Piers–Harris, p.39). However, the restandardization of the instrument did not adequately represent Hispanic/Latino students, which Oswald (16 th MMY) noted is unfortunate in view of the growth of the Latino population in the United States.

Selection and Use with School Population

The manual suggests that the test is appropriate for use in research, educational and clinical settings (Piers–Harris, p.4). The instrument has been used as a screening device to determine the need for psychological and educational interventions, and as a tool to explore the relationship between selfconcept and specific behaviors (Piers–Harris, p.4). The instrument has also been applied as a measure of treatment outcomes by monitoring changes in self-concept over time (Piers–Harris, p.69).

The test is not recommended for children who are unable or unwilling to complete it honestly since it relies on a self-report format (Piers–Harris, p.7). A readability analysis indicates that second-

Appraisal of Instruments 4

grade readers should be able to read the items (Piers–Harris, p.40). Internal consistency can differ across age ranges, for example the Popularity Scale demonstrated weaker internal consistency for the youngest children (7 & 8 year olds; alpha =.60) and for older adolescents (17 & 18 year olds; alpha =.62)

(Piers–Harris, p.49). There may also be low test-retest reliability in younger children as their self-concept is still developing (Piers–Harris, p.51). The test is not standardized for some minority populations and would warrant special consideration when used with Hispanic or other recently immigrated students.

Assistance with Counseling Goals

Kelley (16 th MMY) has asserted that the Piers–Harris 2 is one of the best questionnaires of its type and suggested it is best used as a screening instrument. The manual offer several clear examples of how the instrument could be used for screening and as part of a psychological evaluation (Piers-Harris, p.31-33). The manual cautions that the instrument cannot on its own provide a comprehensive evaluation of a child’s self-concept. Other sources of data must be integrated into a full evaluation, such as: clinical interviews with the child and other relevant informants, prior history, school records and observations, and results from other psychological tests (Piers–Harris, p.5).

Suicidal Ideation Questionnaire

Purpose of Test

The Suicidal Ideation Questionnaire is designed to assess thoughts about suicide in adolescents and young adults. Thoughts about suicide, or suicidal ideation, is considered to be a potential precursor to suicidal behavior (SIQ, p.iv). There are two versions of the SIQ. One version is designed for use with adolescents in senior high school (grades 10, 11, 12). The second, which is designated as SIQ-JR, is designed for adolescents in junior high school (grades 7, 8, 9) (SIQ, p.iv). The SIQ was primarily designed for large scale screening as a primary prevention effort to identify an individual’s potential risk status (SIQ, p.3&12).

Quality of Test Manual

The manual is designed as a technical guide for the instrument as well as a clinical guide to aid in the evaluation of suicidal ideation in adolescents (SIQ, p.iv). Clinical caveats are given to assure that the

Appraisal of Instruments 5

SIQ is used appropriately (SIQ, p.35). The manual provides “background information” on suicidal behavior in adolescents and describes the “construct of suicidal ideation” (SIQ, p.iv). The manual is clearly written and accessible.

Quality of Test

The internal consistency reliability of the SIQ and SIQ-JR was computed using Cronbach’s coefficient alpha and found to be high – the manual reports .936 for SIQ-JR and .971 for SIQ (SIQ, p.25).

The manual presents standard error of measurement values for the various sub-samples and shows very little variability with most SEms approximately 4 raw score points (SIQ, p.25). The manual presents testretest reliability based on two assessments given to 801 adolescents approximately 4 weeks apart. The test-retest reliability was moderate (.72) which is stated to be consistent with expectations for a suicidal ideation measure because of external factors and random error variance (SIQ, p.25-26). One critique is that the issue of whether the test-retest reliability decreases substantially with a suicidal population was not addressed. This could offer valuable information about how often the screening should be repeated

(Conoley, MMYB 11).

Content validity is presented in the congruence of item content with specified suicidal cognitions and in item-total scale correlations (SIQ, p.29). The SIQ item-total scale correlations are high, with the majority ranging from .70 to .90. The SIQ-JR item-total scale correlations are slightly lower and range from .46 to .86 (SIQ, p.29). Conoley (11 th MMY) noted that the continuum of suicidal thoughts presented in the manual has not been substantiated. Correlations between the SIQ/SIQ-JR and related constructs are presented to support the validity of the instrument (SIQ, p.30). Correlations between the two versions of SIQ and scales measuring depression, hopelessness, self-esteem, anxiety, academic selfconcept and learned helplessness are explored and show shared variance with the SIQ (SIQ, p.30-31). A multiple regression analysis was also completed to evaluate the relationship between the SIQ and other variables of psychological distress and individual differences variables such as: age, sex, SES, and academic achievement (SIQ, p.31-32). Clinical validity of the SIQ was examined in a clinical setting with a sample of hospitalized suicide attempters. The manual reports the mean SIQ score of the suicide

Appraisal of Instruments 6

attempters to be 69.60 and compares this to the mean score of the standardization sample, which was

17.79 (SIQ, p.33). Of interest is only two thirds of suicide attempters endorsed levels of suicidal ideation above the cutoff score (Carmer, 11 th MMY).

Ease of Administration and Scoring

The SIQ is a self-report instrument and respondents rate items on a 7-point scale (SIQ, p.7).

Items are scored in a “pathology direction”. The higher the score = the greater the endorsement of the given suicidal cognitions (SIQ, p.7). Hand scoring can be accomplished with the aid of a provided scoring key. The scoring key also indicates critical items (SIQ, p.10). A mail-in scoring service is available for large group administrations (SIQ, p.10).

The manual provides a cutoff score (41 on SIQ, 31 on SIQ-JR) which may be used to judge the severity of suicidal thoughts (SIQ, p.10-11). In the case of large-scale screening a more liberal cut-off score is recommended to identify adolescents at approximately the 84 th percentile and above (SIQ, p.11).

James Carmer (11 th MMY) suggested that evaluators of the SIQ should not rely on the cutoff score given the finding that one third of suicide attempters scored below the cutoff. Carmer cautioned that each SIQ protocol should be individually screened for the endorsement of any critical item (Carmer, MMYB 11).

Administration requires “knowledge of professional and ethical guidelines” (SIQ, p.9). The SIQ can be administered individually or in groups. It is suggested that administers of the instrument introduce it as “an assessment of the adolescent’s thoughts about himself or herself” not as a suicide questionnaire – in order to avoid “the possibility of mood induction” (SIQ, p.10). Special considerations are offered in the manual for large group administration and scoring (SIQ, p.10).

Interpretation and Adequacy of Test Norms

The manual clarifies that interpretation of the SIQ requires an examination of the total score, critical items, and item endorsement patterns (SIQ, p.10). The manual notes that false negatives are expected and cautions that individuals thought to be at risk on the basis of other information should be further evaluated (SIQ, p.11). Collie Conoley noted a high probability of false positives and concluded that “it is obviously safer to err on the side of including more rather than fewer adolescents in a follow-up

Appraisal of Instruments 7

program (Conoley, MMYB 11). The manual provides case studies as examples and suggestions for interpretation (SIQ, p.13-14). Tables for item response endorsement proportions for boys, girls and the total standardization sample are also given (SIQ, p.18-20).

2,180 adolescents were included in the standardization of the SIQ (890 in the senior high sample and 1,290 in the junior) (SIQ, p.15). Collie Conoley (11 th MMY) critiqued that the sample appears to be drawn from one high school and two junior high schools in an urban/suburban community in the

Midwestern United States (Conoley, MMYB 11). The manual noted that research and descriptive data continued to be collected from samples of over 4,000 adolescents in regular education, special education, and clinical settings (SIQ, p.15). Normative tables based on the total standardization sample and the standardization sample broken down by grade and sex are found in Appendices C and D (SIQ, p.38-41).

The manual cites substantial skewness in the distribution of scores and indicates that this is expected because “suicidal ideation is not normally distributed in nonclinical groups of adolescents” (SIQ, p.15).

The manual also notes sizeable sex differences on the SIQ and SIQ-JR (SIQ, p.15) and provides a graph of score difference between genders (SIQ, p.22).

The manual asserts that the sample represents a “racially and socioeconomically heterogeneous group of adolescents” and states that no significant differences between ethnic groups were apparent

(SIQ, p.21-23). However, it is notable that 78.1% of the normative sample for SIQ and 73.5% for SIQ-JR were White, with only .4% and .8% were Hispanic (SIQ, p.22).

Selection and Use with School Populations and Assistance with Counseling Goals

The manual proposes several potential applications for the SIQ. For example, the SIQ may be used as part of a general assessment in clinical situations, for the evaluation of large-scale intervention and prevention programs (e.g. in schools), as a follow-up measure for the evaluation of individuals at continued risk (i.e. after discharge from a treatment setting), and as a measure for evaluating treatment outcomes (particularly as an outcome measure in the application of cognitive therapies for suicide) (SIQ, p.9).

Appraisal of Instruments 8

The manual cites research that early identification and screening is a primary component of school-based suicide prevention programs (SIQ, p.12). Carmer asserted that the SIQ would be a “helpful part of a school crisis response team or overall prevention program” (Carmer, MMYB 11). Low base rates attenuate the predictive value of the instrument, but the instrument may still identify potential risks in the individual who wants to be identified (Reynolds, W.M., 1987, p.3). Conoley (11 th MMY) noted that the SIQ should be administered when/where there can be timely and thorough follow-up. School counselors/psychologists administering the instrument must prepare for this.

Adolescent Anger Rating Scale

Purpose of Test

The Adolescent Anger Rating Scale (AARS) was developed in 1994 to measure anger expression in adolescents ages 11-19 (Burney, D.M., 1994, p.7). The purpose of the instrument is also to differentiate between anger control and types of anger, specifically what the author defines as

Instrumental Anger and Reactive Anger (Burney, D.M., 1994, p.2).

Quality of Test Manual

The manual clearly outlines five “major investigative stages” important to the development of the instrument. A table of the statistical analyses present in each stage is also provided (AARS, p.10).

Etiological factors of anger are considered and explored, though perhaps overstated as indicated by

Carlen Henington (15 th MMY). The manual includes several helpful tables for AARS scale/subscales.

Quality of Test

Estimates of internal consistency and item-total correlations were obtained using Cronbach’s

Alpha. Correlations for the entire standardization sample ranged form .81 to .92 (AARS, p.25). Itemtotal correlations ranged from .42 to .69 for the IA subscale, .37 to .64 for the RA subscale and .34 to .65 for the AC subscale (AARS, p.25). AARS test-retest reliability was measured using 175 pairs of AARS scores with a 2-week interval between ratings (AARS, p.27). Test retest correlations ranged from .71 to

.79 (AARS, p.27).

Appraisal of Instruments 9

The manual presents an analysis of six steps which were taken to address content validity. This included a review of the development of the original test items and the make-up of the expert panel, an assessment of item relevance and face validity, the assignment of factor domains, and the assessment of applicability and practical need (AARS, p.29-31).

In support of the instrument’s concurrent validity Pearson product-moment correlation coefficients were computed to determine the relationship between AARS scores and conduct referrals in school (AARS, p.31). A negative correlation was found between Anger Control (AC) subscale scores and the number of conduct referrals (-.31). Positive correlations were found between the Total Anger scale scores and (a) the number of conduct referrals (.27), (b) the number of instrumental anger-type referrals (.30), and (c) the number of reactive anger-type referrals (.30). The manual presents the results of factor analysis on the normative data in support of construct validity. Findings were consistent with the previous research assessing the three-factor structure (IA, RA, AC) of the AARS (AARS, p.32).

However, Hugh Stephenson (15 th MMY) noted that several unsupported claims were made regarding the nature of the subscale constructs.

To assess convergent validity Pearson product-moment correlation coefficients were computed for AARS and two subscales of the Conners-Wells Self-Report Scales-Long (AARS, p.32). For subscales of similar construct, high correlations were observed (e.g. .61, .57). Lower negative correlations were observed between subscales that differed in construct (e.g. -.24, -.26) (AARS, p.32). Stephenson (15 th

MMY) noted the need for more convergent validity information for other relevant measures.

Discriminate validity was explored in correlations between the AARS and the Multidimensional Anger

Inventory (MAI). Correlations between the MAI and the AARS subscales were moderate to low, e.g. .46 for IA, .44 for RA and -.11 for AC (AARS, p.32). The shared variance between the MAI and the AARS subscales was also low (range = 1.01 to .21). The manual asserts that this offers “greater support for the ability of the ARS subscales to measure unique aspects of the construct of anger” (AARS, p.32).

Ease of Administration and Scoring

Appraisal of Instruments 10

The AARS form is fast and easy to use, yet should be administered and scored by professionals with formal training in standardized testing and the psychometric properties associated with statistical analysis, test development and interpretation (AARS, p.11). The items on the AARS are written at approximately a fourth grade reading level (AARS, p.9). The examiner may read items aloud for those with limited reading proficiency. If this is done, it should be noted in the examiner’s written report

(AARS, p.11). The manual suggests that normative data are appropriate to differing methods of administration, i.e. group or individual, read silently by the individual or aloud by the administrator

(AARS, p.11). The examiner must adhere to specific directions and include a discussion about the importance and purpose of the assessment. It should be noted that this discussion is especially important to encourage honest responses as there is no validity or lie scale in the instrument (Henington, 15 th

MMY).

Scoring is completed with simple mathematical calculations. A scoring key is attached to the underside of the response pages. The manual includes detailed instructions for scoring, including how to prorate a score in the case of missing responses (AARS, p.12-13). Raw scores are transformed to T scores and percentiles.

Interpretation and Adequacy of Test Norms

The manual provides guidelines for interpreting AARS T scores, along with descriptive information and case examples (AARS, p.16-18). Hugh Stephenson (MMYB, 15) notes that the purpose of the test is “immediately obvious” to the test taker and warns that this may result in under or over reporting of anger levels. Stephenson recommended that future test revisions include an item analysis for social desirability (MMYB, 15). Carlen Henington (MMYB, 15) similarly noted the absence to a

“validity or lie scale”, but stated that “it is believed that the value of this instrument is likely to outweigh” this concern. Still, it seems an important consideration in the interpretation of scores.

The manual describes a normative sample consisting of 4,187 adolescents from middle and high school classrooms (AARS, p.19). The manual states that data were collected in urban, rural and suburban public school in southeast, north, and southwest United States. However, Stephenson, noted

Appraisal of Instruments 11

that participants were recruited on a voluntary basis and there is no statistical breakdown of regional demographic data (MMYB, 15). Henington also noted that no indication of socioeconomic status was provided (MMYB, 15). Ethnic representations are identified separately for boys and girls. The largest represented groups being Caucasian 61.3% & 59.1%, African-American 20.9% & 24.6%, followed by

Hispanic 8.2% & 8.0%, Asian 3.5% & 3.3%, Multi-ethnic 4.9% & 4.1% and undetermined 1.4% & .9%.

Stephenson (15 th MMY) critiqued the overrepresentation of some groups.

To aid in interpretation Four norm tables are given for younger boys/girls and older boys/girls

(AARS, p.21). The manual distinguishes between two groups based on grade level. The rationale is that anger expressions are variable throughout adolescent development (AARS, p.16). However, this assertion has not been supported by other empirical studies (Henington, MMYB 15).

Selection and Use with School Populations and Assistance with Counseling Goals

In a school setting the AARS can be used to screen for maladjustment and skill deficits in impulse/self-control, problem solving and social skills (Henington, MMYB 15). Henington noted that the

AARS would be more useful for treatment planning than as a measure of treatment effectiveness

(Henington, MMYB 15). For school counselors the instrument could assist in the identification of individual students who would benefit from anger control training. The AARS manual also suggests that the instrument is useful in identifying atypical patterns of anger. This information would be helpful in the development of school-based training groups as different behavior patterns might be associated with each type of anger. For example, the manual recommends cognitive-behavioral training for adolescents with very high IA scores (AARS, p.40). The introduction to the manual references the need for school anger scales as a response to multiple school shootings. Stephenson concluded that this suggestion is misplaced as there is no evidence that this scale can predict such occurrences given the low base rate of extreme violence. (Stephenson, MMYB 15). The AARS does seem to be useful as a guide to the type of intervention needed and for the purposes of analyzing the situations resulting in conduct referrals on school campus.

Appraisal of Instruments 12

The normative sample of the instrument would support its use within a diverse school population.

However, given the overrepresentation of some groups caution is warranted when interpreting this measure for Hispanic, Asian, and other multi-ethnic groups. The lack of SES information for the normative sample might also be important, particularly if administering the AARS to a group of students with an unusually low or high SES.

The AARS differentiates behavioral components of anger that make-up three subscales within the instrument (AARS, p.7). This distinguishes anger types for the purposes of treatment planning. However, it has been noted that no principles for different treatment are offered or outlined in the manual (Stephenson, MMYB 15). Carlen Henington also noted that the usefulness of AARS to assess treatment effectiveness remains to be determined (Henington, MMYB 15).

References

Download