File - Crystal D. Ellis

advertisement
Critique of Testing Instruments
Crystal Ellis
March 4, 2008
COUN 566
Appraisal and Instruments
Dr. Miars
Introduction
The population that I will be working with is school-aged children, within a school
setting. The tests that I chose to critique are those that can be used with school aged-children.
Upon receiving my graduate degree, I would be qualified to administer these tests. The three
tests I chose are the Reynolds Child Depression Scale (RCDS), the Adolescent Anger Rating
Scale (AARS), and finally the Comprehensive Assessment of Interpersonal Relations (CAIRS).
The following is an analysis of the critiques of these tests and their uses in the field.
Adolescent Anger Rating Scale (AARS)
Purpose, Selection and Use of the Test The purpose of giving the AARS is so that one can
measure total anger expression, and also differentiate between instrumental anger, reactive anger
and anger control. (Burney, 2001) The development of the instrument was prompted by interest
in understanding the root causes of adolescent anger, the need to measure types of anger and the
need to develop specific treatment plans to decrease violence caused by anger that is expressed
in adolescents. (Burney, 2001)
Burney further describes this manual as an instrument that is intended to be used
specifically with adolescent’s aged 11-19. AARS items are written at approximately a fourth
grade reading level. (Burney, 2001) The AARS is a 41-item instrument that uses a 4-point Likert
scale ranging from 1 (Hardly Ever) to 4 (Very Often). (Henington, MMYB 15) Twenty items
measure Instrumental Anger (IA), a delayed possibly covert goal related response. (Henington,
15 MMYB)Eight items measure reactive anger (RA), and immediate response to events
perceived as negative, threatening, or fear provoking. (Henington, MMYB 15) Thirteen items
measure anger control (AC), a proactive cognitive behavioral anger management response.
(Henington, MMYB 15)
Norm Data and Quality of the Test The norm sample included adolescents from various
ethnic groups including African American, Asian, Caucasian, Hispanic, and multi-ethnic groups.
Inner city, urban and suburban environments are represented. (Burney, 2001) The norm group
consists of 4,187 adolescents divided into middle school (grades 6-8, ages 11-14) and high
school (grades 9-12 ages 14-19) from the United States. (Henington, MMYB 15)
According to Henington’s review of the AARS manual the following data was stated.
The internal consistency was obtained by using Cronbach’s alpha method. Correlations for the
entire standardization sample ranged from .81 to .92. Alpha coefficients and standard errors of
measurement for the AARS subscales were provided by grade level and gender group. Alpha
coefficients for girls and boys in Grades 6-8 and Grades 9-12 were consistent with correlations
observed for the total norm sample. Alpha coefficients ranged from .80 to .92 for participants in
Grades 6-8 and .81 to .94 for participants on Grades 9-12. Little variability in SEM was found
across the norm groups. (Burney, 2002) AARS test retest reliability was measured using 175
pairs of AARS protocols with in a 2-week interval between ratings. (Henington, MMYB 15)The
correlations ranged from .71 to .79. (Henington, MMYB 15) These scores indicate a fairly stable
measure. The Pearson Product Moment correlations were relatively low among the subscales.
(Henington, MMYB 15) Reactive Anger (RA) obtained low correlations across gender and
grade, with younger girls having the lowest alpha (.80) (Henington, MMYB 15) Instrumental
Anger (IA) obtained the highest values across gender and grades with the older boys having (.94)
(Henington, MMYB 15) Item total correlations ranged from .42 to .69 for IA items; .37 to .64 for
RA (Reactive Anger) items and .34 to .65 for AC (Anger Control) items. (Henington, MMYB
15) These low correlations suggest that each subscale represented on the AARS has the ability
to uniquely measure a specific type of anger independent of other subscales. Content Validity
was assessed using an expert panel (school psychologists, school personnel, university professor
and clinicians) the experts supported the face and content validity for the AARS. (Burney, 2001)
The Criterion validity was assessed using the Pearson Product moment correlation to determine
the scores on each AARS scale and subscale and the number of conduct referrals as well as the
number of instrumental and reactive anger conduct referrals. (Henington, MMYB 15) The results
yielded positive correlations indicating relationships described were strong. (Henington, MMYB
15) A strong negative correlation was found between anger control and the number of conduct
referrals (-.31), number of instrumental referrals (-.29), and the number of reactive type anger
referrals (-.36), indicating that the more control an adolescent has over his or her anger, the
fewer number of general or specific conduct referrals. (Burney, 2001) Furthermore, positive
correlations were found between Total Anger scores and the number of conduct referrals.
Administration, Scoring, and Quality of the Test Manual The AARS can be
administered in a group or just to an individual. When given in a group setting, the environment
should provide privacy and confidentiality. Rapport building techniques are especially useful in
easing anxiety of the test taker as well as useful in encouraging respondents to answer honestly.
(Burney, 2002) It is also important to discuss any questions that come up and distinguish
between the variables on the Likert Scale. (Burney, 2002) The manual also suggests (in more
than one place) ensuring that the respondent answers each question.
Application of Counseling Goals According to Henington, the AARS is a useful tool and
he acknowledges the careful consideration that was given to the development of the instrument.
However, Henington feels that the author does overstate some of the validity data. One key
concern is the etiology of anger and the number of demographic variables. He says that
researchers have found that anger and behavior problems such as conduct disorder, oppositional
defiance, and attention deficit disorder as related to, rather than caused by, these characteristics.
(Henington, MMYB 15) Finally, as a self-report measure with no validity or lie scale, it is
unknown if responses will be altered to achieve social acceptance of those who have access too
the information provided by the adolescent. According to Henington, discriminate comparisons
were made between AARS and the Multidimensional Anger Inventory (MAI). Rationale for this
comparison was that the two instruments measure different aspects of anger. The correlations
between the MAI and the AARS were described as moderately low. (Henington, MMYB 15)
These relatively low correlations provide support of the AARS scores’ ability to measure
constricts of anger that differ from the current measures of anger. Stephenson finds this
comparison as a flaw in the validity study. He asserts that it would have been useful to see other
convergent data such as the State-Trait Expression of Anger Inventory- II and other scales.
It is believed that the value of the instrument is likely to outweigh the concerns established here.
I can see that this instrument would be a useful screening tool in an educational setting, with
especially considerations to identifying types of anger.
Reynolds Child Depression Scale (RCDS)
Purpose, Selection and Use of the Test The Reynolds Child Depression Scale (RCDS) is
a self-report, paper and pencil measure, intended to assess the severity of depressive
symptomalogy in 8-12 year old (grades 3-6) children. (Carlson, MMYB 11) Although it is not
intended for diagnostic purposes; it was developed in accordance with widely accepted
diagnostic systems, such as the Diagnostic and Statistical Manual of Mental Disorders – Third
edition (DSM-III), and the Research and Diagnostic Criteria (RDC) and can be used as a
screener and as an assessment and evaluation instrument in clinical and research settings.
(Carlson, MMYB 11) The RCDS test book is entitled “About Me.” It may be safe to assume that
most children are aware of this concept, as it is something that most teachers address in their
curriculum at some point. I think that the comfort level there is perhaps and advantage to the
young child who may feel nervous. Children are told to choose responses that tell how they have
been feeling for the last two weeks. (Rohrbeck, MMYB 11) The author recommends that the
items be read out loud for children in grades 3 and 4. (Reynolds, 1981) The scale includes thirty
items (at a second grade reading level) that tap cognitive, motor-vegetative, somatic, and
interpersonal symptoms of depression. (Rohrbeck, MMYB 11) Twenty-nine items use a 4 point
Likert type scale response format, with choices almost never, sometimes, alot of the time, and all
of the time. The last item consists of five faces with expressions that range from happy to sad; the
child is asked to choose the circle that shows how he/she feels and several items on the scale
(n=7) are reverse scored. (Rohrbeck, MMYB 11)
Norm Data and Quality of the Test The test norms were based on a sample of 1,620
elementary school aged children in the Midwest and western areas of the United States. (Carlson,
MMYB 11) Both Carlson and Rohrbeck of the Mental Measurements Yearbook felt that the
sample seems representative of those regions. The norm sample included approximately 30%
ethnic minority children from urban, suburban and rural areas. (Carlson, MMYB 11) The RCDS
mean total score is 56.42 and is comparable to the mean item score of 1.88. (Reynolds, 1981)
Qualitatively this suggests that the average response to items for which high scores suggest
depressive sympotmolgy, was almost never and sometimes. Given that depression and depressive
symptoms are not considered a normal aspect of childhood, this level of overall symptom
endorsement appears consistent with expectations. (Reynolds, 1981) Both Carlson and
Rohrbeck of the Mental Measurements Yearbook also suggested that the information regarding
reliability and validity are quite impressive overall. Carlson asserts that the internal consistency
coefficients (using Cronbach’s alpha) and split half coefficients, corrected for length by the
Spearman-Brown formula were in the upper .80’s and lower .90’s within grades, gender, and
ethnic groups, as well as a subset for learning disabled children. Test-retest was surprisingly
good as well (.82 and .85). The standard error of measurement, computed to be between 3 and 4
points for the total RCDS scale further lends support of the clinical utilization of this measure.
(Reynolds, 1981) To establish validity, the manual’s author addresses content validity, criterionrelated, and construct validities. Evidence of construct validity is that the items were developed
to reflect the DSM-III R and the RDC symptoms of depression. (Reynolds, 1981) As evidence of
construct validity, there were several studies of convergent validity. (Rohrbeck, MMYB 11) The
RCDS correlates with the Child Depression Inventory with a correlation of .76. (Rohrbeck,
MMYB 11) Criterion related validity was reported by comparing RCDS performance with two
other measures o depression with children. In all instances, a correlation ranging in the mid .70’s
were obtained. (Reynolds, 1981)
Administration, Scoring and Quality of the Test Manual Reynolds explains that the
RCDS is written for ages well below that of the minimum age cut off for the test. The RCDS can
be administered individually, in small groups of 5-10 or in larger groups (20-30) in a classroom
setting. The test manual also suggests that the administering of this test is not advised for large
groups. The RCDS is designed to measure depressive symptoms, so to increase clinical utility of
the RCDS scores, a cutoff score is provided to designate a clinically relevant level of depressive
sypotmmology. (Reynolds, 1981) There are two forms of the test, Form HS and Form OCR.
Form HS is used for hand scoring administered in small groups or individuals and requires about
10 minuets for each child to complete. (Reynolds, 1981) The OCR form is for quantitatively
larger groups, with the option of mailing in for a machine to score. The cut-off score of 74
defines a level of depressive symptoms; a child with scores above 80 is characterized as severe.
(Reynolds, 1981) The RCDS should not be presented as a test, since this may suggest to young
children that there are right and wrong answers. It is also important to be mindful of the time in
which this test is given. For example, it should not be given around a holiday, field trip, or report
card time. This helps to decrease any stress for the child. The administrator should have a calm
and even demeanor, allowing for the child to ask any questions. Questions should be answered
honestly and concisely. (Reynolds, 1981)
The test manual itself is clear and thorough in its delivery of reliability and
validity as well as it explicit use of the test. Carlson states that through out the manual the test
author issues many caveats for potential test users in an effort to ensure proper test use and
interpretation. The test manual presents a solid base of studies pertaining not only to reliability
and validity, but also to the standardization procedures and psychometric properties of the
RCDS.
Application of Counseling Goals Overall, the test is quite useful in determining
depressive symptomalogy. Rorhbeck states that the test is not be given as the sole means of
determining depressive qualities in children, but rather used as a tool to help define
characteristics. This reviewer also mentions that this test is not be used to determine potential
suicide. The Children’s Depression Inventory would be a better tool for that. But the reviewer
does state that the test is very practical in it’s use with young children.
Clinical Assessment of Interpersonal Relations (CAIR)
Purpose, Selection and Use of the Test The Clinical Assessment of Interpersonal
Relations (CAIR) is an instrument developed to measure the quality of the relationships of
children and adolescents with significant people their lives, specifically, parents, peers and
teachers. (Bracken, 1993) Interpersonal relations are defined “as the unique and relatively stable
behavioral patterns that exists or develops between two or more people as a result of individual
and extra individual influences.” (Bracken, 1993) Behavioral aspects of relationships,
environmental influences and similarity of characteristics is included in this definition. The test
itself is composed of five relationship scales (mother, father, male peers, female peers, teachers)
and a Total relationship index (TRI). (Keith, MMYB 13) Each scale contains 35 items, which are
well organized in an eight-page test booklet containing an identification section, directions,
scales and summary page. The test taker would use a four point Likert Scale to mark how he/she
honestly feels about each relationship. The relationship characteristics measured are
companisionship, emotional support, guidance, emotional comfort, reliance, trust, understanding,
conflict, identification, respect, empathy, intimacy, affect, acceptance, and shared values. (Keith,
MMYB 13) This test aligns with the theoretical model of interpersonal relationships. Simply,
the theories generally suggest that interpersonal relationships of children and adolescents are
influenced by and related to their functioning in many different settings and often can predict
later psychosocial adjustment. (Bracken, 1993) Based on this theoretical position, the CAIR and
the Multidimensional Self Concept Scale (MSCS) was co-developed and co-normed. Bracken,
the author of the test identified six contexts in which children and adolescents most function
(social, competence, affect, academic, family, and physical) along with three relationship
domains (social, family and academic). These context and domains form the CAIRS and the
MCSC respectively. (Medway, MMYB 13)
One would select and use this test with children and adolescents ranging in age from 9.0
to 19.11. The CAIR would useful to school psychologists and neurophysiologists as it would be
helpful in determining a child’s feeling around his her place based on the relationships he has
developed with others around him. It helps to identify a child’s relationship difficulties and it
could serve as a guide for a therapist working on intervention in these areas. (Keith, MMYB 13)
Norm Data and Quality of the Test Keith’s findings regarding the norm sample are the
following. A national sample of 2, 501 children in grades 5 through 12 and ages 9-19 years was
used for norming. Children were from a regular education classes, and special educations classes
in rural, urban and suburban school districts. School District that participated was from around
the United States. Children from intact, reconstituted, single parent family homes or who were
living in foster homes were also chosen to represent the norm group. Students from both genders
were equally represented. The CAIR yields standard scores and raw scores, standard scores with
a mean of 100 and a standard deviation of 15. The scores are converted to T scores with the
mean at 50 and the standard deviation at 10. The classification system of the CAIRS describes
the extent to which relationships are positive and/or negative and therefore are easily understood
by parents, teachers and does not label a child. The internal consistency of the CAIR TRI (TRI
refers to Total relationship index) is relatively high, .96 for the total standardization sample.
(Bracken, 1993) The test retest reliability also exceeds .90 for each of the five scales. The CAIR
shows moderate correlation with the MCSC at .55. Content Validity is strong, as it was based on
sound physiological theories, research and literature support. Standard errors of measurement for
the five subscale range form 3.0 to 3.97 with the TRI at a SEM of 3.0. (Bracken, 1993)
Administration, Scoring and Quality of the Test Manual Bracken states that a formal
degree or training is not required to administer the CAIR, but is highly recommended that the
administration be done under the supervision of a trained professional. A professional solely
does the interpretation of the scores with a graduate training in the related psychology fields. The
test materials consist of the CAIR rating form, and the CAIR Score summary profile form. The
rating from can be completed in about 15 minutes as long as a quiet, comfortable, nondistracting
environment has been provided. It is essential that the person administering the CAIR have a
good rapport with the student. It is also helpful to go over the rating form prior to the beginning
of the test. The administrator can answer questions regarding words on eh test, but may not in
any way add to the meaning of the word when offering help. The test is written at a third grade
reading level. The CAIR can be given in a group setting or in an individual setting. There is no
time limit on the CAIR. Although it has been stated that the average time to take the test is 20
minutes. (Medway, MMYB 13) The scoring is very delicate in that it needs to be done with care.
The scoring is done by differential procedure. Positively worded items are scored from 4 to 1,
where as negative connotations are scored from 1 to 4. The examiner must be careful to apply to
correct scoring procedure. All raw scores are calculated and then converted to standard scores
based on examinee age and gender. Next, confidence intervals ranging from 85% to 99% are
assigned and percentile is found as well as T scores. (Medway, MMYB 13)
The test manual has been reviewed as being “exceptionally well written” although
more information is needed about age race, and gender in comparative analyses. (Keith, MMYB
13) Medway goes on to say that the manual adequately covers interpretive issues such as
differences in ratings across gender and type of interpersonal relationship so that the examiner
has some understanding of relationship normality.
Application to Counseling Goals I believe that both reviewers and the author of this test
felt that the CAIR was a psychometrically strong instrument. I can see its use in the school
setting as useful and practical. Medway asserts that praise the instrument as a “well conceived
and developed instrument that provide a straightforward method of measuring children’s
important social networks.” (Medway, MMYB 13)
Bibliography
Burney, Deanna Mckinnie, Adolescent Anger Rating Scale. 2001 Psychological
Assessment Resource. Odessa, FL.
Bracken, Bruce. A Comprehensive Assessment Of Interpersonal Relations. 1993 PROED INC Austin, TX.
Carlson, Janet Mental Measurement Year Book 11-RCDS
Henington, Carlsen MMYB 15 AARS
Keith, Patricia MMYB 13 CAIR
Medway, Fredric MMYB 13 CAIR
Reynolds, William. Reynolds Child Depression Scale. 1981-1989 Psychological
Assessments Resources. Odessa Fl.
Rohrbeck, Cynthia MMYB 11 RCDS
Stephenson, Hugh MMYB 15 AARS
Download