Ashley Rivera Test Reviews Test Review #1 Conners 3 Description: The purpose of the Conners 3 is to serve as a thorough and focused assessment of Attention Deficit Hyperactivity Disorders. The Conners 3rd Edition (Conners 3) is an instrument designed to assess ADHD and other disruptive behavior in children and adolescents. It is the latest update to the venerable Conners' Rating Scales (CRS). Initially introduced in 1989, the CRS became a valuable tool to the clinician concerned about a child who may be suffering from ADHD. The foundation of the CRS was rating scales completed by parents, teachers, and (in most cases) the child to provide multiple sources of information for the completion of an assessment. There are four different forms of the Conners 3, the full version (the "Conners 3"), an abbreviated format (the "Conners 3 Short"), and two screening forms (the "Conners 3 ADHD Index" and the "Conners 3 Global Index"). All forms of the instrument consist of inventories of face valid questions that are completed by parents, teachers, and by the child if 8 years of age or older. The questions are scored from 0 (not at all true/seldom/never) to 3 (very much true/very often/very frequently). Development: The Conners 3 is the second revision of the CRS (copyright 2008). The original instrument was developed in 1989 with the first revision, known as the Conners Rating Scales-Revised (CRS-R), released in 1997. Obviously, development of this latest edition has built on the first two instruments, and is nearly 50 years in the making, with the original checklists first used by the test author in clinical practice in the 1960s. This update reportedly began with a comprehensive review of clinical and legislative standards of ADHD, as well as focus groups consisting of clinicians, academics, and educators. Special attention was paid to the federal Individuals with Disabilities Education Act (IDEA 2004), and to common diagnostic criteria when establishing whether a child meets educational disability requirements. These efforts resulted in a large item pool that was pilot tested in both general and clinical populations. Exploratory factor analysis drove item selection and scale construction. Technical: The normative process for the Conners 3 culminated with an impressive 3,400 individuals in the normative sample. Nearly 7,000 rating forms were completed. Over 100 different sites in North America provided data from a general population group that was meant to parallel the 2000 U.S. Census distribution of gender, ethnicity, and geographical region. More than 800 individuals from a clinical population were also included. There are separate norms for boys and girls. Temporal stability was highest for parent ratings on Content scales (generally .88 or above) and lowest for self-report ratings on all of the scales (ranging from .71 to .83). Finally, correlations across informants (parent, teacher, and child) showed good interrater reliability. Ashley Rivera Test Reviews The Conners 3 manual is chock full (47 pages) of validity data that are difficult to distill down to a paragraph or two for this review. In brief, there is strong evidence presented that the instrument has adequate factorial validity with confirmatory factor analysis showing good fit. Convergent validity was demonstrated with significant rs using three other instruments: the Behavior Assessment System for Children, 2nd Edition (BASC-2), the Achenbach System of Empirically Based Assessment (ASEBA), and the Behavior Rating Inventory of Executive Functions (BRIEF). Discriminant validity is high with good sensitivity and specificity, adequate positive-predictive power, and an acceptable classification rate. Commentary: The Conners 3 is a welcome revision of the CRS-R, an invaluable tool in the assessment of ADHD. The revised instrument contains a normative group of 3,400 individuals and demonstrates strong psychometric properties. The manual is well written and user friendly. The full version of the Conners 3 offers a wealth of data in easy-to-interpret T scores on scales that are clinically relevant. The scales concerning Conduct Disorder and Oppositional Defiant Disorder are an excellent addition to this instrument, as many children with disruptive behaviors are often first conceptualized as having the impulsive/hyperactive type of ADHD. Similarly, scales for anxiety and depression may help attribute behavior that is initially thought to be only inattention, but is instead a symptom of a mood or anxiety disorder. The Conners 3(S) seems unnecessary, as the full version takes only 10 minutes longer to complete and yields quite a bit more data. The computer program makes scoring very straightforward and provides a wealth of data making interpretation easy. (Scoring by hand would be a chore, however.) The data afforded from the Conners 3 are especially helpful in the educational setting, with an eye towards helping the provider in determining whether a child meets disability criteria, providing aids in intervention for such youths and help with the formulation of an Individualized Educational Program. Ashley Rivera Test Reviews Test Review #2 Self-Perceptions of University Instructors Description: The instrument contains four global factors: Self as a Person/College Professor; Self as a College Professor; Self as a Person/Adjunct; and Self as an Adjunct Faculty. Each factor is divided into four scales: Self-Concept, Ideal Concept, Reflected Self, and Perceptions of Others. Although not explicitly stated, it can be inferred that a respondent would complete items for only two of the four factors. It is expected that a respondent would complete the scales that represents his or her status as a full-time or adjunct faculty member. It is presumed that a person who completes the instrument would receive a total of eight scale scores. Each scale of the instrument consists of 40 pairs of adjectives that are polar opposites in meaning. Each adjective is placed on opposite ends of a line that is marked in four quadrants. The middle of the line is interpreted as a neutral point. Each quadrant is anchored by the words very or more. Respondents answer each item by marking on the line the quadrant that most reflects the way the individual looks at self as a person. The administration and instructions are clear and easy to follow. The test authors do not indicate how long it takes to complete the instrument, but it would likely take about 15-20 minutes to complete the eight scales, each of which is scored separately. The scoring for each item is as follows: +2 for a "very" positive position, +1 for a "more" positive position, -1 for a "more" negative position, or -2 for a "very" negative position. Items on each scale are summed to yield a total scale score that ranges from -80 to +80. The test authors give procedures for converting raw scores to stanine scores for the purpose of standardized comparisons. The manual provides instructions for developing and interpreting individual profiles; however, the interpretations are complex or confusing and include interpretations based on moderate stanine scores. Interpretations are given for each of the eight scales, but no instructions are given for calculating a global score. The interpretations are general in nature. The test authors do not discuss how the profiles can be used in a practical context. The manual also gives instructions for calculating a group profile but does not indicate how to interpret such a profile. Development: The original Self-Perceptions Inventory was developed during an experiment that was designed to determine the best way to measure self. The test authors suggest that the basis of the items on the instrument was Prescott's theory of self-consistency and the thought that accurate measures of self are rooted in comparisons of positive and negative terms. No details were given regarding how specific principles of self-consistency theory served as the foundational theory for the SPI series. Development of the SPI/University Instructors is rooted in the development of the Adult Self-Perception scales, which was copyrighted in 1965. The early instrument consisted of three scales (Self-Concept, Ideal Concept, and Reflected Self). The original scales consisted of 36 pairs of bipolar adjectives. The early response format is similar to the current response format. Over time the test authors included the Perceptions of Others Ashley Rivera Test Reviews scale. The Self as University Instructor was designed in 2008. No information was given regarding theoretical reasons for adding the fourth factor. The current instrument began with 36 pairs of traits and was eventually increased to 40 points. The test authors indicated that the increase in the number of items was implemented to (a) facilitate comparisons with the Self as a Person Scale and (b) increase the reliability of the variables contained in the two scales. Technical: The standardization sample for the SPI/University Instructor consisted of 240 full-time and adjunct faculty in a variety of undergraduate and graduate programs. The sample included 40 full-time and 40 adjunct faculty in each group. No details were given on where the study took place. The manual reports means and standard deviations according to gender, but no other demographic information is provided about the sample. The lack of details regarding the demographic characteristics of the normative sample is a limitation of the manual. Due to these limitations, one cannot conclude the sample is representative of the larger population of university instructors, which limits the generalizablity of findings. The issue of generalizability has been addressed in previous reviews of tests from the SPI series (Clare, 2003; Demauro, 1992; Harrell, 1992; Wang, 2003). Several types of validity were reported for the SPI/University Instructor. However, the test authors did not differentiate whether the data were for full-time faculty, adjunct faculty, or a combination of both. Content validity estimates reportedly ranged from .62 to .86 across various scales. Construct validity estimates were reported for two of the scales and ranged from .64 to .72. However, no indication was given regarding the specific statistical procedure used to assess content or construct validity. Concurrent validity was reported using the multitrait-multirater method. The manual presents a 4 x 4 correlation matrix that includes ratings for self, students (.35), colleagues (.56), and administrators (.43). However, it is not clear which aspect of self was being rated. Further, it is not clear how the test authors generated the singular scores for the correlations. The test authors also reported predictive validity of .60 for Self as Adjunct Faculty and .69 for Self as College Professor. The criterion variable for the predictive validity was on-the-job-success; however, the construct of on-the-job-success was not defined. The method for operationalizing on-the-job-success was not specified. The data reported for reliability were weak. The manual reported a single 9-week stability coefficient of .90. Likewise, only a single value of coefficient alpha of .90 was given. The manual did not provide details of how the single reliability indices were generated. One would expect to find separate reliability estimates for each of the 16 scales of the instrument. The limitations regarding the technical merits of the SPI have been noted in previous reviews (Clare, 2003; Demauro, 1992; Harrell, 1992; Wang, 2003). Commentary: The SPI/University Instructor is a group or individually administered instrument that is easy to administer and score. More extensive data are needed regarding the psychometric properties of the instrument. Data for the normative sample needs to be strengthened and more clearly defined. There are many cultural factors (such as age, length of time teaching, ethnicity, position, and tenure status) that can affect a person's perception of self Ashley Rivera Test Reviews as a university instructor. Data on these demographic variables need to be collected and included in comparative studies of self-perceptions of university instructors. The limited data on the normative sample have been a long-standing concern with the SPI series (Clare, 2003; Demauro, 1992; Harrell, 1992; Wang, 2003). Additional details are needed on how specific validity indices were obtained, including statistical procedures used to establish validity. These details need to be included in narrative form. In addition, information regarding reliability needs to be expanded. The test authors must provide information regarding reliability for each of the 16 scales included in the instrument. Ashley Rivera Test Reviews Test Review #3 Psychological Processing Checklist–Revised DESCRIPTION The Psychological Processing Checklist-Revised (PPC-R) is a brief rating scale that functions as a screener for a teacher's perceptions of a student's psychological processing. The Individuals with Disabilities Education Act (IDEA) and other professional organizations define specific learning disabilities as deficits in one or more basic psychological processes that impair learning. The PPC-R was developed as a standardized measure of six of the nine processing dimensions identified by the Illinois State Board of Education (ISBE) for identification of learning disabilities. The test authors provide operational definitions for all nine processing dimensions. The PPC-R is unique in that it allows teachers to provide input on students' classroom behaviors in combination with a multisource diagnostic evaluation (e.g., intelligence tests, achievement tests, and Response to Intervention). The PPC-R consists of 35 items in which the teacher rates students in Kindergarten through Grade 5 on a 4point frequency scale. The use of multiple raters working as a group or individually is encouraged. Regardless of who completes the PPC-R, the rater should know the child for at least 6 weeks to ensure a valid description of the child's classroom behavior. PPC-R ratings can be completed in 15-20 minutes or less, and scoring requires an additional 10 minutes. Raw scores on each of the six dimensions and the total score can be converted into standardized scores on a Tscore metric and percentile ranks. T-scores are based upon gender norms with T-scores higher than 60 indicating a potential processing difficulty. DEVELOPMENT The PPC-R is a normative update of the original PPC published in 2003. The PPC was initially developed for one Illinois school district that wanted to better identify the processing deficits of children assessed for learning disabilities. The initial version included 85 items distributed across nine processing areas, and included two forms based upon grade level (i.e., Kindergarten-5th grade and 6th-12th grade). The test authors note that the second form was eliminated because it was difficult for upper-level teachers to reach agreement on students' behaviors. Following a pilot study, the PPC was reduced to 35 items that assess six processing dimensions, including Auditory Processing (7 items), Visual Processing (7 items), Visual-Motor Processing (6 items), Social Perception (5 items), Organization (5 items), and Attention (5 items). Three processing dimensions were eliminated: Monitoring, Conceptualization, and Automaticity and Speed of Mental Processing. The PPC-R maintained the same items as the original version. TECHNICAL The PPC-R was re-standardized in 2007 to expand the standardization sample to provide greater sampling across geographic regions, racial/ethnic samples, and to increase overall normative sample size. The PPC-R was normed on 2,107 general education students and 606 special education students with learning disabilities Ashley Rivera Test Reviews in kindergarten through fifth grade from all regions of the United States and a small sample of students from Canada. Forty-three percent of the normative sample came from the Midwest, whereas other regions were greatly underrepresented (i.e., West, 2% and Canada, 6%). The general education sample was used as the normative sample, and the special education sample's raw scores at the 25th, 50th, and 75th percentiles were provided as a reference. The sample size was adequate and roughly equally distributed across grades, although kindergarten students were somewhat underrepresented. Gender representation was fairly even across all grade levels. The ethnicity of the general education students was representative of the U.S. population, but ethnicity was not presented as a function of gender or grade level and socioeconomic status and characteristics of the school districts (i.e., size, public or private, rural versus urban) are not reported. At times, the technical manual is unclear, with the test authors referencing different sample sizes when reporting descriptive statistics, reliability, and validity data, with some tables reporting a larger sample size than the normative sample. There were multiple exceptions that were not explained. Additionally, the test authors report convergent and divergent validity on the basis of correlations between the PPC-R and two intelligence tests (WoodcockJohnson Test of Cognitive Ability-Revised [WJ-R; Woodcock & Johnson, 1989] and Cognitive Assessment System [CAS; Thorndike & Hagen, 1993]) and two standardized achievement tests (Iowa Test of Basic Skills [ITBS; Hoover, Dunbar, & Frisbie, 2001] and Cognitive Abilities Test [CogAT; Naglieri & Das, 1997]). Evidence for convergent validity was mixed as a result of each PPC-R subscale correlating moderately with multiple WJ-R subtests that theoretically do not involve that form of processing. The validity of the visualmotor processing and social perception scales remains largely unexamined because none of the intellectual or educational tests specifically involved tasks requiring these processes. COMMENTARY. The test authors appear to have made significant attempts to improve the psychometric properties of the instrument compared to the last version, particularly in the form of a significant increase in the size of standardization sample and the inclusion of students outside of the state of Illinois. However, it is unclear why the test authors continue to retain a six-subscale model of psychological processing when factor analysis suggested a five-factor model given the perfect correlation between the Organization and Attention factors. Similarly, all six factors are moderately to highly correlated with one another (range .58 to 1.0). Further attention to the relationship between psychological processes should be provided within the technical manual to caution assessors from overcompartmentalizing these related processes. The PPC-R was developed as a screening instrument and for only that use should it be used. However, in the manual the case studies provided stress the use of PPC-R as providing information for intervention. No empirical support for the use of the PPCR in guiding intervention is provided in the manual nor could any be found after a review of the literature. Therefore, further evidence of validity (e.g., instructional validity) is needed. A review of the literature was conducted and no peer-reviewed publications were found. It appears that the research involving the PPC-R has been solely in the form of a few master's theses and posters at national conferences. Ashley Rivera Test Reviews Ashley Rivera Test Reviews Test Review #4 Computerized Test of Information Processing DESCRIPTION: The Computerized Test of Information Processing (CTIP) was designed as a test of information-processing speed, or more specifically as a method of assessing whether processing speed abilities have been compromised by brain injuries or disease (especially Traumatic Brain Injury and Multiple Sclerosis). The test also purports to assess for malingering of cognitive deficits by determining whether examinees are exerting maximum effort as they approach the task. The test authors recommend the CTIP be administered as part of a comprehensive neuropsychological assessment battery that also includes measures of motor ability and potential malingering. The manual clearly states that the CTIP should not be used as a stand-alone indicator of malingering potential. Three subtests are included on the CTIP, administered in the following order using a computer monitor and keyboard: Simple Reaction Time, Choice Reaction Time, and Semantic Search Reaction Time. On the Simple subtest, the examinee is instructed to press the spacebar as quickly as possible when a single stimulus appears in the center of the computer screen. Thus, each trial provides a measure of the number of seconds required to process the stimulus and provide a response. On the Choice subtest, the examinee must choose which of two keyboard keys to press based on which stimulus (of two possible stimuli) appears on the screen. Thus, each trial provides a measure of the number of seconds needed to process two pieces of information and decide on a response. Finally, the Semantic Search subtest requires the examinee to choose which of two keyboard keys to press based on whether or not a stimulus word presented on the computer screen fits into a corresponding semantic category, which also is presented on the screen. Each trial provides a measure of the number of seconds required to recognize and process the meaning of the stimulus word, determine whether the word fits into the stimulus category presented, and press the corresponding keyboard key. Given the increasingly complex demands of each subtest, reaction times tend to increase from the Simple to Choice to Semantic Search subtest. As examinees take more time to process the information contained in these tasks and decide on a response, their reaction time scores increase, indicating decreases in processing speed. DEVELOPMENT: The operational definition of information-processing speed provided in the manual is: "a specific type of attentional processing that represents the speed at which processing operations are carried out" (p. 5). The test authors provide a brief review of the information-processing speed construct, citing its importance to other cognitive functions such as working memory and verbal comprehension. Further, the authors note that deficits in information-processing speed are involved in numerous clinical phenomena, such as normal aging, Traumatic Brain Injury, and neurological diseases such as Multiple Sclerosis. The test authors developed the CTIP based on research indicating the sensitivity and utility of reaction time tests for detecting cognitive impairment. The authors note that the Simple, Choice, and Semantic Search Ashley Rivera Test Reviews subtests were selected because they met criteria important in the measurement of information-processing speed, including minimal practice effects after repeated administrations, increasing complexity from one subtest to the next, less frustrating tasks than other measures of reaction time, and minimal reliance on motor abilities for successful task completion. TECHNICAL: Standardization. The normative sample included 386 people (173 males, 213 females) with cognitive functioning within normal limits. Standardization participants ranged from 15 to 74 years of age. For purposes of converting raw median reaction time scores into norm-based percentile scores, the sample was divided into four age groups: 15 to 24 years (n = 109), 25 to 44 years (n = 91), 45 to 64 years (n = 132), and 65 to 74 years (n = 54). The rationale for dividing the sample into these age groups is that reaction times were progressively slower as age increased. The mean number of years of education among normative participants was either 14.89 or 14.94 (the text presents one value and a table presents another; it is unclear which is correct). Unfortunately, no information is provided about the ethnicity or geographic distribution of the sample, or about the process by which people were selected for participation in the standardization process. Reliability. Internal consistency analyses are not reported for the CTIP, given the nature of the subtests (i.e., reaction times are the primary index of interest rather than number of correct or incorrect responses). It seems like some sort of internal consistency analysis (e.g., coefficient alpha, item-total correlations) could have been helpful in detecting items that for whatever reason did not function in the same way as other items (see Neuhaus, Carlson, Jeng, Post, & Swank, 2001, for an example of how internal consistency analyses were used with a computerized measure of reaction time). This seems especially important for the Semantic Search subtest, as the content differences among individual items on this subtest are greater compared to the interitem differences on the Simple and Choice subtests. In the absence of internal consistency analyses, reliability was estimated with two analyses of test-retest stability. In the first study, a sample of 20 undergraduate students took the CTIP on three occasions: The first administration was followed by a second administration with an interval of only 20 minutes, and the third administration occurred 1 week later. Test-retest coefficients for the 20-minute interval ranged from .56 to .66; coefficients for the 1-week interval ranged from .63 to .80. The second test-retest study followed a group of 20 undergraduates over 4 weeks, with one administration per week. Thirteen of these students participated in an additional administration that occurred 6 months after the fourth administration. Across the first 4 weeks, coefficients ranged as follows: Simple Reaction Time from .37 to .85, Choice Reaction Time from .61 to .85, and Semantic Search Reaction Time from .54 to .84. As was found in the first study, longer time intervals generally were associated with higher coefficients. Coefficients based on the 6-month interval ranged from .49 to .66. Overall, some of these test-retest coefficients are lower than expected, given the seemingly stable nature of the Ashley Rivera Test Reviews information-processing speed construct. Additional research using larger samples will be necessary to explore the reasons for these findings, especially with regard to why the short-term stability coefficients are so low. Validity. Several types of evidence for validity are described in the manual. Evidence for construct validity was evaluated by correlating CTIP scores with scores on other measures of neuropsychological constructs. For example, scores were correlated with scores on the Trail Making Test, Digit Span subtest, and Digit Symbol tests. Using subsamples from the normative group, CTIP scores demonstrated high and statistically significant intercorrelations among the three CTIP subtests and moderate and statistically significant correlations with Trails A and Trails B, but very small and nonsignificant correlations with the Digit Span and Digit Symbol tests. Correlational results also are presented using samples of participants with Traumatic Brain Injury or Multiple Sclerosis, but these correlations were rather inconsistent across groups, with no consistent or predictable pattern of relationships among scores. Several clinical studies were conducted to establish evidence for criterion-related validity, by demonstrating that different clinical groups performed in distinct and predictable ways on the CTIP. Overall, results of these studies generally support criterion-related validity. COMMENTARY: Perhaps the principal strength of the CTIP is the ease of administration and scoring, as the automated procedures eliminate scoring time and the potential for scoring errors. This reviewer installed the CTIP on his laptop computer in order to gain a "hands on" understanding of the instrument. Administration, scoring, and report-generating procedures are simple for the examiner to navigate. It seems that observing the examinee during administration will provide the examiner with clinically useful information such as behaviors or verbalizations indicative of fatigue, motivation, and tolerance for frustration. The Clinical report provides a useful graph that compares the examinee's performance to the normative group for each subtest, which (along with the percentile scores) makes it very easy to see deficits in information-processing speed relative to the normative sample. One caveat related to the computerized administration: Examiners must take care to use a computer monitor without a harsh glare, as this may influence performance. Ashley Rivera Test Reviews Test Review 5 The 16pf Adolescent Personality Questionnaire (APQ) Description: This test was developed and normed to elicit valuable information regarding adolescents personal style, problem solving skills, their interests in specific work activities, and other areas where the child may be having problems. The purpose of this test it for screening the adolescent and to decide on whether or not to introduce topics in a counseling setting. The test includes four sections. the child's personal style (normal personality traits). problem solving - this is where they will measure the child's ability to reason. work activity preferences measuring the six career interests. “Life’s Difficulties” which may be defined as questions that concern certain topics ( sex, emotiaonl needs, mood, aggression, etc.) The purpose of this assessment is to make educational adjustments where personality is relevant. This test is especially useful for professionals to adjust their approach to the characteristics of the child. The intended use of the test is to measure traits that influence how youths work and interact with one another. This test provides a broad overview of our personalities. It looks at independence, anxiety, tough mindedness, self control, openness to change and self control. Development: Being influence by world wars, Raymand Cattell developed this particular exam. Overall this test defines personality traits. This being a very popular and widely used exam of analyzing personality, it is used to describe and identify traits that make a person who he or she is. Technical: I believe that this test can be adapted for use with students from culturally and linguistically diverse backgrounds. This test is available in almost every language. This test does offer a certification program in order to better understand the content of the test and develop interview strategies. There is an online training that is offered and in order to use the assessment you must have a Masters degree in social work, psychology, education, HR or criminal justice, etc. OR you must successfully complete their online certification program. Commentary: Cattell and Cattell (1995) described the development of the new 5th edition of the 16PF, undertaken with the goal of updating and improving item content, standardizing on the current population sample, and refining the instrument psychometrically. Item selection involved an iterative process, commencing with selected items from all earlier versions of the 16PF (presumably excluding items which showed significant sex differences). Factor analyses (H.E.P. Cattell, 2001, 2004) supported the factor structure of the 16PF5 and demonstrated its continuity with earlier versions, but for this version, provided only five second-stratum factors in line with the Ashley Rivera Test Reviews currently popular Big Five personality dimensions and the corresponding static Five Factor Model (FFM). However, it is important to note that both Gorsuch and Cattell (1967) as well as Cattell and Nichols (1972) had previously undertaken extensive investigations into the delineation of higher-stratum Q-data personality factors. For example, from an examination of 10 separate studies, Cattell and Nichols had identified no fewer than eight second-stratum 16PF factors. Therefore, the Big Five (FFM) was seen by Cattell as being overly restrictive (Cattell, 1995). This issue has been examined independently (Boyle et al., 1995; Boyle & Saklofske, 2004), showing the inadequacy of the FFM which accounts for less than 60% of the known trait variance within the normal personality sphere alone, not including the abnormal trait domain (Boyle et al., p. 432; Boyle & Smári, 1997, 1978, 2002). Ashley Rivera Test Reviews Resources: 1) Lehman Library. 2) Z Databases. (n.d.). Retrieved from http://libguides.lehman.edu/az.php?a=m