Chapter 3 Alignment Among Secondary and Postsecondary Assessments in California The California Assessment Environment During the past decade, California’s assessment environment has been one of constant change. In the 1970's and 1980's, only one testing program existed in California, the California Assessment Program (CAP). This program was school-focused and lowstakes. In 1990, Governor Deukmejian canceled CAP, and the state began wrestling over the nature of assessment and accountability in California. The first major issue regarding assessment was the establishment of state curriculum content and achievement standards. A Standards Commission was created and in 1997, the Commission published state content standards for math and English. The second major issue of contention was whether schools or students would be the focus of accountability. In 1992, the State Department of Education began work on California’s first performance assessment, the California Learning Assessment System (CLAS). CLAS was supposed to provide school scores initially and eventually provide individual scores. CLAS was administered in California for two years, but outcries from some parent groups regarding possible violations of student privacy, coupled with concerns about its technical qualities and its inability to provide individual student data led Governor Wilson to veto CLAS funding. In 1995, another assessment was legislated, the California Assessment of Applied Academic Skills (CAAAS), a matrix-sample test proposed as a measure of progress toward state standards. By design, the CAAAS would only provide group scores; individual scores would not be available. Governor Wilson intervened again, withholding funding and demanding a basic skills test that would provide individual scores. Currently, the development of the CAAAS has been postponed. Wilson's demands were eventually met with the authorization of the Standardized Testing and Reporting Program (STAR) in 1997. The STAR program, which mandates testing for students in grades 2 through 11, consists of three parts. The first part is the Stanford Achievement Test, Version 9 (commonly referred to as Stanford 9 or SAT 9).1 1 Stanford 9 is published by Harcourt Brace Educational Measurement. 26 The Stanford 9 includes questions in math and ELA in all the tested grades, and highschool students are also tested in science and history. Initially, the Stanford 9 test was to be used primarily as a diagnostic aid; parents, educators, and policymakers would use the scores to gauge the progress of students and schools in mastering basic skills and knowledge. However, a new high-stakes accountability focus was recently introduced as part of STAR legislation, mandating that test results be used to reward or sanction schools. The second part of STAR consists of the “augmentation items.” Because the Stanford 9 is a battery of tests designed to assess the basic skills taught in most curricula throughout the country, the state added some "augmentation" items, which are aligned with California's content standards. These additional “augmentation” items are now part of the California Standards Tests. The California Standards Tests was initially designed to assess progress toward state standards in math and language arts only, but history, science, and writing were subsequently added to the program. All California Standards Tests are criterion-referenced, and some of the exams are end-of-course measures. Math students in grades 8-11 and science students in grades 9-11 take the math and science tests that are tied to the specific courses in which they are currently enrolled. Writing tests are administered in grades 4 and 7, and language arts tests are administered in grades 2-11. The third part of STAR is the Spanish Assessment of Basic Education, Second Edition (SABE/2). This is a multiple-choice test intended to assess native Spanish speakers’ knowledge of math and language arts. The SABE/2 is required for native Spanish speakers during their first year in California public schools. The California Department of Education administers other assessments as well. The California High School Exit Exam, was legislated for development in 1999 and slated for implementation in 2004. The exam assesses state content standards in ELA and math through grade 10. Specifically, the ELA test is aligned with content standards from the 9th/10th grade, whereas the content standards in mathematics are from the 6th/7th grade and Algebra 1. As a condition of graduating, students must pass both parts of the exam. Students are given up to 8 opportunities to pass the test. 27 Additionally, the California Department of Education also administers the Golden State Exams (GSEs). The GSEs are voluntary tests allowing high schools students to earn special recognition for outstanding levels of achievement, culminating in the Golden State Seal Merit Diploma. Most of the GSEs are end-of-course exams, although a few are comprehensive exams that measure knowledge developed over several courses. Currently, the GSEs are low-stakes assessments, although the University of California system is considering whether GSE scores can be used as an alternative or supplementary measure to the SAT I in making admissions decisions. California Assessments Included in this Study For this study, we examined the math and reading sections of the Stanford 9. Because security concerns precluded us from obtaining the Stanford 9 form currently in use in California (form T), we instead examined an alternate, low-security form (form S). Because the Stanford 9 form that we have analyzed may not resemble the form that is in use, the Stanford 9 results should be interpreted cautiously. STAR augmentation items were unavailable, so these items are excluded from our analysis. We also excluded the California Standards Test and the California High School Exit Exam because they were not available when this study was initiated. However, the GSEs are included in this study. The GSEs are 90-minute exams containing both multiple-choice and open-ended items. The tests are offered in key subject areas in grades seven through eleven, and are intended to assess student achievement relative to the state’s content standards. Five GSEs are included in this study: High School Mathematics (HS Math), First Year Algebra, Geometry, Reading/Literature, and Written Composition. The GSE Algebra and GSE Geometry are end-of-course exams, whereas the GSE HS Math, GSE Reading/Literature and GSE Written Composition are comprehensive exams that measure student knowledge over several courses. The GSE HS Math is intended to assess math knowledge commonly taught in three years of college preparatory math courses, whereas the GSE Reading/Literature and GSE Written Composition are intended to assess reading and writing skills developed during three years of college preparatory English courses. 28 Additionally, several college placement tests are examined. As mentioned in Chapter 2, the kinds of college placement tests given can vary by institution, so we obtained placement tests from the more selective university system (University of California) (UC) and the less selective system (California State University) (CSU). We also obtained placement tests used at the community college level. CSU has system-wide placement tests for both math and English, whereas UC administers a system-wide test only for English. Both CSU and UC administer college placement tests to assess whether entering students require remediation. In order to be exempt from taking an English placement exam, CSU students must exceed a certain score on the SAT II Writing, the verbal section of the SAT I, or the writing section of the ACT. For math, CSU students may use satisfactory scores on the math section of either the ACT or SAT I to exempt themselves from taking a math placement test. Students not meeting the minimum standards under the CSU guidelines must take a 75-minute multiple-choice math exam, and/or a 105minute English test, which contains both multiple-choice and open-ended items. To be exempt from taking an English placement test, UC requires a minimum achievement level on either the SAT II Writing or AP English Language and Composition exam. Examinees not meeting the UC standards for English are required to take a two-hour writing exam that demonstrates their ability to organize and develop their ideas. Community colleges administer a range of exams; we include the assessments used at Santa Barbara City College in this analysis as an example. All students planning to enroll in an English course at Santa Barbara City College must take the 85-minute College Test for English Placement before registration. The test, consisting of both multiple-choice and open-ended items, is used to place students in an appropriate English course. Santa Barbara City College also uses several placement tests from the Mathematics Diagnostics Testing Project to measure student readiness for a broad range of mathematical courses. Depending upon the math test taken and their performance on this test, students are placed into an appropriate math course. The Algebra Readiness Test, Second Year Algebra Readiness Test, and Mathematical Analysis Readiness Test are included in this study. The Algebra Readiness Test consists of 60 multiple-choice 29 items administered in 50 minutes, whereas both the Second Year Algebra Readiness Test and Mathematical Analysis Readiness Test contain 45 multiple-choice items administered in 45 minutes. Tables 3.1 and 3.2, organized by test function, list these testing programs and the type of information we were able to obtain for this study. For most tests, we used a single form from a recent administration or a full-length, published sample test. In a few instances where full-length forms were unavailable, we used published sets of sample items. This was the case for the CSU placement tests and the GSEs. For the ELA tests, Table 3.2 specifies whether the test includes each of three possible skills: reading, editing, and writing. 30 31 State achievement College admissions Stanford 9 ACT State achievement End-of-course Golden State Exam Geometry State achievement State achievement End-of-course Golden State Exam Algebra Golden State Exam High School Mathematics Test Type Test Full sample form Full form Sample items Sample items Sample items Materials Examined 60 minutes 60 minutes Two separate 45minute sessions Two separate 45minute sessions Two separate 45minute sessions Time Limit Calculator Calculator Ruler Calculator Ruler Calculator Ruler Calculator Ruler Tools 31 Table continues 60 MC 48 MC 30 MC 2 OE 30 MC 2 OE 30 MC 2 OE Number and Type of Items Selection of students for higher education Monitor student achievement Monitor student achievement toward state-approved content standards, provide special diploma Monitor student achievement toward state-approved content standards, provide special diploma Monitor student achievement toward state-approved content standards, provide special diploma Purpose Table 3.1 Technical Characteristics of the Mathematics Assessments Prealgebra (23%), elementary algebra (17%), intermediate algebra (15%), coordinate geometry (15%), plane geometry (23%) and trigonometry (7%) Two subtests: mathematical problem-solving and mathematical procedures Algebra I and II, geometry, probability and statistics Geometry First-year algebra Content as Specified in Test Specifications 32 Test Type College admissions College admissions College admissions College placement College placement College placement Test SAT I SAT II Mathematics Level IC SAT II Mathematics Level IIC Algebra Readiness Test California State University EntryLevel Mathematics Placement Exam Mathematical Analysis Readiness Test Full sample form Sample items Full sample form Full sample form Full sample form Full sample form Materials Examined 45 minutes 75 minutes 50 minutes 60 minutes 60 minutes 75 minutes Time Limit 32 Table continues 45 MC 65 MC 60 MC 50 MC 50 MC 35 MC 15 QC 10 GR Number and Type of Items Calculator Calculator Calculator Calculator Calculator Calculator Tools Assess preparation for trigonometry Assess whether admitted students possess entry level math skills Assess preparation for elementary algebra Selection of students for higher education Selection of students for higher education Selection of students for higher education Purpose Prealgebra, algebra, and geometry Algebra I and II (60%), geometry (20%), data interpretation, counting, probability, and statistics (20%) Arithmetic, prealgebra Algebra (18%), geometry (20%, specifically coordinate (12%) and three-dimensional (8%)), trigonometry (20%), functions (24%), statistics and probability (6%), and miscellaneous (12%) Algebra (30%), geometry (38%, specifically plane Euclidean (20%), coordinate (12%), and threedimensional (6%)), trigonometry (8%), functions (12%), statistics and probability (6%), and miscellaneous (6%) Arithmetic (13%), algebra (35%), geometry, (26%), and other (26%) Content as Specified in Test Specifications 33 College placement Second Year Algebra Readiness Test Notes. MC = multiple-choice OE = open-ended GR = grid-in QC = quantitative comparison Test Type Test Full sample form Materials Examined 45 minutes Time Limit 33 45 MC Number and Type of Items Calculator Tools Assess preparation for intermediate algebra Purpose Prealgebra, algebra, geometry Content as Specified in Test Specifications 35 Test Function State achievement State achievement State achievement College admissions College admissions Test Golden State Exam Reading/Literature Golden State Exam Written Composition Stanford-9 ACT AP Language and Composition Full sample form Full sample form Full form Sample Items Sample items Materials Examined 180 minutes --60 minutes reading -- 120 minutes writing 80 minutes --35 minutes reading --45 minutes editing 60 minutes Two separate 45-minute sessions Two separate 45-minute sessions Time Limit 35 Table continues 52 MC reading 1 OE reading 2 OE writing 40 MC reading 75 MC editing 84 MC reading 30 MC editing 2 OE writing 30 MC reading 2 OE reading Number and Type of Items Provide opportunities for HS students to receive college credit and advanced course placement Selection of students for higher education Monitor student achievement toward CA standards Monitor student achievement toward stateapproved content standards provide special diploma Monitor student achievement toward stateapproved content standards, provide special diploma Purpose Y Y Y N Y Reading Section? Table 3.2 Technical Characteristics of the English/Language Arts Assessments N Y N Y N Editing Section? Y N N Y Y Writing Section? 36 College placement College admissions SAT II Writing College Test for English Placement College admissions SAT II Literature College placement College admissions SAT I California State University EntryLevel English Placement Exam Test Function Test Full sample form Sample items Full sample form Full sample form Full sample form Materials Examined 85 minutes -- 30 minutes reading --35 minutes editing -- 20 minutes writing 105 minutes -- 30 minutes reading -- 30 minutes editing --45 minutes writing 60 minutes -- 40 minutes editing -- 20 minutes writing 60 minutes 75 minutes Time Limit 36 Table continues 35 MC reading 70 MC editing 1 writing 45 MC reading 45 MC editing 1 OE writing 60 MC editing 1 OE writing 60 MC reading 40 MC reading 38 MC editing Number and Type of Items Assess whether students possess entry-level English skills Assess whether admitted students possess entrylevel English skills Selection of students for higher education Selection of students for higher education Selection of students for higher education Purpose Y Y N Y Y Reading Section? Y Y Y N Y Editing Section? Y Y Y N N Writing Section? 37 College placement University of California Subject A Examination Notes. MC = multiple-choice OE = open-ended Test Function Test Sample questions Materials Examined 120 minutes Time Limit 37 1 OE writing Number and Type of Items Assess whether admitted students possess entrylevel writing skills Purpose N Reading Section? N Editing Section? Y Writing Section? Alignment Among California Math Assessments In this section, we describe the results of our alignment exercise for the math assessments. The results are organized so that alignment among tests with the same function is presented first, followed by a discussion of alignment among tests with different functions. Alignment is described by highlighting similarities and differences with respect to technical features, content, and cognitive demands. That is, we first present how the assessments vary on characteristics such as time limit, format, contextualized items, graphs, diagrams, and formulas. We then document differences with respect to content areas, and conclude with a discussion of discrepancies in terms of cognitive requirements. Table 3.3 presents the alignment results for the math assessments. The numbers in Table 3.3 represent the percent of items falling into each category. As an example of how to interpret the table, consider the SAT I results; 58% of its items are multiplechoice, 25% are quantitative comparisons, and 17% are grid-in items. With respect to contextualization, 25% of the SAT I questions are framed as a real-life word problem. Graphs are included within the item-stem on 7% of the questions, but graphs are not included within the response options (0%), and students are not asked to produce any graphs (0%). Similarly, diagrams are included within the item-stem on 18% of the questions, but diagrams are absent from the response options (0%), and students are not required to produce a diagram (0%). With respect to content, the SAT I does not include trigonometry (0%), and assesses elementary algebra (37%) most frequently. In terms of cognitive demands, procedural knowledge (53%) is the focus of the test, but conceptual understanding (32%) and problem solving (15%) are assessed as well. Results for the other tests are interpreted in an analogous manner. 38 39 MC QC 95 92 100 GSE Geometry GSE HS Math Stanford 9 0 0 0 0 0 SAT II Math Level 100 IIC 100 CSU EntryLevel Math 100 Placement Exam Algebra Readiness Test 0 0 College Placement Tests 0 SAT II Math Level 100 IC 0 25 100 58 SAT I ACT College Admissions Tests 95 GSE Algebra State Achievement Tests Test 0 0 0 0 17 0 0 0 0 0 GR Format 0 0 0 0 0 0 0 8 5 5 OE 24 26 12 18 25 22 58 33 10 15 C Context 0 9 12 8 7 5 21 0 0 0 S 0 0 2 0 0 2 4 5 0 5 RO Graphs 0 0 0 0 0 0 0 0 5 0 P 16 12 2 26 18 13 19 23 75 10 S 0 0 0 0 0 0 0 5 0 0 P 39 18 8 10 12 1 15 6 15 25 10 M 0 2 0 0 8 0 6 0 0 0 G Formulas Table continues 0 0 0 0 0 0 0 0 0 0 RO Diagrams 6 42 2 2 13 17 0 23 0 0 PA 32 20 14 30 37 22 13 15 0 52 EA 8 0 22 10 2 5 2 0 0 0 IA 16 2 12 12 6 15 19 23 5 19 CG 14 12 14 28 19 25 19 23 86 14 PG Content 2 0 18 4 0 8 4 0 10 0 22 8 6 8 13 3 40 15 0 10 TR SP 0 0 12 6 11 5 4 0 0 5 MISC 28 16 26 34 32 40 63 62 52 24 70 84 54 58 53 53 31 23 38 71 PK Cognitive Demands CU Table 3.3 Alignment Among the Technical, Content, and Cognitive Demands Categories for the Math Assessments 2 0 20 8 15 7 6 15 10 5 PS 40 100 100 MC 0 0 QC 0 0 GR 0 0 OE 7 7 C Context Formulas M = formula needs to be memorized G = formula is provided Notes. Format MC = multiple-choice items QC = quantitative comparison items GR = fill-in-the-grid items OE = open-ended items Second year Algebra Readiness Test Math Analysis Readiness Test Test Format 6 0 S 4 2 0 0 P 22 18 S 0 0 RO 0 0 P Diagrams 16 18 M 40 Content Areas PA = prealgebra EA = elementary algebra IA = intermediate algebra CG = coordinate geometry PG = plane geometry TR = trigonometry SP = statistics and probability MISC = miscellaneous topics 0 0 G Formulas Contextualization C = contextualized items RO Graphs 2 2 PA 0 31 IA 4 7 CG 33 29 PG 0 0 0 0 TR SP 0 0 MISC 16 13 Cognitive Demands CU = conceptual understanding PK = procedural knowledge PS = problem-solving 82 82 PK Cognitive Demands CU Diagrams S = graph/diagram within item-stem RO = graph/diagram within response options P = graph/diagram needs to be produced 60 31 EA Content 2 4 PS Alignment Among Tests With the Same Function State Achievement Tests Four state achievement tests are included in this analysis: the GSE Algebra, GSE Geometry, GSE HS Math, and Stanford 9. Of the four assessments, only the Stanford 9 does not require two testing sessions.2 The Stanford 9 is also the only state achievement test that does not include any open-ended questions. The Stanford 9 and GSE HS Math contain many contextualized items (58% and 33%, respectively), whereas the GSE Algebra and GSE Geometry contain relatively few (15% and 10%, respectively). The Stanford 9 also includes many problems that contain graphs within the item-stem (21%), but these types of problems are absent from the GSEs (0% for all three GSE tests). Questions with diagrams in the item-stem comprise a large proportion of the GSE Geometry (75%), a moderate proportion of the GSE HS Math and Stanford 9 (23% and 19%, respectively), and a small proportion of the GSE Algebra (10%) questions. Although formulas are typically not needed to solve a problem, there is variation in the use of formulas among the tests. Memorized formulas are needed most frequently on the GSE Geometry items (25%) and least frequently on the Stanford 9 items (6%). There are also differences with respect to emphasis of particular content areas. The GSE Algebra focuses on elementary algebra (52%), Stanford 9 focuses on statistics and probability (40%), and GSE Geometry focuses on planar geometry (86%). In contrast, the GSE HS Math is more broadly distributed, emphasizing planar geometry, coordinate geometry, and prealgebra (23% of its items on each content area). Most items on the Stanford 9, GSE HS Math, and GSE Geometry assess conceptual understanding (63%, 62%, and 52%, respectively), but the GSE Algebra items tend to measure procedural knowledge (71%). College Admissions Tests We examined four college admissions tests: the ACT, SAT I, SAT II Math Level IC, and SAT II Math Level IIC. All tests, except the SAT I, have a one-hour time limit. 2 As part of the STAR, assessments in several areas, including science and social science, are administered over a three-day period. However, the math and ELA assessments are administered in a single testing session, unlike the GSEs, which are administered over two testing sessions. 41 The SAT I has a 75-minute time limit. All four exams are also predominantly multiplechoice, although the SAT I includes quantitative comparison (25%) as well as grid-in (17%) items. Contextualized questions are most prevalent on the SAT I (25%) and least prevalent on the SAT II Math Level IIC (12%). Students are rarely asked to work with graphs, and questions that contain graphs within the item-stem constitute no more than 12% of items on the college admissions measures. Questions that include diagrams within the item-stem are more prevalent, comprising 26%, 18%, and 13% of items on the SAT II Math Level IC, SAT I, and ACT, respectively. However, questions with diagrams within the item-stem are infrequent on the SAT II Math Level IIC (2%). Formulas are also uncommon, but there are differences with respect to the extent to which formulas are necessary. Whereas the ACT, SAT II Math Level IC, and SAT II Math Level IIC include some items in which a memorized formula is needed (15%, 12%, and 10%, respectively), these items are largely absent from the SAT I (1%). Although college admissions generally exams sample from the same content areas, they do not do so to the same extent. Elementary algebra comprises most of the SAT I items (37%). The SAT II Math Level IC also emphasizes elementary algebra (30%), but focuses on planar geometry as well (28%). The ACT shows a similar content emphasis as that of the SAT II Math Level IC; 22% of its items assess elementary algebra and 25% assess planar geometry. The SAT II Math Level IIC, on the other hand, draws from more advanced content areas, such as intermediate algebra (22%) and trigonometry (18%). In terms of cognitive demands, all four tests assess procedural knowledge to a similar degree. Procedural knowledge items constitute between 54% and 58% of the items found on college admissions measures. However, there is more variation among the exams with respect to emphasis on problem solving. The SAT I and SAT II Math Level IIC place relatively greater emphasis on problem solving (20% and 15%, respectively) than do the ACT and SAT II Math Level IC (7% and 8%, respectively). College Placement Tests We included four college placement tests in this study: the Algebra Readiness Test, CSU Entry-Level Math Placement Exam, Math Analysis Readiness Test, and 42 Second Year Algebra Readiness Test. All four measures are solely multiple-choice tests, and the majority can be completed within an hour (CSU Entry-Level Math Placement Exam is the exception at 75 minutes). The Algebra Readiness Test and CSU Entry-Level Math Placement Exam contain a moderate proportion of items framed in a real-life context (26% and 24%, respectively), but contextualized items are largely absent from the Math Analysis Readiness Test and Second Year Algebra Readiness Test (7% each). Questions in which graphs are included within the item-stem are rarely present on college placement measures. They constitute 9% and 6%, respectively, of items on the Algebra Readiness Test and Second Year Algebra Readiness Test, and are not included on either the CSU Entry-Level Math Placement Exam or Math Analysis Readiness Test. Items that contain diagrams within the item-stem comprise a small to moderate proportion of test questions, ranging from 12% of the Algebra Readiness Test items to 22% of the Second Year Algebra Readiness Test items. There is little variation with respect to formulas, as items requiring a memorized formula comprise 18% of items on three of the four college placement exams. The exception is the Algebra Readiness Test, where it comprises 8%. Content discrepancies are apparent among the college placement measures. The Second Year Algebra Readiness Test and CSU Entry-Level Math Placement Exam focus on elementary algebra (60% and 32%, respectively), but the Algebra Readiness Test focuses on prealgebra (42%). The Math Analysis Readiness Test has a broader content sampling than the other college placement measures, emphasizing elementary algebra (31%), intermediate algebra (31%), as well as planar geometry (29%). Few differences are observed with respect to cognitive requirements, as problemsolving items and conceptual understanding items are downplayed in favor of procedural knowledge items. Procedural knowledge problems constitute the majority of the college placement measures, ranging from 70% of the CSU Entry-Level Math Placement Exam questions to 84% of the Algebra Readiness Test questions. Alignment Among Tests with Different Functions With the exception of the SAT I and the GSEs, none of the math assessments requires students to generate their own answers. Questions framed within a realistic 43 context typically represent a small to moderate proportion of college admissions (12%25%) and college placement tests (7%-26%), but are more prevalent on state achievement tests, such as the Stanford 9 and GSE HS Math (58% and 33%, respectively). Excluding the Stanford 9, questions that contain graphs within the item-stem are relatively uncommon, and are more likely to be included on college admissions (5%-12%) than on college placement (0%-9%) or state achievement tests (0% if Stanford 9 is excluded). Diagrams are included on every measure that we examined, but typically represent only a small or moderate fraction of a test. Questions that contain diagrams within the itemstem represent 2%-26% of college admissions items and 12%-22% of college placement items. On state achievement measures, diagrams are included within the item-stem on 10% of the GSE Algebra questions to 75% of the GSE Geometry questions. Items calling for a memorized formula are relatively infrequent, and represent no more than 25% of questions on any given test. With respect to the content category, college admissions exams assess logic (coded as miscellaneous) and knowledge of advanced courses (i.e., intermediate algebra or trigonometry) more frequently than do state achievement or college placement measures. Intermediate algebra items, for example, are included on 2%-22% of college admissions tests, 0%-2% of state achievement tests, and 0%-8% of college placement tests, the Math Analysis Readiness Test notwithstanding. The anomalously high proportion of intermediate algebra found on the Math Analysis Readiness Test (31%) reflects the fact that this exam is intended to place students into courses up to trigonometry, whereas the other college placement measures are either remedial placement measures (CSU Entry-Level Math Placement Test and the Algebra Readiness Test) or are intended for course placement no higher than intermediate algebra (Second Year Algebra Readiness Test). Excluding the Math Analysis Readiness Test, prealgebra and elementary algebra together comprise between 38%-62% of items on the three remaining college placement exams. State achievement tests, such as the Stanford 9 and GSE HS Math, are broadly distributed across multiple math areas, whereas the GSE Algebra and GSE Geometry are more narrowly focused on a single, relevant subject matter. 44 In terms of cognitive requirements, state achievement tests, on average, tend to emphasize conceptual understanding (52%-63% if GSE Algebra is excluded), whereas college admissions and college placement tests emphasize procedural knowledge (53%58% and 70%-84%, respectively). Problem-solving items are most prevalent on college admissions (7%-20%) and least prevalent on college placement exams (0%-4%). Problem-solving questions are also likely to be included on the GSEs (5%-15%), which reflects the fact that the GSEs are used to award students who wish to earn special honors on their high school diploma. Discussion Below, we discuss the implications of the discrepancies among the math assessments. We begin by highlighting instances in which differences are justifiable, then address whether there were any misalignments that may send students confusing signals. We also explore the possibility that state achievement tests can inform postsecondary decisions. Which Discrepancies Reflect Differences in Test Use? As noted in Chapter 1, content discrepancies may reflect differences in intended test use. To illustrate, consider the SAT II Math Level IIC, Stanford 9, and GSE Geometry. Although both the SAT II Math Level IIC and Stanford 9 include topics from a wide variety of courses, the SAT II Math Level IIC includes many trigonometry and problem-solving items (18% and 20%, respectively), whereas the Stanford 9 rarely includes such material (4% and 6%, respectively). The broad content sampling found on both of these assessments can be further contrasted with the content of the GSE Geometry, which reflects the curriculum of a specific course. In this particular case, the SAT II Math Level IIC, Stanford 9, and GSE Geometry have disparate functions, and content differences reflect variations in purpose. Because the SAT II Math Level IIC is used to select among higher-achieving students for entrance into universities and colleges, it includes many problem-solving and trigonometry items in order to distinguish among higher-proficiency examinees. The Stanford 9, on the other hand, is used to monitor student achievement statewide, and therefore requires items of more moderate 45 difficulty that can be attempted by students with a wider range of proficiency levels. The GSE Geometry is not a measure of math knowledge developed over several math courses (as is the case for the SAT II Math Level IIC and Stanford 9), but is instead a measure of proficiency of one specific course. Consequently, it is warranted that the GSE Geometry limits its content to a narrow area of math. The above example represents justifiable discrepancies across tests with different purposes. However, there are also instances in which discrepancies within tests of similar purposes are warranted as well. Although diagrams comprise 10% of the GSE Algebra items, but 75% of the GSE Geometry items, this discrepancy is not a misalignment. Instead, it is indicative of the latter exam’s content focus, which emphasizes figural relations more so than any other math content area (Fischbein, 1993). Likewise, that the SAT I places greater emphasis on problem-solving and non-routine logic problems, whereas the ACT places greater emphasis on procedural knowledge and textbook-like items is warranted given that the SAT I is intended to be a reasoning measure, and the ACT is intended to assess content knowledge found in high-school math courses. Is There Evidence of Misalignment? In our analysis of the math tests, we could not find any examples of misalignments, as discrepancies among college admissions, college placement, and state achievement measures are either small or moderate, and could generally be predicted a priori. To illustrate, consider that open-ended items are included on the GSEs but are absent from college admissions measures. As noted in Chapter 1, the inclusion of openended items on state achievement exams is indicative of attempts to use these tests as levers of instructional reform. College admissions exams, on the other hand, exclude open-ended items because such items can potentially undermine the public’s perceptions of these tests as “objective” measures in which to make fair comparisons of student proficiency.3 Format differences, in this case, are not misalignments. 3 Open-ended items are also excluded because they are more costly than multiple-choice items. 46 Can State Achievement Tests Inform Postsecondary Admissions and Course Placement Decisions? Although there are many discrepancies among exams of different functions, it may still be possible that a test can serve multiple purposes satisfactorily. Currently, some measures are used for more than one purpose. Many postsecondary institutions, for example, allow students to submit scores from college admissions exams such as the SAT I or ACT as a means of exemption from a remedial college placement test. Potentially, state achievement tests, such as the GSEs, can be used for similar purposes, but no postsecondary institution to date has made use of GSE scores for placement decisions. However, many policymakers have called for expanding GSE scores to purposes beyond monitoring student achievement (Kirst, 2000) because such a policy change would not only reduce testing burden, but it would also motivate students to focus on state standards rather than on external tests like the SAT I or ACT (California State Board of Education, 2001a; Healy, 2001; Hoff, 2001; Olson, 2001b; Standards for Success, 2001). The UC Latino Eligibility Task Force, as well as the President of the UC System, have advocated that the GSE scores be used for guiding admissions decisions (Archibold & Colvin, 1997; Atkinson, 2001). Below, we discuss the potential of the GSE Algebra and GSE HS Math for college placement and admissions decisions, respectively.4 It may be possible to use the GSE Algebra for remedial placement decisions, as its test content is similar to or more rigorous than that of remedial college placement exams. The GSE Algebra covers elementary algebra to a greater extent than the Algebra Readiness Test, and to approximately the same extent as the Second Year Algebra Readiness Test. The GSE Algebra also contains a higher proportion of problem-solving items than either remedial college placement exam. Logically, a sufficiently high score on the GSE Algebra could excuse students from having to take the Algebra Readiness Test or Second Year Algebra Readiness Test. However, it is unlikely that the GSE Algebra can be used for broader course placement decisions. Unlike the Math Analysis Readiness Test, the GSE Algebra does 4 Because the content of the GSE Geometry does not match that of any of the college admissions or college placement tests, it is not a viable alternative for informing admissions or placement decisions, and is excluded from the following discussion. 47 not assess intermediate algebra, which means that it cannot inform decisions regarding whether students are prepared to take more advanced math courses, namely trigonometry. With respect to guiding admissions decisions, the GSE HS Math is a potential alternative to college admissions measures such as the SAT I. It has roughly the same content coverage as the SAT I, and includes the same proportion of problem-solving items (15%). To determine the feasibility of the GSE HS Math as a measure that informs admissions decisions, more research is needed to explore the relationship between the GSE HS Math and SAT I scores, as well as the relationship between the GSE HS Math scores and first-year college grade point average. Other factors, such as the potential of adverse impact of the use of the GSE HS Math scores on different student groups, must be considered as well. Alignment Among California ELA Assessments Below we present the ELA results. As with math, we discuss discrepancies both within and across test functions. The results are also organized by skill, namely reading, editing, and writing. In some instances, there are only two tests that assess the same skill and share the same function, so it is important to recognize that patterns or comparisons between these tests may not be indicative of more general trends within this category of tests. Alignment is characterized by describing differences with respect to technical features, content, and cognitive demands. Specifically, we discuss differences in time limit and format, then document discrepancies with respect to topic, voice, and genre of the reading passages, before concluding with variations in cognitive processes. The alignment results for tests that measure reading skills are presented in Tables 3.4 and 3.5. Tables 3.6 and 3.7 provide the results for exams that assess editing skills, and Tables 3.8 and 3.9 provide the findings for exams that assess writing skills. For each table, the numbers represent the percent of items falling in each category. To provide a concrete example of how to interpret the findings, consider the content category results for the AP Language and Composition, presented in Table 3.4. With respect to topic, 50% of the reading passages included on the AP Language and Composition are personal accounts, whereas 25% of the topics are about humanities, and the remaining 25% are 48 about natural science. It does not include topics from fiction or social science (0% each). In terms of the author’s voice, 75% of the passages are written in a narrative style, whereas the other 25% are written in an informative manner. With respect to genre, only essays (100%) are used; passages on the AP Language and Composition are not presented as letters, poems, or stories (0% each). Results for the other tests are interpreted in a similar manner. 49 50 17 Stanford 9 0 20 63 AP Language and Composition SAT I SAT II Literature 0 0 College Test for English Placement CSU Entry-Level English Placement Test College Placement Tests 25 ACT College Admissions Tests 0 Fiction GSE Reading/Literature State Achievement Test Test 100 43 0 40 25 25 33 0 0 43 0 20 25 25 33 0 0 14 13 20 0 25 0 0 Humanities Natural Social Science Science Topic 0 0 25 0 50 0 17 100 Personal Accounts 50 100 0 100 40 75 0 0 0 0 0 0 0 50 50 0 100 0 0 0 0 0 0 17 0 Narrative Descriptive Persuasive Voice 0 100 0 60 25 50 33 0 Informative Table 3.4 Alignment Within the Content Category for the Reading Passages 0 0 13 0 0 0 17 0 Letter 100 100 25 80 100 75 33 100 Essay 0 0 50 0 0 0 0 0 Poem Genre 0 0 13 20 0 25 50 0 Story Reading Measures Alignment Among Tests of the Same Function State Achievement Tests There are two state achievement tests that measure reading skills, the GSE Reading/Literature and Stanford 9. The GSE Reading/Literature test consists of two separate 45-minute testing sessions, one of which is devoted to open-ended items. The Stanford 9, on the other hand, contains only multiple-choice items and is administered in a single 60-minute testing session (see Table 3.2). With respect to reading passages topics, the GSE Reading/Literature includes only personal accounts (100%), whereas the Stanford 9 is focused on humanities and natural science (33% each topic) (see Table 3.4). Reading passages on both exams are most likely to be written in a narrative style (100% for GSE Reading Literature and 50% for Stanford 9), although the Stanford 9 includes informative (33%) and persuasive (17%) pieces as well. In terms of genre, the GSE Reading/Literature favors essays (100%) but the Stanford 9 favors stories (50%). Both tests focus heavily on straightforward recollection of facts, with recall items constituting 71% of the Stanford 9 items and 86% of the GSE Reading/Literature items (see Table 3.5). Table 3.5 Alignment Within the Cognitive Demands Category for Tests Measuring Reading Skills Test Recall Inference Evaluate Style GSE Reading/Literature 86 14 0 Stanford 9 71 29 0 ACT 58 42 0 AP Language and Composition 23 77 0 SAT I 18 83 0 SAT II Literature 13 80 7 College Test for English Placement 54 46 0 CSU Entry-Level English Placement Exam 33 66 0 State Achievement Tests College Admissions Tests College Placement Tests 51 College Admissions Tests Four college admissions exams assess reading proficiency: the ACT, AP Language and Composition, SAT I, and SAT II Literature. With the exception of the AP Language and Composition, no college admissions test assesses reading skills with openended items. Testing time devoted specifically to measuring reading skills is 60 minutes for both the SAT II Literature and AP Language and Composition. Because the SAT I does not contain separate sections for editing and reading items, we cannot determine testing time earmarked specifically for assessing reading proficiency, although testing time devoted to assessing both types of skills is 75 minutes (see Table 3.2). Reading passage topics also vary from one measure to the next (see Table 3.4). The SAT II Literature emphasizes fiction (63%) whereas the AP Language and Composition emphasizes personal accounts (50%). The SAT I favors humanities (40%), but the ACT is evenly distributed among fiction, humanities, natural science, and social science (25% each). Narrative pieces are included on all college admissions measures, and range from 40% of the SAT I passages to 100% of the SAT II Literature passages. Essay is generally the most common genre, appearing on 75% of the ACT, 80% of the SAT I, and 100% of the AP Language and Composition passages. However, the SAT II Literature is more likely to include poems (50%) than essays (25%). With the exception of the ACT, college admission exams place the greatest emphasis on interpretation and analysis of the reading passages. Inference items range from 42% of the ACT questions to 83% of the SAT I questions (see Table 3.5). College Placement Tests Two college placement tests, the CSU Entry-Level English Placement Exam and College Placement Test for English, contain reading items. Both assess reading proficiency with a 30-minute multiple-choice reading section (see Table 3.2). Humanities is the prevalent reading passage topic on the CSU Entry-Level English Placement Exam (100%), but the College Placement Test for English is likely to contain either humanities or natural science topics (43% on each) (see Table 3.4). With respect to author’s voice, the CSU Entry-Level English Placement Exam favors narrative pieces (100%) whereas the College Placement Test for English favors informative pieces 52 (100%). There are no differences with respect to genre, as both exams present reading passages only as essays. For the cognitive demands category, the College Placement Test for English is almost evenly split among recall (54%) and inference (46%) items, but the CSU Entry-Level English Placement Exam focuses more on inference questions (66%) (see Table 3.5). Alignment Among Tests of Different Functions With the exception of the AP Literature and Composition and GSE Reading/Literature, reading skills are assessed only with multiple-choice items. Testing time devoted specifically to assessing reading skills ranges from 30-minutes for the CSU Entry-Level English Placement Exam and College Placement Test for English to 60 minutes for the Stanford 9, SAT II Literature, and AP Literature and Composition. All but two assessments contain reading passages on two or more topics. The two exceptions are the GSE Reading/Literature, which includes only personal accounts, and the College Placement Test for English, which contains only humanities pieces. Every test contains either a narrative passage or an informative passage, and the vast majority includes both. Essay is the most prevalent genre, comprising 33%-100% of state achievement, 100% of college placement, and 75%-100% of college admissions tests, the SAT II Literature exam notwithstanding. The SAT II Literature favors poems instead. State achievement tests emphasize recall questions (71%-86%), whereas college placement tests focus on inference questions (42%-83%). Compared to the two other types of exams, college placement measures are more evenly distributed between recall (33%-54%) and inference questions (46%-66%). Editing Measures Alignment Among Tests of the Same Function State Achievement Test Of the state achievement measures, only the GSE Written Composition assesses editing skills. It contains both multiple-choice as well as open-ended items, and is administered in two separate 45-minute sessions (see Table 3.2). All of its reading 53 passages are personal accounts written in a narrative style, and presented as essays (see Table 3.6). The GSE Written Composition focuses mainly on recall items (67%), although a moderate proportion of test questions measures students’ ability to evaluate style (33%) as well (see Table 3.7). 54 55 0 0 CSU Entry-Level English Exam 0 0 0 Fiction College Test for English Placement College Placement Tests SAT II Writing SAT I ACT College Admissions Tests GSE Written Composition State Achievement Test Test 0 100 100 60 0 0 0 0 N/A 20 0 100 0 0 0 0 Humanities Natural Social Science Science Topic 0 0 0 20 100 Personal Accounts 55 0 50 50 40 100 0 0 0 0 0 N/A 0 0 0 0 0 Narrative Descriptive Persuasive Voice 100 50 50 60 0 Informative Table 3.6 Alignment Within the Content Category for the Editing Passages 0 0 0 0 0 Letter 100 50 100 100 100 0 0 0 0 0 Poem N/A Essay Genre 0 50 0 0 0 Story Table 3.7 Alignment Within the Cognitive Demands Category for Tests Measuring Editing Skills Test Recall Inference Evaluate Style 67 0 33 ACT 48 4 48 SAT I 0 100 0 SAT II Writing 50 3 47 College Test for English Placement 16 57 27 CSU Entry-Level English Placement Exam 14 21 64 State Achievement Test GSE Written Composition College Admissions Tests College Placement Tests College Admissions Tests Items measuring editing skills are included on three college admissions tests, the ACT, SAT I, and SAT II Writing. The exams are predominantly multiple-choice, with testing time ranging from 40 minutes for the ACT to 45 minutes for the SAT II Writing (see Table 3.2). Again, the SAT I does not specify the specific amount of testing time devoted to measuring editing skills. The SAT I does not include a reading passage, but instead uses a few sentences as prompts. In contrast, the ACT and SAT II Writing include reading passages. These reading passages are typically essays about humanities, and written in either a narrative or informative voice (see Table 3.6). The ACT and SAT II Writing items are equally distributed among recall (48% and 50%, respectively) and evaluate style items (48% and 47%, respectively), but the SAT I assesses only inference skills (100%) (see Table 3.7). College Placement Tests Two college placement tests, the CSU Entry-Level English Exam and College Test for English Placement, assess editing skills via multiple-choice items. The CSU Entry-Level English Exam allows 30 minutes for students to complete the editing section, whereas the College Test for English Placement allows 35 minutes (see Table 3.2). Reading passages on the CSU Entry-Level English Exam are likely to be informative 56 essays about a topic in social science (see Table 3.6). Passages on the College Test for English Placement, on the other hand, are more variable. They draw from humanities (100%), and are likely to be either essays or poems (50% each), written in a narrative or informative voice (50% each). In terms of cognitive demands, the College Test for English Placement includes many inference items (57%), whereas CSU Entry-Level English Placement Exam focuses on evaluate style (64%) (see Table 3.7). Alignment Among Tests of Different Functions Editing measures are predominantly multiple-choice, with testing time devoted specifically to editing skills ranging from 30 minutes for the CSU Entry-Level English Placement Exam to 45 minutes for the ACT and GSE Written Composition. All measures, except the SAT I, include reading passages as a prompt. The SAT I uses sentences as prompts. Excluding the College Test for English Placement, all reading passages are essays in which the author employs a narrative or informative voice. More variation is observed with respect to reading passage topics. The GSE Written Composition favors personal accounts (100%), whereas social science topics are most likely to appear on the CSU Entry-Level English Placement Exam (100%). In contrast, humanities is the focus on the College Test for English Placement (100%) and on most passages on the college admissions measures (60%-100%). Both the GSE Written Composition and the college admissions exams tend not to assess the full spectrum of the cognitive demands category. The GSE Written Composition, ACT, and SAT II Writing emphasize recall and evaluate style items, but are generally devoid of inference items, whereas the reverse is true for the SAT I. In contrast, college placement measures assess all three areas of the cognitive demands category. Recall items comprise a small proportion of each college placement test (14%16%), whereas evaluate style (27%-64%) and inference (21%-57%) items constitute a moderate to large proportion, depending upon the measure. 57 Writing Measures Alignment Among Tests of the Same Function State Achievement Tests Two state achievement tests, the GSE Reading/Literature and GSE Written Composition, require students to produce writing samples (see Table 3.2). Topics vary from humanities and personal accounts to social science (see Table 3.8). With respect to scoring criteria, the GSE Written Composition requires students to demonstrate mechanics, word choice, style, organization, and insight, but the GSE Reading/Literature emphasizes only the two latter elements (see Table 3.9). Table 3.8 Alignment Among the Writing Prompt Topics Topic Test Fiction Humanities Natural Science Social Science Personal Accounts State Achievement Tests GSE Reading/Literature X GSE Written Composition X X X X College Admissions Tests AP Language and Composition X SAT II Writing X X College Placement Tests College Test for English Placement X CSU Entry-Level English Placement Exam X UC Subject A X 58 X X Table 3.9 Alignment Among the Scoring Criteria for Tests Measuring Writing Skills Scoring Criteria Elements Test Mechanics Word Choice Organization Style Insight State Achievement Tests GSE Reading/Literature GSE Written Composition X X X X X X X AP Language and Composition X X X X X SAT II Writing X X X X College Admissions Tests College Placement Tests College Test for English Placement N/A CSU Entry-Level English Placement Exam X X X X X UC Subject A X X X X X College Admissions Tests Of the college admissions measures, only the SAT II Writing and AP Language and Composition require a writing sample. The SAT II Writing provides students with a one- or two-sentence writing prompt on a topic (usually humanities), and allows 20 minutes for students to respond (see Tables 3.2 and 3.8). In contrast, prompts on the AP Language and Composition are typically reading passages, and students are required to provide three writing samples in over two hours (see Table 3.2).5 Topics can vary, but are usually about humanities or personal accounts (see Table 3.8). The AP Language and Composition emphasizes all elements of the scoring criteria, but the SAT II Writing downplays the importance of insight (see Table 3.9). College Placement Tests Three college placement tests, the College Test for English Placement, CSU Entry-Level Placement Exam, and UC Subject A, require a written composition. Testing time varies from 20 minutes for the College Test for English Placement to 2 hours for UC 5 The AP Language and Composition requires a total of three writing samples, two of which are produced during the 120-minute writing session, and one during the 60-minute reading session. However, because examinees also respond to a set of multiple-choice items during the reading session, it is unknown the amount of time students devote specifically to the writing sample. 59 Subject A (see Table 3.2). The College Test for English Placement favors personal accounts, CSU Entry-Level English Placement Exam favors social science, and UC Subject A includes topics from humanities, natural science, and personal accounts (see Table 3.8). Both the UC Subject A as well as the CSU Entry-Level English Placement Exam emphasizes factors such as mechanics, word choice, style, organization, and insight, but the College Test for English Placement does not provide students with information regarding the scoring criteria (see Table 3.9). Alignment Among Tests of Different Functions Writing measures can vary from 20 minutes for a single writing sample (SAT II Writing, College Test for English Placement) to over 2 hours for three writing samples (AP Language and Composition). Humanities and personal accounts are the most common topic, as all but one test includes a writing prompt from one, if not both, of these areas. (CSU Entry-Level English Placement Exam is the exception). Most measures emphasize mechanics, word choice, organization, style, and insight, but the SAT II Writing omits insight from its scoring criteria, and the GSE Reading/Literature is concerned only with organization and insight. Discussion Our discussion of the discrepancies among ELA assessments parallels that of the math discussion. We first identify examples of discrepancies that are justifiable, then discuss the implications of the misalignments. We also discuss the feasibility of using state achievement tests to inform admissions decisions. Which Discrepancies Reflect Differences in Test Use? Some discrepancies among the ELA assessments reflect differences in purpose. Consider, for instance, discrepancies between the scoring standards of the CSU EntryLevel English Placement Exam and the AP Language and Composition. For the former test, maximum scores are awarded to writing samples that have minor diction errors, mechanics lapses, and underdeveloped paragraphs. Under the AP Language and Composition guidelines, such compositions might receive adequate scores, but would not 60 be viewed as exemplary papers. Because the AP Language and Composition is used to award academic credit to students who demonstrate college-level proficiency, whereas the CSU Entry-Level English Placement Exam is used to determine whether students need additional remediation, discrepancies between their scoring criteria are warranted. Even when two measures have similar test functions, discrepancies may still be warranted. For example, although both the GSE Written Composition and GSE Reading/Literature are measures of progress toward state standards, their scoring rubrics differ. The GSE Reading/Literature downplays the importance of mechanics, word choice, and style, but the GSE Written Composition deems these elements to be essential. Given that the GSE Reading/Literature’s main purpose is to assess understanding and interpretation of reading passages, elements such as mechanics, word choice, and style are peripheral to demonstrating reading comprehension. However, for the GSE Written Composition, which assesses students’ ability to express ideas effectively using standard written English, these elements are essential. Similarly, the large discrepancy between the SAT I (100%) and the ACT (4%) and the SAT I and SAT II Writing (3%) with respect to inference items is attributable to subtleties in purpose. The SAT I is intended to be a measure of reasoning proficiency, so great emphasis on inference questions is warranted. The ACT and SAT II Writing, on the other hand, are curriculum-based measures, so relatively greater focus on skills learned within English classes (i.e., recall and evaluate style skills) is to be expected. Is There Evidence of Misalignment? Although the vast majority of the ELA discrepancies is small or moderate, or stems from variations in test function, one instance of misalignment pertains to the scoring criteria of the SAT II Writing. Insight is included within the scoring criteria of most other writing assessments, but is omitted from the scoring rubrics of the SAT II Writing. Given that insight is also included in the standards of most English courses, it appears that the SAT II Writing standards are incongruent with those that are typically expressed. Potentially, this misalignment can send students mixed messages about the importance of insight with respect to writing skills. If the developers of the SAT II Writing were to add insight to the scoring criteria, or provided a clear rationale of why 61 insight has been omitted from the scoring rubrics, students would receive a more consistent signal about the importance of insight with respect to writing proficiency. Can State Achievement Tests Inform Postsecondary Admissions Decisions? As mentioned earlier, policymakers are exploring the possibility that the GSEs can be used for purposes other than monitoring student achievement toward state standards. However, the GSE Reading/Literature appears to have little potential as an alternative measure to college admissions exams. Its lack of emphasis on inference skills (14%) stands in contrast to college admissions measures (42%-83%). That the GSE Reading/Literature contains fewer inference items may mean that it cannot discriminate among higher-achieving students as well as college admissions exams. The GSE Written Composition holds more promise as an alternative measure to college admissions tests. It has approximately the same proportion of recall and evaluate style items as do the ACT and SAT II Writing. Furthermore, the GSE Written Composition has the added advantage of requiring multiple writing samples from examinees. The ACT does not require a composition, and the SAT II Writing requires a single writing sample. Arguably, the GSE Written Composition would allow admissions officers to better judge applicants’ writing proficiency than either the ACT or the SAT II Writing. However, any policy changes regarding the use of the GSE Written Composition to inform admissions decisions will require more research that examines the relationship between the GSE Written Composition scores and first-year college grade point average. 62