Testing Writing What to test Accuracy Content/Substance Vocabulary/word choice Organization Fulfillment of purpose How to test Essay Paragraph Letter Other formats Scoring Weighted Factors (Example) Content Grammar Organization Vocabulary Fulfillment of Task Holistic Scoring Method Definition of Holistic Scoring Holistic scoring is a method by which trained readers evaluate a piece of writing for its overall quality. The holistic scoring used in Florida requires readers to evaluate the work as a whole, while considering four elements: focus, organization, support, and conventions. This method is sometimes called focused holistic scoring. In this type of scoring, readers are trained not to become overly concerned with any one aspect of writing but to look at a response as a whole. Focus Focus refers to how clearly the paper presents and maintains a main idea, theme, or unifying point. Papers representing the higher end of the point scale demonstrate a consistent awareness of the topic and do not contain extraneous information. Organization Organization refers to the structure or plan of development (beginning, middle, and end) and whether the points logically relate to one another. Organization refers to (1) the use of transitional devices to signal the relationship of the supporting ideas to the main idea, theme, or unifying point and (2) the evidence of a connection between sentences. Papers representing the higher end of the point scale use transitions to signal the plan or text structure and end with summary or concluding statements. Support Support refers to the quality of the details used to explain, clarify, or define. The quality of support depends on word choice, specificity, depth, credibility, and thoroughness. Papers representing the higher end of the point scale provide fully developed examples and illustrations in which the relationship between the supporting ideas and the topic is clear. Conventions Conventions refer to punctuation, capitalization, spelling, and variation in sentence used in the paper. These conventions are basic writing skills included in Florida's Minimum Student Performance Standards and the Uniform Student Performance Standards for Language Arts. Papers representing the higher end of the scale follow, with few exceptions, the conventions of punctuation, capitalization, and spelling and use a variety of sentence structures to present ideas. Holistic Scoring in More Detail As previously noted, holistic scoring gives students a single, overall assessment score for the paper as a whole. Although the scoring rubric for holistic scoring will lay out specific criteria just as the rubric for analytic scoring does, readers do not assign a score for each criterion in holistic scoring. Rather, as they read, they balance strengths and weaknesses among the various criteria to arrive at an overall assessment of success or effectiveness of a paper. The CSU composition placement exam (administered from 1977-2004 and then replaced by the Composition Challenge Exam for a smaller number of students) relied for many years on a 9-point scale for overall assessment. Although the composition program now uses a 6-point scale, the rubric functions in much the same way. Notice that the four key criteria are defined most concretely for "upper-range" papers. Deficits from the most effective demonstration of the criteria characterize the "mid-range" and "lower-range" papers. A reader writes nothing on the paper itself and assigns the holistic score after reading the paper carefully and completely. A second reader, who does not see the first score, independently reads and assigns a second holistic score. If the two scores differ by more than 2 points, then a third reader scores the paper as well. Inter-rater reliability (the percentage of papers given the same score or differing by one point) should fall between .85 and .90 for sound holistic scoring. Readers who read the same kinds of papers regularly (including students in a large class) can easily be trained to reach acceptable inter-rater reliability scores. AP exams and the SAT II writing test both use holistic scoring to assess student writing skills. WHAT IS HOLISTIC SCORING? The holistic scoring method of assessing writing is based on the theory that a whole piece of writing is greater than the sum of its parts. In holistic scoring, the evaluation of a piece of writing, usually an essay, is based on the overall impression it creates rather than for individual aspects of the content or writing style or mechanics. Each written work is read by two readers, who provide separate, independent judgments on the overall quality of the writing based on a rubric or set of criteria specified for the particular program or test situation. The criteria typically include the elements of organization, development of ideas, style, mechanics, diction, and usage, but readers do not judge each of these elements separately. Holistic scoring is the opposite of primary trait scoring in this regard. Most holistic scoring programs, such as the one at the Educational Testing Service (ETS), have published scoring criteria, which are applied consistently across scoring sessions. The readers, chosen generally among professionals who teach writing, set the standards at the outset of the scoring session through discussion of a specific set of papers. For the essays to be scored fairly and consistently, readers must be able and willing to adjust to their personal standards of evaluation to those set for the particular testing program. This is the reason for the practice session which precedes the main scoring session; during the practice, readers set and get accustomed to a set of standards by scoring a range of responses to a particular test. Experienced holistic scorers are available to provide support and additional training as needed throughout the scoring process. (See the ETS Website on holistic scoring at <http://www.ets.org/holistic.html>. ) CSU-EPT SCORING GUIDE (areas related to development and specificity in bold) This scoring guide is used by the California State University (CSU) faculty who score the student essays written for the English Placement Test (EPT) using a holistic scoring method. Additionally, it is used for portfolio review and assessment in the GEW course (equivalent to English 100) at CSU San Marcos. Score of 6: Superior. A 6 essay is superior writing, but may have minor flaws. An essay in this category: addresses the topic clearly and responds effectively to all aspects of the task explores the issues thoughtfully and in depth is coherently organized, with ideas supported by apt reasons and well-chosen examples has an effective, fluent style marked by syntactic variety and a clear command of language is generally free from errors in mechanics, usage, and sentence structure Score of 5: Strong. A 5 essay demonstrates clear competence in writing. It may have some errors, but they are not serious enough to distract or confuse the reader. An essay in this category: clearly addresses the topic, but may respond to some aspects of the task more effectively than others shows some depth and complexity of thought is well organized and developed with appropriate reasons and examples displays some syntactic variety and facility in the use of language may have a few errors in mechanics, usage, and sentence structure Score of 4: Adequate. A 4 essay demonstrates adequate writing. It may have some errors that distract the reader, but they do not significantly obscure meaning. An essay in this category: addresses the topic, but may slight some aspects of the task may treat the topic simplistically or repetitively is adequately organized and developed, generally supporting ideas with reasons and examples demonstrates adequate facility with syntax and language may have some errors, but generally demonstrates control of mechanics, usage, and sentence structure Score of 3: Marginal. A 3 essay demonstrates developing competence, but is flawed in some significant way(s). An essay in this category reveals one or more of the following weaknesses: distorts or neglects aspects of the task lacks focus, or demonstrates confused or simplistic thinking is poorly organized or developed does not provide adequate or appropriate details to support generalizations, or provides details without generalizations has problems with or avoids syntactic variety has an accumulation of errors in mechanics, usage, and sentence structure Score of 2: Very Weak. A 2 essay is seriously flawed. An essay in this category reveals one or more of the following weaknesses: indicates confusion about the topic or neglects important aspects of the task lacks focus and coherence, or often fails to communicate its ideas has very weak organization, or little development provides simplistic generalizations without support has inadequate sentence control and a limited vocabulary is marred by numerous errors in mechanics, usage, and sentence structure Score of 1: Incompetent. A 1 essay demonstrates fundamental deficiencies in writing skills. An essay in this category reveals one or more of the following weaknesses: suggests an inability to comprehend the question or to respond meaningfully to the topic is unfocused, illogical, incoherent, or disorganized is undeveloped provides little or not relevant support has serious and persistent errors in word choice, mechanics, usage, and sentence structure Non-response essays, those that reject the assignment or fail to address the question, should be given to the Table Leader. Readers should not penalize ESL writers excessively for slight shifts in idiom, problems with articles and confusion over prepositions, and occasional misuse of verb tense and verb forms, so long as such features do not obscure meaning. EPT. What is the English Placement Test? The English Placement Test, developed cooperatively by the CSU faculty and Educational Testing Service (ETS), is designed to assess the level of analytical reading and writing skills of students entering the California State University. The EPT is a 45-minute timed essay and two 30-minute timed multiple-choice sections designed to assess the level of reading and writing skills of entering lower-division students for the purpose of placing them in appropriate courses. The test is offered only to admitted students and has no effect on admissions decisions. While the test is not a condition for admission to the California State University campuses, students must take it--or be exempted by alternative measures (ACT, SAT, Advanced Placement Test, transferable English composition course). Normally the EPT is taken once; if a student passesthe test, s/he clears the California State University English Placement Test requirement. Since its beginning in 1977, the EPT has been administered to more than 430,000 students. From 22,000 to 26,000 regularly admitted, first-time freshman students are tested each year. Of those students taking the EPT slightly more than 50 percent demonstrate the need for remediation or for special assistance with writing skills in order to succeed in college-level work. The essay portion of the test requires students to read a brief prompt about a general topic or issue; they must then take and explain a position, drawing upon personal experience, observation, or reading. (Information from <http://rhet.csustan.edu/EPT/nature.htm>. GRE. The GRE (Graduate Record Examination) of the ETS, an analysis of a complex issue, also uses a holistic scoring rubric, quite similar to the one used for the EPT. It is available online at <http://www.gre.org/essscore.html>. Examples of essays scoring 1-6 are also available at this site. Holistic Scoring Karen Schwalm, GCC The word "holistic" means looking at the whole rather than at parts; holistic scoring is a procedure for evaluating essays as complete units rather than as a collection of constituent elements. Developed in the 1970's by the National Assessment of Educational Progress (NAEP) and Educational Testing Service (ETS), holistic scoring of student writing has three purposes: to enable valid, quick and reliable evaluation of student essays. The movement for holistic scoring was developed to counter then-current practices of evaluating student writing by employing multiple choice tests. ETS argued that the valid measurement of writing ability should include requiring students to write. In addition, believing that testing forms curricular practices, many writing teachers argued that multiple-choice tests of writing sent students (and faculty) the wrong message about writing instruction. However, direct assessment of student writing appeared to be much more expensive than using multiple choice tests if all that was considered was scoring. Edward White, in Teaching and Assessing Writing, points out that when the costs of test-creation are included, there is no cost saving in using multiple-choice tests. In fact, if readers could be trained to read quickly, looking at the piece of writing a whole unit, holistic writing assessment could compete quite favorably with multiple-choice tests. Some people argued that when the benefits to instructors were included, holistic assessment might include even greater benefits. The only drawback was demonstrating that scoring was reliable. Until the development of holistic scoring, no reliable way existed to evaluate the essays that students wrote. Individual readers could read and evaluate essays quickly, but no individuals volunteered to read thousands of papers! And when Paul Diederich, in a landmark study, Measuring Growth in English, gave a group of readers papers to evaluate without including any criteria, he discovered that every paper received every possible score from the group of readers. The problem, White explains, "was to develop a method of scoring papers that retained the economy of a single, general impression score, with its underlying view that writing could be evaluated as a whole, and added to it substantial reliability of scoring." Several key elements have been developed to ensure reliability in holistic scoring: a clearly defined writing assignment or prompt, highly structured reading sessions, written scoring "rubrics" or guides, sample papers (or "anchors") matched with descriptions on the scoring guides, close monitoring of the scoring while it is going on, multiple readers (and thus scores), and careful record keeping to ensure the continuing reliability of participating readers. Good holistic writing assessment programs -- ones that are cost-effective and produce valid, reliable results -- should be designed with careful attention to these elements.