Testing Writing What to test How to test

advertisement
Testing Writing
What to test
Accuracy
Content/Substance
Vocabulary/word choice
Organization
Fulfillment of purpose
How to test
Essay
Paragraph
Letter
Other formats
Scoring
Weighted Factors (Example)
Content
Grammar
Organization
Vocabulary
Fulfillment of Task
Holistic Scoring Method
Definition of Holistic Scoring
Holistic scoring is a method by which trained readers evaluate a piece of
writing for its overall quality. The holistic scoring used in Florida requires
readers to evaluate the work as a whole, while considering four elements:
focus, organization, support, and conventions. This method is sometimes
called focused holistic scoring. In this type of scoring, readers are trained not
to become overly concerned with any one aspect of writing but to look at a
response as a whole.
Focus
Focus refers to how clearly the paper presents and maintains a main idea,
theme, or unifying point. Papers representing the higher end of the point scale
demonstrate a consistent awareness of the topic and do not contain extraneous
information.
Organization
Organization refers to the structure or plan of development (beginning,
middle, and end) and whether the points logically relate to one another.
Organization refers to (1) the use of transitional devices to signal the
relationship of the supporting ideas to the main idea, theme, or unifying point
and (2) the evidence of a connection between sentences. Papers representing
the higher end of the point scale use transitions to signal the plan or text
structure and end with summary or concluding statements.
Support
Support refers to the quality of the details used to explain, clarify, or define.
The quality of support depends on word choice, specificity, depth, credibility,
and thoroughness. Papers representing the higher end of the point scale
provide fully developed examples and illustrations in which the relationship
between the supporting ideas and the topic is clear.
Conventions
Conventions refer to punctuation, capitalization, spelling, and variation in
sentence used in the paper. These conventions are basic writing skills included
in Florida's Minimum Student Performance Standards and the Uniform
Student Performance Standards for Language Arts. Papers representing the
higher end of the scale follow, with few exceptions, the conventions of
punctuation, capitalization, and spelling and use a variety of sentence
structures to present ideas.
Holistic Scoring in More Detail
As previously noted, holistic scoring gives students a single, overall
assessment score for the paper as a whole. Although the scoring rubric for
holistic scoring will lay out specific criteria just as the rubric for analytic
scoring does, readers do not assign a score for each criterion in holistic
scoring. Rather, as they read, they balance strengths and weaknesses
among the various criteria to arrive at an overall assessment of success or
effectiveness of a paper. The CSU composition placement exam
(administered from 1977-2004 and then replaced by the Composition
Challenge Exam for a smaller number of students) relied for many years
on a 9-point scale for overall assessment. Although the composition
program now uses a 6-point scale, the rubric functions in much the same
way. Notice that the four key criteria are defined most concretely for
"upper-range" papers. Deficits from the most effective demonstration of
the criteria characterize the "mid-range" and "lower-range" papers.
A reader writes nothing on the paper itself and assigns the holistic score
after reading the paper carefully and completely. A second reader, who
does not see the first score, independently reads and assigns a second
holistic score. If the two scores differ by more than 2 points, then a third
reader scores the paper as well. Inter-rater reliability (the percentage of
papers given the same score or differing by one point) should fall between
.85 and .90 for sound holistic scoring. Readers who read the same kinds
of papers regularly (including students in a large class) can easily be
trained to reach acceptable inter-rater reliability scores.
AP exams and the SAT II writing test both use holistic scoring to assess
student writing skills.
WHAT IS HOLISTIC SCORING?
The holistic scoring method of assessing writing is based on the theory that a
whole piece of writing is greater than the sum of its parts. In holistic scoring,
the evaluation of a piece of writing, usually an essay, is based on the overall
impression it creates rather than for individual aspects of the content or writing
style or mechanics. Each written work is read by two readers, who provide
separate, independent judgments on the overall quality of the writing based
on a rubric or set of criteria specified for the particular program or test
situation. The criteria typically include the elements of organization,
development of ideas, style, mechanics, diction, and usage, but readers do
not judge each of these elements separately. Holistic scoring is the opposite
of primary trait scoring in this regard.
Most holistic scoring programs, such as the one at the Educational Testing
Service (ETS), have published scoring criteria, which are applied consistently
across scoring sessions. The readers, chosen generally among professionals
who teach writing, set the standards at the outset of the scoring session
through discussion of a specific set of papers. For the essays to be scored
fairly and consistently, readers must be able and willing to adjust to their
personal standards of evaluation to those set for the particular testing
program. This is the reason for the practice session which precedes the main
scoring session; during the practice, readers set and get accustomed to a set
of standards by scoring a range of responses to a particular test. Experienced
holistic scorers are available to provide support and additional training as
needed throughout the scoring process. (See the ETS Website on holistic
scoring at <http://www.ets.org/holistic.html>. )
CSU-EPT SCORING GUIDE
(areas related to development and specificity in bold)
This scoring guide is used by the California State University (CSU) faculty
who score the student essays written for the English Placement Test (EPT)
using a holistic scoring method. Additionally, it is used for portfolio review and
assessment in the GEW course (equivalent to English 100) at CSU San
Marcos.
Score of 6: Superior. A 6 essay is superior writing, but may have minor
flaws. An essay in this category:




addresses the topic clearly and responds effectively to all aspects of
the task
explores the issues thoughtfully and in depth
is coherently organized, with ideas supported by apt reasons and
well-chosen examples
has an effective, fluent style marked by syntactic variety and a clear
command of language

is generally free from errors in mechanics, usage, and sentence
structure
Score of 5: Strong. A 5 essay demonstrates clear competence in writing. It
may have some errors, but they are not serious enough to distract or confuse
the reader. An essay in this category:





clearly addresses the topic, but may respond to some aspects of the
task more effectively than others
shows some depth and complexity of thought
is well organized and developed with appropriate reasons and
examples
displays some syntactic variety and facility in the use of language
may have a few errors in mechanics, usage, and sentence structure
Score of 4: Adequate. A 4 essay demonstrates adequate writing. It may
have some errors that distract the reader, but they do not significantly obscure
meaning. An essay in this category:





addresses the topic, but may slight some aspects of the task
may treat the topic simplistically or repetitively
is adequately organized and developed, generally supporting ideas
with reasons and examples
demonstrates adequate facility with syntax and language
may have some errors, but generally demonstrates control of
mechanics, usage, and sentence structure
Score of 3: Marginal. A 3 essay demonstrates developing competence, but
is flawed in some significant way(s). An essay in this category reveals one or
more of the following weaknesses:







distorts or neglects aspects of the task
lacks focus, or demonstrates confused or simplistic thinking
is poorly organized or developed
does not provide adequate or appropriate details to support
generalizations, or provides details
without generalizations
has problems with or avoids syntactic variety
has an accumulation of errors in mechanics, usage, and sentence
structure
Score of 2: Very Weak. A 2 essay is seriously flawed. An essay in this
category reveals one or more of the following weaknesses:





indicates confusion about the topic or neglects important aspects of the
task
lacks focus and coherence, or often fails to communicate its ideas
has very weak organization, or little development
provides simplistic generalizations without support
has inadequate sentence control and a limited vocabulary

is marred by numerous errors in mechanics, usage, and sentence
structure
Score of 1: Incompetent. A 1 essay demonstrates fundamental deficiencies
in writing skills. An essay in this category reveals one or more of the following
weaknesses:





suggests an inability to comprehend the question or to respond
meaningfully to the topic
is unfocused, illogical, incoherent, or disorganized
is undeveloped
provides little or not relevant support
has serious and persistent errors in word choice, mechanics, usage,
and sentence structure
Non-response essays, those that reject the assignment or fail to address the
question, should be given to the Table Leader. Readers should not penalize
ESL writers excessively for slight shifts in idiom, problems with articles and
confusion over prepositions, and occasional misuse of verb tense and verb
forms, so long as such features do not obscure meaning.
EPT. What is the English Placement Test? The English Placement Test,
developed cooperatively by the CSU faculty and Educational Testing Service
(ETS), is designed to assess the level of analytical reading and writing skills of
students entering the California State University. The EPT is a 45-minute
timed essay and two 30-minute timed multiple-choice sections designed to
assess the level of reading and writing skills of entering lower-division
students for the purpose of placing them in appropriate courses.
The test is offered only to admitted students and has no effect on admissions
decisions. While the test is not a condition for admission to the California
State University campuses, students must take it--or be exempted by
alternative measures (ACT, SAT, Advanced Placement Test, transferable
English composition course). Normally the EPT is taken once; if a student
passesthe test, s/he clears the California State University English Placement
Test requirement.
Since its beginning in 1977, the EPT has been administered to more than
430,000 students. From 22,000 to 26,000 regularly admitted, first-time
freshman students are tested each year. Of those students taking the EPT
slightly more than 50 percent demonstrate the need for remediation or for
special assistance with writing skills in order to succeed in college-level work.
The essay portion of the test requires students to read a brief prompt about a
general topic or issue; they must then take and explain a position, drawing
upon personal experience, observation, or reading. (Information from
<http://rhet.csustan.edu/EPT/nature.htm>.
GRE. The GRE (Graduate Record Examination) of the ETS, an analysis of a
complex issue, also uses a holistic scoring rubric, quite similar to the one
used for the EPT. It is available online at <http://www.gre.org/essscore.html>.
Examples of essays scoring 1-6 are also available at this site.
Holistic Scoring
Karen Schwalm, GCC
The word "holistic" means looking at the whole rather than at parts; holistic scoring is
a procedure for evaluating essays as complete units rather than as a collection of
constituent elements. Developed in the 1970's by the National Assessment of
Educational Progress (NAEP) and Educational Testing Service (ETS), holistic scoring
of student writing has three purposes: to enable valid, quick and reliable evaluation of
student essays.
The movement for holistic scoring was developed to counter then-current practices of
evaluating student writing by employing multiple choice tests. ETS argued that the
valid measurement of writing ability should include requiring students to write. In
addition, believing that testing forms curricular practices, many writing teachers
argued that multiple-choice tests of writing sent students (and faculty) the wrong
message about writing instruction.
However, direct assessment of student writing appeared to be much more expensive
than using multiple choice tests if all that was considered was scoring. Edward White,
in Teaching and Assessing Writing, points out that when the costs of test-creation are
included, there is no cost saving in using multiple-choice tests. In fact, if readers
could be trained to read quickly, looking at the piece of writing a whole unit, holistic
writing assessment could compete quite favorably with multiple-choice tests. Some
people argued that when the benefits to instructors were included, holistic assessment
might include even greater benefits. The only drawback was demonstrating that
scoring was reliable.
Until the development of holistic scoring, no reliable way existed to evaluate the
essays that students wrote. Individual readers could read and evaluate essays quickly,
but no individuals volunteered to read thousands of papers! And when Paul Diederich,
in a landmark study, Measuring Growth in English, gave a group of readers papers to
evaluate without including any criteria, he discovered that every paper received every
possible score from the group of readers. The problem, White explains, "was to
develop a method of scoring papers that retained the economy of a single, general
impression score, with its underlying view that writing could be evaluated as a whole,
and added to it substantial reliability of scoring."
Several key elements have been developed to ensure reliability in holistic scoring: a
clearly defined writing assignment or prompt, highly structured reading sessions,
written scoring "rubrics" or guides, sample papers (or "anchors") matched with
descriptions on the scoring guides, close monitoring of the scoring while it is going
on, multiple readers (and thus scores), and careful record keeping to ensure the
continuing reliability of participating readers. Good holistic writing assessment
programs -- ones that are cost-effective and produce valid, reliable results -- should be
designed with careful attention to these elements.
Download