Uploaded by linawonda97

H. Douglas Brown-LANGUAGE ASSESMENT all chapters.summary

advertisement
LANGUAGE ASSESSMENT
SUMMARY OF PRINCIPLE
AND LANGUAGE PRCTICE
H. Douglas Brown
ARRANGE BY: ELINA WONDA_ 2012150001
2022
TESTING, ASSESSING, TEACHING
CHAPTER 1
In classroom setting, test is the word that sometimes made students feels afraid or shocked
whenever they heard it. Courses of study in every discipline are marked by periodic tests in
every educational setting therefore test is one of the assessments that unavoidable. Test is
made not to be degrading, artificial, anxiety-provoking experiences rather it is given to the
students to make them feel confident and responsible in their studies. There are three basic
interrelated concepts in Language assessment and that is testing, assessment and teaching.
A. Testing
It is a method of measuring a student’s ability, knowledge, and performance after the period
of teaching. In other word, it is an instrument with a set of techniques and procedures that
requires students to perform on their part. The method of creating a test must be explicit and
structured like multiple-choice questions with correct answers, writing prompt with a scoring
rubric, an oral interview based on a question script and a checklist of expected responses to
be filled in by the administrator. Must be design in a way to measure general ability,
competencies or objectives of the students.
The result or measurement given to the students may vary depending on the type of the test
given. Some may give a letter grade with a comment from the teacher and others graded with
number or percentage rank. Test is given also measure students’ knowledge of defining
vocabularies, grammars skills and also be able to identify rhetorical features in learning
language. Well-constructed test is an instrument that provides an accurate measure of the
students’ ability within a particular domain.
B. Teaching and Assessing
Assessing is simply an ongoing process that took place throughout the study period.
Whenever students’ response to the questions, give comments or try new words, teachers
subsequently assess students’ base on their performance. Students’ works are sometimes
assessed by themselves, fellow classmate or teacher. Test is one of the forms of assessment
but is done in a period of time teacher only.
LANGUAGE ASSESSMENT
In order to make classroom an interesting place for learning, students must feel free to
express their opinion about the language they learn without being judged by others or teacher.
Teaching sets up the practice games of language learning by giving the opportunities for
learners to listen, think, take risks, set goals, and process feedback from the teacher and then
recycle through the skills that they are trying to master.
In the assessing process, there are two ways where assessing process is being done depending
of the techniques being used.
I.
Informal Assessment
It can be in a form of a teacher give comment or feedback to a student saying ‘very good’
after the student’s presentation. A good deal of a teacher's informal assessment is embedded
in a classroom tasks designed to elicit performance without recording results and making
fixed judgments about a student's competence.
II.
Formal assessments
Formal assessments on the other hand, are the activities or procedures that are designed to
measure specific skills and knowledge of a student. It is a systematic plan and the techniques
that are constructed to give students and teacher an appraisal of students’ achievements. Test
is an example of a formal assessment.
a. Formative and Summative Assessment
Formative assessment is one of the two functions in assessment. Most of the classroom
assessments are categorized as formative assessment. Evaluating students’ performance,
helping students to boost their skills and competences is the process of formative assessment.
Whereas, summative assessment is aim to measure what the student have learned so far by
giving exams in the middle or at the end of the course.
b. Norm-Referenced and Criterion-Referenced Tests
In norm reference test, students’ results or scores are ranked from above average to lower
average to indicate their rank order. It is mostly written in numerical figure (150 out of 200)
or in percentage form (75%). TOEFL test is one of the examples of norm reference test.
However, criterion referenced test is designed to students base on the curriculum given in that
particular course to measure through grading to give feedbacks to students on their
performance. It usually took whole class to take such test and the teacher is the administrator
who assessed the performance.
The historical perspective underscores two major approaches to language testing that were
debated in the 1970s and early I980s.
Discrete-Point and Integrative Testing
LANGUAGE ASSESSMENT
This type of test is designed to assume that language can be broken down into its components
and tested. These parts are listening, reading, writing and speaking and other various units of
language such as lexical, grammar etc. and can be tested successfully.
1. Integrative test
The second component, on the other hand is an approach that focus mainly on
communication. The types of this tests is cloze test that is reading passage where students
fills the blanks with the missing words and dictation language teaching technique where the
teacher read out loud the passage and students write down every words that dictated.
Communicative Language Testing
According to Bachman and Palmer (1996, p. 9) include among the fundamental- principles of
language testing the need for a correspondence between language test performance and
language use:"ln order for a particular language test to be useful for its intended purposes,
test performance must correspond in demonstrable ways to language use in non-test
situations."
Bachman (1990) also proposed a model of language competence consisting of organizational
and pragmatic competence, respectively subdivided into grammatical and textual
components, and into illocutionary and sociolinguistic components.
Performance-Based Assessment
Performance-based assessment of is the type of assessment that involves oral production,
written production, open-ended responses, integrated performance, group performance, and
other interactive tasks. Technically, higher content validity is achieved in the process because
students are measured in the process of performing the targeted linguistic acts. The
characteristic of performance-based test is the presence of interactive tasks.
ISSUES IN CLASSROOM TESTING
1. New views on intelligence
Intelligence was view as the ability to perform linguistics and mathematical problem-solving.
Because IQ testing is only measured by timed and could not cover all fields of study, Howard
Gardner (1983-1999) extended this view by adding five more frames of mind in his theory of
multiple intelligences as follows:
a)
b)
c)
d)
e)
spatial intelligence
musical intelligence
bodily-kinesthetic intelligence
interpersonal intelligence
intrapersonal intelligence
Robert Sternberg (1988, 1997) also charted new territory in intelligence research and
recognized that creative thinking and manipulating strategies could also be part of multiple
LANGUAGE ASSESSMENT
intelligences. More recently, Daniel Goleman'S (1995) added his concept of EQ (emotional
quotient) concluded that those who manage their emotions-especially emotions that can be
detrimental tend to be more capable of fully intelligent processing.
Though these new conceptualizations of intelligence have not been universally accepted by
the academic community, their intuitive appeal infused the decade of the 1990s with a sense
of both freedom and responsibility in the testing system.
2. Traditional and "Alternative" Assessment
Table below shows the differences between traditional and alternative assessments
Traditional Assessment
One-short, standardized exam
Timed, multiple-choice format
Decontextualized test items
Scores suffice for feedback
Norm-referenced scores
Focus on the ‘right’ answer
Summative
Oriented to product
Non-interactive performance
Foster extrinsic motivation
Alternative Assessment
Continuous long-term assessment
Untimed, free response format
Contextualized, communicative task
Individualized feedback and washed back
Criterion-referenced score
Open ended creative answers
formative
Oriented to process
Interactive performance
Foster extrinsic motivation
It is difficult, in fact, to draw a clear line of distinction between what Armstrong (1994) and
Baily (1998) have called traditional and alternative assessment because many forms of
assessment fall in between the two, and some combine the best of both.
Computer-Based Testing
Computer-based tests are small-scale "home-grown" tests available on web-sites. The
computer is programmed to fulfill the test design as it continuously adjusts to find questions
of appropriate difficulty for test-takers at all performance levels. In CATs (computeradaptive-test), the test-taker sees only one question at a time, and the computer scores each
question before selecting the next one. As a result, test-takers cannot skip questions, and once
they have entered and confirmed their answers, they cannot return to questions or to any
earlier part of the test.
Table below shows the advantages and disadvantages of CATs
Advantages
Disadvantages
 Classroom-based testing
 Lack of security of supervision
 Self-directed testing on various
 Quizzes could come from unofficial
aspects of a language
websites
 Practice for upcoming high-stakes
 Open-ended response are less likely
standardized test
to appear
 Large-scale standardized test and
 Interactive element is absent
easily administered at different
LANGUAGE ASSESSMENT
stations
LANGUAGE ASSESSMENT
CHAPTER 2
PRINCIPLES OF LANGUAGE
ASSESSMENT
PRACTICITY
An effective test is said to be a practical test. Meaning that it:




Is not excessively expensive
Stays within appropriate time constraints
Is relatively easy to administer
Has specific scoring procedure and time-efficient
A test that is prohibitively expensive is impractical because it takes too much time and effort
to complete than necessary to accomplish its objective.
RELIABILITY
This type of test is consistent and dependent.
unreliability of the test have been described below;
Some factors that may contribute to
a) Student-related-reliability
Learners-related-reliability is usually caused by temporary-illness, fatigue, having bad day,
anxiety, or other physical and physiological factors.
b) Rater Reliability
This refers to the human errors, subjectivity, and bias that may add during the process of
scoring. This can further classify into two parts. Inter-rater-reliability refers to two markers
giving inconsistent scores to the same test due to lack of attention to the scoring criteria.
Whereas, intra-rater-reliability is when teachers were unfamiliar of the scoring criteria,
fatigue, bias toward students or could be because of carelessness.
c) Test Administration Reliability
Unreliability may occur to environmental condition where test is administered. Low light
efficiency, level of the desk, temperature of the room may affect the product copy of the tests
papers. Also the surrounding noise may affect the test-takers taking the test.
d) Test Reliability
The nature of the test itself can cause measurement errors. Too long test sometimes make
students feel fatigue whenever they reached the last part of the test, which they could not
perform well. Students that perform low will badly affected in such situation.
VALIDITY
LANGUAGE ASSESSMENT
According to Gronlund (1998, p. 226) validity is the extent to which inferences made from
assessment results are appropriate, meaningful, and useful in terms of the purpose of the
assessment. For example, a valid test of the reading ability actually measures reading ability
not the previous knowledge of a subject.
There are five types of evidence of measuring test
A). Content-Related Evidence
Content-related evidence refers to a test actually samples the matter about which conclusions
are to be drawn and that requires students to perform the behavior that is being measured.
This often related to as content validity.
B). Criterion-Related Evidence
This type of evidence is also refer to as criteria-related validity or simply stated the extent to
which the criterion of the test has been fulfilled. Teacher-made classroom assessments can be
a demonstration of criterion-related assessment through the comparison of the results of the
assessments.
This evidence further into two parts: concurrent validity; refers to the test results that
supported by other concurrent performance that beyond the assessment itself. Predictive
validity: in the measurement of future result of test-takers.
C). Construct-Related Evidence
Commonly refer to as construct validity is any theory, hypothesis, or model that attempts to
explain observed phenomena in the universe of perceptions. Proficiency and communicative
competence are examples of linguistic construct.
D). Consequential Validity
Consequential validity encompasses all the consequences of a test, including such
consideration of its accuracy in measuring intended criteria, effects on the preparation of the
tests or assessments, its effect on students, social consequences of test's preparation and use.
E). Face validity
This refers the students’ expressions when viewing tests or assessments. How the students
feel about the test. Face validity would be high if leaners encounter:
A well-constructed, with familiar task




Can be done in a short time period
Clear presented
Instructions a crystal clear
Difficulties are easy to handle.
AUTHENTICITY
LANGUAGE ASSESSMENT
It is define as the degree of correspondence of the characteristics of a given language test
tasks to features of a target language tasks. Authenticity may present in a test in the following
way:





The language in the test in as natural as possible
Items a contextualized
Topics are meaningful for learners
Thematic organization to items is provided
Real-world tasks
WASHBACK
Refers to the effects of the tests that have on instruction in terms of how students prepare for
the test. Examples cram causes and teaching to the test. Formal assessment likely to build inn
wash back effects because teaches used to provide interactive feedback. Test only provides
feedback and not washes back to students.
APPLYING PRINCIPLES TO THE EVALUATION OF CLASSROOM TESTS
Questions below are the guidelines to create or design a test or assessment






Are the best procedures practical?
Is the Test reliable?
Does the procedure demonstrate content validity?
Is the procedure face valid and biased for best?
Are the test tasks are authentic as possible?
Does the test offer beneficial wash back to the learner?
LANGUAGE ASSESSMENT
CHAPTER 3
DESIGNING CLASSROOM
LANGUAGE TESTS
In order to design a test or revising the existing test we have to consider the following
Questions.
a.
b.
c.
d.
e.
What is the purpose of the test?
What are the objectives of the test?
How will the test specifications reflect both the purpose and the objectives?
How will the test tasks be selected and the separate stems arranged?
What kind of scoring, grading, and feedback is expected?
TEST TYPES
1) Language aptitude Test
This is a type a test that design to measure the capacity or general ability to learn foreign
language. Modern Language Aptitude Test and Primsleur Language Aptitude Test are
both example of this type.
2) B) Proficiency Tests
This type of test is a global competence Test of a language. It is the testing systems that
test overall skills in a language. Proficiency tests are all summative and norm-referenced
tests and the good example is TOEFL.
3) Placement Test
Some proficiency test can be a placement test because the purpose is the place the student
into particular level or schools interference to his or her scores. Specific example could be
English as a Second Language Placement Test
4) Diagnostic Test
This test type of test is design in a way to measure specific aspects of language. For
example pronunciation Test could be design to measure the phonological features of
English language that are difficult for learners.
5) Achievement Test
This type of test is related to classroom lesson, units or even total curriculum. It is only
limited to particular material addressed in a curriculum and the primary role is to
determine whether course objectives have been met appropriate knowledge and skills
acquired at the end of a period of instructions.
SOME PRACTICAL STEPS TO TEST CONSTRUCTION
LANGUAGE ASSESSMENT
a. Assessing clear, unambiguous Objectives
You must know the very specific reason/ purpose on what you want to test for. Begin by
taking a careful look at what students should know, what they are able to do based on the
material that given to them.
b.
Drawing up Test Specifications
Test specifications in classroom setting can be simple and practical and are much more
formal and sometimes detailed. The specification will comprise of:
(a) A broad outline of the test, (b) the skill you will tests.




Unit Test must take more than 30 minutes
Test four skills
Include oral production in preceding period
Time must be divided equally for all language skills
(c) Item types and tasks



Test prompt can be oral or written
Have case vary widely and within the response mode of course
The mode could be elicitation and response mode
3) Devising Test tasks
Before testing, bring the class with warm up activities that is closely related to the content
of the test. By the end of the lesson all four skills are fulfilled in that activity.
4) Designing Multiple Choice Questions
Hughes (2003, p. 79-78) stated cautions against a number of weaknesses of multiple
choice questions items as follows:





Recognition knowledge required
Guessing may affects test score
Restrict the techniques of what to test
Wash back may harmful
High chance of cheating
Practicality and reliability are two principles that support multiple choice questions. It is
easy to create but worth the efforts for the preparation.
SCORING, GRADING AND GIVING FEEDBACK
Scoring


When designing test, you must know how to score and graded.
Scoring plan reflects the relative sections of the items in the test.
LANGUAGE ASSESSMENT
Grading
Grading can be in a form of giving "A" or "B" or "C" at the end of the course, but how the
grading with letters being graded is the product of:





The country, culture, and context of English classroom
Institutional expectations
Explicit and implicit definition of grades that you have set forth
The relationship you have established in class
Students’ expectations
Giving feedback






Feedback must be beneficial wash back
Feedback includes letter grade, total score, and marginal comments
Self-assessment
Giving suggestions
Discussion on the result of the test
Oral feedback
LANGUAGE ASSESSMENT
CHAPTER 4
STANDARDIZED TESTING
A standardized Test is presupposes certain standard objectives, or criteria that are held
constant across one form of a test to another. A criteria in large scale standardized Test are
design to apply to a broad band of competencies that are usually not exclusivity to one
particular curriculum. A good standardized Test is the product of a thorough process of
empirical research and development. It indicates procedures of administration and scoring.
It is a type of a norm- referenced test and the goal it placing the test- takers on a continuum
across a range of scores and to differentiate test-takers by their relative ranking.
Advantageous of Standardized Test




Ready-made validated product that free teachers from spending too much time
creating a test.
Administration to large group can be accomplished within reasonable time limits
It is fast turnaround time in the process of scoring multiple choice questions
There is often an air of face validity
Disadvantages of standardized Test




Inappropriate use of the test
The misunderstanding of the difference between direct and indirect testing.
Some tests task that do not directly specify performance in the target objectives.
Well-standardized tests demonstrate high correlations between performance and target
objectives but correlation are not sufficient to demonstrates unequivocally the
acquisition of criterion objectives by all test-takers
Developing a Standardized Test
1. Determine the purpose and Objectives of the test
All Standardized Test are expected to provide high practicality in administration and scoring
without unduly composing validity. The outlay of money and time for such Test is
significant, but the test would be used repeatedly. Therefore, it is important to state the
purpose and Objectives specifically. That is why, for example, the purpose of TOEFL test is
to evaluate the English proficiency of non-native speakers.
2.



Design test specifications
Decision is needed to go about structuring the specification of the test
Make comprehensive research to underlying the test itself
Standardized Test that does not work is often the product of short- sighted construct
validation. For example, TOEFL is a proficiency Test, the first step is the
development process is to define the construct of Language Proficiency.
LANGUAGE ASSESSMENT
3. Design, Select and arrange test tasks/items
Once the specification process is over, designing, selecting and arranging process begins. The
specs act much like a blueprint undermining the number and types of items to be created.
4. Make appropriate evaluations of different kinds of items
 Productions Response made different forms of evaluation become important.
 Principles of practicality and reliability are prominent along with the concept of
facility.
 Practicality issues in such items include timing of the test, clarity of direction and ease
of administration
 Reliability is a major player in instances where more scorer in employed.
 Facility is the key to the validity and success of an item
5. Specify scoring procedures and reporting formats
A systematic assembly of test items in pre- selected arrangements and sequences, all of which
are validated to conform to an expected difficulty level should yield the test and can be
scored accurately. TOEFL is straightforward scoring procedures. Scores are calculated in
three different sections and total score ranged from 40 to 300. And essay also scored
separately.
6. Perform ongoing construct validation studies
 No standardized instrument is expected to be used repeatedly without a rigorous
program of ongoing construct validation
 A complete standardized Test must be accompanied by systematic periodic
corroboration of its effectiveness.
 Test is true if produced in equated form and form must be reliable across test
 That is why TOEFL has the an impressive program of research to keep the test up to
date
STANDARDIZED LANGUAGE PROFICIENCY TESTING


Presuppose a comprehensive definition of the specific competencies that comprises
overall language ability.
TOEFL provide an illustration of an operational definition of ability for assessment
purpose.
FOUR STANDARDIZED LANGUAGE PROFICIENCY TESTS




TOEFL
Michigan English Language Assessment Battery (MELAB)
IELTS
TOEIC
LANGUAGE ASSESSMENT
The construction of standardized Test is no minor accomplishments whether the
instrument is large or small - scale. The designing of specific alone requires a
sophisticated process of construction validation coupled with considerations of
practicality.
LANGUAGE ASSESSMENT
CHAPTER 5
STANDARD-BASED ASSESSMENT
The construction of standardized measurements procedures makes possible concordance
between standardized Test specifications and the goal and objectives of educational
programs. English as a second Language (ESL) is increasingly important to United States for
non-native speakers.
1) ELD STANDARDS
The process of designing appropriate periodic reviews of ELD requires dozens of curriculum
and assessment specialists, teachers and also researches. To create benchmark for
accountability, there must be a responsibility to carry out comprehensive study in the
following areas:





Literally thousands of categories of language ranging from phonology at the
continuum to discourse, pragmatics, functional, and sociolinguistic elements at the
other end.
Specifications of what ELD students’ needs are, at 13 different levels for succeeding
in their academic and social development
A consideration of what is realistic number and scope of standards to be included in
the curriculum
A separate set of standards for teachers to be used
A thorough analysis of the means of available to assess students’ attainment of the
standard.
Listening and speaking standards for English language learners (ELLs) identity student's
competency to understand and to produce it orally. These two skills are the building blocks
for the foundation of the second language assassination. Use ELA and ELD standards to
develop students’ skills and proficiency on the ELA.
2) ELD ASSESSMENT
 The California English Language Development Test (CELDT) is designed to assess
the attainment of ELD standards across grade level.
 Stringent budgets within departments of education worldwide predispose in decisionmaking positions to rely on traditional standardized Tests for ELD assessment, but
rays of hope lie in the exploration of more student-centered approaches to learner
assessment.
3) CASES AND SCANS
 Standards-based assessment system also affects the higher levels of education.
LANGUAGE ASSESSMENT
 Comprehensive Adult Student Assessment System (CASAS) is the program design to
provide based assessments of ESL curricular across United States.
 CASAS assessment instrument are used to measure all four skills of English language
including higher-order thinking skills.
 Secretary's Commission in Achieving Necessary Skills (SCANS) outlines
competencies necessary for language in the workplace.
4) TEACHER STANDARDS
Students performance depends on the quality of the instructional program provided which
depends on the quality of professional development.
Kuhman (2001) stated the following importance of teacher standard in three domains:



Linguistics and language development
Culture and the interrelationship between language and culture
Planning and managing instruction
How to assess whether teachers have met the standard remains a complex issue because all
Kuhman's domains cannot be evaluated or assessed through TESOL.
TESOL’s standards committee advocates performance-based assessment of teachers for the
following reasons:





Teachers can demonstrate standard in teaching
Assess teaching through teachers’ performance in class
Performance can be detailed to indicators
Processes used to assess teachers need
Students learning progress
CONSEQUENCE OF STANDARDS-BASED AND STANDARDIZED TESTING
A) Positive consequences
 Standardized Test offer high levels of practicality and reliability and impressive
construct validation
 TOEFL has the capability of placing tens and thousands test-takers onto normreferenced scale with high reliability ratio.
 TOEFL is a gate-keep for students wanted to enter universities and visa that
accompanied it.
Test Bias
 Test bias includes culture, race, gender, teaching and learning styles.
 Weir (2001) stated that teachers and students must give a freedom of choose
formative assessments rather than summative assessments in standardized Test
in order to avoid test bias.
LANGUAGE ASSESSMENT
Test-Driven Learning and Teaching
Whenever students or other test-takers knew that one single measurement of their
performance will determine their lives, they are less likely to take a positive attitude toward
learning. The motivation on such context is almost exclusively extrinsic, with little likelihood
of stirring intrinsic interests. A teacher might be superb teacher and that his or her student
might make excellent progress through the school year.
Ethical Issues: Critical Language Testing
Shohamy (1997) see the ethics of testing as an extension of what educators call critical
pedagogy or critical language testing. The issues of critical language testing are numerous:
 Psychometric traditions are challenged by interpretive, individualized procedures for
predicting success and evaluating ability
 Test designers have a responsibility to offer multiple modes of performance to
account for varying styles and abilities among test-takers
 Test are deeply embedded in culture and ideology
 Test takers are political subjects on a political context
LANGUAGE ASSESSMENT
CHAPTER 6
ASSESING LISTENING
Observing the Performance of the Four Skills
The interaction concept of performance and observation simply put that, when you propose to
assess someone’ ability in four skills, you assess that person’ competence, but you observe
the person’ performance. Sometimes performance does not indicate student’s true
performance because of students-related reliability factors that affects performance. The first
important principle of assessing student’s performance is to consider the fallibility of the
result. Multiple measures will always give you a more reliable and valid assessment than a
single measure. The second principle is observable performance, simply put, being able to see
or hear from the students’ performance through four different language skills.
Teachers should consider following alternatives before draw conclusions to single
performance:




Several tests that combined to form an assessment
A single test with multiple test tasks to count for learning style and performance
variables
In and extra-class graded work
Alternative forms of assessment
Important of Listening
Listening and speaking are two skills that bond together. The overtly observable nature of
speaking renders it more empirically measureable then listening. A good speaker is often
valued more highly than a good listener but language teachers know that a good speaker is a
comprehensive of good listening.
Basic Types of Listening




Recognize sound and hold temporary ‘imprint’ to short-term memory
Determine speech event and attend to its context
Use linguistic decoding skills to interpret the message
Conceptualized the relevant information to long-term memory
Potential assessment objective s that represent above stages of listening types




Comprehending the structure elements of the delivered message
Understanding of pragmatic context
Determining meaning of audio input
Developing the gist, a global or comprehensive understanding
Four types of Listening performance
LANGUAGE ASSESSMENT
 INTENSIVE _( listening for perception of components e.g. Phonemes)
 RESPONSIVE _(listening to relatively short stretch of language in order to make
equally short response e.g. greetings)
 SELECTIVE _(processing stretches discourse in order to scan relevant information
e.g. part one of IELTS listening test)
 EXPENSIVE _(listen to develop top-down, global understanding of spoken language
e.g. lecturer lecturing)
MICRO AND MARCROSKILLS OF LISTENING
Microskills is whenever students attending to the smaller bits and chunks of language in more
of a bottom-up process, whereas macroskills focuses on the large elements involved in a topdown process. Richards and Dunkel (1983, 1991) had stated the following factors that make
the listening difficult:







Clustering
Redundancy
Performance variables
Colloquial language
Rate of delivery
Stress, rhythm, and intonation
Interaction
DESIGNING ASSESSMENT TASKS:
Intensive Listening
After determining the objective of the test task, the next step is to design the tests that specify
the objective of the test. The tasks could be ranged from intensive listening performance such
as phonemic recognition, to extensive comprehension of language in communication context.
Some micro skills of intensive listening a test could contain are as follows:


Recognizing phonological and morphological elements
Paraphrase recognition
Responsive Listening
The objective of this item is the recognition of ‘WH’ questions and how much its appropriate
response. Dictators are chosen to represent common learner errors. Not have to in multiplechoice, and it can be offered in a more open-ended framework.
Selective Listening
Test-taker listens to a limited quantity of aural input and discerns within it for some specific
information. Below are number of techniques used in selective listening test,
 Listening Cloze or Cloze dictation (e.g. fill in blanks)
LANGUAGE ASSESSMENT
 Information transfer (diagram labeling)
 Sentence repetition
 Dictation
Authentic Listening Tasks
Ideally, the language assessment field would have a stockpile of listening test type that are
cognitively demanding, communicative, and authentic not to mention interactive by means of
an integration with speaking. We can assess a test-takers comprehension if we take the liberty
of stretching the concept of assessment to extend beyond test. Here are some possibilities
tasks




Note-taking
Editing
Interpretive tasks
Retelling
LANGUAGE ASSESSMENT
ASSESSING SPEAKING
CHAPTER 7
Speaking is a productive skill of listening skill. This is because an oral production task
involves the interaction of aural comprehension. Evaluation or assessing can be focus on the
fluency, pronunciation, vocabulary use, grammar, comprehensibility.
Basic Types of speaking
Initiative

Ability to imitate a word, a phrase or even a sentence
Intensive


Assessments tasks like direct response, reading aloud, sentence and dialogue
completion, picture cue tasks.
Focuses in grammar, phrasal, lexical or phonological relationships in the language
used.
Responsive

This include interaction and test comprehension (greetings, small talks) and does not
take time
Interactive



Take time because long discussion that involves more than one participants.
An example could be group discussion
This type could be transactional language or interpersonal exchange
Extensive (monologue)


Example could be speech, oral presentation, story-telling etc.
Non-verbal responses except the informal monologue conversion
MICRO AND MACRO SKILLS OF SPEAKING
Micro skills refer to the production of smaller chunks of language such as phonemes,
morphemes, words, collocations and phrasal units. While macro skills focus on the fluency,
discourse, function, style, cohesion, nonverbal communication, and strategy options.
DESIGNING ASSESSMENT TASKS
1.



Imitative speaking
Focuses more on the pronunciation that help students to be more comprehensible
This type is called occasional phonologically focused repetition tasks
Repetition can be homonyms words, a sentence statement or question
LANGUAGE ASSESSMENT
 Test example could be PhonePass Test and computer based test.
2. Intensive Speaking
 Students are promoted to produce short stretches of discourse through demonstrating
linguistic ability at specific levels.
 Many tasks are cued
 Part C and D of PhonePass Test fulfilled this criteria
(a) Direct response task
This type of task is mechanical and not communicative but produces correct grammatical
output
(b) Read aloud test



Reading beyond the sentences, could be a paragraph
By recording the outputs makes it easy to asses
Errors and questionable items were noted by teachers for proper feedbacks
(c) sentence/dialogue Completion Tasks and Oral Questionnaires


Students read dialogue of omitted line by speaker
Test takers first given one to read through the dialogue to prepare an appropriate
response.
(d) Picture-Cued Tasks

Picture is given to students to make oral description of it. To tell a story or described
an incident that a picture convoy.
(e) Translations

Test takers are given a word, phrases, or sentence and asked them to translate it. This
method apply to non-native speaker students
3. Responsive Speaking
(a) Questions and Answers Tasks


Questions (two or more) are given to students in a way that they could make
meaningful language response.
Tests givers already knowing why they should create such questions before asking
students.
(b) Giving Instruction and Directions Tasks


Reading instruction on daily basis could be tidy up your desk or bake a cake.
Techniques are when teachers pose a question, students’ answers.
LANGUAGE ASSESSMENT
(c) Paraphrasing

Students area to listen to number of sentences then make paraphrase of what they
heard.
4. Interactive Speaking
The two types of oral production assessment (interactive and extensive speaking) include
tasks that need long stretches of interactive discourse which are interview, role play,
discussions and games. And those that involve less interactions area as follows; speech, long
story telling, long explanation and translation.
LANGUAGE ASSESSMENT
CHAPTER 8
ASSESSING READING
Reading two primary hurdles which are bottom-up strategy for processing separate letters,
words, phrase and top-down strategy for comprehension. For second language learners,
readers must develop appropriate content and formal schemata to carry out interpretation
effectively.
Types of Reading
A reader must anticipate convections rules of different reading in order to process meaning
effectively. Below are the types of reading:
o Academic Reading
o Job-related reading
o Personal reading
Micro and Macro skills and Strategies for Reading
(A) Micro skills






Discriminate among the distinctive graphemes and orthographic patterns of English
Retain chunks of language of different length in short-term memory
Process writing at an efficient rate of speed to suit the purpose
Recognition of core words
Recognition of grammar
Recognition of cohesive devices in writing
(B) Micro skills





Recognition of rhetorical forms of written discourse
Recognition of communicative function of written text
Infer context that's not explicit by using background knowledge
Differentiate the literal and implied meaning
Use battery of reading schemata
(C) Principles strategies for reading compression







Identity purpose of reading
Apply spelling rules
Use lexical analysis techniques
Guess the meaning
Skimming to gist the overall meaning
Scanning for detailed information
Use silent reading techniques
LANGUAGE ASSESSMENT
 Use other resources for further understanding
 Capitalized discourse markers to form relationship
Types of Reading
1. Perceptive
 Involves attending the components of larger discourse and bottom-up process
2. Selective
 Involves recognition of grammar, lexical, discourse features. Therefore, pictures cued
brief paragraph, simple chart and graph area used
 bottom up and top down process are used in this type of reading
3. Interactive
 In this type, readers is deeply interact with the text
 Typical genres involves in this type are anecdotes, short narrative and description,
questionnaires, memos, etc.
 The focus is to identify grammar, discourse makers, symbolic, lexical features in a
short time frame.
4. Extensive
 involves more than one pages of reading like essays, articles, reports, Short stories
and books
 refers to reading research
 reading happens outside the schooling hours
 purposely to find detailed information from the reading
DESIGNING ASSESSMENT TASKS
(1) Perceptive Reading
o
o
o
o
Reading Aloud
written Response
multiple choice
Picture- cued items
(2) Selective Reading




Multiple choice (MC) tasks
Cloze vocabulary/ grammar task
Contextualized MC vocabulary/ grammar
MC vocabulary/grammar
 Matching Tasks
LANGUAGE ASSESSMENT


Vocabulary matching
Fill-in vocabulary
 Editing Tasks
 MC grammar editing








Picture-cued tasks
gap-filling tasks
cloze tasks
Short-answer Tasks
scanning
skimming
summary and responding task
note taking and outlining
LANGUAGE ASSESSMENT
CHAPTER 9
ASSESSING WRITING
Areas like handwriting ability, spelling, grammatical construct sentences, paragraph
construction, ideas development, can be an objective of writing assessments.
Genres of Writing
1. Academic writing
 possible examples are subject reports, essays, journals, short-answers
response, theses, dissertations
2. Job-related Writing
 Examples would be messages, memos, letters/emails, reports, manuals,
advertisements, announcements, schedules, etc.
3. Personal writing
 Examples would be - letters, emails, shopping lists, financial documents,
forms, medical reports, diaries etc.
Types of Writing Performance
Initiative


This type include the ability to spell words correctly
At this level students are trying to master the mechanism of writing
Intensive (controlled)

Skills in producing appropriate vocabularies within a context, collocations and
idioms, and correct grammatical features up to the length of sentence are gained
through this type of performance.
Responsive



Tasks require students to perform at limited discourse levels, form sentences into
paragraphs by creating logical connection sequence of two or paragraphs.
Basically the use of pedagogical method
Genres of writing specifically include narrative and description, short reports, lab
reports, summaries, brief response to reading, interpretation of charts
Extensive


Implies all the processes and strategies that involves in writing for all purposes.
Focuses mainly on the purpose, organizing and developing ideas of a writing type.
Micro and Macro Writing
Micro skills apply more appropriately to imitative and intensive writing type while
macroskills are essential for successful masterly of responsive and extensive types of writing.
LANGUAGE ASSESSMENT
DESIGNING ASSESSMENT TASKS
1. Imitative writing
 Young learners to the old learners of English language, need basic training in the
assessments of initiative writing, specifically forming letters, words, and simple
sentences and also pronunciation
 Tasks type include copying, listening cloze selective tasks, picture cued tasks, form
completion tasks, convection of abbreviation and numerical figures to words.
2. Intensive Writing
 Also called control writing or form-focused writing, grammar writing or guided
writing
 Students produced language to display their competence in grammar, vocabularies,
and sentence formation.
 Tasks type include dictation, grammatical transformation tasks that involves all areas
of grammar, ordering tasks, vocabulary assessments tasks, short answer and sentence
completion task
 pictures-cued tasks
 short sentences
 picture description
 -picture sequence description
Issues in Assessing Responsive and Extensive Writing



Authenticity
Scoring
Time
3) Responsive and Extensive writing
It is more open ended tasks like short reports, essays, summaries, and responses.
 Paraphrasing meaning to say or rewrite something with one own words in order to
avoid plagiarism.
 Scoring focus on the scale of grammar, vocabulary used and discourse markers
 Guided question and answers task is another type of this writing
 Paragraph construction tasks is an of extensive writing which involves construction of
topic sentence, topic development within a paragraph, development of main idea and
supporting ideas in the paragraphs
 This type of writing involves many strategies in order to accomplish the purpose of
writing.
Scoring Methods for Responsive and Extensive writing
LANGUAGE ASSESSMENT
 Holistic Scoring _is a systematic scoring procedures
 Primary Trait scoring _ the method focus on how well a student could perform in
writing with a narrow defined range of discourse.
 Analytic Scoring _ is a classroom evaluation of learning process that covers almost
six elements of writing.
LANGUAGE ASSESSMENT
CHAPTER 10
ALTERNATIVE IN
ASSESSMENT
 Principal purpose of the chapter is to examine some of the alternative in assessment
that markedly differ from formal test
 Test is one of the possible types of assessments. The differences between test and
assessment is that testing is a formal method with strict time limited while assessing is
a ongoing process a course take.
 Test measures performance of a specific domain while assessing procedure or method
focus on broader concept to smaller scale like through observation
Characteristics of alternative in assessment stated by Hudson and Brown (1998)










Require learners to perform, create, or do something
Use real world contexts o stimulus
Are non-intrusive
Assessed on daily basis class activities
Focus on process and product
Involve higher level thinking and problem solving skills
Strength and weakness of students is clear
Multi-culturally sensitive
All scoring procedures done only by human not machine or computer
Signaling teachers to perform the instructional and assessment roles.
The Dilemma of Maximizing Both Practicality and Washback
The negative correlation as a technique increases in its washback and authenticity, its
practicality and reliability tend to be lower. Large scales multiple choice tests cannot offer
mush washback and authenticity. This challenge is to understand that the alternatives in
assessment are not doomed to be impractical and unreliable.
Performance-Based Assessment
Performance-based assessment implies productive, observable skills, such as speaking and
writing, of content-valid tasks. It implied an integration of language skills. Students and
teachers are likely to be more motivated to perform them, as opposed to set a multiple choice
question about facts and figures regarding and solar system.
The following are characteristics of performance assessment:
o Students make a constructed response
o They engage in higher-order thinking, with open-ended tasks
o Tasks are meaningful, engaging, and authentic
LANGUAGE ASSESSMENT
o Tasks call for the integration of language skills
Portfolios
Portfolio is a purposeful collection of student’s work that demonstrates their efforts, progress
and achievement in given areas. It includes material such as essay, reports, poetry, artwork,
journals, test scores, and self-peer assessment. There are six possible attributes of a portfolio,
collecting, reflecting, assessing, documenting, linking, evaluating. Successful portfolio
development will depend on following a number of steps and guidelines
o
o
o
o
o
o
o
State objectives clearly
Give guidelines on what material to include
Communicative assessment criteria to students
Designate time within the curriculum for portfolio development
Establish periodic schedules for review and conferencing
Designate an accessible place to keep portfolios
Provide positive washback-giving final assessments
Journal
A journal is a log of one’s thought feelings, reactions, assessments, ideas, or progress toward
goals, usually written with the little attention to structure, form, or correctness. Journals serve
important pedagogical purpose: practice in the mechanics of writing, using writing as a
“thinking process, individualization, and communication with the teacher. The following
steps are not coincidentally parallel to those cited above for portfolio development:
 Sensitively introduce students to the concept of journal writing
 State the objective of the journal (language learning logs, grammar journals,
responses to readings, strategies-based learning logs, self-assessment reflections,
diaries of attitudes, feeling, and other affective factors, and acculturation logs,
 Give guidelines on what kinds of topics to include
 Carefully specify the criteria for assessing or grading journal
 Provide optimal feedback in your responses (cheerleading feedback, instructional
feedback, and reality-check feedback)
 Designate appropriate time frames ad schedules for review
 Provide formative, washback-giving final comments
Conferences and Interviews
In here, teacher plays the role of a facilitator and guide and student need to understand that
the teacher is an ally who is encouraging self-reflection and improvement. While an interview
is a discussions of alternatives in assessment usually encompass one specialized kind of
conference. Goals of interview are assess the student’s oral production, seeks to discover a
student’s learning style and preferences, asks a student to assess his or her own performance,
and requests an evaluation of course.
Observations
LANGUAGE ASSESSMENT
Teacher intuition about student’s performance is not infallible, and certainly both the
reliability and face validity of their feedback to students can be increased with the help of
empirical means of observing their language performance. Kinds of student performance that
can be usefully observed are sentence level oral production skills, discourse level skills,
interaction with classmates, quality of teacher-elicited responses, and length of utterances. To
carry out classroom observation, we should take the following steps
o
o
o
o
o
Determine the specific objectives of the observation
Decide how many students will be observed at one time
Set up the logistics for making unnoticed observations
Design a system for recording observed performances
Do not overestimate the number of different elements you can observe at one
time
o Plan how many observations you will make
o Determine specifically how you will use the results
Self and Peer Assessments
The ability to one’s own goals both within and beyond the structure of a classroom
curriculum, to pursue them without the presence of an external prod and to independently
monitor that pursuit are all keys to success. Peer assessment appears to cooperative learning.
The benefit self and peer assessment for community learners capable of teaching each other
something.
Types of self-peer assessment
Assessment of a performance
 Students monitors themselves in either oral or written production and renders some
kind of evaluation of performance
Indirect assessment of competence
 Self and peer assessments of performance are limited in time and focus to a relatively
short performance. Assessment of competence may encompass a lesson over several
days, a module, or even a whole term of course work.
Metacognitive assessment (for setting goals)
 Personal goal-setting has the advantage of fostering intrinsic motivation and of
providing learners with that extra-special impetus from having set and accomplished
one’s own goals.
Socio-affective assessment
 It requires looking at oneself through a psychological lens and may not differ greatly
from self-assessment across a number of subject matter areas or for any set of
personal skills.
LANGUAGE ASSESSMENT
Student generated test
 Gorsuch found that student-generated quiz items transformed routine weekly quizzes
into a collaborative and fulfilling experience. Students in small groups were directed
to create content questions on their reading passages and to collectively choose six
vocabulary items.
Guidelines for self-peer assessment
o
o
o
o
Tell student the purpose of assessment
Define the tasks clearly
Encourage impartial evaluation of performance or ability
Ensure beneficial washback through follow up tasks
LANGUAGE ASSESSMENT
CHAPTER 11
GRADING AND STUDENT
EVALUATION
GUIDELINES FOR SELECTING GRADING CRITERIA
Consider the following guidelines
 It is essential for all components of grading to be consistent with an institutional
philosophy and/or regulations.
 All of the competences of a final grade need to be explicitly started in writing to
students at the beginning of a term of study with a designation of percentages or
weighting figures for each component
 If your grading system includes behavior and motivation, it is important to recognize
their subjectivity
 Finally, consider allocating relatively small weights to items oral participation in the
class through attendance so that a grade primarily reflects achievements
Calculative grade: Absolute and relative grading
If you pre-specify standards of performance on a numerical point system, you are using an
absolute system grading for example having established points for a midterm test, points for a
final term, and for the accumulated for the semester. Relative grading is more commonly
used than absolute grading. It has the advantage of allowing your own interpretation and of
adjusting for unpredicted ease of difficulty of a test. It usually accomplished by ranking
student in order of performance and assigning cut-off points of grades.
Institutional Expectations and Constraints
Some institutions refuse to employ either a letter grade or a numerical system of evaluating
and instead offer narrative evaluations of students. This preference for more individualized
evaluations is often a reaction to the overgeneralizations of letter and numerical grading. Try
to determine what is grading philosophy because sometimes teacher will grade students using
a system that conforms to an unwritten philosophy.
Cross-cultural factor and the question of difficulty
Every learner may have implicit philosophies of grading at wide variance with those of an
English-speaking culture. Teacher needs to understand the context in which they are teaching.
A number of variables bear on the issue
o It is unheard of to ask a student to self-assess performance
o The teacher assigns a grade, and nobody questions the teacher’s criteria
LANGUAGE ASSESSMENT
o One single final examination is the accepted determinant of a student’s entire
course grade
o The notion of a teacher’s preparing student to for their best on a test is an
educational contradiction
Alternative to Letter Grade
For assessment of a test, paper, report, etc., the possibilities beyond a simple number of letter
include:





Teacher’s marginal and or/end comment
Teacher’s written reaction to a student’s self-assessment of performance
A teacher’s review of the test in the next class period
Peer-assessment of performance
Self-assessment of performance
For summative assessment, those addiction assessments can modify forms:




A teacher’s marginal and/or ends of exam, paper, project comments
A teacher’s summative written evaluative remarks on a journal, portfolio, or other
tangible product
A teacher’s conference with the student
A completed summative checklist of competencies with comments
Some principle and guidelines for grading and evaluation
We should understand that:
o
o
o
o
o
o
Grading is not necessary based on a universally accepted scale
Grading is sometimes subjunctive and context dependent
Grading of tests is often done on the curve
Grades reflect a teacher’s philosophy of grading
Grades reflect an institutional philosophy of grading
Cross cultural variation in grading philosophies needs to be understood.
Download
Study collections