Uploaded by t.g-123

PRINCIPLES OF LANGUAGE ASSESSMENT

advertisement
1
Abstract:
This paper tells about the main principles of language
assessment that should be followed by the teachers as the
constructors of assessment. There are five major principles of
language assessment; practically, reliability, validity,
authenticity, and wash back.
Teaching is not only about delivering knowledge to the students but also
about constructing students’ understanding. A teacher has to know about what
lesson he is going to deliver to the students, the way how to deliver it to them, and
how to give assessment. Assessment is one of the important things in teaching and
learning process because it is a tool to measure whether the students know or
understand the material or not. By giving assessment, teacher can get information
about students’ achievement.
Brown (2010: 25) stated that there are five major principles of language
assessment; practically, reliability, validity, authenticity, and wash back. They are
going to be described in more detail as followed:
1. Practically
Brown said that practically refers to the logistical, down to earth,
administrative issue involved in making, giving, and scoring and assessment
instrument (2010: 26). Further, Mousavi in Brown (2010: 26) stated that these
include cost, the amount of time it takes to construct and to administer, ease
of scoring, and ease of reporting the result.
Based on the definition above, it can be conclude that practically
defines in term of cost, time, administration, scoring/evaluation.
a. Cost
A good test should not be too expensive to conduct. A teacher should
avoid conducting a test that requires excessive budget.
b. Time
A good test should not be too long or too short to be finished by the
students.
2
c. Administration
A good test should not be too complicated or difficult to conduct and it
should be simple to administer.
d. Scoring/evaluation
A good test should be followed by something to make it easy to score like
rubrics of scoring and key answer.
2. Reliability
Brown (2010: 27) said that a reliable test is consistent and dependable.
If you give the same test to the same student or matched students on two
different occasions, the test should yield similar results. From the definition
above, it means that if the test is conducted to the same students on different
occasions then it will produce almost the same result. For example, a student
will get the same score if he or she takes the test, possibly with a different
examiner, on a Monday morning or a Tuesday afternoon.
There is a relationship between reliability and validity as stated by
Bachman (2011: 160). He said that in order for a test score to be valid, it must
be reliable. There are some issues related to reliability as stated by Brown
(2010: 28-29). They are:
a. Student-related Reliability
According to Brown (2010: 28), the most common learner-related issue in
reliability is caused by temporary illness, fatigue, a "bad day" anxiety, and
other physical or psychological factors, which may make an observed
score deviate from one's true score. For example, when a student is not in
his good mood because of his “bad day” while taking a test, then it can
affect his score.
b. Rater Reliability
Rater reliability deals with the scoring process. It can be caused by human
error and subjectively.
3
c. Test Administration Reliability
Test administration reliability concerns with the situation and condition in
which the test is administered. For example, when a teacher wants to
conduct a listening test, he should prepare a room that is comfortable for
listening activity. He has to make sure that the activity will run well by
considering all the things related to the test like the audio system (it
should be clear to all the students), the lighting, and seating arrangement
as well.
d. Test Reliability
Brown said that sometimes the nature of the test can cause measurement
errors (2010: 29). It means that test reliability refers to the test itself. For
example, when the teacher conducts the test with multiple choice items
and in one question has more than one correct answer or when the teacher
uses ambiguous sentence in the test then it can affect the score.
3. Validity
As stated by Brown (2010: 30), a valid test measures exactly what it
proposes to measure. For example, when the students are given a reading test
about the human respiration, a valid test will measure the reading ability such
as identifying general or specific information of the text, not their prior
knowledge (biology) about the human respiration.
Brown (2010: 30-35) proposed five ways to establish validity. They
are content validity, criterion validity, construct validity, consequential
validity, and face validity.
a. Content Validity
Content validity refers to the correlation between the content of the
test and the language skill, structure, etc.
For example:
When the teacher wants to assess students’ speaking ability in a
conversational setting, then the teacher asks the students to answer
paper-and-pencil multiple-choice questions requiring grammatical
4
judgments. It is not achieve content validity. The teacher should
conduct a test that requires the students actually to speak to their
friend.
b. Criterion Validity
Criterion validity emphasizes on the relationship between the test
score and the outcome. According to Brown (2010: 32), criterion
validity usually falls into concurrent validity and predictive
validity.
 Concurrent Validity
A test has concurrent validity if its results are supported by
other concurrent performance beyond the assessment itself.
For example:
The validity of a high score on the final exam of a foreign
language course will be verified by the actual proficiency in
the language.
 Predictive Validity
Predictive validity is to measure and predict a test taker’s
likelihood of future success.
For example:
TOEFL test is intended to know how well someone will
perform the capability of his English in the future.
c. Construct Validity
Construct validity refers to concepts or theories which are
underlying the usage of certain ability.
For example:
Proficiency, communicative competence, and fluency are examples
of linguistic construct. When the teacher conducts a speaking test,
the scoring analysis for the test includes several factors in the final
score: pronunciation, fluency, grammatical accuracy, vocabulary
use, and socio linguistics appropriateness. The justification of these
five factors lies in the theoretical construct that claims those factors
5
to be major components of oral proficiency. But if he conducts a
test that evaluated only pronunciation and grammar, he could be
justifiably suspicious about the construct validity of that test.
d. Consequential Validity
Consequential validity refers to the consequences of using a
particular test for a particular purpose. A good test must give
positive consequence for the students. So, the teacher should
consider the effect of assessment on students’ motivation,
independent learning, study habits, and attitude toward school
work.
e. Face Validity
According to Gronlund in Brown (2010: 35), face validity is
students view the assessment as fair, relevant, and useful for
improving learning. Moreover, Mousavi in Brown (2010: 35) stated
that face validity refers to the degree to which a test looks right,
and appears to measure the knowledge or abilities it claims to
measure, based on the subjective judgment of the examinees who
take it, the administrative personnel who decide on its use, and
other psychometrically unsophisticated observers.
Students may feel that a test isn’t testing what it’s supposed to test,
and this might affect their performance and consequently affect the
result of the test. To overcome students’ perception, the teacher as
test constructor has to consider:

Students will be more confident if they face a well constructed,
expected format with familiar task.

Students will be less anxious if the test can be accomplished
within an allotted time limit.

Students will be optimistic if the items of the test are clear and
uncomplicated.

Students will find it is easy to do the test if the directions are
clear.
6

Students will be less worried if the test is related to their course
work.
4. Authenticity
The fourth major principle of language assessment is authenticity. It
deals with the “real word”. Teachers should conduct a test with the test items
are likely to be applied in the real context of daily life. Brown (2010: 37)
proposes consideration that might be helpful to present authenticity in a test.
They are:

The language in the test is as natural as possible.

Topics are meaningful (relevant, interesting) for the learner.

Some thematic organization to items is provided, such as through a
story line or episode.

Tasks represent, or closely approximate, real-world tasks.
5. Washback
According to http://teflpedia.com/Washback_effect, washback refers
to the influence, either positive or negative, that an exam has on the way in
which students are taught. In addition, Hughes in Brown (2010: 37) said that
washback is the effect of testing on teaching and learning. Based on those
definitions above, it can be concluded that washback refers to the effect of
testing on teaching and learning process and it has two sides; positive and
negative.
 Positive Washback
Positive washback has beneficial influence on teaching and learning for
both the teachers and the students.
For example:
When the teacher conducts daily paper based test and asks the students to
answer some questions, after they finish their job they will submit the
paper to the teacher then the teacher check their job. After that, the
teacher not only gives a score but also gives a feedback or comment
7
about their strengths and their weaknesses on test performance in order to
give motivation to the students.
 Negative Washback
A test which has negative washback is considered to have negative
influence on teaching and learning.
For example:
When the teacher conducts daily paper based test and asks the students to
answer some questions, after they finish their job they will submit the
paper to the teacher then the teacher check their job and only gives them
a score without any comments. In reality, letter grade or numerical score
is not enough. The students need a feedback from their teacher.
CONCLUSION
Based on the explanation about the principles of language assessment
above, we can conclude that a test is good if it contains practically, high
reliability, good validity, authenticity, and positive washback. The teachers should
apply these five principles in conducting a test on their teaching and learning
process.
8
References:
Bachman, L.F. 2011. Fundamental Consideration in Language Testing. New
York: Oxford University Press.
Brown, H.D., Abeywickrama Priyanvada. 2010. Language Assessment: Principles
and Classroom Practices (2nd Edition). New York: Pearson Education,
Inc.
http://teflpedia.com/Washback_effect accessed on March 9th, 2018 at 08.11 am.
Download