1 Abstract: This paper tells about the main principles of language assessment that should be followed by the teachers as the constructors of assessment. There are five major principles of language assessment; practically, reliability, validity, authenticity, and wash back. Teaching is not only about delivering knowledge to the students but also about constructing students’ understanding. A teacher has to know about what lesson he is going to deliver to the students, the way how to deliver it to them, and how to give assessment. Assessment is one of the important things in teaching and learning process because it is a tool to measure whether the students know or understand the material or not. By giving assessment, teacher can get information about students’ achievement. Brown (2010: 25) stated that there are five major principles of language assessment; practically, reliability, validity, authenticity, and wash back. They are going to be described in more detail as followed: 1. Practically Brown said that practically refers to the logistical, down to earth, administrative issue involved in making, giving, and scoring and assessment instrument (2010: 26). Further, Mousavi in Brown (2010: 26) stated that these include cost, the amount of time it takes to construct and to administer, ease of scoring, and ease of reporting the result. Based on the definition above, it can be conclude that practically defines in term of cost, time, administration, scoring/evaluation. a. Cost A good test should not be too expensive to conduct. A teacher should avoid conducting a test that requires excessive budget. b. Time A good test should not be too long or too short to be finished by the students. 2 c. Administration A good test should not be too complicated or difficult to conduct and it should be simple to administer. d. Scoring/evaluation A good test should be followed by something to make it easy to score like rubrics of scoring and key answer. 2. Reliability Brown (2010: 27) said that a reliable test is consistent and dependable. If you give the same test to the same student or matched students on two different occasions, the test should yield similar results. From the definition above, it means that if the test is conducted to the same students on different occasions then it will produce almost the same result. For example, a student will get the same score if he or she takes the test, possibly with a different examiner, on a Monday morning or a Tuesday afternoon. There is a relationship between reliability and validity as stated by Bachman (2011: 160). He said that in order for a test score to be valid, it must be reliable. There are some issues related to reliability as stated by Brown (2010: 28-29). They are: a. Student-related Reliability According to Brown (2010: 28), the most common learner-related issue in reliability is caused by temporary illness, fatigue, a "bad day" anxiety, and other physical or psychological factors, which may make an observed score deviate from one's true score. For example, when a student is not in his good mood because of his “bad day” while taking a test, then it can affect his score. b. Rater Reliability Rater reliability deals with the scoring process. It can be caused by human error and subjectively. 3 c. Test Administration Reliability Test administration reliability concerns with the situation and condition in which the test is administered. For example, when a teacher wants to conduct a listening test, he should prepare a room that is comfortable for listening activity. He has to make sure that the activity will run well by considering all the things related to the test like the audio system (it should be clear to all the students), the lighting, and seating arrangement as well. d. Test Reliability Brown said that sometimes the nature of the test can cause measurement errors (2010: 29). It means that test reliability refers to the test itself. For example, when the teacher conducts the test with multiple choice items and in one question has more than one correct answer or when the teacher uses ambiguous sentence in the test then it can affect the score. 3. Validity As stated by Brown (2010: 30), a valid test measures exactly what it proposes to measure. For example, when the students are given a reading test about the human respiration, a valid test will measure the reading ability such as identifying general or specific information of the text, not their prior knowledge (biology) about the human respiration. Brown (2010: 30-35) proposed five ways to establish validity. They are content validity, criterion validity, construct validity, consequential validity, and face validity. a. Content Validity Content validity refers to the correlation between the content of the test and the language skill, structure, etc. For example: When the teacher wants to assess students’ speaking ability in a conversational setting, then the teacher asks the students to answer paper-and-pencil multiple-choice questions requiring grammatical 4 judgments. It is not achieve content validity. The teacher should conduct a test that requires the students actually to speak to their friend. b. Criterion Validity Criterion validity emphasizes on the relationship between the test score and the outcome. According to Brown (2010: 32), criterion validity usually falls into concurrent validity and predictive validity. Concurrent Validity A test has concurrent validity if its results are supported by other concurrent performance beyond the assessment itself. For example: The validity of a high score on the final exam of a foreign language course will be verified by the actual proficiency in the language. Predictive Validity Predictive validity is to measure and predict a test taker’s likelihood of future success. For example: TOEFL test is intended to know how well someone will perform the capability of his English in the future. c. Construct Validity Construct validity refers to concepts or theories which are underlying the usage of certain ability. For example: Proficiency, communicative competence, and fluency are examples of linguistic construct. When the teacher conducts a speaking test, the scoring analysis for the test includes several factors in the final score: pronunciation, fluency, grammatical accuracy, vocabulary use, and socio linguistics appropriateness. The justification of these five factors lies in the theoretical construct that claims those factors 5 to be major components of oral proficiency. But if he conducts a test that evaluated only pronunciation and grammar, he could be justifiably suspicious about the construct validity of that test. d. Consequential Validity Consequential validity refers to the consequences of using a particular test for a particular purpose. A good test must give positive consequence for the students. So, the teacher should consider the effect of assessment on students’ motivation, independent learning, study habits, and attitude toward school work. e. Face Validity According to Gronlund in Brown (2010: 35), face validity is students view the assessment as fair, relevant, and useful for improving learning. Moreover, Mousavi in Brown (2010: 35) stated that face validity refers to the degree to which a test looks right, and appears to measure the knowledge or abilities it claims to measure, based on the subjective judgment of the examinees who take it, the administrative personnel who decide on its use, and other psychometrically unsophisticated observers. Students may feel that a test isn’t testing what it’s supposed to test, and this might affect their performance and consequently affect the result of the test. To overcome students’ perception, the teacher as test constructor has to consider: Students will be more confident if they face a well constructed, expected format with familiar task. Students will be less anxious if the test can be accomplished within an allotted time limit. Students will be optimistic if the items of the test are clear and uncomplicated. Students will find it is easy to do the test if the directions are clear. 6 Students will be less worried if the test is related to their course work. 4. Authenticity The fourth major principle of language assessment is authenticity. It deals with the “real word”. Teachers should conduct a test with the test items are likely to be applied in the real context of daily life. Brown (2010: 37) proposes consideration that might be helpful to present authenticity in a test. They are: The language in the test is as natural as possible. Topics are meaningful (relevant, interesting) for the learner. Some thematic organization to items is provided, such as through a story line or episode. Tasks represent, or closely approximate, real-world tasks. 5. Washback According to http://teflpedia.com/Washback_effect, washback refers to the influence, either positive or negative, that an exam has on the way in which students are taught. In addition, Hughes in Brown (2010: 37) said that washback is the effect of testing on teaching and learning. Based on those definitions above, it can be concluded that washback refers to the effect of testing on teaching and learning process and it has two sides; positive and negative. Positive Washback Positive washback has beneficial influence on teaching and learning for both the teachers and the students. For example: When the teacher conducts daily paper based test and asks the students to answer some questions, after they finish their job they will submit the paper to the teacher then the teacher check their job. After that, the teacher not only gives a score but also gives a feedback or comment 7 about their strengths and their weaknesses on test performance in order to give motivation to the students. Negative Washback A test which has negative washback is considered to have negative influence on teaching and learning. For example: When the teacher conducts daily paper based test and asks the students to answer some questions, after they finish their job they will submit the paper to the teacher then the teacher check their job and only gives them a score without any comments. In reality, letter grade or numerical score is not enough. The students need a feedback from their teacher. CONCLUSION Based on the explanation about the principles of language assessment above, we can conclude that a test is good if it contains practically, high reliability, good validity, authenticity, and positive washback. The teachers should apply these five principles in conducting a test on their teaching and learning process. 8 References: Bachman, L.F. 2011. Fundamental Consideration in Language Testing. New York: Oxford University Press. Brown, H.D., Abeywickrama Priyanvada. 2010. Language Assessment: Principles and Classroom Practices (2nd Edition). New York: Pearson Education, Inc. http://teflpedia.com/Washback_effect accessed on March 9th, 2018 at 08.11 am.