PREPARED BY RAZİYE ŞENGÜL EVALUATION,ASSESSMENT AND TESTING EVALUATION : It involves looking at all factors that influences the learning process, i.e., syllabus objectives, course design and materials. ASSESSMENT : It refers to a variety of ways of collecting information on a learner’s language ability or achievement. It is an umbrella term for all measures used to evaluate student progress. TESTING : It is a subcategory of assessment. It is a systematic procedure used to gather information about students’ behaviour. TYPES OF TESTS PLACEMENT TESTS : They assess students’ level of language ability so they can be placed in an appropriate course or class. APTITUDE TESTS : They measure capacity or general ability to learn a foreign language. DIAGNOSTIC TESTS : They identify language areas in which a student needs further help. They are based on failure. PROGRESS TESTS : They measure the progress that students are making toward defined course or program goals. ACHIEVEMENT TESTS : They are based on the progress of a specific course content and they are usually administered at mid or end of the semester. PROFICIENCY TESTS : They assess the overall language ability of students at varying levels or in a particular skill area. TEST PURPOSE Additional ways of labeling tests TRADITIONAL versus ALTERNATIVE ASSESSMENT TYPES OF ALTERNATIVE ASSESSMENT SELF ASSESSMENT : It allows the students to provide feedback on their own work or that of their peers. PORTFOLIO ASSESSMENT : It is a collection of student works that are associated with standards you are required to learn. STUDENT-DESIGNED TESTS : Students in upper year courses might be involved in designing exam questions, reading questions or even entire assignment that they would like to complete in order to demonstrate their learning. PROJECTS: Students learn about a subject by working for an extended period of time to investigate and respond to a complex question, challenge, or problem. PRESENTATIONS: It usually consists of a topic for the student to research, discuss and present. Question and answer session is usually included after the presentation. This measures the ability of students to respond, think under pressure and manage discussion. TIMING OF THE TEST - APTITUDE, ADMISSIONS AND PROFICIENCY TESTS are applied before or outside the program. -PLACEMENT AND DIAGNOSTIC TESTS occur at the start of the program. -PROGRESS AND AND ACHIEVEMENT TESTS take place during the course. -MASTERY OR CERTIFICATION TESTS occur at the end of the program. THE CORNERSTONES OF TESTING USEFULNESS VALIDITY RELIABILITY PRACTICALITY WASHBACK AUTHENTICITY TRANSPARENCY SECURITY MANAGEABILITY REFERENCING USEFULNESS Any language test must be developed with a specific purpose, a particular group of test-takers and a specific language use in mind. It is the most important quality of testing. VALIDITY It means the test assesses the course content and outcomes using formats familiar to the students. The types of validity : 1-CONTENT VALIDITY: A test should sample the subject matter and require the test-taker to perform the behaviour that is being measured. For example, . If you are trying to assess a person's ability to speak a second language in a conversational setting, asking the learner to answer, paper-and-pencil multiple-choice questions requiring grammatical judgments does not achieve content validity 2-CRITERION RELATED VALIDITY: It means the extent to which the criterion of the test has actually been reached.For example, A classroom test designed to assess mastery of a point of grammar in communicative use will have criterion validity if test scores are corroborated either by observed subsequent behavior or by other communicative measures of the grammar point in question. 3-CONSTRUCT VALIDITY : A test has to test those attributes it is supposed to. For example, if we want to test reading.we need to examine all the underlying skills involved such as reading both aloud and silently,with accuracy,fluency and verve. 4-CONSEQUENTIAL VALIDITY: It encompasses all the consequences of a test, including such considerations as its accuracy in measuring intended criteria, its impact on the preparation of test-takers, its effect on the learner, and the (intended and unintended) social consequences of a test's interpretation and use. 5-FACE VALIDITY : refers to the degree to which a test looks right, and appears to measure the knowledge or abilities it claims to measure, based on the subjective judgment of the test-takers. Face validity is not something that can be empirically tested by a teacher or even by a testing expert. It is purely a factor of the "eye of the beholder"-how the test-taker, or possibly the test giver, intuitively perceives the instrument. RELIABILITY It refers to the consistency of test scores, which simply means that a test would offer similar results if it were given at another time. Three important factors affect test reliability : Fluctuations in the learner such as additional learning, forgetting, sickness or emotional problems that affect the results. Fluctuations in scoring such as subjectivity, mechanical errors that affect the results. Fluctuations in test administration such as inconsistent administrative procedures and testing conditions that affect the results. PRACTICALITY A teacher should be able to develop, administer and mark the test within the available time and with available resources. Also the feedback from assessment should be understood by the students. WASHBACK It refers to to the effect of testing on teaching and learning. Formative tests, by definition, provide washback in the form of information to the learner on progress toward goals. But teachers might be tempted to feel that summative tests, which provide assessment at the end of a course or program, do not need to offer much in the way of washback. Such an attitude is unfortunate because the end of every language course or program is always the beginning of further pursuits, more learning, more goals, and more challenges to face. Even a final examination in a course should carry with it some means for giving washback to students. AUTHENTICITY When you make a claim for authenticity in a test task, you 'are saying that this task is likely to be enacted in the "real world." Many test item types fail to simulate real-world tasks.They may be artificial in their attempt to target a grammatical form or a lexical item. In a test, authenticity may be present in the following ways: The language in the test is as natural as possible. Items are contextualized rather than isolated. Topics are meaningful (relevant, interesting) for the learner. Some thematic organization to items is provided, such as through a story line or episode. Tasks represent, or closely approximate, real-world tasks. TRANSPARENCY It refers to the availability of clear, accurate information to students about testing. Such information should include outcomes to be evaluated, formats used,weighting of items and sections, time allowed to complete the test and grading criteria. MANAGEABILITY The assessment task must not take excessive administrative time so that the costs do not outweigh the benefits. REFERENCING (JUDGING PERFORMANCE) NORM REFERENCING : It is a system of assessment that judges the performance of an individual within a group against the whole group’s performance. If the sample is large enough, it will form a bell-shaped curve with the same mean and standard deviation as the whole population. CRITERION REFERENCING: It is the principle of defining what is required before a test is sat and then judging the individuals against those criteria. For example, in order to get a swimming certificate, the criteria will be that you have to swim a certain distance within a certain time. You might be the last but it is irrelevant if you do it in the time. THANKS FOR LISTENING…