Production of comparative language tests The European Survey on Language Competences Neil Jones Cambridge ESOL SQA 28 february 2013 1 Cambridge ESOL Project Management, English Language tests, Language Test Coordination Centre international d’études pédagogiques (CIEP) Centre international d’études pédagogiques French Language tests Goethe Institut German Language tests Università per Stranieri di Perugia Italian Language tests Universidad de Salamanca/ Instituto Cervantes Spanish Language tests Gallup Europe Sampling + Testing tool + Translation National Institute for Educational Measurement (Cito) Questionnaire design, Analysis Key aims: The ESLC set out to: provide information on the general level of foreign language knowledge of pupils provide strategic information to policy makers, teachers and learners Using the following instruments: Language Tests: English, French, German, Italian, Spanish 3 skills (Reading, Listening, Writing) A1 to B2 levels of CEFR Contextual questionnaires: addressing 13 language policy issues for students, teachers, principals and countries Validity as inference to some “real world” .. .. Inference World of the test “Real World” of language use Inference to some “real world”: a sequence of steps Processes, knowledge Test/ task features Learner features Test construction Test score Test performance 1 2 “Real world” (target situation of use) Measure 3 4 Test construction Theory-based validity Context validity Processes, knowledge Test/ task features Learner features Evaluation Test score Test performance 1 What to observe? How? Generalization 2 Extrapolation How can we score what we observe? Are scores consistent and interpretable? Scoring validity Measurement validity Scale construction, measurement Context-specific Framework levels “Real world” (target situation of use) Measure 3 Alignment 4 5 Does the test score reflect the candidate’s actual ability? How does the specific learning/testing context relate to a more general proficiency framework? Standard setting, interpretation Context-neutral Approach to developing the language testing framework Identify the language testing objectives of the ESLC. For each skill, identify test content and testable subskills derived from: a socio-cognitive model of language proficiency language functions or competences salient at levels A1 to B2 in CEFR identify appropriate task types to test these subskills develop specifications, item writer guidelines and a collaborative test development process that are shared across languages in order to produce comparable language tests. Common European Framework model of language use/learning “…the actions performed by persons who as individuals and as social agents develop a range of competences, both general and in particular communicative language competences. They draw on the competences at their disposal in various contexts under various conditions and constraints to engage in language activities involving language processes to produce and/or receive texts in relation to themes in specific domains, activating those strategies which seem most appropriate for carrying out the tasks to be accomplished. The monitoring of these actions by the participants leads to the reinforcement or modification of their competences.” (Council of Europe 2001:9, emphasis in original). CEFR’s model of language use and learning Domain of use Strategies The language learner/ user Processes Language activity Knowledge Monitoring, assessment Task Topic (situation, theme…) An interactional view Test tasks reflect TLU tasks. Test Learner’s engagement with tasks has interactional authenticity. Task Strategies The language learner/ user Processes Knowledge Domain of use (TLU) Task Language activity Task Topic (situation, theme…) Task Task Test performance enables inference to performance in TLU. A model for reading (after Weir 2005) Creating a text level structure: Construct an organised representation of the text [or texts] Building a mental model Integrating new information Enriching the proposition Remediation where necessary Monitor: goal checking Meaning representation of text(s) so far Goal setter Careful reading Local: Understand sentence GlobaI Comprehend main idea(s) Comprehend overall text Comprehend overall texts Expeditious reading Local: Scan for specifics Global: Skim for gist Search for main ideas and important detail Metacognitive mechanisms/ Strategies General knowledge of the world Topic knowledge Inferencing Selecting appropriate type of reading: Text structure knowledge: Genre Rhetorical tasks Establishing propositional meaning at clause and sentence levels The language learner/ user Parsing Strategies Processes Lexical access Knowledge Word recognition Visual input Central processing core Syntactic knowledge Lexicon Lemma: Meaning Word class Lexicon Form: Orthography Phonology Morphology Knowledge Domains of language use A1 A2 B1 B2 personal 60% 50% 40% 25% public 30% 40% 40% 50% educational 10% 10% 20% 20% professional 0% 0% 0% 5% Features of approach Implementation of construct: subskills mapped to specific task types Reading and Listening: objectively marked; Writing: subjectively marked Four task development stages: Pilot (2008), Pretesting (2009) Field Trial (2010), Main study (2011) Task adaptation across languages Cross-language vetting Reading – an A1 task You will read a notice about a cat. For the next 4 questions, answer A, B or C. Leo is lost. He’s my little cat. He’s white with black paws. He’s small and very sweet. He has brown eyes. He wears a grey collar. He didn’t come home on Monday and it’s Thursday today. That’s a long time for a little cat! Leo often sits on top of the houses near here between Smith’s baker’s shop and King Street. If you find him in your garden or under your car, please telephone me immediately. Please note – Leo doesn’t like it when people pick him up, and he doesn’t like milk. Thank you for your help! Sophie Martin tel: 798286 Busco a mi gato Leo. Ha desaparecido. Es blanco con las patas negras. Es pequeño, tiene 7 meses y es muy bonito. Tiene los ojos marrones. Lleva un collar gris. Le gusta sentarse en los tejados de las casas que están entre la panadería García y la calle de la Victoria. No veo a Leo desde el lunes y hoy es jueves. Es mucho tiempo para un gato tan pequeño. Leo no bebe leche y no come pan. Si lo ves cerca de tu casa o debajo de un coche, llámame. Gracias por tu ayuda. Sofía Alonso 626 537 548 Reading – an A1 task 1 What colour is Leo? A white and grey B brown and grey C black and white 3 Where does Leo like to go? A in gardens B under cars C on houses 2 Sophie saw Leo A yesterday. B a few days ago. C a week ago. 4 If you find Leo A phone Sophie. B give him some milk. C tell the baker. 1 2 ¿De qué color es Leo? 3 Leo lleva fuera de casa A Blanco y gris A un día. B Marrón y gris B varios días. C Blanco y negro C una semana. A Leo le gusta sentarse 4 Si ves a Leo debes A en los jardines. A ir a la panadería. B debajo de los coches. B darle leche. C en los tejados. C llamar a Sofía. EN - Holiday photo You are on holiday. Send an email to an English friend with this photo of your holiday. Tell your friend about: • the hotel • the weather • what the people are doing Write 20–30 words. FR - Photo de vacances Tu es en vacances. Tu envoies un email à un ami avec cette photo de tes vacances. Tu utilises la photo pour parler de : • l’hôtel • le temps • les activités Tu écris 20–30 mots. ES - Foto de vacaciones ES - DE Foto de vacaciones - Urlaubsfoto Estás deFerien. vacaciones. e-mail a un Du hast SchreibEnvía deiner un deutschen amigo español con esta foto deUrlaubsfoto. tus Freundin eine E-Mail mit diesem vacaciones. Schreib deiner Escribe sobre:Freundin über: Hotel • das el hotel •• das el tiempo Wetter • qué hace la gente •Escribe was die 20–30 Leute machen palabras. Schreib 20–30 Wörter. IT – A1 level not tested Marking of Writing Responsibility of countries Central trickle-down training sessions held for national coordinators A proportion of multiple marking in each country: check on incountry rater agreement But (all) multiple-marked scripts also centrally marked: additional check on leniency/severity Country A Country B Single marking Multiple marking Central marking Central markers Country C A. Communication how many of the content points are dealt with (clearly) how well the points are expanded style – register B. Language coherence vocabulary cohesion accuracy ~~~~~~~~~ ~~~~~~~~~ ~~ ~~~~~~~~~ ~~~~~~~~~ ~~ 1 lower 2 3 higher Lower exemplar Higher exemplar ~~~~~~~~~ ~~~~~~~~~ ~~ ~~~~~~~~~ ~~~~~~~~~ ~~ 1 lower 2 ~~~~~~~~~ ~~~~~~~~~ ~~ ~~~~~~~~~ ~~~~~~~~~ ~~ 3 4 5 higher Item response theory and item-banking Measurement scale Standards consistently applied .. 90 80 B1 Test 1 70 .. A2 60 Test 2 50 .. A1 40 Test 3 30 Learners located on scale Tests at appropriate level Item bank links all levels Targeted language testing Routing test B1 A2 A1 A2 B1 B2 Test design Level 1 tasks\booklets A1-R1-a A1-R1-b A1-R2-a A1-R2-b A1-R3-a A1-R3-b A2-R2-a A2-R2-b A2-R3-a A2-R3-b A2-R4-a A2-R4-b A2-R5-a A2-R5-b English ER111 ER112 ER211 ER212 ER311 ER312 ER221 ER223 ER321 ER323 ER422 ER423 ER522 ER523 time 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 b1 1 English ER221 ER223 ER321 ER323 ER422 ER423 ER522 ER523 ER532 ER533 ER631 ER633 ER731 ER733 time 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 7.5 English ER532 ER533 ER631 ER633 ER731 ER733 ER642 ER643 ER741 ER742 ER841 ER843 time 7.5 7.5 7.5 7.5 7.5 7.5 15 15 15 15 15 15 b4 2 b5 b6 b7 b8 2 1 1 2 1 1 1 1 3 1 1 3 4 4 1 2 2 1 3 30 4 3 4 30 Level 2 b14 b15 3 4 4 3 4 4 3 3 4 30 30 3 30 b16 b17 b18 2 30 30 30 4 30 b19 2 b20 b21 b22 2 1 2 1 2 2 2 2 2 3 3 4 3 4 4 3 4 4 3 3 3 30 30 Level 3 b26 b27 2 2 4 3 4 3 30 30 30 3 30 30 4 30 30 3 30 b28 b29 b30 b31 b32 b33 b34 b35 b36 1 2 1 1 2 2 1 1 2 1 2 2 3 3 1 1 2 3 3 2 3 4 4 30 2 1 b24 1 1 1 4 b23 2 1 2 3 30 1 1 2 1 30 1 1 1 3 3 30 b12 2 3 3 b25 1 2 b11 4 4 4 b10 2 3 b13 1 b9 2 2 30 tasks\booklets B1-R5-a B1-R5-b B1-R6-a B1-R6-b B1-R7-a B1-R7-b B2-R6-a B2-R6-b B2-R7-a B2-R7-b B2-R8-a B2-R8-b b3 1 2 30 tasks\booklets A2-R2-a A2-R2-b A2-R3-a A2-R3-b A2-R4-a A2-R4-b A2-R5-a A2-R5-b B1-R5-a B1-R5-b B1-R6-a B1-R6-b B1-R7-a B1-R7-b b2 2 30 30 30 30 30 30 3 30 3 30 30 2 1 30 30 Standard setting to the CEFR Standard reference: the CoE Manual for relating language exams to the CEFR; http://www.coe.int/t/dg4/linguistic/manuel1_en.asp Jones, N (2009) A comparative approach to constructing a multilingual proficiency framework: constraining the role of standard setting http://www.coe.int/t/dg4/linguistic/Proceedings_CITO_EN.pdf See too the CoE Manual for language test development and examining (ALTE) http://www.coe.int/t/dg4/linguistic/ManualtLangageTestAlte2011_EN.pdf Standard setting to the CEFR My conclusions: Build on what you already know; Performance skills are a more practical target for standard setting judgment than indirectly observable, objectively marked skills; Comparative judgments are easier than absolute judgments, and therefore ranking may offer more than rating; In a multilingual framework it is essential to minimize the role of subjective judgment. Cross-language alignment In ESLC a study was possible for Writing. A ranking study, cf Sevres (2008) for Speaking Ranking approach to cross-language comparison (Speaking, CIEP 2008) Standard Set for Rankings 10 8 C1 6 B2 A2 Rankings B1 4 German 2 English Spanish 0 -10 -5 0 -2 5 10 French Italian -4 A1 -6 -8 -10 A1 A2 Ratings B1 B2 C1 Levels from rating ESLC Writing alignment: five languages on a single scale -4 -3 -2 English Level -1 0 0 1 2 3 4 5 6 French German Italian 1 2 3 4 Spanish B2 B1 A2 A1 Pre-A1 Spanish B2 B1 A2 A1 Pre-A1 Italian B2 B1 A2 A1 Pre-A1 German mean median B2 B1 A2 A1 Pre-A1 French B2 B1 A2 A1 Pre-A1 -2.5 Students English -1.5 -0.5 0.5 1.5 2.5 First target language (Skills averaged) CEFR levels First language (Skills averaged) Percentage 100% 0% 80% 20% 60% 40% B2 B1 A2 40% 60% A1 Pre-A1 20% 80% 0% 100% UK- FR BE nl PL ES PT BE fr BG BE EL HR SI EE NL MT SE ENG (EN) (FR) (EN) (EN) (EN) (EN) (EN) de (EN) (EN) (EN) (EN) (EN) (EN) (EN) (FR) (FR) Second target language (Skills averaged) CEFR levels Second language (Skills averaged) Percentage 100% 0% 80% 20% 60% 40% B2 B1 A2 40% 60% A1 Pre-A1 20% 80% 0% 100% SE PL UK- EL PT FR HR BG SI EE BE fr ES (ES) (DE) ENG (FR) (FR) (ES) (DE) (DE) (DE) (DE) (DE) (FR) (DE) MT (IT) NL (DE) BE BE nl de (EN) (EN) Asset Languages link between GCSE and CEFR NQF level General qualifications Asset Languages Asset CEFR levels Level 7-8 Mastery C2 Levels 4-6 Proficiency C1 Cambridge Cambridge CEFR levels ESOL exams Level 3 AS/A/AEA Advanced B2 C2 CPE Level 2 Higher GCSE Intermediate B1 C1 CAE Level 1 Foundation GCSE Preliminary A2 B2 FCE Entry 1-3 Breakthrough A1 B1 PET A2 KET Entry 3 Level Entry 2 Level Entry 1 Level A1 GCSE grades and CEFR levels http://www.surveylang.org