55555379

advertisement
Using the Common European
Framework of Reference
to Report Language Test Scores
Spiros Papageorgiou
University of Michigan
spapag@umich.edu
Overview
•
•
•
•
The Common European Framework of
Reference (CEFR)
The Manual for relating language
examinations to the CEFR
Standard setting
An example of a CEFR standard setting study
in Colombia
The CEFR
• Reference document—not prescriptive
• Basis for the elaboration of language syllabi, curricula,
examinations, and textbooks
• Language objectives: Description of what language learners
have to learn to do in order to use a language for
communication
• Six main levels of proficiency: A1 (lowest), A2, B1, B2, C1, C2
(highest)
The Manual for Relating
Examinations to the CEFR
It aims to “help the providers of examinations to develop,
apply and report transparent, practical procedures in a
cumulative process of continuing improvement in order
to situate their examination(s) in relation to the Common
European Framework” (p. 1).
Stages for Relating Test Content
and Test Scores to the CEFR
•
•
•
•
•
Familiarization
Specification
Standardization training
and benchmarking
Standard setting
Validation
Standard Setting
• The decision making process of classifying examination
results in a number of successive levels
• Performance Level Descriptions (PLD): statements
describing what learners can do with language
(e.g., CEFR descriptors)
• Performance Level Labels (PLL): labels of PLD
(e.g., A1–C2)
• Cut scores: the boundary between two successive
levels
• Participation of expert judges (panelists)
PLL
PLD
C2
Can write clear, smoothly flowing, complex texts in an appropriate and effective style and
a logical structure which helps the reader to find significant points.
C1
Can write clear, well-structured texts of complex subjects, underlining the relevant salient
issues, expanding and supporting points of view at some length with subsidiary points,
reasons and relevant examples, and rounding off with an appropriate conclusion.
B2
Can write clear, detailed texts on a variety of subjects related to his field of interest,
synthesising and evaluating information and arguments from a number of sources.
B1
Can write straightforward connected texts on a range of familiar subjects within his field
of interest, by linking a series of shorter discrete elements into a linear sequence.
A2
Can write a series of simple phrases and sentences linked with simple connectors like
“and”, “but” and “because”.
A1
Can write simple isolated phrases and sentences.
An Example of a Standard Setting Study in
Colombia
• Reporting scores for the Michigan English Test on
the CEFR levels
• 13 participants from the 9 Binational centers in
Colombia
• Familiarization with the CEFR
• Training with item difficulty (Pilot Form B)
• Angoff standard setting method
• First round of judgments
• Pilot Form A statistical information
• Second round of judgments
Standard Setting Validity Evidence
•
Procedural validity: examining whether the procedures
followed were practical and implemented properly; that
feedback given to the judges was effective; and that
documentation was sufficiently compiled.
•
Internal validity: addressing issues of accuracy and
consistency of the standard setting results.
•
External validation: collecting evidence from
independent sources that support the outcome of the
standard setting meeting.
The Familiarization Task
• A1 = 1, A2 = 2, B1 = 3, B2 = 4, C1 = 5, C2 = 6
Procedural Validity:
Internalization of the CEFR
Correlation of descriptor level judgments with
the CEFR during the Familiarization stage
Descriptors
J1
J2
J3
J4
J5
J6
J7
J8
J9
J10 J11 J12 J13
Listening
.85
.89
.80
.81
.71
.77
.79
.88
.80
.70
.91
.84
.79
Reading
.92
.92
.85
.86
.69
.86
.84
.84
.82
.62
.90
.86
.77
Vocabulary
.89
.93
.91
.96
.70
.76
.73
.92
.90
.84
.97
.90
.86
Grammar
.90
.94
.97
.87
.91
.95
.89
.95
.84
.78
.93
.85
.89
Internal Validity: Method Consistency
Standard error of judgments should be ≤ ½
of the standard error of the test
(Section I 1.71 and Section II 1.74 )
Cut score
SEj incl. extreme ratings
SEj excl. extreme ratings
Section I B1
1.97
1.57
Section I B2
1.34
1.34
Section I C1
1.69
1.69
Section II B1
2.00
1.71
Section II B2
2.30
1.62
Section II C1
2.57
1.71
Internal Validity: Decision Consistency
Calculating agreement coefficient rho
(p0; max .98) and kappa (k; max 71)
Cut score
Section I B1
Section I B2
Section I C1
Section II B1
Section II B2
Section II C1
p0
.90
.88
.97
.95
.86
.94
k
.68
.70
.61
.64
.71
.65
Internal Validity: Intra-judge Consistency
Correlation of mean of judgments
with empirical item difficulty
MET section/round of
judgments
Correlation
Section I, Round 1
.42
Section I, Round 2
.83
Section II, Round 1
.73
Section II, Round 2
.92
Internal Validity: Inter-judge Consistency
Indices of agreement and consistency
Index
Section I
Section II
ICC
.94
.94
W
.80
.76
Alpha
.94
.94
External Validity: Reasonableness
of the Cut Scores
Classification of Pilot Form A test takers
(N = 660) into CEFR levels
Level
Section I
Section II
A2
105 (15.91%)
55 (8.33%)
B1
408 (61.81%)
323 (48.94%)
B2
95 (14.39%)
214 (32.43%)
C1
52 (7.88%)
68 (10.30%)
External Validity: Comparison of
Level Classifications
Exact and adjacent level agreement of classifications (N =
302) provided by a test center and the cut score
Agreement
Section I
Section II
Exact level
122 (40.40%)
92 (30.46%)
Within 1 level
290 (96.03%)
264 (87.42%)
Final Stage Before Reporting
Test Scores: Equating
• A statistical procedure used to allow for comparisons of
scores obtained on different test forms
• Adjustment of differences in test form difficulty
(but not content)
• Scaled scores, not percentages
• Examinee position on the language ability scale
• Scores are comparable across different administrations
• Linked to the CEFR cut scores
Reported Scores
Both section scores should be taken into account when
interpreting the test results for use in decision-making
CEFR Level MET Section I scores MET Section II scores
C1
64 and above
64 and above
B2
53–63
53–63
B1
40–52
40–52
A2
39 or below
39 or below
For more information visit
www.lsa.umich.edu/eli/testing
Download