Good Assessment by Design International GCSE and GCE Comparative Analyses Dr. Rose Clesham Put into context: we have looked at education systems that use high stakes external assessments for 16 and 18 year olds, in terms of: • Contextual national characteristics and factors underpinning these educational systems • How their curriculum and content standards are defined and operate • How their summative national assessments are designed and their focus 85% 81% 83% 77% World economy The Assessment Analysis To provide comparisons between UK based GCSEs and GCEs, PISA tests, and selected high performing PISA jurisdictions in terms of: • Their assessment structures and demands in summative assessments • The use of question types • Subject specific areas of focus (eg. use of context in maths, source based focus, particular skills) • Providing examples of good assessment practice In order to increase our global understanding, and inform the design of WCQ qualifications and assessments. Cross Business Research Approach 100+ people involved • Internal/external subject team workshops • Workshop subject groups agreed on and standardised subject definitions of mapping categories • 2-4 raters at item level in each subject • Results aggregated rather than a true score judgement The Methodology Content Standards • Based on Uniform Content Standards (Porter, DfE) that can be applied internationally, eg. for Physics Cognitive Operations • Based on the focus and type of question. These are similar to, but give more detail than Assessment Objectives – and can be used to compare national and international assessments (using a variation of Porter) Cognitive Demand • Based on considerations of how content standards, assessment questions and mark schemes are written and applied – mainly to do with complexity and linkage (using a new model that incorporates Bloom, Webb, Pollitt, Biggs and Collis) Outcomes For example, Content Standards in Mathematics at 16 Outcomes For example, Content Standards in Mathematics at 16 Outcomes For example, Content Standards in Mathematics at 16 Outcomes For example, Content Standards in Mathematics at 16 Outcomes For example, Content Standards in Mathematics Outcomes For example, Content Standards in Mathematics at 16 Cognitive operations in Science (at 16) Cognitive operations 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Apply concepts / make connections Analyse information Demonstrate understanding Perform procedures Pi sa G na l tio er na In t SQ A CS E SE SW N iG C on g H En gl a Ko n g nd Memorize Proportion of Lower to Higher Order Cognitive Operations in Science 70% 30% High order science cognitive operations: % differences from the mean, by country Cognitive demand in Science 52% 44% 4% % difference from mean of low demand items in science, by country % difference from mean of medium demand items in science, by country % difference from mean of high demand items in science, by country Dendrogram showing family relationships between Maths assessments Porter’s Alignment Indices in Maths Why does all of this matter? Mean task input as percentiles of the 1960 task distribution Economy-wide measures of routine and non-routine task input (US) 65 Routine manual 60 Nonroutine manual 55 Routine cognitive 50 Nonroutine analytic 45 40 1960 Nonroutine interactive 1970 1980 1990 2002 The dilemma of schools: The skills that are easiest to teach and test are also the ones that are easiest to digitise, automate and outsource (Levy and Murnane) Education and assessments needs to prepare students: •To deal with more rapid change than ever before •For jobs that have not yet been created •Using techniques that have not yet been invented •To solve problems that we don’t yet know will arise Overall conclusions of our research: • Appropriate content representation is a key assessment issue. • In general, lower order cognitive operations dominate assessments. • Higher order cognitive operations are not well represented in assessments, for example problem solving skills in maths and applying concepts and making connections in science. • Cognitive operations and cognitive demand often work independently of each other and therefore need to be actively designed into assessments. • A range of different question types was evident. The analysis showed that although more closed question types can assess a range of cognitive operations and demands, they are less effective than more open responses. Two fundamental points … • No country’s assessment system was perfect either in design or in outcome- however, some are clearly more designed and comprehensive than others • Good assessment doesn’t happen by chance, it has to be carefully designed and developed to assess the knowledge and skills we want and value. Implications This work has enabled us to: •Evaluate assessments using a set of common criteria •Make empirical and standardised judgements •Contrast and compare As our qualifications now move into developmental phases, we need to ensure that they: • Reflect intended curriculum aims and content standards • Assess the knowledge and skills that will be needed for our learners • Are informed by international best practice • And comparable with best practice • And will be more reliable, valid and discriminating in terms of design, focus and . content . Thank you rose.clesham@pearson.com