Secondary Assessment and the NCEAs 2011 Changes to Setting and Marking – Why? • Continuous improvement • Research based • Monitored by external experts Basis of the changes • Test Dimensionality • Item characteristics • Candidate characteristics Item response curves Grade thresholds DIF (differential item functioning) • Provides information about whether candidates perform differently based on their particular demographic. • Example 1. Males/females in particular question • Example 2. Asian/non-Asian in particular question How we use the Information? Assessment Principles • Discuss and identify the key principles for all assessment with those around you • Write down six adjectives that describe good assessment which would be used as part of the assessment principles Assessment Principles Assessment must be: • Appropriate • Fair and inclusive. Evidence gathered must be: • Valid • Sufficient. Assessor judgements must be: • Consistent • Transparent. We ask examiners to Include (where relevant to the task)… • Content accessible to all • Content that is interesting and relevant • Tasks that encourage response from a diverse perspective • Gender neutral titles and names • Gender balance. We ask them to Avoid (unless pertinent to the standard) … • Controversial issues, people or events • Topics which may upset • Stereotypical pictures, cartoons and resources • Tokenism. Achievement standard (criteria and explanatory notes) Assessment schedule Assessment task (evidence/ judgement statements) (the means of collecting valid evidence) Analysis of the standard • Criteria – determine the difference between each level of performance • Explanatory notes – identify any restrictions, explanations or specifics • Curriculum – may provide further clarification or exemplification Supporting Documents • Assessment Specifications • Exemplars • Assessment reports • Curriculum Developing questions • Verb + Subject + Condition • Verb must signal clearly what is required and be consistent with criteria and explanatory notes in the standard • Subject must be appropriate to the curriculum, criteria and explanatory notes • Condition must be appropriate to the curriculum, criteria and explanatory notes • Students must have sufficient opportunity to achieve at every level. The Assessment Schedule… • Designed to achieve consistency of assessor judgements • Developed at the same time as the assessment activity • Provides examples of expected student evidence • Specifies minimal requirements for achievement at each level (quality and quantity). Quality Check Check for: • Excess or areas of deficiency • Validity – against the standard – use standard itself • Level of difficulty • Time for students to complete task • Clarity of instructions • Consistency • Error-free. Traps and pitfalls - check… • • • • • • • • All resources are appropriate, clear and accessible. Labels and tables – instructions match. Diagrams – do they work? General accuracy – languages: accents and characters / history: dates and periods. Pictures / images – relevant, good quality. Genre/context of task and standard match Task numbering is consistent. Key words bolded. Research has affected question structure • All questions now provide evidence for all levels of achievement • Most questions now have scaffolding What is scaffolding? • Providing the support (footholds) that enable more students to demonstrate evidence of higher performance levels. • Questions that are scaffolded: – elicit better responses from students – have more students attempting the more challenging parts of questions. • Some form of scaffolding should be provided for all questions. Scaffolding in Chemistry Consider the development of the following question: • Version 1: Discuss the different states of fluorine and bromine at room temperature. • Version 2: Fluorine, F2, is a gas and bromine, Br2, is a liquid at room temperature. Discuss the different states of these elements at room temperature. • Version 3: Fluorine, F2, is a gas and bromine, Br2, is a liquid at room temperature. Discuss the different states of these elements at room temperature. You should include in your answer information on particle separation, energy, particle motion and the attractive forces between the particles. And finally • Version 4: Fluorine, F2, is a gas and bromine, Br2, is a liquid at room temperature. • • • • Discuss the reasons for the different states of these elements at room temperature. You should include in your answer: information on particle separation energy particle motion the attractive forces between the particles. How research supports these changes How research supports these changes Biology Level 3 2006 Question Three Many metabolic pathways are controlled by multiple genes. An example is the metabolic pathway that produces normal skin pigmentation. Albinism, which is the total lack of pigment, can be caused by a mutation in any one of the genes controlling this pathway. (e)Discuss the fact that it is possible for two albino parents to have a child with normal skin pigmentation. BIOLOGY LEVEL 3 2007 QUESTION 3 (C) ON THE SAME CONCEPT The same concept again in 2009 The 2007 question The 2009 question Monitoring marking Profiles of expected performance • PEPs are: • indicators of ranges of expected results for external standards • tolerances developed on the basis of professional judgement of National Assessment Facilitators (NAFs) in the Secondary Examinations team, NZQA statistical/research staff, Subject experts • tools to ensure results that are consistent with the achievement standards. Why are they used? PEPs were introduced as a result of the 2005 State Services Commission report into NCEA and New Zealand Scholarship. The report addressed concerns about the excessive variability shown in consecutive years of NCEA results in some standards. Such variability is not acceptable in national examinations where cohorts are large. The NZQA Board adopted the recommendation that a range of expected results should be established to give guidance to examiners and markers. Do they ever change? PEPs are: • set afresh each year • affected by changes in the registered standards • affected by changes in cohort performance • guidelines which can and are broken with justification. Grade Score Marking (GSM) Purpose To improve discrimination between grade levels in NCEA external examinations. Research 1. Item response theory (IRT) 2006-2009 2. Score based grading 2007-2009 3. Live Pilot 2010 Outcomes: the NZQA perspectives • National Assessment Facilitators had increased confidence in signing off marking schedules • Tracking of results during marking was straightforward • All data procedures post marking were business-asusual. Outcomes: the marker perspectives • Cut score setting enhanced Panel Leaders’ (PLs’) confidence that grade boundaries reflected standards • Markers liked the refinement of being able to indicate high/low performance within each grade • PLs thought that discrimination was better at boundaries. Projected benefits • Greater accuracy in grade determination • Fairer to students • Reduction in year-by-year variability leading to diminished need for PEPs longer term • More transparency • Closer alignment of marking to the standard Grade Score Marking – what is it? • Grades are assigned according to the criteria in the standard • Each grade is divided into 2; there is also a zero Ø • Each question receives a grade and score • The scores are totalled • A sample of papers on each score is judged by the panel leader and check marker to set the boundaries for each grade • The boundaries are called cut scores and become the sufficiency statements for the grades. Scoring each question with GSM N A M E NØ N1 N 2 A3 A4 M5 M6 E7 E8 GSM Assessment schedules • Schedules are based on the criteria in the standard for A, M and E • N is below A • There is an NØ for ‘no response; no relevant evidence’ • Each grade, NAME, is divided into two. Total score The scores for the questions are aggregated to give a total score for the paper. The total score represents the sum of the grades for the questions. What is a cut score? A cut score sets the grade boundaries by establishing the range for each grade. Senior markers use the actual student exam papers to set cut scores to define each NAME grade. Results • Markers write only the total score on the front of the paper • Markers enter only the total score online • The cut-score is entered by the NAF after consultation with the senior markers and other NZQA staff • The correct result grade will be generated automatically. Results to candidates • The record of achievement will show only the grade N, A, M, E. • The judgement statement on the web will show the cut scores for the various grades. • When candidates receive their booklets back they will also receive an information sheet telling them where to go on the web to check their grades. Questions • What do students need to know to sit the exam? • How are teaching programmes affected? • Is this a move away from standards based assessment ? • What is the difference between this and using marks? • Can GSM be used for internal marking (either for internal assessment or mock exams)? Jennifer Mackrell and Christine Pallin Team Leaders - National Assessment Facilitators Secondary Examinations NZQA PO Box 160 Wellington 6140. Contact emails: Jennifer.mackrell@nzqa.govt.nz Christine.pallin@nzqa.govt.nz