Critical Maths Assessment DRAFT 1. Assessing Critical Maths It is essential to get the assessment of Critical Maths right in order to ensure that the curriculum aims are met; this is by no means easy. This report considers general principles for the assessment and outlines a number of possible assessment methods. None of the ways of assessing the curriculum is ideal, each has some drawbacks. Individual awarding bodies are invited to draw on their expertise of designing assessments in order to produce a suitable assessment of the curriculum, in line with the principles outlined below. MEI will be glad to engage in dialogue with awarding bodies and to share the outcomes of continuing development of Critical Maths. 2. Essential characteristics of the assessment The following essential characteristics of assessment have emerged following discussion with awarding bodies. The assessment should be: Valid – testing the skills we want students to develop. Scalable – can test hundreds in a centre and thousands nationally. Not too expensive. Not easy to cheat. • • • • Ofqual’s criteria for assessments which are fit for purpose are as follows. “To be fit for purpose assessments must: be valid – they assess what they are intended to assess. For example, the ability to develop and sustain an argument about a historical event cannot be validly assessed by multiple-choice questions, whereas recall of historical dates might be validly assessed in that way. • be reliable – the outcome of the assessment (the mark or grade) for a student would usually be replicated if the assessment was repeated. • minimise bias – the assessment must not produce unreasonably adverse outcomes for particular groups of students – for example, assessments should not lead to male students performing less well than female students for reasons unconnected to the knowledge or skills being assessed. • be comparable – the standard of the assessment (in terms of the subject matter, the complexity of the questions or other assessment tasks, and the level of performance required for students to be awarded a mark or grade), should be comparable whenever the assessments are taken and marked and whichever exam board sets the assessment and awards the qualification. • be manageable – the time and resources used in preparing for and sitting the assessments are reasonable for both students and centres and are proportionate to the purpose of the qualification.” 1 • In order to be included in regulated qualifications, the assessment proposed for the curriculum must meet these requirements. 1 Ofqual GCSE reform consultation June 2013 Page 1 of 23 SD 30/09/2013 Version 5 3. Draft curriculum aims and objectives 3.1 Aims This curriculum should encourage students to • Engage in solving realistic problems appropriate to this level. • Recognise when mathematical and statistical analysis will be helpful. • Develop skills of representing new situations mathematically and thinking flexibly in problem solving. • Develop the ability to use their mathematical and statistical knowledge to make logical and reasoned decisions and communicate them clearly. • Develop the mathematical and statistical knowledge and skills they need to become educated citizens in the context of today’s society. • Have the confidence to work on a problem where the method of solution is not obvious. 3.2 Objectives Students should be able to • Discuss problems, identifying the important features. • Propose solutions to problems. • Evaluate strategies for tackling a problem. • Use quantitative evidence. • Make reasonable estimates with limited information. • Communicate their solutions, strategies and reasoning to others. • Check a solution to see whether it is reasonable and criticise unreasonable solutions. • Interpret mathematical solutions in terms of the original problem. • Recognise related problems and apply the knowledge and skills they have learnt to real situations. 4. Categorising levels of understanding and implications for assessment 4.1 Miller’s pyramid of clinical competence The following model was proposed in the 1990s for assessing medics. Each level of the pyramid is assessed by an appropriate method. Competence at one level also implies competence of all the levels below it. In the problem solving context, the levels could be interpreted as follows. A method of assessment is suggested for each one. Page 2 of 23 SD 30/09/2013 Version 5 Level Does Problem solving interpretation Is able to use mathematics to solve problems arising in life, study or work Shows Is able to solve problems suitable for someone studying this curriculum Knows how Is able to find relevant information to start solving a problem. Is able to make relevant comments about someone else’s solution to a problem. Understands the underlying mathematical concepts. Knows Possible assessment This would require future observation of the student – not suitable for end of course assessment. Dependent on the complexity of the problem. Some problems could be assessed by means of portfolio or controlled assessment. Shorter problems and problems related to types which students are likely to have encountered before could be assessed by timed written examination. Timed examination, possibly with the use of pre-release material. Timed examination. The assumed knowledge has already been assessed at GCSE so assessment should concentrate on knowledge acquired during the course. 4.2 The Mathematical Assessment Task Hierarchy 2 Smith et al proposed a classification of mathematical task into eight categories, collected into three groups (A, B and C) in order to assist the development of appropriate assessments for undergraduates. This classification (shown on the next page) was based on Bloom’s taxonomy which consists of Knowledge, Comprehension, Application, Analysis, Synthesis and Evaluation. Darlington3 labels Smith et al’s three groups of competences in the following way. • • • Group A: Routine procedures Group B: Using existing mathematical knowledge in new ways Group C: Application of conceptual knowledge to construct mathematical arguments She also exemplifies each category with an example from A level Mathematics and an example from undergraduate mathematics. The table below exemplifies the categories with examples from Critical Maths. 2 Smith, G., L. Wood, M. Coupland, B. Stephenson, K. Crawford and G. Ball. 1996. Constructing mathematical examinations to assess a range of knowledge and skills. International Journal of Mathematical Education in Science and Technology 3 Darlington, E. 2013. The use of Bloom’s taxonomy in advanced mathematics questions. Informal Proceedings of the British Society for Research into Learning Mathematics Page 3 of 23 SD 30/09/2013 Version 5 Group Group A Category Factual knowledge Critical Maths exemplification A jury consists of 12 people. If juries are chosen at random from the adult population, how many women would you expect there to be, on average, on a jury? Comprehension Give an example of regression to the mean. Routine use of procedures A typical shoe shop has about 500 styles of shoe available in each of 6 sizes. The shoes are stored in boxes. The measurements of a typical shoe box are shown below. Estimate the size of the stock room that the shoe shop needs. Group B Information transfer Give an example of a Fermi estimation problem which you find interesting. Pose and solve the problem and explain why you have chosen this problem. Application in new situations A group of holiday makers consists of 50 people. One night, 10 of them stay late in the hotel lounge and break some of the furniture, causing a lot of expensive damage. The police are called. Everyone denies being responsible so all 50 are asked to take a lie detector test. A lie detector test is about 85% accurate when someone is lying but only about 50% accurate when someone is telling the truth. Each of the 50 people denies being responsible. (a) How many people will the lie detector be expected to identify as having lied? (b) How many of those will be guilty? Page 4 of 23 SD 30/09/2013 Version 5 Group Group C Category Justifying and interpreting Critical Maths exemplification “Reading standards have fallen in primary school after Cameron cut support for literacy tuition” Education Labour tweet 20/9/13 “The percentage of pupils achieving level 4 or above, in the reading test decreased by 1 percentage point to 86%.” DfE statistical first release 19/9/13 The Every Child a Reader programme was launched in 2008 to provide additional support for Key Stage 1 pupils who were falling behind in literacy. In November 2010, the government took the decision to make the funds which had been allocated to this programme part of the general school budget with schools making their own decisions on how to spend the money. Percentage of Key Stage 2 pupils achieving level 4, or above, in the reading test Year 2007 2008 2009 2010 2011 2012 2013 % 84 87 86 83 84 87 86 DfE data, used under the terms of the Open Government Licence Is there evidence that reading standards have fallen in primary schools between 2012 and 2013? Implications, A budget airline allows hand luggage with maximum dimensions conjectures and 40 cm x 55 cm x 20 cm. comparisons Irene has three bags. Each is 5 cm shorter than the maximum allowed in one direction. A. 35 cm x 55 cm x 20 cm B. 40 cm x 50 cm x 20 cm C. 40 cm x 55 cm x 15 cm Explain how Irene can put the bags in order of how much they will hold without doing any calculations or experiments. Page 5 of 23 SD 30/09/2013 Version 5 Group Category Evaluation Critical Maths exemplification A survey by a dating website checked the hair colour of the wives of 100 billionaires with the following results. Hair colour Brown Blonde Black Number 62 22 16 Is this evidence that men prefer women with brown hair? Darlington 4 comments as follows. “The mathematical skills associated with Group C – “those that we associate with a practising mathematician and problem solver” (Pountney, Leinbach and Etchells 2002, 15) – are those which, unfortunately, have been found to be most lacking amongst undergraduate mathematicians (Ball et al. 1998; Smith et al. 1996). Similarly, Etchells and Monaghan (1994; cited by Pountney et al. 2002) found that Alevel mathematics examinations awarded marks mainly for Group A tasks.” By contrast, the skills developed in the Critical Maths course are mainly from groups B and C. The three groups of mathematical skills outlined in the table could, with suitable weighting and additional detail, taken together with the objectives of the Critical Maths curriculum, inform the Assessment Objectives for the assessment of Critical Maths. 5. Assessment methods for Critical Maths A timed written examination could assess some aspects of Critical Maths. However, students will often need thinking time in order to use mathematical knowledge in new ways and construct mathematical arguments. The Oxford Admissions test (OxMAT) is a 2½ hour written examination; Darlington 5 has analysed the questions in five years’ of test papers with the following result. “This analysis found that the majority of OxMAT questions are from Group C, and the minority Group A.” The candidates for OxMAT all do A level Mathematics and will usually be expecting to achieve at least grade A. The candidature for Critical Maths has less experience of mathematics than the candidature for OxMAT, they have a wider range of prior attainment and will be faced with questions set in contexts which may be unfamiliar whereas the OxMAT questions are all pure mathematics questions. Assessing Critical Maths purely by means of timed written examination could run the risk of reducing validity by limiting what can be assessed and/or introducing bias by favouring students who are familiar with the contexts used. 4 Darlington, E. 2013. The use of Bloom’s taxonomy in advanced mathematics questions. Informal Proceedings of the British Society for Research into Learning Mathematics 5 Ibid. Page 6 of 23 SD 30/09/2013 Version 5 An assessment which included the response to just one substantial problem would run the risk of reduced reliability; including at least two substantial problems would help remedy this. Crisp 6 makes the following comment about GCSE coursework. “Coursework tends to involve just one or two tasks but these are large tasks conducted over a period of time so they effectively increase the sample size for a GCSE qualification more than could be achieved using an equivalent exam, and hence should help to avoid ‘construct under-representation.” 5.1 Possible approaches to the assessment of Critical Maths Approaches which allow students to demonstrate what they can do when given sufficient thinking time are listed below. Advantages and disadvantages of each of these approaches are considered. a. A teacher record of skills demonstrated during the course, possibly combined with a student reflective log. b. A portfolio of written solutions to problems solved during the course. c. Controlled assessment in which students respond to one, or more, problems with strict controls on the resources available to them. d. Pre-release material giving the background to one, or more, problems which will then be solved in a timed written examination. e. Questions in timed written examinations which assume that students have solved particular types of problem and ask them to reflect on the process. f. Timed written examinations combined with formative assessment. 5.1a. Teacher record Teachers would record skills demonstrated by students during the course; this would require an appropriate record keeping system to be designed and consideration to be given to whether it should be on a yes/no basis or whether some kind of grades or marks should be attached to the individual skills. A student reflective log could provide additional evidence of learning. Advantages • Students can demonstrate relevant skills when appropriate and gain credit for doing so. • The teacher record could be reported separately from the examination grade to show different types of attainment. Disadvantages • Keeping a record of individual student attainment throughout the course could become time-consuming and burdensome for teachers. • Staff turnover can cause difficulties with continuity of the record-keeping process. • It is difficult to be certain how much has been done by individual students. • It is difficult to design a manageable moderation process. • Checking off separate skills as they are attained can lead to atomistic thinking rather than encouraging a holistic approach. • Combining a mark from a teacher record with an examination mark poses difficulties. 6 Crisp, V. 2009 Does assessing project work enhance the validity of qualifications? The case of GCSE coursework. Educate http://www.educatejournal.org/ Page 7 of 23 SD 30/09/2013 Version 5 5.1b. Portfolio Each student would gather a collection of some or all of the work done during the course for assessment at the end of the course. Marking and moderation could take place simultaneously by teachers using adaptive comparative judgement (see section 6.4) which has been used for the assessment of Design Technology portfolios. The bunching of coursework marks around key grade boundaries has been identified as a feature of teacher marking of internal assessment. “Drivers in the current context put intolerable pressure on teachers, pulling them in very different directions. On the one hand their performance must continually improve, and on the other they must be impartial and reliable assessors. This leads to a highly-conflicted professional role regarding internal assessment. External accountability measures exert very high pressures for continual improvement and attaining grades at and above the C threshold at GCSE. Many elements such as professional recognition, status and progression are contingent on performance against targets and measures. For the majority of teachers this does not lead them to maladministration of assessment, but it does appear to drive bunching and upwards tilting of marking, and may include a strong element of ‘benefit of the doubt’. At the same time, awarding bodies expect teachers to behave as consistent, fair markers, ensuring that each standards and marking practice are in line with marking schemes and national standards.”7 Adaptive Comparative Judgement could overcome this (see section 6.4). An alternative would be to report the portfolio separately, combined with moderation of internally produced rank orders. This would ensure that a mark does not need to be associated with the portfolio; the only decision to be made is whether it merits a grade; the number of grades should be kept small. Advantages • Students can have sufficient thinking time to produce their best work. • The portfolio could be reported separately from the examination grade to show different types of attainment. Disadvantages • A portfolio of all solutions from the whole course would be potentially burdensome for students to compile. • Deciding whether to include particular pieces of work in a portfolio can assume an importance out of proportion to the educational benefit. • Comparing portfolios with different amounts of work in them poses difficulties for marking and moderation. – how should a portfolio consisting of four good solutions be compared to a portfolio with two good solutions and many more pedestrian ones? Some of the difficulties could be overcome by specifying a particular number of solutions (say, three). • It is difficult to be certain how much of the work has been done by individual students. • It is desirable for students to discuss solutions to problems while learning the curriculum; would it be acceptable for students to include solutions to problems which have been widely discussed in class as part of their portfolio? 7 Oates, T, 2013, Radical solutions in demanding times: alternative approaches for appropriate placing of ‘coursework components’ in GCSE examinations Page 8 of 23 SD 30/09/2013 Version 5 5.1c. Controlled assessment Students undertake tasks under controlled conditions, tasks may be Awarding Body or teacher set. Students may be required to undertake some, or all, of the work under close supervision. Tasks could be Awarding Body set and marked in order to reduce the burden on teachers and increase reliability. In order to allow students time to think about an unfamiliar problem and to understand a context which they may not be familiar with, a possible model is as follows. • • Part 1: students are given a paper of problems to solve in exam conditions and allowed an hour to think about them. They hand in their initial notes and take the problems away to think about and research for one or two days before part 2. Part 2: students are given a clean copy of the question paper and the notes they made in part 1. They are not allowed to bring any material into the examination room. They have two hours to complete the solutions to the problems in examination conditions. The short time period between parts 1 and 2 disadvantages students who are not able to devote as much time to research due to other demands, including examinations in other subjects. However, a longer period between the two parts would allow solutions to be posted on the internet and make it possible for some students to receive undue help from family members or tutors. Advantages • Students can have appropriate thinking time to allow them to respond to more complex problems. • The controlled conditions provide assurance that the work is students’ own. Disadvantages • Working under controlled conditions can be stressful for students. Ofqual’s research into the introduction of controlled assessment 8 has identified a number of potential concerns which are relevant to the development of the assessment of Critical Maths. • Accommodating students who are absent or entitled to additional time for assessments. • Reduced opportunity for students to develop and refine their work. • Pressure on scarce resources such as ICT and classroom space at critical times. • Reduction of teaching time due to time spent on controlled assessment. • Uncertainty about how to interpret guidance regarding what is and is not permitted. 8 Ofqual, 2011, Evaluation of the Introduction of Controlled Assessment Page 9 of 23 SD 30/09/2013 Version 5 5.1d. Pre-release The contexts for the assessment problems are made known to students in advance of the examination but they are not told the questions. Tasks and pre-release materials would be provided by the examining Awarding Body which will have experience of preparing these for other subjects. Pre-release materials can include any of the following. • One (or more) problems to reflect on. • Some data to analyse. • Background information to read and understand. Students are allowed to read around and research the contexts before going into the examination but they do not know what questions will be asked. Advantages • The pre-release material allows all students to understand the context and so removes a potential barrier to their being able to work on the problem. • The use of pre-release material fosters keen interest in students and can be a strong motivator for learning. Disadvantages • Students who can solve problems quickly can have an advantage over those who are able to produce a satisfactory solution given sufficient thinking time. • Teachers may be able to spot likely questions and prepare students for them – this puts some students at an advantage. 5.1e. Examination questions which assume previous classroom work In the examination, students are asked to reflect on a specific type of problem which they have previously solved in class. They might be asked to outline strategies used or to construct or solve a similar problem. Advantages • This type of question in the examination can have a positive effect on what takes place in classroom learning. • It is motivating for students to know that the work that they are doing in class will be referred to directly in the examination. Disadvantages • Examination questions could become formulaic and predictable. Page 10 of 23 SD 30/09/2013 Version 5 5.1f. Examination and formative assessment The examination concentrates on the skills it is possible to assess in that way. Other skills could be assessed by means of formative assessment but not reported as a grade. However, students could refer to these skills in personal statements and interviews for university or jobs. This approach has common ground with a proposal in the recent Tim Oates paper on coursework. “Model 5 A ‘qualifications package’ model For this we need an SMP or Nuffield style approach where we create a ‘qualifications package’ which gives high detail in these elements, all presented as a linked whole: Course content Teaching materials and student materials In service training re course content Formative assessment instruments Exam content ‘Desired skills’ and related outcomes are developed through the learning programme. It relies on professional development and highly refined learning materials. Marks in coursework would not contribute to grades in the examination. This model promotes the idea of an integrated offer which must be consistent with ‘expansive’ rather than ‘instrumental’ education. But developing this is slow, and expensive. It is a viable way forward, but is a long term strategy, due to the high level of both resource and coordination required.” 9 The Critical Maths development work in which MEI is engaged includes most of these elements; the addition of formative assessment instruments would complete the package. Advantages • Assessment is reliable and manageable. • Teachers are able to track the development of skills gained during the course • Students are encouraged to reflect on their progress throughout the course and can make this a part of university or job applications. • Questions which students would find it hard to make a start on in a timed written examination are incorporated into the course. Disadvantages • Skills which are not directly relevant to the examination might not be encouraged. • Students may not have anything to show that they have developed the skills which have not been examined. 9 Oates, T, 2013, Radical solutions in demanding times: alternative approaches for appropriate placing of ‘coursework components’ in GCSE examinations Page 11 of 23 SD 30/09/2013 Version 5 6. Marking solutions to problems Student solutions to longer problems are likely to differ considerably in the approach taken and the progress made. Possible ways of assessing such solutions include the following. • • • A mark scheme specific to the problem; this might include levels of performance rather than specific marks for specific points made. Generic criteria giving guidance on marking problems; these could either award specific marks for specific points made or have levels of performance. The use of adaptive comparative judgement to produce a rank order with grades or marks being assigned from the rank order. 6.1 Mark scheme It is difficult to construct a successful mark scheme for a problem without seeing the full range of student responses. External assessment would allow examiners to adjust a provisional mark scheme after seeing student responses but further research needs to be undertaken to find out how reliable the application of such mark schemes would be. 6.2 Specific criteria Swan and Burkhardt 10 give examples of holistic criterion-based scoring schemes which are specific to tasks; these are similar in principle to levels-based mark schemes which are often used for marking essays in some subjects. 6.3 Generic criteria MEI, among others, has produced generic marking criteria for mathematics coursework at A level. These have been shown to work but they are for tasks where the variety is limited and teachers can find it difficult to award the correct marks at first, until they gain experience. Further research would be needed to determine whether such an approach is feasible. 6.4 Adaptive comparative judgement The E-scape project 11 at Goldsmiths College used computer based adaptive comparative judgement to assess Design Technology portfolios. Assessors were presented with two portfolios at a time on screen and had to decide which was the better. The computer software adapted to present pairs of portfolios which were closer as time went on. The software reports a rank order and identifies any assessors which are out of line with the general view. Once a rank order has been established, marks or grades can be awarded. This methodology has since been applied to mathematics assessment in a research context. A 2010 Cambridge Assessment research paper 12 summed up as follows. “Research is needed in order to evaluate the quality of assessment outcomes based entirely on paired comparison or rank-order judgments, and to identify the 10 Swan, M, Burkhardt, H. Designing Assessment of Performance in Mathematics. Educational Designer 11 http://www.gold.ac.uk/teru/projectinfo/projecttitle,12370,en.php 12 http://www.cambridgeassessment.org.uk/images/125350-summary-of-rank-ordering-and-pairedcomparisons-research.pdf Page 12 of 23 SD 30/09/2013 Version 5 circumstances in which these outcomes are ‘better’ than those produced by conventional marking. The assumptions, underlying processes, and operational issues associated with using paired comparison / rank-order judgments in public examinations require further scrutiny. Crucially, the judgment process moves more towards a ‘black box’ model of assessment – something which is contrary to the direction in which assessment has been developing. In addition, the increasing demand from schools, pupils and parents for detailed feedback on performance becomes problematic under such arrangements. In terms of validity, ‘better’ means making the case that the paired comparison / rank-order outcome supports more accurate and complete inferences about what the examinees know and can do in terms of the aims of the assessment. In terms of reliability, ‘better’ means showing that the paired comparison / rank-order outcomes are more replicable with different judges (markers) or different tasks (questions). In terms of practicality, we need to show that replacing marking with paired comparison / rank-order judgments is technologically, logistically and financially feasible. In terms of acceptability, ‘better’ means showing that examinees and other stakeholders are more satisfied with the fairness and accuracy of paired comparison / rank-order assessment outcomes, and the information from the assessment meets school, candidate and user requirements. In terms of defensibility, ‘better’ means showing that it is easier for examination boards, when challenged, to justify any particular examinee’s result (which clearly could be a significant challenge for a system based entirely on judgment with no equivalent of a detailed ‘mark scheme’).” 7. • • • • • • Further considerations for the assessment of Critical Maths Active participation in the learning of a Critical Maths curriculum is of great potential value to students and should not be undermined by the assessment. To ensure that a qualification in Critical Maths can count as the Core Mathematics component of a TechBacc 13 as well as enabling it to have currency for students, it should either form the whole of a 120 guided learning hour qualification or be combined with another component to make such a qualification. Some students may be re-engaged with mathematics through studying the Critical Maths curriculum and decide to take an AS in Mathematics in year 13, having embarked on Critical Maths in year 12. There should be an assessment of their learning at the end of year 12 to ensure that they get credit for their achievement. Adaptive Comparative Judgment and level-based mark schemes should be trialled as ways of marking student solutions to more substantial problems. Although group discussion is of enormous value during the Critical Maths course, the assignment of individual credit to participants in such discussion makes it difficult to include credit for group discussion in the final assessment; there may, however, be value in trialling this. A pass/merit/distinction grading of a Critical Maths qualification may be more appropriate than finer grading. 13 https://www.gov.uk/government/news/new-techbacc-will-give-vocational-education-the-high-statusit-deserves Page 13 of 23 SD 30/09/2013 Version 5 8. Draft Assessment Objectives Assessment objective AO1 Factual knowledge and routine application • Understand the terminology and mathematical ideas appropriate to the specification. • Give examples of mathematical ideas introduced in the course. • Use numerical and other information to make a reasonable estimate. AO2 Using mathematical knowledge • Make a start on solving a problem where the solution is not obvious. • Identify the important variables/features in a situation. • Interpret quantitative information or evidence. • Recognise situations related to problems they have encountered before. AO3 Construction and criticism of arguments using mathematics • Communicate solutions, strategies and reasoning. • Interpret solutions in the context of the original problem. • Criticise and evaluate solutions to problems. Page 14 of 23 Suggested weighting 10-20% 30-40% 45-60% SD 30/09/2013 Version 5 Appendix Draft Examination Questions Short questions 1. To stay healthy, adults are advised to walk 10 000 steps a day. About how far is this? A 50 m B 500 m C 5 km D 50 km [1] T1, AO1 Answer C 2. Areas of the UK which have more mobile phone masts also have more births per year. Give a likely explanation for this. [2] M7, AO2 Solution Reasonable explanation clearly expressed. Marks 2 Notes Example explanation Areas with more mobile phone masts have higher populations and so more births SC1 for incomplete explanation. Page 15 of 23 SD 30/09/2013 Version 5 Longer questions 1. A workplace wants to reduce staff absence due to illness. It introduces a new policy. • Any member of staff who has more than twice the average number of days absence due to illness in a year get a letter which threatens punishment if he/she does not take fewer days off ill in the coming year. • Any member of staff who has no absence due to illness in a year is paid a small bonus. The following year the members of staff who had the letter threatening punishment have, on average, fewer days absence due to illness but the members of staff who got the bonus have, on average, more days absence due to illness. The manager says, “This proves that telling people off when they are not doing well enough is more effective than paying a bonus when they do well.” (i) Suggest an alternative explanation for what has happened. Explain clearly the meaning of any technical terms you use, showing how they apply in this situation. [2] M14, P8 AO1 (ii) The workplace has 100 staff. Describe a way in which the workplace could test which is more effective at reducing absence: a telling off or a reward? [4] M8, M13, P3, AO2 Part i Solution A suitable explanation Clearly explained Marks E1 E1 Page 16 of 23 Notes Suitable explanation This is an example of regression to the mean. The amount of illness a person has varies. The people who were most ill in the first year did not get as ill in the second year; the people who were least ill got more ill by natural variation. SD 30/09/2013 Version 5 Part ii Solution A description of a randomised controlled trial. • Assign each worker to one of two groups at random. • In one group, send a warning letter to all who have more than some amount of absence (average or above) • In the other group, give a small reward to all who have less than some amount of absence (average or below). • Compare absence rates for both groups in the following year. Marks 4 for clear description including all four points. 3 for clear description which is incomplete Notes The term “randomised controlled trial” need not be used Ignore additional details such as what to do if someone leaves the place of work part way through the trial. 2 for progress towards a clear description 1 for a correct, relevant statement or just the term “randomised controlled trial” with no further detail Page 17 of 23 SD 30/09/2013 Version 5 2. A new flag is being designed. A requirement of the design is that the total area of the grey regions must be the same as the total area of the white regions. The design below has been made by drawing a diagonal in a rectangle then taking a point on the diagonal and drawing lines parallel to the sides of the rectangle through that point. The regions formed have been alternately coloured grey and white, as shown. Explain why this design will meet the requirement that the total area of the grey regions must be the same as the total area of the white regions, no matter what point on the diagonal is taken. [5] P7, T5, AO3 Solution A clear argument showing that the grey area equals the white area. This is likely to include the following points. • Big grey triangle = big white triangle as each is half a rectangle. • Little grey triangle = little white triangle as each is half a rectangle. • One big triangle plus one little triangle plus one of the rectangles makes half the flag so the grey rectangle must equal the white rectangle. Hence the total grey area must be equal to the total white area. Marks M1 for identifying either big or little triangles as equal in area. M1 for explaining why. A1 for identifying each of the other pair of triangles as equal (dep on both previous M marks) M1 for identifying each half of the rectangle as equal in area. A1 for completion Notes Measuring the given diagram and calculating areas scores zero. [5] Page 18 of 23 SD 30/09/2013 Version 5 3. An infectious disease becomes an epidemic if each infected person goes on to infect, on average, more than one person. If each infected person goes on to infect, on average, less than one person then the disease dies out. (i) A new infectious disease is affecting a population. If nobody is vaccinated, each infected person infects on average five other people. An effective vaccine has been developed. A person who has been vaccinated cannot catch the disease. Provide a clear explanation showing that at least 80% of the population needs to be vaccinated to prevent an epidemic. [3] U1,2, P1, AO2 (ii) A different disease affects children. It is caught by touching an infected child or a surface which an infected child has touched. When a child gets the disease the typical course is as follows. • On the first one or two days, the child has no symptoms but can infect others. • On the next two or three days, the child feels ill and stays in bed at home. • On the last three or four days, the child continues to be infectious in spite of feeling well. On average, an infected child infects 2 other children when the child is allowed to go back to school as soon as he/she feels well again. To stop the epidemic, it is suggested that a child who falls ill with the disease must stay off school for one whole week after showing symptoms. Will this stop the epidemic? [6] U1,2 R5, P1,7, AO3 Part i Solution A clearly explained argument showing that 4 out of 5 people need to be vaccinated. Marks 3 for clear correct argument with correct answer. 2 for correct argument which is not quite clear or complete but which is clearly leading to the correct answer. Notes Example clear correct argument Without vaccination, one person infects 5. This needs to be reduced to (no more than) one person infected instead of 5. So 4 out of every 5 people need to be vaccinated. This is 80%. 1 for a correct, relevant statement. Page 19 of 23 SD 30/09/2013 Version 5 Part ii Solution A correctly explained argument with a conclusion, including the following points (or similar). • A child who goes back to school as soon as s/he feels better is infectious for 4-6 days. • The proposed measure would make the child infectious for 1-2 days. • It could cut the infection rate to less than half so less than one infection per child – this would stop the epidemic. • We don’t know that a child is equally infectious the whole time. Children can still infect others outside school e.g. siblings or friends who visit. Marks 5-6 for a correct argument with a conclusion which includes all the important points. Notes The conclusion can be “yes”, “no” or “it depends” as long as it arises from the candidate’s argument. A conclusion on its own, with no argument, scores zero. 3-4 for an argument with a conclusion which includes at least two points. 1-2 for a partial argument with no conclusion. Page 20 of 23 SD 30/09/2013 Version 5 4. An extract from a local newspaper story is given below. An amazing coincidence Joey Pickles (age 5) had a special present on his birthday – a little brother. Ben was born on Joey’s birthday. Their mother, Marie, said “I am amazed – I never expected them both to share a birthday.” There are 50 000 people living in the area covered by the local newspaper. Estimate how long it is likely to be before another child in the area has a brother or sister born on his/her birthday. You should state and justify any assumptions you make. [8] U1, 2 T1, 2 M4, AO1 (2), AO2 (2), AO3 (4) Solution A clearly presented solution leading to an estimate of somewhere in the region of 3 months to 5 years. The solution should contain the following points. • • • • A reasonable attempt to estimate the number of women or children in the area. An allowance made for children who will not get a younger sibling. An estimate of younger siblings born over a specified time period. Use of 1/365 (or 1/366 or 1/360) as a probability of a younger sibling matching the birthday of an older sibling. Marks 7-8 for a correct argument including all four points and leading to a correct conclusion. 5-6 for a correct argument including three of the four points; may not have a conclusion. 3-4 for two of the four points. 1-2 for one of the points. Notes Example solution 50 000 people aged 0 to 80 (approx); the children are aged 0 to 16 so about one fifth of them are children. 10 000 children. About half of these might expect a younger sibling in the next 5 years (or so) so; about 5000 children. In 1/365 of cases, the birthday of the sibling will be the same as the birthday of the original child. 5000/365≈15. 15 times over 5 years so about every 4 months. Second example solution About half the 50 000 people are female aged 0 to 80. About a quarter of those are aged 20 to 40. About 6000 women will have an average of 2 children each over the next 20 years. 12 000 children in 20 years. About a quarter will be only children so 9000 children will have an older sibling. In 1/365 cases, he/she will have the same birthday as the original child. 9000/365≈18. 18 in 20 years is about once a year. SC A numerical answer with no working scores 1 mark if in range. Page 21 of 23 SD 30/09/2013 Version 5 5. A patient visits his doctor with symptoms which could indicate one of two medical conditions, a minor condition A or a life-threatening condition B. The doctor’s practice typically sees cases of condition A two or three times a day and cases of condition B perhaps once every year or two years. A test is available for condition B. If a patient has condition B then the test diagnoses the condition correctly 95% of the time but 2% of patients who do not have condition B will test positive for it. Treatment for condition A is cheap and effective, usually resulting in a complete cure within a fortnight. Treatment for condition B is expensive and very unpleasant for the patient, resulting in an inability to work for about six months, however, the treatment is usually effective, if started quickly enough. (i) Should the doctor recommend the patient be tested for condition B straight away or should she try treating for condition A first? (ii) Another patient has the same symptoms as the first patient. He has a family history which makes it twenty times as likely that he will suffer from condition B compared to the average person. Should the doctor recommend this patient be tested for condition B straight away or should she try treating for condition A first? [15] U1, 2 P5, 6, 7, R1, T2, M2 AO1 (3), AO2 (2), AO3 (10) NB This question would be suitable for trialling a marking approach based on Adaptive Comparative Judgement, assigning a rank order for and then awarding marks based on this. A suggested mark scheme is offered as an alternative to comparative judgment. The question should be marked as a whole. Page 22 of 23 SD 30/09/2013 Version 5 Solution Solutions are likely to include the following points. • A comparison of numbers suffering from A to those suffering from B (about 1000 to 1 for i). • Using natural frequencies to estimate the probability of a positive test result meaning the patient actually has condition B. • A recommendation which takes account of the probability of a false positive, the potential seriousness of condition B and the consequences of treatment for B if, in fact, the patient does not have B. • For part ii, realising that the frequency of B in these circumstances is 20 times as high compared to that used in i. • Amending working for i accordingly and reviewing the resulting recommendation. • When recommending sending a patient for testing, also recommending action which will take account of the possibility of a false positive. Marks 13-15 for a correct complete solution including suitable recommendations – at the lower end of the level, there may be some gaps in the recommendations. 10-12 for a largely correct and complete partial solution. This level includes responses which work correctly with the probabilities in both cases but do not make suitable recommendations and responses which do one part correctly and make substantial progress in the second part. 7-9 for substantial progress towards a complete solution. This level includes responses which do one part correctly and do little, or nothing, for the other part. 4-6 for some progress towards a complete solution. Responses at this level include some correct work, for example, correctly working with the probabilities for one part of the question. Notes Part i example solution For 100 patients with B, the practice sees about 100 000 with A. For 100 000 A patients, 2000 test positive and 98 000 test negative. For 100 B patients, 95 test positive and 5 negative. Someone testing positive has only a 5% chance of having B so best to treat for A and monitor patient to see whether treatment is effective. If not effective then send for testing but get a second opinion if result is positive. Part ii example solution For 2000 patients with B, the practice sees about 100 000 with A. For 100 000 A patients, 2000 test positive and 98 000 test negative. For 2000 B patients, 1900 test positive and 100 negative Someone testing positive has a 50% chance of having B so best to test for B but also get a second opinion. 1-3 for some relevant work. Responses at this level have some correct statements or working but these are not part of a coherently communicated solution. Page 23 of 23 SD 30/09/2013 Version 5