Completing a statistics GCSE coursework project Assessment The assessment criteria are sub-divided into three strands, the first two of which are subdivided. The same criteria applies to both tiers of entry. • Strand 1: Specify the line of enquiry, design and plan the approach and the collection of data. • Strand 2: Processing and representing data with calculations and summary statistics. • Strand 3: Interpretation and discussion of results and conclusions. Strand 1 is sub-divided into (a) planning and (b) collection of data. (5 marks each) Strand 2 is sub-divided into (a) analysis, presentation and diagrams and (b) calculations (10 marks each) Strand 3 is interpretation (10 marks) The cycle . Specify the problem and plan Collect data Process and represent Evaluate; possibly modify, improve or develop Interpret and discuss Calculate and obtain summary statistics The cycle may be completed by an evaluation with suggestions for improvements or developments that would enhance the project developments that would enhance the project. Strand 1a 5 marks Strand 1b Strand 2a 5 marks 10 marks Strand 2b Strand 3 10 marks 10 marks TOTAL 40 marks Plan Hypotheses and decide techniques to use. What are the potential problems you anticipate and how are you to overcome these? Give consideration to outliers or anomalous. How will you deal with bias or exceptional data (outliers)? Identify the secondary data population and its reliability. Consider the nature of the data – discrete or continuous? What sampling techniques will you use? Justify your choice of sampling. Calculations need to be sophisticated (A grade) and relevant to the project – unnecessary diagrams will lose marks. Ensure each diagram you use has commentary. Eg. Comparisons made about a back to back Stem and leaf diagram, two box plots or CF graphs. Conclusions about the project. Critically question the validity of your findings. In what ways could you have improved the validity of your findings. Refinements to your plan may also feature in this section. Strand 1(a): Planning Mark (Grade) 3 (C) The candidate will give a reasonably clear description of their strategy. Two essential requirements for mark 3 and above. (i) a discussion of the sampling regime to be used (or a justification for taking a census plus some detail as to what will be done in the event of non-availability of some information). (ii) a more complex project is being undertaken which involves the use of techniques of at least grade C in the grade descriptors. This might be through a series of samples and a search for a trend or a comparison of two or more samples using a measure of centrality and a measure of spread (either median and interquartile range or mean and standard deviation) or a detailed comparison of two or more lines of best fit. 4 (A) This will involve a number of inter-related features constituting a more demanding task which must be designed in a way that makes it a single project. The candidate will express clear aims and give a reasonable strategy for achieving results efficiently. Candidates are expected to anticipate problems and plan ways of overcoming them. They determine in advance what they will do if their sample contains outliers or anomalous data. They are expected to discuss possible sampling regimes and give an adequate reason for their choice. They will give evidence that their work will be efficient and effective with no wasted effort. They choose techniques with a clearly stated purpose and plan to make use of these in a way consistent with the grade A descriptors. 5 Candidates need to do all that is required for mark 4 and use a range of techniques aslisted in the grade A descriptors. These techniques must be justified and appropriate. Contrived situations run counter to the philosophy of the project and are not to be rewarded, so imagination and creative thinking will be required. Candidates justify the methods they plan to use by comparing with possible alternatives, eg the rationale behind using a stratified rather than systematic sample. They should discuss strengths and weaknesses of both sampling regimes. They plan to use the standard deviation rather than the range because…. (i) outliers are defined before data is collected. Reasons they may occur discussed and the response which will be made planned. (ii) all of the issues about outliers, exceptional values, non-response, bias etc are expected and discussed. Strand 1(b): Collecting (data may be primary or secondary) 4 (A) Must be in the context of a mark 4/5 plan. Primary data must avoid bias and be reliable. The candidate states what steps are being taken to guarantee that a like with like comparison will occur. For secondary data details about the mechanics of the sampling should be addressed. It is not necessary to be using a stratified sample, but it is necessary to have good reasons stated for the system used. Anomalous, missing data or outliers must be identified and dealt with as planned. 5 Candidates using primary data should discuss the problems of obtaining truthful responses in sensitive cases. All of the issues about outliers, exceptional values, non-response, bias etc are expected, identified and responded to as planned. Published data is accessed for comparison (eg obesity tables or RPI). Continuous/discrete nature of data is discussed. Strand 2(a): Analysis, presentation and diagrams – this is where planned techniques are found. Inappropriate or unplanned diagrams will not gain marks – use your plan carefully here. Mark (Grade) 7 (A) The candidate uses relevant, appropriate graphs and diagrams as planned explaining their choices and scales. The work must be sufficiently complex to be contextually of a grade A standard. Examples: comparative pie charts. Cumulative frequency stepped polygons. Histograms with unequal class intervals (to be rewarded these must be used and arise naturally within the project). Use of Venn Diagrams. Comparisons of scattergraphs, cumulative frequency diagrams, box and whisker diagrams with and without outliers (this must be planned in Strand 1). 8 The candidate uses statistical graphs and diagrams with a greater degree of sophistication. Examples: The candidate uses the box plots to note that in (b) all the key statistics are greater and talks about skewness of (b) and relative symmetry of (a). Decides that curves of best fit will give a better model. 9 All choices should have been planned (strand 1) and justified in terms of their aim and the methods used. Diagrams, concepts and arguments need to be presented correctly and efficiently with a high degree of justification. An accurate and perceptive analysis must be made within the context of the specified plan. Examples: The use of standardised scores on axes to enable the comparison of frequency distributions and the shape and properties of the normal distribution. The candidate decides that lines of best fit are not the best model and moves on to compare curves of best fit and considers the equation of the curve and can explain the variables they have used and discuss the intercept. OR They superimpose a normal distribution curve onto their histogram to find out if their data is normally distributed. 10 This mark will be awarded for a concise project in which all the strands are interlinked and concise. Examples: Use of regression, candidate’s own measure of spread. Superimposing published values on their scattergraph/box and whisker to make comparisons. Discussion of the difference between discrete and continuous variables to scales when planning could be used as evidence for this mark; for instance, ‘Time is continuous and Money is discrete… Strand 2(b): Calculations – this area assesses the choice of calculations used. How well have you examined the data and what can be inferred about your results. “It is likely therefore that………” 5 (C) There may be inaccuracies or irrelevant work present. Examples: Calculates estimate for the mean from continuous grouped data. Obtains median and quartiles from a discrete data set. Calculates moving averages. Estimations of population proportions from samples 6 Examples: Calculates Spearman’s rank correlation in appropriate circumstances. Find the equation of trend lines from moving averages. Combined probability. 7 (A) Examples: Calculates standard deviation in a situation where this is statistically valid. Calculates frequency densities to draw a histogram. Uses proportion to draw comparative pie charts. Formally identifies outliers. Identify and calculate average seasonal effect. 8 Examples: Makes use of frequency density (from unequal intervals) in comparing samples. Use of the Geometric mean in comparing proportions. Uses deciles and percentiles from cumulative frequency diagram to compare distributions. 9 Examples: Use of standardised axes when comparing samples. Uses deciles and percentiles linked with standard deviation to compare distributions. Use of variation and calculation of tolerance limits. Finds equations of curves of best fit and defines variables. 10 Examples: Formal use of averages and standard deviation to test for the normal distribution. Informed use of regression coefficients, normal distribution. Use of binomial distribution. Strand 3: Interpretation 5 (C) Here, non-trivial comparisons of samples with some interpretation or discussion is expected. Examples: Comparison of lines of best fit with regard to what this means in the context of the ‘real world’ situation being modelled. Using median and quartiles in fairly general terms but translating this back to the project plan. It is not enough to say the medians are … and the lower quartiles are … which are statements of fact and gains marks only in Strand 2 . 6. As well as satisfying mark 5, this includes aspects of detail which show that the candidate is understanding the situation in statistical terms and thinking about the implications. The award is made secure by discussing ways of dealing with problems that may have arisen. Examples: Discussing the effects of outliers or suspect/unrepresentative samples. 7 (A) Within the context of a multi-faceted project candidates conduct a detailed interpretation which draws the features together. Makes a detailed summary of what they have done and why they have done it. Their discussion makes use of all statistical techniques employed and relates these back to their original hypotheses. They comment on patterns and offer reasons for exceptions. There should be no unjustified conclusions or claims. 8 As well as satisfying mark 7, candidates discuss possible limitations for any inferences that they feel able to make. They discuss the validity of their results in quite specific terms. All the techniques they have used are discussed and outcomes commented upon. These techniques should be efficient. There is no redundancy. 9 The candidate discusses their project in a more statistically sophisticated way. All inferences must be related to hypotheses. There must be no redundant calculations or diagrams. Examples: Uses probability to evaluate the limitations of their conclusions. Discusses their results in comparison with published tables. Uses the normal distribution to make predictions. 10 The candidate looks at their model critically and if he/she refutes the hypothesis should offer an alternate hypothesis and describe how they would test it. They should recognize limitations and comment on practical consequences of their work. Examples: Discusses formally the appropriateness of the normal distribution as a model. Compares data with and without outliers and comments upon their results discussing the limitations of their conclusions when outliers are/are not included. At any stage an evaluation of their strategy strengthens the award. Candidates who suggest modifications to their plan (without prosecuting these) can gain an award in the third strand commensurate with the award their revised plan would have received in the first strand. In this way it is possible for the third strand to outscore the other two. Marking GCSE Statistics coursework The marks for GCSE Statistics coursework are out of 40. The minimum marks per grade, for the Statistics project in GCSE Statistics are, approximately Grade F = 11 to 13 Grade C = 19 to 21 Grade A = 27 to 29 A grade A* on the GCSE Mathematics coursework is nominally at a mark of around 22 out of 24. In GCSE Statistics it is most likely that candidates providing work at the highest level will include work such as the normal distribution, standard deviation or similar and fully utilized with no redundancy, and may normally be a high mark in the GCSE Mathematics data handling project.