SHS Maths | shsmaths.wordpress.com Representing and Summarising Data S1 Chapter 2 Mode / Median / Mean These should be very familiar from GCSE. They are all ‘measures of location.’ Mode is the most common observation Median is middle value when data put in order Mean is sum of observations divided by number of observations. Proper notation for mean: If x1, x2, x3,… xn are x x n the observations you have, then: More accurately, x is used for a mean of a sample. μ (‘mu’) is the symbol for the mean of a population Using the Mean All very simple, but need to be careful when doing calculations with the mean. x x If Need to remember this when combining means! Eg: Mean score for 25 kids in class 1 was 6.4 Mean score for 30 kids in Class 2 was 7.2 What’s their combined mean? n then we can also say: x nx When to use Mean / Mode / Median Mode: For qualitative data, and quantative when there’s only one or two modes. Useless when each value occurs only once! Median. For quantative data, and better than the mean when there are extreme values, as it’s not affected. Mean. For quantative data, and good because it uses all the data. Frequency Tables Very often, data is presented in a freq. table, and we need to be able to find averages from this. Eg. Collar sizes of 95 shirts sold in a shop: x f 15 3 15.5 17 16 29 16.5 34 17 12 We now define the mean as: fx x f Mode is easy. For median, it’s easiest to add cumulative frequency Grouped Frequency When we have continuous data, we can’t use discrete classes – we have to put the data into groups. This will mean that the detail is lost, so all our measures of location now become estimates. Mode – this is the ‘modal class’ – the class with highest freq. fx x f but now x is midpoint of class. Mean – we can still use Median – we estimate the median using interpolation. Example 14 – Pine Cones Length of cone (mm) freq. 30 - 31 2 2 x 30.5 = 61 32 - 33 25 25 x 32.5 = 812.5 34 - 36 30 30 x 35 = 1050 37 - 39 13 13 x 38 = 494 Totals 70 2717.5 f x mid Note on classes: These have been given to nearest mm, so we need to think carefully about the class boundaries: Class width 31.5 Lower class boundary 32 – 33 Lower class limit 33.5 Upper class limit Upper class boundary Example 14 – Pine Cones Length of cone (mm) freq. 30 - 31 2 2 x 30.5 = 61 32 - 33 25 25 x 32.5 = 812.5 34 - 36 30 30 x 35 = 1050 37 - 39 13 13 x 38 = 494 Totals 70 2417.5 f x mid Modal class = 34 – 36 Estimate for mean = 2417.5 ÷ 70 = 34.53mm To find the median, we need to use interpolation: We have 70 values, so, because n is large, we can say median lies at 35th value. We can see from table that this is in the 34 – 36 class… but where? Example 14 – Pine Cones The principle we use for interpolation is this: the proportion of the observations into the class, is the same as the proportion of the measurement from the class boundary. ie, for this example…. 33.5mm m 36.5mm 27th 35th 57th th in which th to 57th observation The median, 35 where the median is.proportion The class m, is interpolated median lies as goes the from same 27 of the way 27 observation is 33.5 (lower class boundary) between 33.5 and 36.5,class as it is between 27 and 57. and 57th is 36.5 (upper boundary) m 33.5 35 27 36.5 33.5 57 27 m 33 .5 8 3 30 8 m 33.5 3 30 m 33.5 0.8 Coding Data Sometimes it’s useful to code data to make it more manageable. xa The normal form of a coding is: y b The mean / mode median of the coded data can be interpreted for the original data – as in Example 16.