Section 2–1B: Class Frequency Tables for Discrete Quantitative Data A Class Frequency Table is a table that lists several ranges of data values in the first column and the total number of times that each of the different data values in that range occurs in a second column. The first column has a heading that describes what the data represents. We attach a variable x to this heading. The variable x will represent all the different possible discrete numerical values the data can contain. The rows in the first column represent a range (or class) of numbers. Each class is written with a first number a dash and then a second number. The first row in the table below has a class of 0 – 2. The 0 – 2 class represents any number in the data set from 0 to 2 inclusive (0 , 1 or 2). Inclusive means that the first and last numbers are included. The lower limit is 0 and the upper limit is 2. The second row in the table below has a class of 3 – 5. The 3 – 5 class represents any number in the data set from 3 to 5 inclusive (3 , 4 or 5). The lower limit is 3 and the upper limit is 5. The second column contains the frequency. The frequency is the number of times that each of the different values in the given class occurs in the data. We use the notation freq (x) to represent the frequency for any specific x value from column one. The total of the second column will be the total number of data bits in the distribution. This means that if we have 25 data bits then the total of the second column will be 25. An Example of a Class Frequency Table x: age of child freq (x) 0–2 2 3–5 8 6–8 5 9 – 11 10 12 – 14 5 15 – 17 8 18 – 20 2 ∑ freq(x) = 2 + 8 + 5 + 10 + 5 + 8 + 2 = 40 Section 2 – 1B Lecture Page 1 of 13 © 2012 Eitel You lose some Information with the use of a Class Frequency Table instead of a Frequency Table You lose some Information about the frequencies of the individual x values when you count a range of values as a single class. The first two rows from the Class Frequency Table above are shown below. x: age of child freq (x) 0–2 2 3–5 8 The second class is written as 3 – 5 and stands for any child whose age is 3, 4 or 5 years old. Any child with one of those ages is counted as being in that class. The frequency for that class is 8. We cannot say exactly what exact ages the 8 children have. They could all be 3 years old. They could all be 5 years old. Many different data sets could produce a class of 3 – 5 with a frequency of 8. All we can say is that 8 people reported having children whose ages were either 3, 4 or 5 years old. If we had used a frequency Table with individual ages like we did in the first set of frequency tables we would have the exact frequency for each age. The problem with the use of a frequency table in this case is that it would have to be 21 rows long. Many times we want a shorter table with fewer rows. A Class Frequency Table allows you to reduce the table to a manageable number of rows but you will lose some of the individual information. Section 2 – 1B Lecture Page 2 of 13 © 2012 Eitel Interpreting a Class Frequency Table The first column has a heading of the age of the children in your family. From the values of x in column one we can see that data is grouped into classes. The first class is written as 0 – 2 and stands for any child whose age is 0, 1 or 2 years old. Any child with one of those ages is counted as being in that class. The second class is written as 3 – 5 and stands for any child whose age is 3, 4 or 5 years old. Any child with one of those ages is counted as being in that class. The third class is written as 6 – 8 and stands for any child whose age is 6, 7 or 8 years old. Any child with one of those ages is counted as being in that class. The other classes follow the same pattern. The second column has a heading freq (x). The second column contains the frequency for each of the classes. The chart to the right of the frequency table below shows how to interrupt the numbers in the two columns. x: age of child freq (x) 0–2 2 2 people reported having children ages 0, 1, or 2 3–5 8 8 people reported having children ages 3, 4, or 5 6–8 5 5 people reported having children ages 6, 7, or 8 9 – 11 10 10 people reported having children ages 9, 10, or 11 12 – 14 5 5 people reported having children ages 12, 13, or 14 15 – 17 8 8 people reported having children ages 15, 16, or 17 18 – 20 2 2 people reported having children ages 18, 19, or 20 Section 2 – 1B Lecture Page 3 of 13 © 2012 Eitel Upper and Lower Class Limits Lower Class Limit: The smallest value that can belong to the class. Upper Class Limit: The largest value that can belong to the class. x: number freq (x) of units 1–3 2 4–6 5 7–9 1 The first class in the table is 1 – 3. This class starts at 1 and ends at 3. The lower class limit is 1 and the upper class limit is 3. The second class in the table is 4 – 6. This class starts at 4 and ends at 6. The lower class limit is 4 and the upper class limit is 6. The third class in the table is 7 – 9. This class starts at 7 and ends at 9. The lower class limit is 7 and the upper class limit is 9. Class Width x: number freq (x) of units 1–3 2 4–6 5 7–9 1 Method 1 If we are working with whole numbers then the class width will correspond to how many values of x each class contains. The easiest way to find the class width is to count how many values of x are in the class. The first class contains the numbers 1 , 2 and 3. It contains 3 numbers so the class width is 3. The second class contains the numbers 4, 5 and 6. It contains 3 numbers so the class width is 3. The third class contains the numbers 7 , 8 and 9. It contains 3 numbers so the class width is 3. This statistics course will require that all classes have the same width. If you find one class width then the other classes will have the same width. Section 2 – 1B Lecture Page 4 of 13 © 2012 Eitel Class Width If the data consists of whole numbers then the easiest way to find the class width is to count how many numbers are in the class. There are two other formulas that also find the class with without counting the numbers in the class. x: number freq (x) of units 1–3 2 4–6 5 7–9 1 Method 2 The class width is the difference between two consecutive lower class limits. The first class has a lower class limit of 1 and the second class has a lower class limit of 4 the class width is 4 – 1 = 3 The second class has a lower class limit of 4 and the third class has a lower class limit of 7 the class width is 7 – 4 = 3 Method 3 Class Width: class width = upper class limit − lower class limit +1 The first class has an upper class limit of 3 and a lower class limit of 1 the class width is 3 – 1 +1 = 3 The second class has an upper class limit of 6 and a lower class limit of 4 the class width is 6 – 4 + 1= 3 Warning: If a class is listed as 1 – 3 then the values 1 and 2 and 3 are in the class. There are three values in the class so the class width is 3. It is a common mistake to think that the class width is 3 minus 1 or 2. This in not true. Section 2 – 1B Lecture Page 5 of 13 © 2012 Eitel Every class should have the same width It is important that each class be the same width to avoid misrepresenting the data. If the first class is 0 – 3 and the next class is 4 – 50 then you should not compare the frequencies of the two classes. x: age of freq (x) house 1–3 5 4 – 50 5 The 0 – 3 class contains 5 data bits. The 4 – 50 class also contains 5 data bits. It would not be fair to compare the frequencies of the two classes because the second class represents a much wider range of values. One half of the data falls into the first class even though it only has 6% of the possible ages. One half of the data falls into the second class even though it has 94% of the possible ages. It would be a mistake to compare the frequencies. The first class has houses that are all almost brand new. These homes are very close in age. The 1 – 3 class class can represent all of these homes because they do not vary in age a lot. With a class width of 46 years there is a much wider range of ages in the second class than the first class. The 4 – 50 class cannot represent these homes in the same way that the 1 – 3 class represents its homes. You should not make comparisons about the relative frequencies of classes unless each class has the same range. Section 2 – 1B Lecture Page 6 of 13 © 2012 Eitel Example 1: Completing a Class Frequency Table A class frequency table for the number of miles from work a person lives is shown below. Each of the classes are given. The class width is 10 x: Miles freq from work (x) The first lower class limit is 0 and the first upper class limit is 9. 0–9 10 –1 9 The second lower class limit is 10 and the second upper class limit is 19. 20 – 29 The third lower class limit is 20 and the third upper class limit is 29. Complete the Class Frequency Table for the data shown below 20 state workers reported the number of miles away from work they live. The data is listed below. 11 0 21 4 11 23 5 12 5 20 9 11 17 11 8 10 13 9 18 14 We want to know how many of the numbers the data set are in each of the listed classes. We can do this with “tally marks” or just count the number of times each number occurs. 0–9: 10 – 19: 20 – 29: x: Miles freq from work (x) 0–9 7 10 –1 9 10 20 – 29 3 Lower class limits = 0, 10 , 20 Upper class limits = 9, 19 , 29 The class width is 10 ∑ freq(x) = 7 + 10 + 3 = 20 Section 2 – 1B Lecture Page 7 of 13 © 2012 Eitel Deciding on the number of classes to use Too many classes can lead to a very long table and may overwhelm the user. Many books suggest keeping the number of classes to 10 or fewer. There is no hard and fast rule about the number of classes to use. This book will try to use tables that have between 5 and 10 rows. Deciding on the width of each class 11 0 31 4 41 43 5 42 25 50 9 31 17 21 8 40 13 41 18 14 6 2 15 16 22 0 8 16 19 24 46 23 22 38 26 18 50 27 29 26 Consider the data set above. It ranges from 0 to 50. Using a small class range like 0 – 1 and 2 – 3 and 4 – 5 will result in too many classes. If each class has class has only 2 members the table would have 25 classes. This table would not be a very good summery of the data. Using only two large classes like 0 – 24 and 25 – 50 would not be very useful either. That would break the data up into only 2 classes which is too much of a summery. The number of classes and the class width are related. For any given set of data, if the number of classes increases the class width decreases. For any given set of data, if the class width increases the number of classes decreases. Select equal class widths that produce “nice” class limits and that also produce from 5 to 10 classes There is no hard and fast rule for the correct class range or number of classes. Chose a class range that will produce “nice” class limits but that will also give you from 5 to 10 classes based on the spread of the data. Class width of 5 starting at 0: 0–4 5–9 10 – 14 15 – 19 Class width of 10 starting at 0: 0–9 10 – 19 20 – 29 30 – 39 Class width of 5 starting at 1: 1– 5 6 – 10 11 – 15 16 – 20 Class width of 10 starting at 1: 1 – 10 11 – 20 21 – 30 31 – 40 Section 2 – 1B Lecture Page 8 of 13 © 2012 Eitel Finding the range of the data Range of Data = Largest data value – Smallest data value Class Width A general formula for determining a class width for a range of data given the number of classes desired Class width ≈ largest data value − smallest data value and then round up to the next whole number number of classes desired If the number of classes must be the exact desired number used in the formula then round the class size number this formula gives up to the next whole number and use that as the class width. If you round down you may get one less class then desired. You may want to round the results of this formula down to a number that will produce “nice” class limits. If the round off is large then you may not get the exact number of classes desired. If the class limits are more important than the number of classes then use this technique. Section 2 – 1B Lecture Page 9 of 13 © 2012 Eitel Example 1: Finding the Class Limits Given: The data ranges from 0 to 19 Range of Data = Largest data value − Smallest data value = 19 − 0 = 19 I decide that I want 4 classes in my class frequency table. Class width ≈ largest data value − smallest data value number of classes desired class width = 19 − 0 = 4.75 ≈ 5 4 The data starts at 0 so that is the first lower class limit. The other lower class limits will increase from 0 by the class width of 5. Lower class limits = 0, 5, 10, 15 The second lower class limit starts at 5 so that is the first upper class limit must be 4. The other upper class limits will increase from 4 by the class width of 5. Upper class limits = 4, 9, 14, 19, 0 ≤ x ≤ 4 and 5 ≤ x ≤ 9 and 10 ≤ x ≤ 14 and 15 ≤ x ≤ 19 x freq. (x) 0–4 5–9 10 – 14 15 – 19 Notice: The lower class limits start at 0 and increase by 5. The upper class limits start at 4 and increase by 5. This pattern must happen if the limits are correct. Section 2 – 1B Lecture Page 10 of 13 © 2012 Eitel Example 2: Finding the Class Limits Given: The data ranges from 1 to 60. Range of Data = Largest data value − Smallest data value = 60 −1 = 59 I decide that I want 6 classes in my class frequency table. Class width ≈ largest data value − smallest data value number of classes desired Class width ≈ 60 − 1 = 9.8 ≈ 10 6 The data starts at 1 so that is the first lower class limit. The other lower class limits will increase from 1 by the class width of 10. Lower class limits = 1, 11, 21, 31, 41, 51 The second lower class limit starts at 11 so that is the first upper class limit must be 10. The other upper class limits will increase from 10 by the class width of 10. Upper class limits = 10, 20,30, 40, 50 1 ≤ x ≤ 10 and 11 ≤ x ≤ 20 and 21 ≤ x ≤ 30 and x 31≤ x ≤ 40 and 41≤ x ≤ 50 and 51 ≤ x ≤ 53 freq. (x) 1 – 10 11 – 20 21 – 30 31 – 40 41 – 50 51 – 60 Notice: The lower class limits start at 1 and increase by 10. The upper class limits start at 10 and increase by 10. This pattern must happen if the limits are correct. Section 2 – 1B Lecture Page 11 of 13 © 2012 Eitel Example 1: Creating a Class Frequency Table 6 0 3 12 1 22 3 24 23 11 22 The data ranges from 0 to 24. I decide I want 5 classes in the table. 16 8 class width ≈ 22 9 24 − 0 = 4.8 ≈ 5 5 The data starts at 0 so the first lower class limit is 0. The class width is 5 so there will be 5 numbers in each class. If the first class is to start at 0 and contain 5 values the first class will be 0 – 4. (0,1,2,3,4) If the second class is to start at 5 and contain 5 values the second class will be 5 – 9. (5, 6, 7, 8, 9) If the third class is to start at 10 and contain 5 values the third class will be 10–14. (10,11,12,13,14) If the fourth class is to start at 15 and contain 5 values the fourth class will be 15–19. (15,16,17,18,19) If the fifth class is to start at 20 and contain 5 values the fifth class will be 20 – 24. (20,21,22,23,24) Lower class limits = 0, 5, 10, 15, 20 Upper class limits = 4, 9, 14, 19, 24 We want to know how many of the numbers the data set are in each of the listed classes. We can do this with “tally marks” or just count the number of times each number occurs. 0–4:|||| 5 – 9: 10 – 14: | | 15 – 19: | x: freq (x) 0–4 4 5–9 3 10 – 14 2 15 – 19 1 20 – 24 5 20 – 24: ∑ freq(x) = 4 + 3+ 2 + 1+ 5 = 15 Notice a pattern: The lower class limits are 0, 5, 10, 15 and 20. If the class width is 5 then each lower class limit increases by 5 starting at the first lower class limit of 0. The upper class limits are 4, 9, 14, 19 and 24. If the class width is 5 then each upper class limit increases by 5 starting at the first upper class limit of 4. Section 2 – 1B Lecture Page 12 of 13 © 2012 Eitel Example 2: Creating a Class Frequency Table 20 students in my statistics class last fall reported the number of hours of TV that they had watched during finals week. The data is listed below. 10 20 35 16 22 41 0 16 19 24 46 23 22 38 26 18 30 27 49 6 The data ranges from 0 to 46. I decide I want 5 classes in the table. class width ≈ 46 − 0 = 9.2 ≈ 10 5 The data starts at 0 so the first lower class limit is 0. The other lower class limits will increase from 0 by the class width of 10. Lower class limits = 0, 10, 20, 30, 40, 50 The second lower class limit starts at 10 so that is the first upper class limit must be 9. The other upper class limits will increase from 9 by the class width of 10. Upper class limits = 9, 19, 29, 39, 49 The class limits are 0 ≤ x ≤ 9 and 10 ≤ x ≤ 19 and 20 ≤ x ≤ 29 and 30 ≤ x ≤ 39 and 40 ≤ x ≤ 49 We want to know how many of the numbers the data set are in each of the listed classes. We can do this with “tally marks” or just count the number of times each number occurs. 0–9: || 10 – 19: 20 – 29: 30 – 39: 40 – 49: The first column has a heading of the number of hours of TV watched. The second column has a heading freq (x). The second column contains the frequency for each of the classes of the discrete numerical values occurs. x: hours of TV watched Freq (x) 0–9 2 10 – 19 5 20 – 29 7 30 – 39 3 40 – 49 3 ∑ freq(x) = 2 + 5 + 7 + 3 + 3 = 20 Section 2 – 1B Lecture Page 13 of 13 © 2012 Eitel