Frequency Distributions Example The following are the scores of 30 college students in a statistics test: 75 52 80 96 65 79 71 87 93 95 69 72 81 61 76 86 79 68 50 92 83 84 77 64 71 87 72 92 57 98 Construct a stem-and-leaf display. Figure Stem-and-leaf display of test scores. 5 6 7 8 9 2 5 5 0 6 0 9 9 7 3 7 1 1 1 5 8 2 6 2 4 6 9 7 1 2 3 4 7 2 8 Example The following data are monthly rents paid by a sample of 30 households selected from a small city. 880 1210 1151 1081 985 630 721 1231 1175 1075 932 952 1023 775 850 825 1100 1140 1235 1000 750 750 915 1140 965 1191 1370 960 1035 1280 Construct a stem-and-leaf display for these data. Solution Stem-and-leaf display of rents. 6 7 8 9 10 11 12 13 30 75 80 32 23 91 10 70 50 25 52 81 51 31 21 50 15 35 40 35 50 60 85 65 75 00 75 40 00 80 Exercise Develop your own Stem and Leaf Plot with the following temperatures for June. 77 57 67 87 80 80 50 70 80 77 82 62 62 82 68 61 65 83 65 70 65 79 59 69 73 79 61 64 76 71 Example The following stem-and-leaf display is prepared for the number of hours that 25 students spent working on computers during the last month. Prepare a new stem-and-leaf display by grouping the stems. 0 1 2 3 4 5 6 7 8 6 1 2 2 1 3 2 7 6 4 5 6 4 5 6 9 7 8 6 9 9 8 4 5 7 Solution Grouped stem-and-leaf display. 0–2 3–5 6–8 6 * 1 7 9 * 2 6 2 4 7 8 * 1 5 6 9 9 * 3 6 8 2 4 4 5 7 * * 5 6 Grouped Data Vs Ungrouped Data Ungrouped data – Data that has not been organized intogroups. Also called as raw data. Grouped data - Data that hasbeen organized into groups (into a frequency distribution). Data Frequency 2 8 3 4 5 6 7 7 8 2 9 5 Data Frequency 2–4 5 5–7 6 8 – 10 10 11 – 13 8 14 – 16 4 17 – 19 3 Creating a Categorical Ungrouped Frequency Distribution Step 1: Make a table with the following columns in order: class, tally, and frequency Step 2: Tally (TOTAL) the data and place the results in the tally column. Step 3: Count the tallies and place the results in the frequency column. Example: Below is the marks of 35 students in English test (out of 10). Arrange these marks in tabular form using tally marks. 5, 8, 7, 6, 10, 8, 2, 4, 6, 3, 7, 5, 8, 5, 1, 7, 4, 6, 3, 5, 2, 8, 4, 2, 6, 4, 2, 8, 9, 5, 4, 7, 5, 5, 8. Example: Let us consider the following data: 2, 3, 3, 5, 7, 9, 7, 8, 9, 9, 2, 5, 3, 9, 3, 2, 5, 9, 8, 7, 3, 5, 7, 9, 8, 5, 2, 3 Design frequency table for above data. Example These are the favorite colors of fifteen 2nd graders. Red Yellow Green Red Blue Class Blue Red Red Green Red Green Yellow Red Blue Green Tally Frequency Total= Grouped Frequency Distribution • When the range of the data is large, the data must be grouped into classes 41 105 109 104 57 99 112 107 105 118 67 99 87 78 101 95 125 92 Key Concept Class Width • The class width is the range of the class. • Can be found by subtracting the lower class limit of one class from the upper class limit of the next class Class width = Upper boundary – Lower boundary # of classes Frequency Distributions cont. Calculating Class Midpoint or Mark Class midpoint or mark = Lower limit + Upper limit 2 Rules For Grouped Data Rule #1: Choose the classes You will normally be told how many classes you need Rule #2: Choose Class Width ALWAYS round up to the next whole number Rule #3: Mutually Exclusive This means the class limits cannot overlap or be contained in more than one class. Rules For Grouped Data Rule #4: Continuous Even if there are no values in a class the class must be included in the frequency distribution. There should be no gaps in a frequency distribution. (with the exception of a class with zero frequency) Rule #5: Exhaustive There should be enough classes to accommodate all of the data Rule #6: Equal Width This avoids a distorted view of the data. Table Class Widths, and Class Midpoints Class Limits Class Width Class Midpoint 400 to 600 601 to 800 801 to 1000 1001 to 1200 1201 to 1400 1401 to 1600 200 200 200 200 200 200 500 700.5 900.5 1100.5 1300.5 1500.5 Frequency Distributions Minutes Spent on the Phone 102 71 103 105 109 124 104 116 97 99 108 86 103 112 118 87 85 122 87 107 67 78 105 99 101 82 95 100 125 92 Make a frequency distribution table with five classes. Minimum value = Maximum value = 67 125 Construct a Frequency Distribution Table Minimum = 67, Maximum = 125 Number of classes = 5 Class width = 11.6 = 12 Class Limits Tally f 67 78 3 79 90 5 91 102 8 103 114 9 115 126 5 Total=30 Example: Construct a grouped frequency table for the following data : 8, 10, 43, 15, 22, 34, 23, 45, 28, 49, 30, 21, 29, 17, 33, 39, 41, 48, 33, 25 Example • The total home runs hit by all players of each of the 30 Major League Baseball teams during the 2002 season. Construct a frequency distribution table. Table Team Anaheim Arizona Atlanta Baltimore Boston Chicago Cubs Chicago White Sox Cincinnati Cleveland Colorado Detroit Florida Houston Kansas City Los Angeles Home Runs Hit by Major League Baseball Teams During the 2002 Season Home Runs 152 165 164 165 177 200 217 169 192 152 124 146 167 140 155 Team Milwaukee Minnesota Montreal New York Mets New York Yankees Oakland Philadelphia Pittsburgh St. Louis San Diego San Francisco Seattle Tampa Bay Texas Toronto Home Runs 139 167 162 160 223 205 165 142 175 136 198 152 133 230 187 Solution 230 −124 Approximate width of each class = = 21.2 5 Now we round this approximate width to a convenient number – say, 22. • Then our classes will be 124 – 145, 146 – 167, 168 – 189, 190 – 211, 212 - 233 Table Frequency Distribution for the Data of Table Total Home Runs Tally 124 – 145 146 – 167 168 – 189 190 – 211 212 - 233 |||| | |||| |||| ||| |||| |||| ||| f 6 13 4 4 3 ∑f = 30 Relative Frequency and Percentage Distributions Calculating Relative Frequency of a Category Re lative frequency of a category = Frequency of that category Sum of all frequencies Calculating Percentage Percentage = (Relative frequency) x 100 Solution Table Relative Frequency and Percentage Distributions for Table Relative Total Home Runs f 124 – 145 146 – 167 168 – 189 190 – 211 212 - 233 6 13 4 4 3 .200 .433 .133 .133 .100 20.0 43.3 13.3 13.3 10.0 ∑f = 30 Sum = .999 Sum = 99.9% Frequency Percentage Example After conducting a survey of 30 of your classmates, you are left with the following set of data on how many days off each employee has taken this year: 7, 8, 9, 4, 10, 36, 19, 9, 26, 5, 11, 6, 2, 9, 10, 8, 16, 29, 7, 9, 8, 25, 4, 27, 8, 7, 6, 10, 34, 8 Construct a Frequency Table. Assume you want to divide the data into 5 different classes. Answer Class Limits 2-8 9-15 16-22 23-29 30-36 Tally Frequency 14 8 2 4 2 Total: 30 Example Some what None Somewhat Very Very None Very Somewhat Somewhat Very Somewhat Somewhat Very Somewhat None Very None Somewhat Somewhat Very Somewhat Somewhat Very None Somewhat Very very somewhat None Somewhat Construct a ungrouped frequency distribution table for these data. Solution Table Frequency Distribution of Stress on Job Stress on Job Very Somewhat None Tally |||| |||| |||| |||| |||| |||| | Frequency (f) 10 14 6 Sum = 30 Example • Determine the relative frequency and percentage for the data in previous Table Table Relative Frequency and Percentage Distributions of Stress on Job Stress on Job Frequency (f) Very Somewhat None Relative Frequency Percentage 10 14 6 10/30 = .333 14/30 = .467 6/30 = .200 .333(100) = 33.3 .467(100) = 46.7 .200(100) = 20.0 Sum = 30 Sum = 1.00 Sum = 100 Example The following data give the average travel time from home to work (in minutes) for 50 states. The data are based on a sample survey of 700,000 households conducted by the Census Bureau (USA TODAY, August 6, 2001). Example (Cont…) 22.4 19.7 21.6 15.4 21.1 18.2 27.0 21.9 22.1 25.4 23.7 21.7 23.2 19.6 24.9 19.8 17.6 16.0 21.4 25.5 26.7 17.7 16.1 23.8 20.1 23.4 22.5 22.3 21.9 17.1 23.5 23.7 24.4 21.9 22.5 21.2 28.7 15.6 24.3 29.2 19.9 22.7 26.7 26.1 31.2 23.6 24.2 22.7 22.6 20.8 Construct a frequency distribution table. Calculate the relative frequencies and percentages for all classes. Solution 31.2 −15.4 = 2.63 Approximate width of each class = 6 Solution Table Frequency, Relative Frequency, and Percentage Distributions of Average Travel Time to Work Class Boundaries f Relative Frequency Percentage 15 to less than 18 18 to less than 21 21 to less than 24 24 to less than 27 27 to less than 30 30 to less than 33 7 7 23 9 3 1 .14 .14 .46 .18 .06 .02 14 14 46 18 6 2 Σf = 50 Sum = 1.00 Sum = 100% Example The administration in a large city wanted to know the distribution of vehicles owned by households in that city. A sample of 40 randomly selected households from this city produced the following data on the number of vehicles owned: 5 1 1 2 0 1 1 2 1 1 1 3 3 0 2 5 1 2 3 4 2 1 2 2 1 2 2 1 1 1 4 2 1 1 2 1 1 4 1 3 • Construct a frequency distribution table for these data, and draw a bar graph. Solution Table Frequency Distribution of Vehicles Owned Vehicles Owned 0 1 2 3 4 5 Number of Households (f) 2 18 11 4 3 2 Σf = 40 Figure Bar graph for Table 20 18 16 Frequency 14 12 10 8 6 4 2 0 No Car 1 Car 2 Cars 3 Cars Vehicles ow ned 4 Cars 5 Cars CUMULATIVE FREQUENCY DISTRIBUTIONS Definition A cumulative frequency distribution gives the total number of values that fall below the upper boundary of each class. Example Using the frequency distribution of Table in ptrvious example, reproduced in the next slide, prepare a cumulative frequency distribution for t he home runs hit by Major League Baseball teams during the 2002 season. Example Total Home Runs f 124 – 145 146 – 167 168 – 189 190 – 211 212 - 233 6 13 4 4 3 Solution Table Cumulative Frequency Distribution of Home Runs by Baseball Teams Class Limits 124 – 124 – 124 – 124 – 124 – 145 167 189 211 233 f Cumulative Frequency 6 13 4 4 3 6 6 + 13 = 19 6 + 13 + 4 = 23 6 + 13 + 4 + 4 = 27 6 + 13 + 4 + 4 + 3 = 30 CUMULATIVE FREQUENCY DISTRIBUTIONS cont. Calculating Cumulative Relative Frequency and Cumulative Percentage Cumulative relative frequency= Cumulative frequencyof a class Totalobservations in the data set Cumulative percentage= (Cumulative relative frequency)100 Table Class Limits 124 – 145 124 – 167 124 – 189 124 – 211 124 - 233 Cumulative Relative Frequency and Cumulative Percentage Distributions for Home Runs Hit by baseball Teams Cumulative Relative Frequency Cumulative Percentage 6/30 = .200 19/30 = .633 23/30 = .767 27/30 = .900 30/30 = 1.00 20.0 63.3 76.7 90.0 100.0 CUMULATIVE FREQUENCY DISTRIBUTIONS cont. Definition An ogive is a curve drawn for the cumulative frequency distribution by joining with straight lines the dots marked above the upper boundaries of classes at heights equal to the cumulative frequencies of respective classes. Figure Ogive for the cumulative frequency distribution in Table Cumulative frequency 30 25 20 15 10 5 123.5 145.5 167.5 189.5 211.5 233.5 Total home runs Shape • A graph shows the shape of the distribution. • A distribution is symmetrical if the left side of the graph is (roughly) a mirror image of the right side. • One example of a symmetrical distribution is the bell-shaped normal distribution. • On the other hand, distributions are skewed when scores pile up on one side of the distribution, leaving a "tail" of a few extreme values on the other side. Positively and Negatively Skewed Distributions • In a positively skewed distribution, the scores tend to pile up on the left side of the distribution with the tail tapering off to the right. • In a negatively skewed distribution, the scores tend to pile up on the right side and the tail points to the left.