Statistics (Stat 101) Sameh Saadeldin Ahmed Associate Professor of Environmental Eng. Civil Engineering Department Engineering College Almajma’ah University smohamed1@ksu.edu.sa http://faculty.ksu.edu.sa/SaMeH Stat 101 2010/2011 Dr SaMeH 1 2.2 Organizing and Graphing Qualitative Data 2.2.1 Frequency Distribution 2.2.2 Relative Frequency & Percentage 2.2.3 Graphical Presentation of Qualitative Bar Graphs Pie Chart. Stat 101 Dr SaMeH 2 2.3 Organizing and Graphing Quantitative Data 2.3.1 Frequency Distribution 2.3.2 Constructing Freq. Distribution Tables 2.3.3 Relative Freq. & Percentage Distribut. 2.3.4 Graphing Grouped Data. Stat 101 Dr SaMeH 3 2.3.2 Constructing Frequency Distribution Tables • To construct the frequency distribution table, you have to make the following three steps: Number of Classes Class Width Lower Limit of the First Class or the Starting Point. Stat 101 Dr SaMeH 4 Number of Classes Usually the number of classes for a frequency distribution table varies from 5 to 20. The decision of the number of classes is arbitrarily made by the data organizer. Stat 101 Dr SaMeH 5 Class Width It is preferable to have the same width for all classes. To do so, find the difference between the largest and smallest values in the data. Then, the approximate width of a class is obtained by dividing this difference by the number of desired classes. Calculating of Class Width Approximate class width = [Largest value - Smallest value] / Number of classes Stat 101 Dr SaMeH 6 Lower Limit of the First Class or the Starting Point Any convenient number that is equal to or less than the smallest value in the data set can be used as the lower limit of the first class. Stat 101 Dr SaMeH 7 2.3.3 Relative Frequency and Percentage Distributions Calculating Relative Frequency and Percentage Relative frequency of a class = Frequency of that class / Sum of all frequencies = f / ∑f Percentage = (Relative frequency) x 100 Stat 101 Dr SaMeH 8 Exercise Calculate the relative frequencies and percentage for the following table: Total goals scored 124 – 145 146 – 167 168 – 189 190 – 211 212 - 233 Stat 101 Dr SaMeH F 6 13 4 4 3 9 2.3.4 Graphing Grouped Data Grouped quantitative data can be displayed in a histogram or a polygon. Histogram A histogram is a graph in which classes are marked on the horizontal axis and the frequencies, relative frequencies, or percentages are marked on the vertical axis. The frequencies, relative frequencies, or percentages are represented by the heights of the bars. In a histogram the bars are drawn adjacent to each other. Stat 101 Dr SaMeH 10 A histogram for the frequency distribution Stat 101 Dr SaMeH 11 A histogram for the relative frequency A histogram for the percentages Stat 101 Dr SaMeH 12 Polygons A graph formed by joining the midpoint of the tops of successive bars in a histogram with straight lines is called a polygon. Stat 101 Dr SaMeH 13 2.3.5 More on Classes and Frequency Distribution • Less than Method for Writing Classes • Single-Valued Classes Stat 101 Dr SaMeH 14 Example 2.6: less than..... The following data give the average travel time from home to work (in minutes) for 50 cities. 22.4 18.2 23.7 19.8 26.7 23.4 23.5 22.5 24.3 26.7 24.2 26.1 19.9 15.6 19.7 22.7 31.2 22.7 27.0 21.6 22.6 23.6 21.7 21.9 15.4 20.8 17.6 23.2 22.1 21.1 17.7 16.0 19.6 25.4 22.5 16.1 21.4 24.9 23.7 22.3 23.8 25.5 21.2 24.4 21.9 20.1 29.2 28.7 21.9 17.1 1.Construct a frequency distribution table. 2.Calculate the relative frequencies and percentages for all classes. 3.Plot the histograms and polygons for the frequencies & percentages. Stat 101 Dr SaMeH 15 Solution The min. value is 15.4 and the max. value is 31.2 Suppose we decide to group these data using six classes of equal width. Then, Approximate width of each class = [31.2 – 15.4] / 6 = 2.63 We round this number to a more convenient number to 3 Let us start the first class at 15, the classes are written as 15 to less than 18, and so on. Sturge’s formula to decide on the no. of classes: c = 1 + 3.3 log n Stat 101 Dr SaMeH 16 Average travel time Frequency Relative Percentage to work (minutes) (f) Frequency 15 to less than 18 18 to less than 21 21 to less than 24 24 to less than 27 7 7 23 9 0.14 0.14 0.46 0.18 14 14 46 18 27 to less than 30 30 to less than 33 SUM 3 1 0.06 0.02 1.00 6 2 100% Stat 101 50 Dr SaMeH 17 Stat 101 Dr SaMeH 18 Example 2.7: Single value...... In this case we use classes that are made of single values and not of intervals. It is useful in cases of discrete data with only a few possible values. See the following example. The governorate of Almajmaa’h city wanted to know the distribution of the computer sets owned by families in the city. A sample of 40 randomly selected houses from the city produced the following data on the number of computers owned. 5 1 1 1 1 Stat 101 1 1 2 2 1 1 1 3 2 2 2 3 4 1 1 0 3 2 1 1 1 0 1 1 4 Dr SaMeH 1 2 2 4 1 2 5 2 2 3 5 1 1 1 1 1 1 2 2 1 19 Construct the frequency distribution table for these data using single-valued classes. Solution: The observations in this data set assumes only 6 distinct values: 0,1,2,3,4 and 5. Each of these values is used as a class in the frequency distribution. Computers Owned Relative Frequency 0.05 0.45 0.275 0.10 Percentage 0 1 2 3 Frequency f 2 18 11 4 4 5 SUM 3 2 40 0.075 0.05 1.00 7.50 5 100% Stat 101 Dr SaMeH 5 45 27.5 10 20 Stat 101 Dr SaMeH 21 2.4 Cumulative Frequency Distribution 2.1 Raw Data 2.2 Organizing Qualitative Data 2.3 Organizing Quantitative Data 2.4 Cumulative Frequency Distribution 2.5 Seam-and –Leaf Display Stat 101 Dr SaMeH 22 2.4 Cumulative Frequency Distribution Cumulative Frequency Distribution A cumulative frequency distribution gives the total number of values that fall below the upper boundary of each class. In the cumulative frequency distribution table, each class has the same lower limit but a different upper limit. Example 2.8 illustrates the procedure to prepare cumulative frequency distribution. Stat 101 Dr SaMeH 23 Example 2.8: Using the frequency distribution of the following table, prepare a cumulative frequency distribution for the total goals. Stat 101 Total goals scored F 124 – 145 146 – 167 168 – 189 190 – 211 212 - 233 6 13 4 4 3 Dr SaMeH 24 Solution: Class Limits Class Boundaries 124 – 145 124 – 167 124 – 189 124 – 211 124 - 233 123.5 to less than 145.5 123.5 to less than 167.5 123.5 to less than 189.5 123.5 to less than 211.5 123.5 to less than 233.5 Cumulative frequency 6 6 + 13 = 19 6 + 13 + 4 = 23 6 + 13 + 4 + 4 = 27 6 + 13 + 4 + 4 + 3 = 30 From the above table we can determine the number of observations that fall below the upper limit or boundary of each class. For example, 23 football teams scored a total of 189 goals or fewer. Stat 101 Dr SaMeH 25 Stat 101 Dr SaMeH 26 Calculating the Relative Frequency and Cumulative Percentage A cumulative frequency distribution gives the total number of values that fall below the upper boundary of each class. Cumulative frequency of a class Cumulative relative frequency = --------------------------------------Total observations in the data set Stat 101 Dr SaMeH 27 The following table contains both the cumulative relative frequencies and the cumulative percentages for the data given in example 8. Class Limits 124 – 145 124 – 167 124 – 189 124 – 211 124 - 233 Stat 101 Cumulative Relative Frequency 6/30 = 0.200 19/30 = 0.633 23/30 = 0.767 27/30 = 0.900 30/30 = 1.000 Dr SaMeH Cumulative Percentage 20.0 63.3 76.7 90.0 100.0 28 Ogives Ogive An ogive is a curve drawn for the cumulative frequency distribution by joining with straight lines the dots marked above the upper boundaries of classes at heights equal to the cumulative frequencies of respective classes. When plotted on a diagram, the cumulative frequencies give that is called ogive. The next figure gives an ogive for the cumulative frequency distribution of ex. 8. Stat 101 Dr SaMeH 29 Stat 101 Dr SaMeH 30 End of Part 3 Get ready for a quiz (2)…… next lecture!! Stat 101 Dr SaMeH 31