• The organization of data in tabular form yields frequency distribution.
• Data in frequency distributions may be grouped or ungrouped.
• Raw Data – are collected data that have not been organized numerically.
• Array – an arrangement of raw data in ascending or descending order or magnitude
In an array, any value may appear several times
• Frequency – the number of times a value appears in the listing
• Relative Frequency – of any observation is obtained by dividing the actual frequency of the observation by the total frequency.
• When the data is small (n ≤ 30) or when there are few distinct values, the data may be organized without grouping.
Example 1. A certain machine is to dispense 1.5 kilos of sodium nitrate. To determine whether it is properly adjusted to dispense 1.5 kilos, the quality control engineer weighed 30 bags of sodium nitrate, 1.5 kilos each after the machine was adjusted. The data given below refer to the net weight (in kilos) of each bag.
1.46 1.49 1.52 1.50 1.46 1.52
1.52 1.50 1.49 1.50 1.46 1.46
1.50 1.52 1.49 1.52 1.46 1.49
1.52 1.46 1.52 1.48 1.52 1.50
1.50 1.48 1.49 1.52 1.48 1.48
Weight (kl) Tally No. of bags
(frequency, f)
Relative
Frequency
Weight (kl)
1.46
1.48
1.49
1.50
1.52
Tally No. of bags
(frequency, f)
6
4
5
6
9
30
Relative
Frequency
0.20
0.13
0.17
0.20
0.30
1.00
Table 1. Frequency distribution of the weight of sodium nitrate
• Statistical data generated in large masses
(n>30) can be assessed by grouping the data into different classes.
Steps in forming a frequency distribution from raw data:
1. Find the range (R). The range is the difference between the largest and smallest values.
2. Decide on a suitable number of classes. This will depend upon what information the table is supposed to present. Sturge suggested the number of classes (m) as m = 1 + 3.3 log n where n = number of cases
3. Determine the class size (c).
c = R/m
The class size (c) may be rounded off to the same place value as the data.
4. Find the number of observations in each class. This is the class frequency (f).
Classes represent the grouping or classification. The range of values in a class is the class interval consisting of a lower limit and an upper limit. Whenever possible, we must make the class interval of equal width and make the ranges multiples of numbers which are easy to work with such as 5, 10 or 100.
Ex. The following are data on the observed compressive strength in psi of 50 samples of concrete interlocking blocks.
136 92 115 118 121 137 132 120 104 125
119 115 101 129 87 108 110 133 135 126
127 103 110 126 118 82 104 137 120 95
146 126 119 119 105 132 126 118 100 113
106 125 117 102 146 129 124 113 95 148
• R = 148 – 82 = 66
• m = 1 + 3.3 log 50 = 7 classes
• c = 66/7 = 9.4
• Use c = 10 since the data values are to the nearest ones
• The lowest value is 82. It is convenient to start with 80 as the lower limit of the first class.
80+10=90 is the lower limit of the 2nd class.
• The number of observed values tallied in each is the class frequency.
Compressive
Strength (psi)
80 – 89
90 – 99
100 – 109
110 – 119
120 – 129
130 – 139
140 – 149
Σ
Tally No. of blocks
(frequency, f)
2
3
9
13
13
7
3
50
Relative
Frequency
0.04
0.06
0.18
0.26
0.26
0.14
0.06
1.00
Table 1.2 Frequency distribution of compressive strength of concrete interlocking blocks
1. The following are the observed gasoline consumption in miles per gallon of 40 cars.
Arrange the data in a frequency distribution.
24.5 23.6 24.1 25.0 22.9 24.7 23.8 25.2 23.7 24.4
24.7 23.9 25.1 24.6 23.3 24.3 24.6 23.9 24.1 24.4
24.5 25.7 23.6 24.0 23.9 24.2 24.7 24.9 25.0 24.8
24.5 23.4 24.9 24.8 24.7 24.1 22.8 23.1 25.3 24.6
• R = 25.7 – 22.8 = 2.9
• m = 1 + 3.3 log 40 = 6.2 say 7 classes
• c = 2.9/7 = 0.41
use c = 0.5 since the data values are to the nearest tenths.
• The lowest value is 22.8, therefore, 22.5 may be the lower limit of the 1 st class. 22.5 + 0.5 =
23.0 is the lower limit of the 2 nd class.
• The midpoint of the class interval is the
class mark. It is half the sum of the lower and upper limits of a class.
• A point that represents halfway, or a dividing point between successive classes is the
class boundary.
• The upper class boundary of the first class is the dividing point between the first class and the second class. The lower class boundary of the second class is the dividing point between the first class and the second class.
• Thus, in Table 1.2, ½ (89+90) = 89.5 is the upper class boundary of the first class. This is the lower class boundary of the second class.
• The class mark of the first class is equal to
½ (80 + 89) = 84.5.
• For the succeeding classes, the class mark may be obtained by adding c = 10 since the classes have equal widths.
Classes
80 – 89
90 – 99
100 – 109
110 – 119
120 – 129
130 – 139
140 – 149
Class Boundaries
79.5 – 89.5
89.5 – 99.5
99.5 – 109.5
109.5 – 119.5
119.5 – 129.5
129.5 – 139.5
139.5 – 149.5
Class Marks
84.5
94.5
104.5
114.5
124.5
134.5
144.5
Table 1.4 Class limits, class boundaries and class marks for frequency distribution presented in Table 1.2