Grouping Data and Histogram: Definitions, Properties, Examples Dr. Pellumb Kllogjeri Grouping Data and Histogram: When dealing with enormous amounts of data, data grouping is quite essential. A pictogram or a bar graph can alternatively be used to illustrate this data. Grouped data is data created by individual grouping observations of a variable into groups. A frequency distribution table of these groups may be used to summarise or analyse the data. A histogram is a graph that displays frequencies for intervals of values of a metric variable. A histogram is similar to a bar graph. However, it is used for class intervals that are not interrupted. It also displays a set of continuous data’s underlying frequency distribution. Grouping Data When the number of observations is considerable, we can use the grouping of data idea to separate the data into several categories. Individual observations of a variable are grouped into groups, and the frequency distribution table of these groups is a helpful way to summarise the data. Frequency Distribution The data gathered is organised in a table using a frequency distribution. Students’ grades, weather in other cities, match points, and so on might all be incorporated in the data. Following the collection of data, we must present it in a meaningful manner to aid comprehension. Arrange the data in a table such that all of its feature s are summarised. Let us consider an example. The following are the temperature of 20 cities in March in degrees. 30,25,27,20,25,30,20,27,24,24,25,27,20,30,33,25,33,33,27,20 Let us present this data in a table and determine the frequency (the number of repetitions of a value ) of the cities with the same temperature. Temperature Number of Cities 20 4 24 2 25 4 27 4 30 3 33 3 All the collected data is arranged under the temperature column and the number of cities columns, as can be seen. This arrangement makes it simple to understand the information provided, and we can see the number of cities with the same temperature. Steps to Draw Frequency Distribution Table for Grouped Data We will follow the below steps to draw a frequency distribution table for the grouped data. 1. 2. 3. 4. Divide the data into groups using the information provided. Sort the observations into ascending order. Find out the frequency of each observation. In the frequency distribution table, write the frequency and group name. Advantages of Grouping Data Below are some of the advantages of grouping data. 1. Grouping data aids in focusing on crucial subpopulations while ignoring others that aren’t. 2. Data grouping enhances estimating accuracy and efficiency. Ungrouped Data Raw data, also known as ungrouped data, is information collected from direct observation. 20 students of a class in a • Example: Consider the marks of particular exam. • 40,50,50,56,92,60,70,60,88,76, 88,80,70,72,92,36,40,40,70,36 In the above example, the number of students who obtained the same number of marks is called the frequency . Here, 3 students got 40 marks. So, the frequency of 40 is 3. Histogram A histogram is a graphical depiction of a set of data that is divided into user-defined ranges. Like a bar graph, the histogram turns a data series into an easily understandable visual by grouping data points into logical ranges or bins. The histogram is made up of a series of bars (similar to a bar chart), but these bars are adjacent to one another, and the height of the bars is proportional to the frequency of the various classes . The area of each rectangle denoted the frequency of each class. The rectangles all have the same width, and their heights directly match the class frequencies when the class intervals are equal. If the length of the appropriate class interval rises, the height of a rectangle must be proportionally reduced. Histogram for a Grouped Data A histogram is a two-dimensional graphical depiction of a continuous frequency distribution. The bars in a histogram are always put side by side, with no gaps between them. That is, rectangles are built on the distribution’s class intervals in histograms. The frequencies are proportional to the rectangle areas. Let us now examine the procedures involved in creating a histogram for grouped data. 1. If the data is in a discontinuous form, represent it in a continuous form. 2. On a uniform scale, mark the class intervals along the x−axis. 3. On a consistent scale, mark the frequencies along the y−axis. 4. Create rectangles with class intervals as the bases and frequencies as the heights. Histogram for Ungrouped Data The histogram is created by plotting the class boundaries (not class limits) on the x−axis and the corresponding frequencies on the y−axis from the grouped data. Before constructing a histogram with ungrouped data, we must first create a grouped frequency distribution. Bar graphs are often used for discrete and categorical data. However, in rare cases where an approximation is required, a histogram may be generated. The steps for creating a histogram for ungrouped data are as follows: 1. Mark the possible values on x−axis. 2. Mark the frequencies along the y−axis. 3. Draw a rectangle centred on each value, with equal width on each side and a margin of 0.5 on either side. Solved Examples on Grouping Data and Histogram Q.1. Draw a histogram for the below table, which represent the marks obtained by 100 students in an examination: Marks Number of Students 0–10 5 10–20 10 20–30 15 30–40 20 40–50 25 50–60 12 60–70 8 70–80 5 Ans: The class intervals are all the same length, at ten marks each. Let’s draw a line on the x−axis to represent these class intervals. Along the y−axis, write the number of students on the appropriate scale. Below is a representation of the histogram. Scale: x−axis: 1cm=10 marks y−axis: 1cm=5 students The bars in the diagram above are drawn in a continuous pattern. The rectangles have lengths (heights) that are proportionate to the frequencies. The areas of the bars are proportional to the respective frequencies because the class intervals are equal. Q.2. Draw a histogram to represent the below data : Class Interval Frequency 0–10 8 10–20 12 20–30 6 30–40 14 40–50 10 50–60 5 Ans: The histogram for the given data is drawn below. Scale: x−axis: 1cm=10 units, y−axis: 1cm=2 units Q.3. A teacher wanted to analyse the performance of two sections of students in a mathematics test of 100 marks. Looking at their performances, she found that a few students got under 20 marks and a few got 70 marks or above. So she decided to group them into intervals of varying sizes: 0–20,20–30,….60–70,70–100. Then she formed the following table: Marks Number of Students 0–20 7 20–30 10 30–40 10 40–50 20 50–60 20 60–70 15 70−above 8 Total 90 Ans: We need to make specific changes in the lengths of the rectangles so that the areas are again proportional to the frequencies. The steps to be followed are given below: 1. Select the class interval with the minimum class size. 2. The lengths of the rectangles are then modified to be proportionate to the class size. When the class size is 20, the length of the rectangle is 7. So, when the class size is 10, the length of the rectangle will be (7/20)×10=3.5 Therefore, the modified table will be as follows. Marks Frequency Width of the class Length of the rectangle 0–20 7 20 (7/20)×10=3.5 20–30 10 10 (10/10)×10=10 30–40 10 10 (10/10)×10=10 40–50 20 10 (20/10)×10=20 50–60 20 10 (20/10)×10=20 60–70 15 10 (15/10)×10=15 70–100 8 30 (8/30)×10=2.67 Since we have calculated these lengths for an interval of 10 marks in each case, we may call these lengths as “proportion of students per 10 marks interval”. So, the correct histogram with varying width is given in the below figure. Q.4. In a study of covid patients in a village, the following observations were noted. Represent the given data by using the histogram. Ages Number of patients 10–20 3 20–30 6 30–40 13 40–50 20 50–60 10 60–70 5 Ans: The histogram for the given data is drawn below. Scale: x−axis: 1cm=10 age y−axis: 1cm=2 patients Q.5. Draw the histogram for the below data. Groups Frequency 0–10 3 10–20 11 20–30 14 30–40 14 40–50 8 Ans: The histogram for the given data is drawn below. Scale: x−axis: 1cm=10 units y−y−axis: 1cm=5 units Summary In this unit, we learnt the definitions of grouping data, frequency distribution, ungrouped data, and histograms. Also, we have studied the method to draw histograms for grouped and ungrouped data and solved some example problems on the same. FAQs on Grouping Data and Histogram Q.1. What do you mean by the grouping of data? Ans: When the number of observations is considerable, we can use the grouping of data idea to separate the data into several categories. Individual observations of a variable are grouped into groups, and the frequency distribution table of these groups is a helpful way to summarise the data. Q.2. What is grouping data and histogram? Ans: When the number of observations is considerable, we can use the grouping of data idea to separa te the data into several categories. Individual observations of a variable are grouped into groups, and the frequency distribution table of these groups is a helpful way to summarise the data. A histogram is a graphical depiction of a set of data that is d ivided into user-defined ranges. Like a bar graph, the histogram turns a data series into an easily understandable visual by grouping data points into logical ranges or bins. Q.3. What is the purpose of grouping data? Ans: Grouping data aids in focusing on crucial subpopulations while ignoring others that aren’t. Data grouping enhances estimating accuracy and efficiency. Q.4. How do you group data in statistics? Ans: The following steps in the grouping can be summarised: 1. Determine the number of classes. 2. Calculate the range or the difference between the data’s highest and lowest observations. 3. To estimate the size of the interval, divide the range by the number of classes. 4. To calculate the upper-class limit, take the lower class limit of the lowest class and multiply it by the class interval. Q.5. What is an ungrouped data example? Ans: Raw data, also known as ungrouped data, is information collected from direct observation. Example: Consider the marks of 20 students of a class in a particular exam. 40,50,50,56,92,60,70,60,88,76, 88,80,70,72,92,36,40,40,70,36. Homework The heights in centimetres of 30 learners are given below. 142 163 169 132 139 140 152 168 139 150 161 132 162 172 146 152 150 132 157 133 141 170 156 155 169 138 142 160 164 168 Group the data into the following ranges and draw a histogram of the grouped data: 130≤h<140; 140≤h<150; 150≤h<160; 160≤h<170; 170≤h<180 (Note that the ranges do not overlap since each one starts where the previous one ended.)