CHS Statistics Chapter 4A Notes– Graphical Displays of Data (Part of Chapter 3 included) Frequency Distributions Lists classes (or categories) of values, along with frequencies (or counts) of the number of values that fall into each class o Lower class limits: the smallest numbers that belong to different classes o Upper class limits: the largest numbers that belong to different classes o Class boundaries: the numbers used to separate classes, but without the gaps created by class limits Find the size of the gap between the upper limit of one class and lower limit of the next class Add half of that amount to each upper class limit to find the upper class boundaries Subtract half of that amount from each lower class limit to find the lower class boundaries o Class midpoints: midpoints of the classes found by adding the lower and upper limits of each class and dividing by 2 o Class width: the difference between two consecutive lower class limits Steps to Creating a Frequency Distribution: 1. Find the class width: Class Width = Max # - Min # # of classes (btw 5-20) *Make it the next largest integer! 2. Set up the classes: o Start the first class with Min # o Add the class width to the Min # to get the next lower limit. o Continue to do this until you have the number of classes that are required. o Go back and create you upper limits (1 less than the next lower limit) o Last upper limit –either add your class width to the previous upper limit or what would the upper limit be if there was another class. 3. Make Tallies: o What class does the data piece fall into put a tally at that class? 4. Count tallies and put the number in frequency column. 5. Find the midpoint of each class – add the upper class limit to the lower class limit and divide by 2. 6. Find the relative frequency for each class. The frequency in that class Total number of frequency 7. Find the cumulative frequency - the sum of the frequency for that class and all the classes above. 8. Find the cumulative relative frequency - the sum of the relative frequency for that class and all the classes above. Example: Create a frequency distribution for the data given about time (in minutes) spent reading the newspaper in a day. Data: 7 6 Classes 39 5 Tallies 13 29 9 3 Frequency 25 11 8 39 Midpoints 22 16 2 15 Relative Frequency 2 8 18 15 2 12 Cumulative Frequency 30 35 7 Cumulative Relative Frequency Graphs of Categorical Data Pie Chart – contains slices of the pie that are proper proportions of the total categorical data o The degrees are calculated by multiplying the category’s decimal percentage by 360 degrees. o Example: 50 students were randomly surveyed to find out their favorite subject. Eleven students said it was math, 4 said history, 15 liked English, 10 said art, 4 said science, and 6 loved phys ed. Create a pie chart of this distribution. TOTAL Number Fraction Percentage Degree Bar Chart – categories are on the x-axis and frequencies are on the y-axis o Bars have gaps between them! o Example: Create a bar chart of the data above. Pareto Chart – another bar chart for categorical data where the bars are arranged in ascending or descending order according to frequencies. Why might a Pareto Graph be easier to read than a bar chart? How might you convert your bar graph to a Pareto chart? Graphs for Quantitative Data: Histogram – A histogram bar graph for quantitative data in which the horizontal scale represents the classes and the vertical scale represents the frequencies. The heights of the bars correspond to the frequency values, and the bars touch-NO GAPS (unless there are gaps in the data). o Example: Create a histogram from the Pepper Pungencies data Classes 32-35 36-39 40-43 44-47 48-51 Frequency 3 9 8 3 1 Midpoints 33.5 37.5 41.5 45.5 49.5 Rel Freq 3/24 =.125 9/24 = .375 8/24 = .33 3/24 = .125 1/24 = .04 Cumul Freq 3 12 20 23 24 Relative Frequency Histogram – These have the same shape as a histogram with frequency, but the frequencies change to relative frequency percentages. Image from: http://www.cms.murdoch.edu.au/areas/maths/statsnotes/samplestats/histogrammore.html Frequency Polygon – uses line segments connected to points located directly above class midpoint values Children of the Presidents Frequency 20 15 10 5 0 -2 1 4 7 10 13 16 19 Number of children of US presidents (midpoints) Ogive – a line graph that depicts cumulative frequencies, just as the cumulative frequency table lists cumulative frequencies Saturated Fat Intakes Cumulative Frequency 25 20 15 10 5 0 15.5 22.5 29.5 36.5 43.5 50.5 People's grams of saturated fat (Upper Class Boundaries) 57.5 Compressed Scale o The scale from 0-100 could be compressed and then continue normally from 100-400. o This is shown by a squiggle o The bars themselves could be also compressed. Why might this occur? Stem-and-leaf plot – represents data by separating each value into two parts: the stem and the leaves o It shows the same distribution of a histogram, but preserves the raw data. o If your data are too crowded in a row, separate the leaves from 0-4, 5-9. o Example: Construct the following biology midterm scores into a stem-and-leaf plot. 67 72 85 75 89 89 88 90 99 100 Dot Plots – consist of a graph in which each data value is plotted as a point along a scale of values. Dots represent the same values that are stacked, so they also preserve original data values. Scatter Plots – a plot of the paired (x,y) data to measure the correlation or association between two quantitative variables. Shape of Quantitative Distributions A unimodal distribution/histogram has one apparent peak: A bimodal histogram has two apparent peaks: A histogram that doesn’t appear to have any mode and in which all the bars are approximately the same height is called uniform: Symmetric Distributions o If you can fold the histogram along a vertical line through the middle and have the edges match pretty closely, the histogram is symmetric. Skewed Distributions o The (usually) thinner ends of a distribution are called the tails. If one tail stretches out farther than the other, the histogram is said to be skewed to the side of the longer tail. o In the figure below, the histogram on the left is said to be skewed left, while the histogram on the right is said to be skewed right.