FREQUENCY DISTRIBUTION TABLES FREQUENCY DISTRIBUTION GRAPHS Frequency Distribution: Lists each category (label) of data and the number of occurrences. Sum of all = population or sample size Relative Frequency: The proportion of occurrences for each category Frequency calculated as: Sum _ of _ all _ frequencies Sum of all = 1. Bar Graph: Vertical or Horizontal. X-axis contains the categories or labels. For Frequency Distributions the y-axis is the number of occurrances. For Relative Frequency Distributions the y-axis is the proportion (values between 0 and 1). Bars do not need to be touching. SPECIES ZOO A ZOO B Elephant 6 3 Giraffe 12 7 Impala 13 24 Zebra 1 2 Ostrich 6 1 Guinea Hens 25 12 NUMBER ZOO A ANIMAL INVENTORY 30 25 20 15 10 5 0 E nt a ph e l fe f ira G la a p Im ra b Ze SPECIES s O h c i tr ea n ui G s n He SPECIES G ui n He ns ffe SPECIES ea h He ns O st r ic Ze br a Im pa la G ira ep ha nt ea h Ze br a NUMBER NUMBER 30 25 20 15 10 5 0 El G ui n ffe Im pa la G ira ep ha nt O st r ic El ZOO A ANIMAL INVENTORY ZOO B ANIMAL INVENTORY 30 25 20 15 10 5 0 He ns ffe SPECIES h Ze br a O st r ic ep ha nt G ira Im pa la ea El G ui n NUMBER ZOO A ANIMAL INVENTORY 30 25 20 15 10 5 0 CAN TREAT DISCRETE DATA LIKE QUALITATIVE (IF ONLY SEVERAL VALUES) OR AS WE WILL BE TREATING CONTINUOUS DATA (IF MANY VALUES) SEPARATE CONTINUOUS DATA INTO CLASSES (INTERVALS) AND THEN DO DISTRIBUTION TABLES OR GRAPHS Frequency Distribution Table: Similar to that for qualitative data, but each class is for a value or an interval (range) of values. Histograms: Vertical bar graphs, where the x-axis is the number line and each bar is for a class. All bars must touch side to side. Uses Lower Class limit on x-axis. Cumulative Frequency Distributions: Each class listed as before (lowest to largest), but the frequencies are the total for that frequency and all the lower classes. Relative Cumulative Frequency Distribution: Each Cumulative Frequency divided by total of all frequencies. The last class will have a cumulative value of 1.0 Use number of siblings Do as Frequency Table Do as Relative Frequency Do as Cumulative Frequency Do as Relative Cumulative Frequency Class: An interval of numbers along the number line. Lower Class Limit (LCL): The beginning number of the class. Upper Class Limit (UCL): The last number of the class. Class Width: the difference between lower class limits (or upper class limits), found by taking using data set’s maximum and minimum Maximum Minimum and calculating #_ of _ Classes rounding up to a convenient value Midpoint of Each Class: The point in the middle of the class, found by averaging the class lower class limit and the next class lower class limit. 1. Organize data in ascending order: 1.03 1.72 1.99 3.21 4.24 4.58 1.36 1.75 2.52 3.47 4.27 4.72 1.45 1.85 2.67 3.50 4.43 4.75 1.51 1.92 3.06 3.72 4.54 4.79 1.63 1.95 3.20 3.78 4.57 4.91 2. Determine the number of classes (5 – 20): For this we will use 6. 3. Find the maximum and minimum: For this max = 4.91 and min = 1.03 4. Calculate the Class Width: MAX MIN 4.911.03 0.647 6 #_ of _ CLASSES Round UP to a convenient value. We will use 0.70. 5. Determine First Lower Class Limit: For this we will use 1.00 (something convenient and lower than the Minimum). 6. Determine the next 5 Lower Class Limits by adding class width to the first and each subsequent to get the next: 1.00+.70=1.70; 1.70+.70=2.40 … 3.10, 3.80, 4.50. 7. Determine the first Upper Class Limit by Subtracting 1 from the last place of the second Lower Class Limit: 1.70-.01=1.69. 8. Find the other 5 Upper Class Limits by adding the class width to each previous Upper Class Limits: 1.69+.7=2.39, 2.39+.7=3.09, …, 3.79, 4.49, 5.19 9. Now construct the Table ……: CLASS LOWER CLASS LIMIT UPPER CLASS LIMIT FREQUENCY 1.00 1.69 ? 1.70 2.39 ? 2.40 3.09 ? 3.10 3.79 ? 3.80 4.49 ? 4.50 5.19 ? And count the frequencies in each class …: 1.03 1.72 1.99 3.21 4.24 4.58 1.36 1.75 2.52 3.47 4.27 4.72 1.45 1.85 2.67 3.50 4.43 4.75 1.51 1.92 3.06 3.72 4.54 4.79 1.63 1.95 3.20 3.78 4.57 4.91 And complete the Table: CLASS LOWER CLASS LIMIT UPPER CLASS LIMIT FREQUENCY 1.00 1.69 5 1.70 2.39 6 2.40 3.09 3 3.10 3.79 6 3.80 4.49 3 4.50 5.19 5 10. Draw the histogram: Histogram 8 7 Frequency 6 5 4 3 2 1 0 1.00 1.70 2.40 3.10 LCL 3.80 4.50 5.3 5.8 6.4 7.1 5.5 5.9 6.6 7.1 5.6 6.2 6.6 7.3 5.7 6.3 6.7 7.6 5.7 6.3 6.8 7.9 Stem Leaf Plot: Used for recording and showing dispersion of data. Stem can be the integer portion of a number and the leaves the decimal portion. Or the stem could be the tens digit and the leaves the ones digit. 5-3,5,6,7,7,8,9 6-2,3,3,4,6,6,7,8 7-1,1,3,6,9 Dot Plot: Also used to show dispersion of data. Draw a number line and label the horizontal scale with the numbers from the data from lowest to highest. Then place a dot above the numbers each time the number occurs. * * * * * * * * * * |___|___|___|___|___|___|___|___|___|___| 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 Polygon Plot: Line graph using the midpoints for the x-axis and frequencies for the y-axis. Both ends of the line must come back to the 0 on the y-axis. POLYGON GRAPH FREQUENCY 10 8 6 4 2 0 0 2 4 6 8 10 12 14 16 18 20 22 24 CLASS MIDPOINT Given a Polygon Plot, construct a Frequency Distribution Table. › 1. Find the Class Width: Difference in Midpoints › 2. Find first two LCL’s: Midpoint +/- ½*Class Width › 3. Find First Upper Class Limit: 2nd LCL – 1 › Find remainder of LCL’s & UCL’s › Find each class’s frequency Ogive (pronounced oh jive) Plot: Line Graph used for displaying Cumulative Frequency Distributions. The x-axis is the Upper Class Limit and the y-axis is the Cumulative Frequency. The first point is a class width less than the first Upper Class Limit so that the line starts with a frequency of 0. Ogive Plot: 22 20 18 16 14 12 10 8 6 4 2 0 1.99 2.99 3.99 4.99 5.99 6.99 7.99 8.99 9.99 10.99 11.99 12.99 13.99 Time Series Plots: Can be vertical or horizontal bar graphs, or line graphs. Xaxis is time intervals or ages (years, months, days) and y-axis is frequency. NORMAL DISTRIBUTION UNIFORM DISTRIBUTION 7 6 8 FREQUENCY FREQUENCY 10 6 4 2 5 4 3 2 1 0 0 A B C GRADE D F A B C GRADE D F SKEWED LEFT DISTRIBUTE 12 12 10 10 FREQUENCY FREQUENCY SKEWED RIGHT DISTRIBUTE 8 6 4 8 6 4 2 2 0 0 A B C GRADE D F A B C GRADE D F Vertical Scale Manipulation: Not starting the y-axis at 0. Also using a break in the scale. Can make differences look bigger than they really are. Exaggeration of Bars or Symbols: Used in pictographs. Horizontal Scale Manipulation: Not all classes or time interval are the same width. “Get your facts first, then you can distort then as you please” Mark Twain “There are lies, damn lies, and STATISTICS” Mark Twain “Definition of Statistics: The science of producing unreliable facts from reliable figures.” Evan Esar