Graphical Displays of Data Section 2.2 Objectives • Create and interpret the basic types of graphs used to display data Introduction • A graph is a snapshot that allows us to view patterns at a glance without undergoing lengthy analysis of the data. • Graphs are much more visually appealing than a table or list. • A graph should be able to stand alone, without the original data. Graph must be given a title, as well as labels for both axes. Purpose of Statistical Graphs • To convey the data to the viewers in pictorial form – It is easier for most people to comprehend the meaning of data presented as a picture than data presented as a table. This is especially true if the viewers have little or no statistical knowledge • • • • To describe the data set To analyze the data set (Distribution of data set) To summarize a data set To discover a trend or pattern in a situation over a period of time • To get the viewers’ attention in a publication or speaking presentation Graphs Used to Display Qualitative Data Pie Chart • Pie Chart is a circle that is divided into sections or wedges according to the percentage of frequencies in each category of the distribution. • Show relationship of the parts to the whole Pareto Chart* • Bar graph • Used to represent a frequency distribution for a categorical variable (nominal level) and the frequencies are displayed by the heights of the contiguous vertical bars, which are arranged in order from highest to lowest. How do I create a Pareto Chart from a categorical frequency distribution? • STEP 1: Draw the x- and y-axes • STEP 2: Label the x-axis using the qualitative categories (highest frequency to lowest frequency) • STEP 3: Label the y-axis using an appropriate scale that encompasses the high and low frequencies • STEP 4: Draw the contiguous vertical bars Example Nursing Business Admin Education Computer Info Systems Political Science Art General Studies Nursing Education Education Psychology Business Admin Psychology Business Admin General Studies General Studies General Studies History History History General Studies Education Computer Info Systems Nursing Education General Studies Education History Class (Major) Frequency Percentage Art 1 3.6% Business Administration 3 10.7% Computer Info Systems 2 7.1% Education 6 21.4% General Studies 6 21.4% History 4 14.3% Nursing 3 10.7% Political Science 1 3.6% Psychology 2 7.1% TOTAL 28 100% Other Bar Graphs Side-by-Side Bar Graph • Used to compare different groups • Typically, uses different colored bars to distinguish groups Stacked Bar Graph Histogram* • A bar graph that displays the data from a frequency distribution – Horizontal Scale (x-axis) is labeled using CLASS BOUNDARIES or MIDPOINTS – Vertical Scale (y-axis) is labeled using frequency – NOTE: bars are contiguous (No gaps) How do I create a histogram from a grouped frequency distribution? • MINITAB – Enter raw data into MINITAB Example-Construct a histogram of the ages of Nextel Cup Drivers. Use the class boundaries as the scale on the x-axis Ages of NASCAR Nextel Cup Drivers in Years (NASCAR.com) (Data is ranked---Collected Spring 2008) 21 21 21 23 23 23 24 25 25 26 26 26 26 27 27 28 28 28 28 29 29 29 29 30 30 30 30 31 31 31 31 31 32 34 35 35 35 36 36 37 37 38 38 39 41 42 42 42 43 43 43 44 44 44 44 45 45 46 47 48 48 48 49 49 49 50 50 51 51 65 72 Frequency Polygon • Line graph (rather than a bar graph) • Uses class midpoints rather than class boundaries on x-axis Ogive (Cumulative Frequency Polygon) • Line graph (rather than a bar graph) • Uses class boundaries on x-axis • Uses cumulative frequencies (total as you go) rather than individual class frequencies • Used to visually represent how many values are below a specified upper class boundary Another possibility • We can use the percentage (relative frequency) rather than the “tallies” (frequency) on the x-axis. – Relative Frequency Histogram – Relative Frequency Polygon – Relative Frequency Ogive • Used when a comparison between two data sets is desired, especially if the data sets are two different sizes • Overall shape (distribution) of graph is the same, but we use a % on the y-axis scale Stem and Leaf Plot* – Method for organizing data – Combination of sorting and graphing – Original Data is retained unlike with a grouped frequency distribution – “Leaves” are usually the last digit in each data value; right hand column of two-column table – “Stems” are remaining digits ; left hand column of two-column table Dotplot*(not in text) – Graph in which each data value is plotted as a point (or dot) along a single horizontal scale of values. – Dots representing equal values are stacked – Original data is retained Exam #1 Scores in Mrs. Ralston’s Math 1111 classes in Fall 2008 39 40 41 43 50 59 59 61 63 64 65 66 66 68 70 70 70 71 73 73 75 76 77 78 79 79 80 80 80 80 81 81 82 83 84 84 84 84 85 86 86 87 88 89 89 90 90 90 90 91 91 92 94 94 94 94 95 96 96 98 99 100 100 100 100 • Construct a frequency distribution for the Exam #1 scores. Use 8 classes with a class width of 10 beginning with a lower class limit of 30. • Use the raw data to construct a histogram of the Exam #1 scores in MINITAB • Use the raw data to construct a dotplot of the Exam #1 scores in MINITAB Homework • Page 71 #2 and 3 (create a Pareto Chart) • Page 74 #16 (create a Stem and Leaf Plot) • Worksheet