Sections 10-1 & 10-2 – Displaying Data Categorical Data: Numerical Data: Data that represent characteristics of objects or individuals (categories). Data that represent numerical information about an object or individual. M & M Activity PART I: PICTOGRAPHS Arrange your M&M’s on the chart below. 10 9 8 7 6 5 4 3 2 1 Brown Green Orange Red Blue Yellow M&M Color Each represents _______________ 1 As you remove each candy from the graph, color its circle the appropriate color. This type of graph is called a PICTOGRAPH because the data are displayed using parallel columns (or rows) of pictures in which each picture represents one or more of the objects being compared. You may now eat your M&M’s ! PART II: More PICTOGRAPHS Construct a pictograph for the number of each color of M&M’s in the entire bag below. Let each circle represent more than 1 M&M. Brown Green Orange Red Blue Yellow Frequency Each represents _______________ PART III: DOT PLOTS Record the number of yellow M&M’s in each person’s sample. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Number of Yellow M&M’s This type of graph is called a DOT PLOT. Dot plots provide a quick, simple way to organize numerical data. They work best when there are fewer than 25 data points. 2 Ordered Stem-and-Leaf Plots Test scores for Ms. Johnson’s class: 55, 62, 65, 70, 71, 72, 72, 75, 79, 82, 85, 88, 91, 95, 97, 99 Test Scores Frequency Tables The following are the ages of a class of statistics students: 17, 19, 18, 22, 25, 30, 34, 36, 26, 22, 24, 20, 21, 23, 19, 19, 18 Construct a grouped frequency table for the data. Start the first class at 17 with intervals of 4. Ages of Statistics Students Tallies Frequency Histograms (adjoining bars, numeric data) 8 6 Frequency 4 2 0 17-20 21-24 25-28 29-32 33-36 Ages of Statistics Students 3 Bar Graphs (spaces between the bars, categorical data) Favorite Cola Pepsi Coke Dr.P Pie Charts (Circle Graphs) Monthly Income 40% Rent 20% Food 10% Entertainment 30% Other expenses 4 Line Graphs (good at showing trend over time) Inches of Precipitation 35 30 25 20 15 10 5 0 Jan Feb Mar Apr May June July a) During which month was the precipitation the highest? b) During which month was the precipitation the lowest? c) Between which two months was the decrease in precipitation the greatest? d) Approximate the amount of precipitation in February. Scatterplots # of Absences 0 1 2 3 4 5 Test Grade 98, 90, 85 80 75, 80 78, 70 60, 68 70, 55, 50 0 1 2 3 4 5 absences Scatterplots are good for determining the trend or relationship between the variables. The graph can illustrate a positive trend, negative trend, or no trend. Once a “trend line” has been found, predictions can be made. 5 Section 10-3 – Measures of Central Tendency and Variation Measures of Center: Arithmetic Mean – Add the data values and divide by the total number of values. (balancing point of the data) Median – Middle data value when data are arranged in order from smallest to largest. If you have an ODD number of values, the median is the middle value. If you have an EVEN number of values, the median will be the average of the 2 middle values. Mode – The value that occurs most often. The data set can be bimodal, multimodal, or have no mode. Example: Calculate the mean, median, and mode for the following data sets: 20 25 30 30 40 5 8 9 12 12 45 50 55 Example: Lily’s test average for 5 tests is an 85. Lily only remembers her grades on the first 4 tests (90, 74, 84, 82). What did Lily make on the fifth test? 6 Which is most appropriate? Mean, Median, or Mode? See pg. 635 Measures of Spread: Range – Highest value – Lowest value Interquartile Range (IQR) – Upper Quartile – Lower Quartile Shows where the middle 50% of the data points lie Example: The following are the amounts of sugar (in one gram) in popular breakfast cereals. Calculate the IQR and the range. 3 7 13 24 30 43 44 47 47 7 Box and Whisker Plot – Extend a horizontal line from the minimum value to the maximum value Draw a box with vertical lines at Q1, Q2 , and Q3 . 5-Number Summary: 1. 2. 3. 4. 5. Minimum value Maximum value Q1 Q2 (median) Q3 Example: Draw a box plot for the month in which your classmates were born. 1 2 3 4 5 6 7 8 9 10 11 12 Outlier – A data point that is more than 1.5 times above or below the IQR . ie, more than Q3 + 1.5 * IQR or less than Q1 – 1.5 * IQR Outliers can have dramatic effects on means and standard deviations. Example: For a certain data set, the 5-number summary is as follows: Minimum value = 13 Maximum value = 97 Q1 = 17 Q2 (median) = 27 Q3 = 45.5 Calculate the IQR and determine if 97 is an outlier. 8 Measures of Variation: A rough estimate of the “average” distance that the scores are from the mean. Standard Deviation: (x1 - x)2 + (x 2 - x)2 + (x 3 - x)2 + s= n + (x n - x)2 Example: Calculate the Standard Deviation for the following data set: Ages of Cars: 9 40 8 5 4 Normal Distributions The graphs of Normal Distributions are the bell-shaped curves called normal curves. See pg. 647 9 Example: Suppose the area under the curve shows the population of women in the US. Suppose that the mean height is 63.6 inches and the standard deviation is 2.5 inches. a) What percentage of women are between 61.1 and 66.1 inches tall? b) What percentage of women are taller than 71.1 inches? c) What percentage of women are between 61.1 and 68.6 inches tall? d) What percentage of women are shorter than 61.1 inches? Section 10-4 – Abuses of Statistics One major misuse of statistics involves the use of graphs. See the graphs on pg. 660–664 10