David S. Moore • George P. McCabe Introduction to the Practice of Statistics Fifth Edition Chapter 1: Looking at Data—Distributions Copyright © 2005 by W. H. Freeman and Company Modifications and Additions by M. Leigh Lunsford, 20052006 Class Website www.IntroStats.blogspot.com • Policies and Syllabus doc • Software • Homework assignments, test announcements, lecture slides, etc. • Continually updated - check it regularly! Technology Requirements • MegaStat Plugin for Excel (website) • Data Sets in Excel Format on CD (CD accompanying text) • TI-83 What is Statistics?? The Science of Learning from Data The Collection and Analysis of Data Experimental Design Chapter 3 Probability Chapter 4 Descriptive Statistics (Data Exploration) Chapters 1, 2 Inferential Statistics Chapters 5 - 8 Chapter 1 - Looking at Data 1.1 Displaying Distributions with Graphs 1.2 Describing Distributions with Numbers 1.3 Density Curves and Normal Distributions Section 1.1 Displaying Distributions with Graphs Data Basics Variable Types An Example (p. 5) Graphs for Categorical Vars. • Bar Graphs • Pie Charts Educational Level Example (page 7): – A Bar Graph by Hand – A Pie Chart by Hand Homework: Try to do these in Excel! Graphs for Quantitative Data • Stemplots (Stem and Leaf Plots) – Generally for small data sets • Histograms • Time Plots (if applicable) Let’s look at an example to see what types of questions one may ask and how these plots help to visualize the answers! Example 1.7 Page 14 Descriptive and Inferential Stats 1. 2. 3. 4. What percent of the 60 randomly chosen fifth grade students have an IQ score of at least 120? Based on this data, approximately what percent of all fifth grade students have an IQ score of at least 120? What is the average IQ score of the fifth grade students in this sample? Based on this data, what is the average IQ score of all fifth grade students (i.e. the population) from which the sample was drawn? Inferential? 2 and 4 Descriptive? 1 and 3 Let’s Make a Stemplot! An Example (Ex. 1.7 p.14) Data in Table 1.3 p. 14 (and on next slide) Stem and Leaf Plot for Example IQ Test Scores for 60 Randomly Chosen 5th Grade Students Generated Using the Descriptive Statistics Menu on Megastat Stem and Leaf plot for iq stem unit = 10 leaf unit = 1 Frequency Stem 3 8 129 4 9 0467 14 10 01112223568999 17 11 00022334445677788 11 12 22344456778 9 13 013446799 2 14 25 60 Leaf Now Let’s Make a Histogram! • Use the Same Data in Example 1.7 (Data in Table 1.3) • We will start by hand….using class widths of 10 starting at 80… • Compare the Stemplot to the Histogram! Histogram for Example iq lower cumulative upper midpoint width frequency percent frequency percent 80 < 90 85 10 3 5.0 3 5.0 90 < 100 95 10 4 6.7 7 11.7 100 < 110 105 10 14 23.3 21 35.0 110 < 120 115 10 17 28.3 38 63.3 120 < 130 125 10 11 18.3 49 81.7 130 < 140 135 10 9 15.0 58 96.7 140 < 150 145 10 2 3.3 60 100.0 60 100.0 IQ Scores of Randomly Chosen Fifth Grade Students 30 25 Compare this Histogram to the Stem & Leaf Plot we Generated Earlier! 15 10 5 IQ Score 15 0 14 0 13 0 12 0 11 0 10 0 90 0 80 Percent 20 Recall Our Earlier Question 1 1. What percent of the 60 randomly chosen fifth grade students have an IQ score of at least 120? • Numerically? 18.3%+15%+3.3%=36.6% (11+9+2)/60=.367 or 36.7% • How to Represent Graphically? Grey Shaded Region corresponds to this 36.6% of data What is different from the histogram we generated in class?? Descriptors we will be interested in for data and population distributions. Let’s Look at the Distribution we Just Created: • Overall Pattern: Shape (modes, tails (skewness), symmetry) Center (mean, median) Spread (range, IQR, standard deviation) • Deviations: Outliers • Overall Pattern: Shape, Center, Spread? • Deviations: Outliers? Data Analysis – An Interesting Example (p. 9)! 80 Calls •Overall Pattern: Shape, Center, Spread? •Deviations: Outliers? Moral of this story: making your class widths too small can obscure important features of your data. Time Plots – For Data Collected Over Time… Example: Mississippi River Discharge p.19 (data p. 21) Example – Dealing with Seasonal Variation Original data Seasonal variation Trend line Residuals = original data - trend line - seasonal variation Extra Slides from Homework • • • • • • Problem 1.19 Problem 1.20 Problem 1.21 Problem 1.31 Problem 1.36 Problem 1.37-1.38 Problem 1.19, page 30 Problem 1.20, page 31 Problem 1.21, page 31 Problem 1.31, page 36