Chapter 1: Exploring Data Section 1.1: Displaying Data Individuals: The objects being described by the data Variable: Any characteristic of an individual Distribution of variable: pattern of variation tells what values the variable takes AND how often each value occurs Who? Individuals, # What? variables, units Categorical place in a group (category) Why? did we gather the data Quantitative ● take numerical values for which it makes sense to do arithmetic DISPLAYING CATEGORICAL DATA *Bar Graph Categorical data Bars don’t touch Can rearrange categories Scale may be in counts or percents 50% 8 40% 6 30% 4 20% 2 10% Truck Van Car (Scale in percents) Percentage of People (Scale in counts) Number of People What kind of car do you drive? (example only) - Truck 3 15% - Van 0 0% - Car 9 45% - SUV 2 10% - Other 4 20% - None 2 10% 10 SUV Other None Type of Car (Categories) *Pie Graph (Chart) Categorical Data Must be Out of a Whole (100%) *Bar graphs are more flexible since you don’t necessarily need the whole (but you must be using the counts side unless the counts are in percentages – Example 1.3 p. 10) DISPLAYING QUANTITATIVE DATA How tall are you in inches? Class Heights (example only) 59 63 65 67 68 60 64 67 67 68 62 64 67 67 68 62 65 67 67 68 62 65 67 68 69 69 69 70 70 70 71 71 71 72 72 72 72 73 73 74 *Dotplot Quantitative data Values at the bottom Each dot = 1 piece of data Heights are how many at each value 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 Height of students (values) *Stem plot Quantitative data Around 5 stems is good minimum May have to split (high/low) Numbers listed in increasing order Each stem must have an equal # of possibilities May need to round data (5.3 ≈ 5) Leaves can only be one digit Heights of Students 55-59 60-64 65-69 70-74 75-79 5H 6L 6H 7L 7H 5 6 6 7 7 9 0222344 5557777777788888999 0001112222334 Key: 6 | 0 means 60 inches Read Example 1.5 on p. 13 – 14. Comparative Stem plot: used to compare two groups o For example: boys height vs. girls height Boys Girls 5 9 442 6 0223 9998887777755 6 577788 4332222100 7 011 7 Example 1.6 on page 19 14 (Scale in counts) Number of Presidents *Histogram Bars touch “Classes” need to have equal widths 5 “classes” is a good minimum Label & scale axes Frequency can be in counts or % 12 10 8 6 4 2 0 40 45 DESCRIBING DISTRIBUTIONS 50 55 60 Age of Inauguration 65 70 Use SOCS – Shape, Outliers, Center, Spread Outlier: an individual observation in any graph of data that falls outside the overall pattern of the graph. 5 6 6 7 7 9 0222344 5557777777788888999 0001112222334 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 Height of students (values) Shape: Peak at 67 Outliers: None Center: 67 Spread: 59 – 74 Key: 6 | 0 means 60 inches Shape: Peak at 65 to 69 Outliers: None Center: 67 Spread: 59 – 74 Kinda like range Shape: Roughly symmetric, one peak Outliers: none Center: About 55 years Spread: 42 – 69 (Scale in counts) Number of Presidents 14 12 10 8 6 4 2 0 40 45 50 55 60 Age of Inauguration 65 70 MORE ON SHAPE - Peaks are always good to mention *Symmetric: roughly the same on both sides *Uniform: roughly the same height everywhere *Skewed Skewed Right Skewed Left Direction towards tail Modifiers: Roughly Approximately Somewhat Clearly Slightly (used more with skewed)