AP Stats – Mr. Veshio Unit 1 – Chapter 5 – Describing Distributions Numerically I – Center – The Median Let’s use the data we obtained from the Steelers activity – here’s a stem-and-leaf plot of the Steelers’ players’ weights: 34|4 33|7 32|1 4 5 31|5 5 30|1 5 5 5 5 7 7 29|0 2 5 8 8 28|0 5 27|0 26|5 25|0 2 3 5 5 6 24|1 1 2 3 3 3 4 7 8 23|0 0 4 4 9 22|0 2 5 5 8 21|3 6 20|1 5 5 6 7 8 9 19|0 0 0 1 2 5 5 18|0 3 4 5 6 9 17|7 9 Steeler Player Weight (17|7 means 177 lbs.) Median – BY HAND – Range – Disadvantage? Better way to describe spread – Interquartile Range Quartiles – Lower quartile – Upper quartile – Q1 of the Steelers data – Q3 of the Steelers data – Interquartile Range (IQR) – Q3 – Q1 = AP Stats – Mr. Veshio Unit 1 – Chapter 5 -1- AP Stats – Mr. Veshio Unit 1 – Chapter 5 – Describing Distributions Numerically 5-number summary – Here’s our 5-# summary for the Steelers players’ weights: II – Boxplots We can use a 5-# summary to make a different kind of display – a BOXPLOT 1.) Draw a single horizontal axis (could be vertical, see p. 77) that spans the extent of the data. Above the axis, draw short vertical lines at Q1, the median and Q3. Connect them with horizontal lines to form a box 2.) Create “fences” – these fences are not part of the final display – they will help us identify outliers Upper fence – Q3 + (1.5 x IQR) = Lower fence – Q1 – (1.5 x IQR) = ASK: “Do any of the data values fall outside these fences?” If not, you have not outliers. If so, those data values will be marked differently (see p. 78) 3.) Draw lines from the ends of the box left and right to the most extreme data values found within the fences (usually the min and max). What info can you obtain from this display? AP Stats – Mr. Veshio Unit 1 – Chapter 5 -2- AP Stats – Mr. Veshio Unit 1 – Chapter 5 – Describing Distributions Numerically Let’s make them on the calculator – see p. 80 TI Tips Enter data: STAT -> EDIT -> 1: Edit To clear whole list – move cursor onto list name (L1, L2, etc.), hit CLEAR, hit down arrow Graphing Boxplots: 2ND -> Y= (STAT PLOT) -> 1: Plot1 o Turn it ON o Select first boxplot (this one will show outliers) o XList: the data you want to plot (if its in L1, put L1) To see graph properly: ZOOM -> 9: ZoomStat TRACE to see “min, Q1, median, Q3, max” Plot data in Plot2 under STATPLOT to compare (may have to hit ZoomStat again) What should we do when we want to compare groups with boxplots and/or 5number summaries? See “Step-By-Step: Comparing Groups” III – Center – Mean What’s the difference between median and mean? Mean – - Median vs. Mean STEELERS’ DATA – What’s the mean of the player weights? *** On the TI: STAT -> CALC -> 1: 1-Var Stats (it will automatically calculate statistics on L1 – if your data is in another list, make sure you put the list name after “1-Var Stats” on the main screen*** AP Stats – Mr. Veshio Unit 1 – Chapter 5 -3- AP Stats – Mr. Veshio Unit 1 – Chapter 5 – Describing Distributions Numerically How does it compare to the median? Which measure of center could we use to describe the “center” of these data? IV – Spread – Standard Deviation Didn’t we already have IQR? – Standard deviation – How to find standard deviation 1.) Find the mean of the data 2.) Take each data value and subtract the mean (find out how “far” each value is away from the mean) – these are the “deviations” Could we stop here and average the deviations? 3.) Square each deviation 4.) Sum up the squared deviations 5.) Find the average of the squared deviations by dividing the sum by n - 1 Wait – huh? “n – 1”? Why not “n” – The result of these steps is a value called the VARIANCE, denoted by “s2” and given by this formula: s2 = ∑ (y – y)2 / n- 1 So why not just use variance to measure the spread of the data? AP Stats – Mr. Veshio Unit 1 – Chapter 5 -4- AP Stats – Mr. Veshio Unit 1 – Chapter 5 – Describing Distributions Numerically Solution – take the square root of the variance – will give us standard deviation, denoted by “s” and given by this formula: s = √ (∑ (y – y)2/ n- 1 ) **ActivStats** 5 – 2 “The Spread of a Distribution” Calculators can find standard deviation for you – it’s listed under “1-Var Stats” – “Sx” Steelers’ data – what was the std. dev. for the players’ weights? How could we interpret this? What Can Go Wrong? Reality check Sort the data Numerical summaries of categorical variables are meaningless! Watch out for multiple modes Beware of outliers MAKE A PICTURE Be careful when comparing groups that have very different spreads AP Stats – Mr. Veshio Unit 1 – Chapter 5 -5- AP Stats – Mr. Veshio Unit 1 – Chapter 5 – Describing Distributions Numerically Just Checking 1.) 2.) 3.) HW: re – read Ch. 5, do #’s 3, 10, 12, 16, 22, 25, 29, 33 AP Stats – Mr. Veshio Unit 1 – Chapter 5 -6-