Notes Outline - Describing Distributions - AP-Stats-2012-2013

advertisement
AP Stats – Mr. Veshio
Unit 1 – Chapter 5 – Describing Distributions Numerically
I – Center – The Median
Let’s use the data we obtained from the Steelers activity – here’s a stem-and-leaf
plot of the Steelers’ players’ weights:
34|4
33|7
32|1 4 5
31|5 5
30|1 5 5 5 5 7 7
29|0 2 5 8 8
28|0 5
27|0
26|5
25|0 2 3 5 5 6
24|1 1 2 3 3 3 4 7 8
23|0 0 4 4 9
22|0 2 5 5 8
21|3 6
20|1 5 5 6 7 8 9
19|0 0 0 1 2 5 5
18|0 3 4 5 6 9
17|7 9
Steeler Player Weight
(17|7 means 177 lbs.)
Median –
BY HAND –
Range –
Disadvantage?
Better way to describe spread – Interquartile Range
Quartiles –
Lower quartile –
Upper quartile –
Q1 of the Steelers data –
Q3 of the Steelers data –
Interquartile Range (IQR) –

Q3 – Q1 =
AP Stats – Mr. Veshio
Unit 1 – Chapter 5
-1-
AP Stats – Mr. Veshio
Unit 1 – Chapter 5 – Describing Distributions Numerically
5-number summary –
Here’s our 5-# summary for the Steelers players’ weights:
II – Boxplots
We can use a 5-# summary to make a different kind of display – a BOXPLOT
1.) Draw a single horizontal axis (could be vertical, see p. 77) that spans the extent
of the data. Above the axis, draw short vertical lines at Q1, the median and Q3.
Connect them with horizontal lines to form a box
2.) Create “fences” – these fences are not part of the final display – they will help
us identify outliers
Upper fence – Q3 + (1.5 x IQR) =
Lower fence – Q1 – (1.5 x IQR) =
ASK: “Do any of the data values fall outside these fences?” If not, you have not
outliers. If so, those data values will be marked differently (see p. 78)
3.) Draw lines from the ends of the box left and right to the most extreme data
values found within the fences (usually the min and max).
What info can you obtain from this display?
AP Stats – Mr. Veshio
Unit 1 – Chapter 5
-2-
AP Stats – Mr. Veshio
Unit 1 – Chapter 5 – Describing Distributions Numerically
Let’s make them on the calculator – see p. 80
TI Tips
 Enter data: STAT -> EDIT -> 1: Edit
 To clear whole list – move cursor onto list name (L1, L2, etc.), hit CLEAR,
hit down arrow
 Graphing Boxplots:
 2ND -> Y= (STAT PLOT) -> 1: Plot1
o Turn it ON
o Select first boxplot (this one will show outliers)
o XList: the data you want to plot (if its in L1, put L1)
 To see graph properly: ZOOM -> 9: ZoomStat
 TRACE to see “min, Q1, median, Q3, max”
 Plot data in Plot2 under STATPLOT to compare (may have to hit
ZoomStat again)
What should we do when we want to compare groups with boxplots and/or 5number summaries?
 See “Step-By-Step: Comparing Groups”
III – Center – Mean
What’s the difference between median and mean?

Mean –
-
Median vs. Mean

STEELERS’ DATA – What’s the mean of the player weights?
*** On the TI: STAT -> CALC -> 1: 1-Var Stats (it will automatically
calculate statistics on L1 – if your data is in another list, make sure you put the list
name after “1-Var Stats” on the main screen***
AP Stats – Mr. Veshio
Unit 1 – Chapter 5
-3-
AP Stats – Mr. Veshio
Unit 1 – Chapter 5 – Describing Distributions Numerically
How does it compare to the median?
Which measure of center could we use to describe the “center” of these data?
IV – Spread – Standard Deviation
Didn’t we already have IQR? –
Standard deviation –
How to find standard deviation
1.) Find the mean of the data
2.) Take each data value and subtract the mean (find out how “far” each value is
away from the mean) – these are the “deviations”
Could we stop here and average the deviations?
3.) Square each deviation
4.) Sum up the squared deviations
5.) Find the average of the squared deviations by dividing the sum by n - 1
Wait – huh? “n – 1”? Why not “n” –
The result of these steps is a value called the VARIANCE, denoted by “s2” and
given by this formula:
s2 = ∑ (y – y)2 / n- 1
So why not just use variance to measure the spread of the data?
AP Stats – Mr. Veshio
Unit 1 – Chapter 5
-4-
AP Stats – Mr. Veshio
Unit 1 – Chapter 5 – Describing Distributions Numerically
Solution – take the square root of the variance – will give us standard deviation,
denoted by “s” and given by this formula:
s = √ (∑ (y – y)2/ n- 1 )
**ActivStats** 5 – 2 “The Spread of a Distribution”
Calculators can find standard deviation for you – it’s listed under “1-Var Stats” –
“Sx”
Steelers’ data – what was the std. dev. for the players’ weights?
How could we interpret this?
What Can Go Wrong?
 Reality check

 Sort the data

 Numerical summaries of categorical variables are meaningless!

 Watch out for multiple modes

 Beware of outliers

 MAKE A PICTURE

 Be careful when comparing groups that have very different spreads

AP Stats – Mr. Veshio
Unit 1 – Chapter 5
-5-
AP Stats – Mr. Veshio
Unit 1 – Chapter 5 – Describing Distributions Numerically
Just Checking
1.)
2.)
3.)
HW: re – read Ch. 5, do #’s 3, 10, 12, 16, 22, 25, 29, 33
AP Stats – Mr. Veshio
Unit 1 – Chapter 5
-6-
Download