McGraw-Hill/Irwin
Copyright © 2011 by the McGraw-Hill Companies, Inc. All rights reserved.
LEARNING OBJECTIVES
LO1. Compute and understand quartiles, deciles , and percentiles .
LO2. Construct and interpret box plots .
4-2
Learning Objective 2
Compute and understand quartiles, deciles , and percentiles .
The median splits the data into equal sized halves
Quartiles split the data into quarters
Deciles into tenths
And percentiles can be any split of our choosing
These measures include quartiles, deciles, a
4-3
Lowest Data
Value
Quartiles
25%
50%
-
Median
50% value
-
50%
25% 25%
Q1 Q2 Q3
25%
Highest Data
Value
Deciles 1/10
10% 10% 10% 10% 10% 10% 10% 10% 10% 10%
4-4
To formalize the computational procedure, let L p refer to the location of a desired percentile. So if we wanted to find the 33rd percentile we would use L
33
50th percentile, then L
50
. and if we wanted the median, the
LO2
The number of observations is n, so if we want to locate the median, its position is at ( n + 1)/2, or we could write this as
( n + 1)( P /100), where P is the desired percentile.
4-5
Listed below are the commissions earned last month by a sample of 15 brokers at Salomon Smith Barney’s
Oakland, California, office.
$2,038 $1,758 $1,721 $1,637
$2,097 $2,047 $2,205 $1,787
$2,287 $1,940 $2,311 $2,054
$2,406 $1,471 $1,460
Locate the median, the first quartile, and the third quartile for the commissions earned.
LO2
4-6
Step 1: Organize the data from lowest to largest value
LO2
$1,460
$1,758
$2,047
$2,287
$1,471
$1,787
$2,054
$2,311
$1,637
$1,940
$2,097
$2,406
$1,721
$2,038
$2,205
4-7
LO2
Step 2: Compute the first and third quartiles. Locate L
25 and L
75 using:
L
25
( 15
1 )
25
100
4 L
75
( 15
1 )
75
100
12
Therefore, the first and third quartiles are located at the 4th and 12th positions, respective ly
L
25
L
75
$ 1 , 721
$ 2 , 205
4-8
Learning Objective 3
Construct and interpret box plots .
A box plot is a graphical display, based on quartiles, that helps us picture a set of data.
To construct a box plot, we need only five statistics:
1. the minimum value,
2. Q1(the first quartile),
3. the median,
4. Q3 (the third quartile), and
5. the maximum value.
4-9
LO3
4-10
Step1: Create an appropriate scale along the horizontal axis.
Step 2: Draw a box that starts at Q1 (15 minutes) and ends at Q3 (22 minutes). Inside the box we place a vertical line to represent the median (18 minutes).
Step 3: Extend horizontal lines from the box out to the minimum value (13 minutes) and the maximum value (30 minutes).
LO3
4-11
Example: Draw a Box & Whisker for
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22
(n = 9)
Q
1
(L
25
) is in the (9+1)*25/100 = 2.5 position of the ranked data so use the value half way between the 2 nd and 3 rd values, so Q
1
= 12.5
Q
1
Q
2 and Q
3 are measures of non-central location
= median, is a measure of central tendency
4-12
Quartile Measures
Calculating The Quartiles: Example
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22
(n = 9)
Q
1 is in the (9+1)*25/100 = 2.5 position of the ranked data, so Q
1
= (12+13)/2 = 12.5
Q
2 is in the (9+1)*50/100 = 5 th position of the ranked data, so Q
2
= median = 16
Q
3 is in the (9+1)*75/100 = 7.5 position of the ranked data, so Q
3
= (18+21)/2 = 19.5
Q
1
Q
2 and Q
3 are measures of non-central location
= median, is a measure of central tendency
4-13
Quartile Measures-
Calculation Rules
When calculating the ranked position use the following rules
― If the result is a whole number then it is the ranked position to use
― If the result is a fractional half (e.g. 2.5, 7.5, 8.5, etc.) then average the two corresponding data values.
― If the result is not a whole number or a fractional half then interpolate between the data points.
4-14
Quartile Measures:
The Interquartile Range (IQR)
― The IQR is Q
3
– Q
1 and measures the spread in the middle 50% of the data
― The IQR is a measure of variability that is not influenced by outliers or extreme values
― Measures like Q
1
, Q
3
, and IQR that are not influenced by outliers are called resistant measures
4-15
The Interquartile Range
Example:
X minimum
Q
1
Median
(Q
2
)
Q
3
25% 25% 25% 25%
X maximum
11 12.5 16 19.5 22
Interquartile range
= 19.5 – 12.5 = 7
4-16
Negatively-Skewed Symmetrical Positively-Skewed
Q
1
Q
2
Q
3
Q
1
Q
2
Q
3
Q
1
Q
2
Q
3
Basic Business Statistics, 11e © 2009
Prentice-Hall, Inc..
4-17
If you found that the first quartile was the
13.75
th value then you interpolate like this:
Take the 13 th and 14 th data values
Find the difference |14 th -15 th |
Multiply the difference by 0.75
Add the calculated value to the 13 th value
4-18
Page 116 –
Q4-21
Q4-23
Q4-25
4-19
35 23 18 25 20
16 22 27 33 41
27 37 17 25 27
29 28 31 32 40
1
2
3
4
8
3
5
1
6 7
3
0
5 0 2 7
7 1 2
7 7 9 8
1 8 means 18
1
2
3
4
6
0
1
0
7 8
2
1
2 3 5 7
3 5 7
7 7 8 9
5.2
6.6
4.3
8.3
5.1
7.5
8.6
7.1
7.8
2.2
6.6
5.8
3.5
7.5
6.1
3.8
2.5
2.7
8.8
4.8
The following data were collected on the ages of cyclists involved in road accidents
4-22
66 6 62 19 20 15 21 8 21 63 44 10 44
26 35 26 61 13 61 28 21 7 10 52 13 52
19 22 64 11 39 22 9 13 9 17 64 32 8
62 28 36 37 18 138 16 67 45 10 55 14 66
49 9 23 12 9 37 7 36 9 88 46 12 59
18 20 11 25 7 42 29 6 60 60 16 50 16
18 15 18 17 31 14 22 14 34 20 9 67 61
34
Total 92
Ages of cyclists in road accidents
Always include a
1 0 0 0 1 1 2 2 3 3 3 4 4 4 5 5 6 6 6 7 7 8 8 8 8 9 9
2 0 0 0 1 1 1 2 2 2 3 5 6 6 8 8 9
3 1 2 4 4 5 6 6 7 7 9
4 2 4 4 5 6 9
5 0 2 2 5 9
8 8
Key 6|7 means 67 years
4-24