McGraw-Hill/Irwin

Describing Data:

Percentiles

Copyright © 2011 by the McGraw-Hill Companies, Inc. All rights reserved.

LEARNING OBJECTIVES

LO1. Compute and understand quartiles, deciles , and percentiles .

LO2. Construct and interpret box plots .

4-2

Quartiles, Deciles and

Learning Objective 2

Compute and understand quartiles, deciles , and percentiles .

Percentiles

 The median splits the data into equal sized halves

Quartiles split the data into quarters

Deciles into tenths

And percentiles can be any split of our choosing

 These measures include quartiles, deciles, a

4-3

Lowest Data

Value

Quartiles

25%

50%

-

Median

50% value

-

50%

25% 25%

Q1 Q2 Q3

25%

Highest Data

Value

Deciles 1/10

10% 10% 10% 10% 10% 10% 10% 10% 10% 10%

4-4

Percentile Computation

 To formalize the computational procedure, let L p refer to the location of a desired percentile. So if we wanted to find the 33rd percentile we would use L

33

50th percentile, then L

50

. and if we wanted the median, the

LO2

 The number of observations is n, so if we want to locate the median, its position is at ( n + 1)/2, or we could write this as

( n + 1)( P /100), where P is the desired percentile.

4-5

Percentiles - Example

Listed below are the commissions earned last month by a sample of 15 brokers at Salomon Smith Barney’s

Oakland, California, office.

$2,038 $1,758 $1,721 $1,637

$2,097 $2,047 $2,205 $1,787

$2,287 $1,940 $2,311 $2,054

$2,406 $1,471 $1,460

Locate the median, the first quartile, and the third quartile for the commissions earned.

LO2

4-6

Percentiles – Example (cont.)

Step 1: Organize the data from lowest to largest value

LO2

$1,460

$1,758

$2,047

$2,287

$1,471

$1,787

$2,054

$2,311

$1,637

$1,940

$2,097

$2,406

$1,721

$2,038

$2,205

4-7

Percentiles – Example (cont.)

LO2

Step 2: Compute the first and third quartiles. Locate L

25 and L

75 using:

L

25

( 15

1 )

25

100

4 L

75

( 15

1 )

75

100

12

Therefore, the first and third quartiles are located at the 4th and 12th positions, respective ly

L

25

L

75

$ 1 , 721

$ 2 , 205

4-8

Learning Objective 3

Construct and interpret box plots .

Boxplots

A box plot is a graphical display, based on quartiles, that helps us picture a set of data.

To construct a box plot, we need only five statistics:

1. the minimum value,

2. Q1(the first quartile),

3. the median,

4. Q3 (the third quartile), and

5. the maximum value.

4-9

Boxplot - Example

LO3

4-10

Boxplot Example

Step1: Create an appropriate scale along the horizontal axis.

Step 2: Draw a box that starts at Q1 (15 minutes) and ends at Q3 (22 minutes). Inside the box we place a vertical line to represent the median (18 minutes).

Step 3: Extend horizontal lines from the box out to the minimum value (13 minutes) and the maximum value (30 minutes).

LO3

4-11

Example: Draw a Box & Whisker for

Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

(n = 9)

Q

1

(L

25

) is in the (9+1)*25/100 = 2.5 position of the ranked data so use the value half way between the 2 nd and 3 rd values, so Q

1

= 12.5

Q

1

Q

2 and Q

3 are measures of non-central location

= median, is a measure of central tendency

4-12

Quartile Measures

Calculating The Quartiles: Example

Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

(n = 9)

Q

1 is in the (9+1)*25/100 = 2.5 position of the ranked data, so Q

1

= (12+13)/2 = 12.5

Q

2 is in the (9+1)*50/100 = 5 th position of the ranked data, so Q

2

= median = 16

Q

3 is in the (9+1)*75/100 = 7.5 position of the ranked data, so Q

3

= (18+21)/2 = 19.5

Q

1

Q

2 and Q

3 are measures of non-central location

= median, is a measure of central tendency

4-13

Quartile Measures-

Calculation Rules

 When calculating the ranked position use the following rules

― If the result is a whole number then it is the ranked position to use

― If the result is a fractional half (e.g. 2.5, 7.5, 8.5, etc.) then average the two corresponding data values.

― If the result is not a whole number or a fractional half then interpolate between the data points.

4-14

Quartile Measures:

The Interquartile Range (IQR)

― The IQR is Q

3

– Q

1 and measures the spread in the middle 50% of the data

― The IQR is a measure of variability that is not influenced by outliers or extreme values

― Measures like Q

1

, Q

3

, and IQR that are not influenced by outliers are called resistant measures

4-15

The Interquartile Range

Example:

X minimum

Q

1

Median

(Q

2

)

Q

3

25% 25% 25% 25%

X maximum

11 12.5 16 19.5 22

Interquartile range

= 19.5 – 12.5 = 7

4-16

Distribution Shape and

The Boxplot

Negatively-Skewed Symmetrical Positively-Skewed

Q

1

Q

2

Q

3

Q

1

Q

2

Q

3

Q

1

Q

2

Q

3

Basic Business Statistics, 11e © 2009

Prentice-Hall, Inc..

4-17

Interpolation

If you found that the first quartile was the

13.75

th value then you interpolate like this:

Take the 13 th and 14 th data values

Find the difference |14 th -15 th |

Multiply the difference by 0.75

Add the calculated value to the 13 th value

4-18

Exercises – To Do

 Page 116 –

Q4-21

Q4-23

Q4-25

4-19

Stem and Leaf Diagrams

35 23 18 25 20

16 22 27 33 41

27 37 17 25 27

29 28 31 32 40

1

2

3

4

8

3

5

1

6 7

3

0

5 0 2 7

7 1 2

7 7 9 8

1 8 means 18

1

2

3

4

6

0

1

0

7 8

2

1

2 3 5 7

3 5 7

7 7 8 9

Stem and Leaf Diagrams

5.2

6.6

4.3

8.3

5.1

7.5

8.6

7.1

7.8

2.2

6.6

5.8

3.5

7.5

6.1

3.8

2.5

2.7

8.8

4.8

Raw data

The following data were collected on the ages of cyclists involved in road accidents

4-22

66 6 62 19 20 15 21 8 21 63 44 10 44

26 35 26 61 13 61 28 21 7 10 52 13 52

19 22 64 11 39 22 9 13 9 17 64 32 8

62 28 36 37 18 138 16 67 45 10 55 14 66

49 9 23 12 9 37 7 36 9 88 46 12 59

18 20 11 25 7 42 29 6 60 60 16 50 16

18 15 18 17 31 14 22 14 34 20 9 67 61

34

Total 92

Ages of cyclists in road accidents

Always include a

1 0 0 0 1 1 2 2 3 3 3 4 4 4 5 5 6 6 6 7 7 8 8 8 8 9 9

2 0 0 0 1 1 1 2 2 2 3 5 6 6 8 8 9

3 1 2 4 4 5 6 6 7 7 9

4 2 4 4 5 6 9

5 0 2 2 5 9

8 8

Key 6|7 means 67 years

4-24