Lect 04 - Measures of Dispersion

advertisement

Measures of Dispersion

Week 4

Dispersion

• Two groups of three students

Group 1

Group 2

• Mean mark

Group 1

4

7

7

7

10

7

4 + 7 + 10 = 21/3 = 7

Group 2 7 + 7 + 7 = 21/3 = 7

• Same mean mark, but Group 1’s marks are widely spread, Group 2’s are all the same

• The following diagram reinforces this point

2

3

Range

• The absolute difference between the highest and lowest value of the raw data

• Group of students 4 7 10

• Range = Maximum – Minimum

10 – 4 = 6

4

Interquartile Range

• This is the absolute difference between the upper and lower quartiles of the distribution.

• Interquartile Range =

Upper Quartile - Lower Quartile

• See next powerpoints for estimating quartiles

5

Quartiles (1)

• Upper quartile : that value for which 25% of the distribution is above it and 75% below

• Lower quartile : that value for which 75% of the distribution is above it and 25% below

6

Quartiles (2)

• If the data is ungrouped , then put the data in order in an array

• Find the quartile position , then estimate its value, as previously for the median

• Upper quartile (Q3): position = 3(n + 1)

4

• Lower quartile (Q1): position = (n + 1)

4

7

Quartiles (3)

Example: ungrouped data :

3, 5, 6, 9, 15, 27, 30, 35, 37

• Lower quartile: position = n + 1 = 9 + 1 = 2.5th

4 4

Lower quartile: value = 5.5

(mid-way between 2 nd and 3 rd number in array)

• Upper quartile: position = 3(n + 1) = 3(9 + 1)

4 4

= 7.5

th

Upper quartile: value = 32.5

(mid-way between 7 th and 8 th number in array)

8

Quartiles (4)

• Grouped data : use the same approach as for estimating the median for grouped data in week 4, except this time use the quartile positions

9

Semi-Interquartile Range

• This is half the interquartile range. It is sometimes called the Quartile Deviation

• Semi-Interquartile Range

= Upper Quartile - Lower Quartile

2

10

Example

Using previous ungrouped data

Interquartile range = UQ - LQ

= 32.5 – 5.5 = 27

Semi-interquartile range = UQ - LQ

2

= 32.5 – 5.5

2

= 27 = 13.5

2

11

Mean Deviation

• Average of the absolute deviations from the arithmetic mean (ignoring the sign)

• When two straight lines (rather than curved brackets) surround a number or variable it is referred to as the modulus and we ignore the sign

12

Mean Deviation of ungrouped data

• X

1

= 2, X

2

= 4, X

3

= 3

• MD =

1

 

X

2

 

X

3

X n

   

0

3

= ⅔

3

13

Variance

• If we square all the deviations from the arithmetic mean, then we no longer need to bother with dropping the signs since all the values will be positive.

• We can then replace the straight line brackets (modulus) for the Mean Deviation with the more usual round brackets.

• Variance is the average of the squared deviations from the arithmetic mean

14

Variance: ungrouped data (1)

• Variance = i n 

1

X X

 i

 2 n

• To calculate the variance

1. Calculate the mean value

X

2. Subtract the mean from each value in turn, that is, find

X i

X

3. Square each answer to get

X i

X

2

15

Variance: ungrouped data (2)

4. Add up all these squared values to get i n 

1

X i

X

2

5. Divide the result by n to get i n 

1

X

1

X

2

 n

6. You now have the average of the squared deviations from the mean (in square units)

16

Standard deviation (SD)

• This is simply the square root of the variance

• An advantage is that we avoid the square units of the variance

• Larger SD, larger the average dispersion of data from the mean

• Smaller SD, smaller the average dispersion of data from the mean

17

x i

Example 1: variance/standard deviation x

1

- x ( x

1

– x) 2

4

7

10

Total

4 – 7 = - 3

7 – 7 = 0

10 – 7 = 3

(-3 2 ) = 9

0 2 = 0

3 2 = 9

18

18

Solutions n 

X

X

2

 i n

18

3

6

Standard deviation is square root of 6

= 2.449 units

19

x i

Example 2: variance/standard deviation x i

- x ( x i

– x) 2

7

7

7

Total

7 – 7 = 0

7 – 7 = 0

7 – 7 = 0

0 2 = 0

0 2 = 0

0 2 = 0

0

20

Solution n 

X

X

2

 i n

0

3

0

Standard deviation is square root of 0 = 0 i.e. there is no spread of values

21

Variance of grouped data

S

2  i

 j

1 i

 j

F i

1

X i

2

 i

 j

1 i

 j

1

F i

F i

X i

2

F i where Fi = Frequency of ith class interval

Xi = mid point of ith class interval j = number of class intervals

22

Price of item (£)

No of items sold

LCB UCB

1.5 2.5

Fi

15

Xi

2

FiXi FiXi^2

30 60

2.5

3.5

4.5

5.5

3.5

4.5

5.5

6.5

2

19

10

14

60

3

4

5

6

6

76

50

84

246

18

304

250

504

1136

23

246

60

2

S

2 

1136

60

S 2 = 18.93 – 4.1

2

S 2 = 18.93 – 16.81

S 2 = £2.12

2

S = √ 2.12 = £1.45

24

Co-efficient of variation (C of V)

• A measure of relative dispersion

• Given by i.e. the standard

X deviation divided by the arithmetic mean of the data.

• Data sets with a higher co-efficient of variation have higher relative dispersion

25

Download