Variation

Variation

Measures of variation quantify how spread out the data is.

Variation is one of the core ideas in Statistics

Super-simple measure of variation

Range = highest value – lowest value

Not good for much, but gives us some idea how spread out the data is.

Standard Deviation

Standard Deviation is a measure of variation based on the mean

Because of this, it can be strongly influenced by outliers, just like the mean.

Standard Deviation is always positive or 0 (zero only if all the data are the same)

The standard deviation has the same units as the data

Calculating Standard Deviation

Definitional formula s





( x n





1 x )

2

Notice we are measuring variation of the data from the mean.

This formula is for the sample standard deviation , and is based on the sample mean and sample size

Calculating Standard Deviation

Shortcut Formula s

 n

  x n (

2 n







 

1 )

2

The advantage: No need to calculate the mean first

The disadvantage: Doesn’t make as much sense

Example: Definitional Form x



12 .

3

Data x

7

8

10

11

13

25 x

 x

7-12.3 = -5.3

8-12.3 = -4.3

10-12.3 = -2.3

11-12.3 = -1.3

13-12.3 = 0.7

 x

 x



2

28.09

18.49

5.29

1.69

.49

25-12.3 = 12.7

161.29

s





( x n





1 x )

2



215 .

34

5



6 .

6

Data x

7

8

10

11

13

25

Sums: 74

Example: Shortcut Form x

2

49

64

100

121

169

625

1128 s

 s

 n

  x n (

2 n







 

1 )

2

6



1128

  

2

6 ( 6



1 ) s



1292

30 s



6 .

6

Population Standard Deviation

If we have the population data, we can calculate the population standard deviation.

To distinguish it, we use a different symbol.

 



( x

 

)

2

N

Variance

Sample Variance: s

2

Population Variance:



2

Understanding Standard Deviation

Main idea:

Bigger value, data is more spread out.

Smaller value, data is closer together.

Rule of Thumb

To very roughly approximate s , s

 range

4

Rough interpretation:

“Most” data will be within two standard deviations of the mean. In other words,

Approximate highest value

 x



2 s

Approximate lowest value

 x



2 s

Empirical Rule

For data sets with a bell-shaped distribution ,

Example

For a particular fast-food store, the time people have to wait at the drive-through has a bell-shaped distribution with x



3 .

5 min s



0 .

7 min

Then about 68% of people wait between x

 s



2 .

8 min and x

 s



4 .

2 min

About 95% of people wait between x



2 s



2 .

1 min and x



2 s



4 .

9 min

Almost everyone (99.7%) of people wait between x



3 s



1 .

4 min and x



3 s



5 .

6 min

Homework

2.5: 3, 9, 21, 23, 25, 33

Variation

Related documents

Products

Support

Variation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib