Basic concepts: mean and variance

advertisement
Statistics
Basic Concepts: Mean and Standard Deviation
The first concept is a rather intuitive one: the mean (or average, or arithmetic
mean).
The mean is the sum of a list of numbers, divided by the total number of
numbers in the list.
The following table represents the different marks obtained by 7 students:
Mark
33%
42%
51%
59%
67%
75%
83%
So the mean or average mark is
33  42  51  59  67  75  83 410

 58.57
7
7
A list of number can be represented in a different way. For example, say that
in a group of ten students, seven students got 46% and three students got
61%. In this case you’d have to calculate the mean as follows:
7  46  3  61
 50.5
10
Now, say that in a group of ten students, five students got 17% and five
students got 83%. The mean is, in this case,
5 17  5  83
 50 .
10
Even though the average mark is 50%, all the marks are far from 50% so the
mean does not tell us much about the list of numbers, here.
We need another tool to tell us whether the numbers are close to their mean
or far from it. This tool is the standard deviation.
The standard deviation measures how far, on average, the numbers are
from their mean (the standard deviation is usually represented by the greek
letter  ).
This is not such an intuitive concept, but if you think about it for a while, it
makes sense.
We need a mathematical formula to better explain the concept.
Say you are given a list of numbers x1 ,..., x N . Then the mean is
x
x1  ...  x N
N
and the standard deviation is



2
1 k N
xk  x .

N k 1
Let’s go back now to one of the previous example: five students got 17% and
five students got 83%.
In terms of a list of numbers, you can write it as follows:
x1  x2  ...  x7  17 and x8  x9  x10  83 .
The mean, as before, is 50% but since we are trying to stick with
mathematical notations we say
x  50
The standard deviation is now



1
5  (17  50) 2  5  (83  50) 2  33.
10
It works really well, since each mark is precisely 33% away from the average
mark x  50% .
Let’s have a look at the first example now: the different marks obtained by 7
students are
Mark
33%
42%
51%
59%
67%
75%
83%
The mean is
x  58.57
and the variance is



1
(33  58.57) 2  (42  58.57) 2  ...  (83  58.57) 2  13.87.
7
The variance obtained is smaller than the one above and it makes sense
since the marks are this time more centred around their mean 58.57%.
Download