Measures of Variability

advertisement
Class 2 – variability 1
Measures of Variability
We have already discussed the most frequently used measures of central tendency. Measures of central
tendency allow us to select one number to represent a distribution of scores. In discussing them, we focused on the
type of measurement scale that is being employed (Nominal, Ordinal, Interval and Ratio. Measures of Variability
are a second form of descriptive statistic, which we use to describe how spread out the scores in our distribution are.
We will begin by defining the most common ways of measuring variability. We will then discuss how different types
of distributions might affect our choice of measure of central tendency. Finally, we will talk about the importance of
looking at variability when interpreting results. Since the logic underlying inferential statistics depends on a good
understanding of variability it is important that you understand these concepts.
When dealing with nominal scales we have the same limitations we had with measures of central tendency.
Numbers assigned to nominal scales are not truly numbers – they are name labels. We therefore, cannot calculate a
number that would describe the variability of the responses. In fact, we cannot even meaningfully say that the scores
range from category 1 to category 7 because the order of the categories is arbitrary. We can only summarize by
listing the categories and their frequency of occurrence. If there are only a few categories, you might simply
summarize the distribution in text form (e.g., Thirty-eight percent of respondents were males and 62% were female).
When there are around 4 to 7 categories bar charts or pie graphs may be appropriate, whereas, larger numbers of
categories might best be summarized in tables. This is not a hard and fast rule. It depends on the variable. We tend
to reserve figures for more important variables.
With ordinal scales, we can define the Range of the categories. We might say our sample included
respondents ranging in education level from no high school through to PhDs. This defines the extremes, or end
points of our ordered categories. Since the intervals are not equal between our categories, we cannot define in
numbers the average difference between scores.
With continuous /scale variables, (Interval and Ratio) we can use numbers to describe the variability of the
distribution. Looking at an example, let’s compare final exam scores from three sections of a Gen Psych course.
Section 1
Section 2
Section 3
160
102
200
130
101
78
100
100
77
70
99
75
40
98
70
Total
500
500
500
Mean
100
100
100
All three classes have the same mean but the variability of the grades differs greatly. There are several measures of
variability that can be used to describe these differences.
Range - simplest measure of variability. It is defined by the highest score minus the lowest score.
Range Section 1 = 160 - 40 = 120
Range Section 2 = 102 - 98 = 4
Range Section 3 = 200 - 70 = 170
Class 2 – variability 2
The higher the range, the more variable the scores, however, the range may not be a very representative measure of a
distribution of scores. You might notice that the range is defined by only 2 numbers; the highest and the lowest.
In section one, there is a relatively large range and the scores are fairly evenly distributed within it. The range of
section 2 is small, but once again the distribution of scores is fairly even. Section 3 has the largest range; however
this is due to one extremely high score. All other scores in section 3 are very close to each other. Range can be
strongly affected by the occurrence of extreme scores. Therefore, range is often not the best way to represent the
variability of scores.
Deviation Scores. One way that we can use to represent the distribution of scores is to look at the average amount
that scores deviate (differ) from the mean. We could take each score and minus the mean from it and then sum this
value. The problem here is that the mean is the arithmetic center. By definition the sum of the deviation of scores
that fall above the mean (have a positive deviation) is equal to the sum of deviation scores below the mean (have a
negative deviation). When we sum deviation scores, we will always get zero. This measure therefore is useless.
The mean deviation score will always be zero. Clearly, the mean deviation score will not help us to describe the
distribution of scores. We could get around this, however, if we used absolute (ignoring the positive or negative
sign) distribution scores. In essence, that is what we do when we calculate Variance.
Variance - Instead of ignoring the sign, we use a little mathematical trick to convert all the deviation scores to
positive numbers. We square them. You might recall from algebra that all squared values are positive.
E.g., 22 = 4 and -22 = 4 (a negative multiplied by a negative is a positive).
If we square all the deviation scores and sum them we will get a positive number. If we then divide by the number of
scores, we obtain the average squared distance that scores deviate from the mean. Variance is one of the most
commonly used measures of variability. Recall it is the heart of the analysis that we call Analysis of Variance
(ANOVA). We are also going to learn that the assumption of homogeneity of variance (the requirement that that the
variance of samples we are comparing do not differ from each other) will be something that you will need to test in
order to be able to use ANOVA’s. For the moment, the important thing to realize about variance is that it is a
measure of variability in terms of average squared distances between the individual scores in the distribution and the
mean of that distribution.
In this course you will not be asked to calculate the variance of a distribution by hand, nor with a calculator.
SPSS will do these calculations for you. There is, however, something you should be aware of about the manner in
which SPSS calculates variance. It sums the squared deviation scores and divides by N-1 (the total number of scores
– minus one). Why does it do that? In order to understand this we have to take a short detour and discuss the
difference between Samples and Populations.
A Population is the entire group of people or scores that you want to apply your sample statistics to. If I wanted to
know the average height of students attending Platteville, I could go out and measure them all and then determine the
exact average height. When we obtain measurements from an entire population we refer to the descriptive values
(mean, range, variability) as parameters. They are exact. When we do research, more often then not, we measure a
sample of the population. The descriptive statistics from the sample are used as estimates of the population
parameters. The word statistic means that we are estimating. A statistic is an estimate of a parameter.
When we use statistics we are taking a subset of scores (the sample) and generalizing them to the
population. One way that statistics can be misleading is that the sample might not be an unbiased subset of the
population. We have discussed this a great deal when talking about sample selections and external validity.
Statisticians have also done a great deal of work looking at the degree to which statistics are unbiased estimates of
the parameter. The easiest way to understand this is to look at a technique they use called Monte Carlo studies.
Statisticians generate a large distribution of numbers of a known mean and variability and then they repeatedly
(thousands of times) draw random samples of a given size from this population. Generally, they use computers to do
these studies. Monte Carlo studies have provided us with two important findings.
1) Larger samples give more precise estimates of the population parameters. This should make sense. The larger
the sample, the more representative it is of the population. Extreme scores to one side of the distribution are more
Class 2 – variability 3
likely to be counteracted by extreme scores to the other end and thus the estimate is more accurate.
2) No matter how large the sample, some statistics are still biased estimates of the population. The mean is an
unbiased estimate. If you calculate the mean from several samples, any given sample is as likely to be an
overestimate of the population mean as it is to be an underestimate. If you average the means from several samples,
you will get a good estimate of the population mean. Variance, however, is not an unbiased estimate of the
population variance. The smaller the sample, the more it underestimates the variance. I am sure there are complex
mathematical explanations for this, but it would be well beyond what you need to know. The important thing to
know about variances is that this bias is very easily corrected. Statisticians have found that if you divide the sum of
the squared deviation scores by N-1, you get an unbiased estimate of the population variance. Notice that the higher
the sample size the less the correction is. For example, 100 - 1 is a smaller adjustment than 10 - 1.
When you have SPSS calculate the variance of a distribution of scores, it assumes you are working with a sample. It
divides the sum of the squared deviation scores by N-1. If you are really working with a population you should
correct this by multiplying the variance by N-1 and then dividing by N.
One of the major advantages of variance is that it is easy for the computer program to work with. The
major limitation is that, unlike computers, people have difficulty time thinking about squared values. If you look at
the two distributions below, you can see that the variability of scores in distribution B is twice that as distribution A,
but the variance of B is 4 times as large as A’s. We can, however, convert variances to values that are easier to think
about, simply by using their square roots. These are called standard deviations. So the standard deviation of
Distribution A is 1.58 and the standard deviation of Distribution B is 3.16. The variability of Distribution A is half
that of Distribution B and this is also true of the magnitudes of their respective Standard Deviations. In other words,
it is not easy for us to compare distributions using variance, but it is easy to do so with standard deviations.
Distribution
A
Deviation
Scores
Squared
Deviations
Deviation
scores
Squared
Deviations
4
-2
4
2
-4
16
5
-1
1
4
-2
4
6
0
0
6
0
0
7
1
1
8
2
4
8
2
4
10
4
16
s2 = 2.5
s = 1.58
Mean = 6
Mean = 6
Distribution
B
s2 = 10
s = 3.16
Standard Deviations. The easiest way to think about standard deviations is as an approximation of the average
amount that the scores in the distribution deviate from the mean. Yes, I know that the average deviation of scores in
distribution A is 1.5 not 1.58, and the average distance between scores for distribution B is 3 not 3.16 but it is a very
close estimate. This is once again so that the statistic estimates the population parameter. The important thing to
remember is that Variances and Standard deviations allow us to use a number to describe the amount to which scores
in the distribution differ from each other.
Properties of Variance and Standard Deviations. While standard deviations are more useful for describing the
distribution of scores in a manner in which most people can understand and they allow us to compare the average
variability of scores between distributions, standard deviations cannot be meaningfully added or averaged. For
example, if I wanted to calculate the average standard deviation of two distributions, I cannot simply add them
together and divide by 2. Instead, I would need to go back to the variances and find their average and then reconvert to a standard deviation. The main point is that you cannot add, subtract, divide or multiply standard
Class 2 – variability 4
deviations and obtain a meaningful answer. These mathematical manipulations can be done with variances and
that makes variance much more useful when computing inferential statistics. Although there is debate about which
statistic, variance or standard deviation, should be reported in the results section, I suggest you use the one which is
most easily understood, the standard deviation. Whenever you report a mean, you should report the standard
deviation as well.
Variation is only one aspect of a distribution that may be important to look at, and perhaps included in your write-up.
The shape of the distribution can also be important. Below I have included some examples of distributions and the
terms used to describe them.
One way we can describe a distribution as symmetric or skewed. A distribution curve is symmetric if when folded
in half the two sides match up. If a curve is not symmetrical it is skewed. When a curve is positively skewed, most
of the scores occur at the lower values of the horizontal axis, and the curve tails off towards the higher end. When a
curve is negatively skewed, most of the scores occur at the higher value and the curve tails off towards the lower end
of the horizontal axis. SPSS reports skewness as a number. A perfectly symmetrical curve has a skewness value of
0. Positive skewed curves have positive numbers, whereas, negative skewed curves have negative numbers.
If a distribution is a bell-shaped symmetrical curve, the mean, median and mode of the distribution will all be the
same value. When the distribution is skewed the mean and median will not be equal. Since the mean is most
affected by extreme scores, it will have a value closer to the extreme scores than will the median.
For example, consider a country so small that its entire population consists of a queen (Queen Cori) and four
subjects. Their annual incomes are
Citizen
Annual Income
Queen Cori
1,000,000
Subject 1
5,000
Subject 2
4,000
Subject 3
4,000
Subject 4
2,000
I, as Queen, might boast that this is a fantastic country with an “average” annual income of $203,000. Before
rushing off to become a citizen, you might want to be wise and find out what measure of central tendency I am using!
The mean is $203,000, so I am not lying, but this is not a very accurate representation of the incomes of the
population. Money is a continuous (ratio) variable, but in this case the median ($4,000.00) or the mode (also
$4,000.00) would be a more reprehensive value of the “average” income. The point to be made here is that the
appropriate measure of central tendency is affected not only by the type of measurement but also by the distribution
of income. The mean of a distribution is strongly affected by extreme scores, which we call outliers.
Another way to describe the shape of a distribution is called kurtosis which describes how peaked or flat a
distribution is. The standard normal curve (which we will speak about more in the future) has a kurtosis value of 0.
Curves which are narrower (more peaked) have positive values; whereas, curves that are flatter (more evenly
distributed) have negative kurtosis values. While numbers can be used to define the skewness and kurtosis, these
values are rarely reported in results sections.
Lab 2: Exploratory Data Analysis
Before analyzing data that you have entered into SPSS, it is advisable to conduct a quick exploratory analysis. In the
Lab today we will be learning how to do that. The purpose of the exploratory analysis is to familiarize you with the
nature of your data distributions, and to identify problems that might need to be dealt with, such as errors, extreme
outliers, or extreme skew and/or kurtosis.
Class 2 – variability 5
For LAB2 I will give you the data. You should define the variables.
In the first column I have used 1 to indicate the subject is male and 2 to indicate that they are female. In the second
column I have entered final exam scores for all the subjects (who just happen to have been in my Gen Psych Class).
In the third column, I have entered scores on a life happiness rating scale that ranges from 1 (very unhappy) to 7
extremely happy.
The first step is to name and define your variables (just like last week)
To begin with we will look at the distribution of the overall scores. Then we will re-do the analysis looking at
exploratory analysis for males and females scores individually.
Using the Explore option.
From the top menu click on, “Analyze” then click on Descriptive statistics, and then click on Explore.
For the first analysis, you should move the final grade variable and the happiness rating variable into Dependent
List Box.
In the Display option box click on Both. Explore will limit output to either statistics (numbers) or plots (graphs) but
I want you to become familiar with both, so choose both.
Click on the box labels Statistics. On the menu that comes up means should already have a check mark beside it.
This is the default analysis. I want you to click the box next to outliers as well.
Click on continue to return to the Explore Menu
Click on the Plots Box
Then click on “histogram” (note Stem and leaf should not be selected – it will give you output you do not know how
to interpret). Click on continue to return to the Explore menu
Click on OK and SPSS will display the results of your exploratory analysis.
The output will consist of a chart that contains various statistics for both variables. Most of these statistics you will
be familiar with but some are new. I will explain the new ones.
Remember statistics are estimates of population parameters. Based on the distribution of the scores, the 95%
confidence Interval for Mean output defines a range of scores within which we can be 95% sure the population mean
lies within. If the sample is representative of the population, then in only 5% of samples we would obtain a sample
so extreme (due to chance alone) that the mean would not be within this range.
Five% Trimmed Mean - To obtain a value SPSS removes the top, bottom 5% of cases, and recalculates a new mean
value. If you compare this mean to the new trimmed mean, you can see if some of your extreme scores are having a
strong influence on the mean. If they are very different, it would be a good idea to check to see if there is a possible
data entry error, or if a subject who should not have been included in your sample for some reason has been
included. Perhaps, they differ in age or in some other important way that might explain their extreme score. Perhaps
they were unable to complete the task due to language difficulties or some other impairment.
Interquadrile range and it defines the middle 50% of the scores. It is the range if the top 25% and the bottom 25% of
scores were removed. We will come back to this at a latter time.
The second Table gives a list of five highest and the five lowest scores in each variables distribution. The Case
number is also present so that if an error is detected, it can be quickly identified and changed in the data set.
Class 2 – variability 6
SPSS provides a Histogram so that you can see the distribution of scores for each variable.
The last output you will obtain for each variable is a Boxplot (also called a Box and Whisker’s Plot.) The rectangle
in the middle represents the interquadrile range (middle 50% of cases). The lines protruding from this rectangle
(called whiskers) extend to the smallest and largest values. You may see additional circles outside this range – these
are classified by SPSS as outliers. Data points are considered outliers if they are more than 1.5 box lengths from the
edge of the box. Extreme scores (marked with an Asterisk * ) are defined as scores more than three box lengths from
the edge of the box. The line in the middle of the central rectangle is the median of the distribution.
One of the two variables has an extreme value. Identify the value, (it is clearly an error). Replace it with the most
likely correct entry, and then redo the analysis for this variable. Note: there is no reason to redo the other variable,
but do not loose the results for that variable -- you will need it.
One final Analysis. This time I want you to look at the final exam scores for Males and Females individually. You
do this by moving the variable that defines subjects’ sex into the factor list box on the Explore Menu. You will be
comparing these to the overall (when both males and females were included in the same distribution) statistics
obtained for grade scores.
With this computer printout you should be able to answer the following questions
1. Compare the statistics you obtained after you remove the error from your data to the statistics you obtained when
the error was present. Pay attention to which statistics (mean, Median and Mode, range and standard deviation,
skewness and kurtosis) are affected by extreme scores and which are not.
2. On your print out, find the overall variance for the Final Exams. If I wanted to determine the variance of the
class’ final grades as a parameter of the class population, how would I need to adjust this value? (i.e., what would
the variance be if the class is the entire population rather than as a sample?)
3. Find the Mean and Standard Deviations for Males final exam scores and for Females final exam scores. Then
find the overall Mean and Standard Deviation for Final exams for the entire class.
 Average the mean scores for Males and females. Is it the same as the overall mean for the distribution?
 Average the standard deviations obtained from the males and the females. Is this value the same as the
standard deviation obtained for the entire class. If not, Why? Mathematically, how would you go about
getting the standard deviation of the class, if all you had to work with is the number of subjects in each
subgroup and the standard deviations for each subgroup?
Download