In-Class Exercise GENERAL INSTRUCTIONS Name: _____________________________

advertisement
Name: _____________________________
Government 2301
In-Class Exercise – State Constitutions: Measures of Central Tendency and Dispersion
GENERAL INSTRUCTIONS

This assignment is not terribly difficult but it is fairly involved. You have the entire period to complete the
assignment; you should turn in your work before you leave, regardless of whether you have finished. In
other words, you may not take the assignment home, finish it, and then turn it in later. Do as much as you
can in the time available to you. If you work, rather than chat, with your neighbor, you should be able to
finish it.

You may work with one or two other classmates (groups of no more than 3 people) to divide up some of
the arithmetic; however, your answers to the “thought questions” should be as original as you are. You will
not receive credit if you simply copy answers from someone else. Your answers should demonstrate some
individual thought.

The instructor will award up to ten (10) “bonus” points on the next exam for successful completion of this
assignment.

Please be NEAT! Write large enough so that the instructor can read your answers. The instructor will NOT
grade illegible work. Print, if necessary!

You may ask the instructor if you have any trouble with or questions about this assignment!
OBJECTIVE:
To be able to describe American constitutions by computing and analyzing the following simple
descriptive statistics:






actual and relative frequency distributions
arithmetic mean
median
mode
range
standard deviation
INTRODUCTION:
If your instructor asked you to describe the typical American constitution, what would you say? [In
addition to the United States Constitution, each of the fifty states has its own state constitution, and most
municipalities (cities) have city charters, which are, in effect, constitutions. Much has been written about
the similarities and differences among American constitutions. One of the frequent observations made by
those who study American constitutions is that state constitutions tend to be longer than the U.S.
Constitution. This exercise is designed to acquaint you with the physical differences among American
constitutions.1
1
NOTE: This assignment does not introduce you to any substantive provisions of American constitutions. You
should be aware, however, that despite the differences among them in terms of their physical qualities and specific
Science, however, demands that we base our observations less on impressions and more on hard data.
Conclusions drawn on the basis of intuition may very well be correct, even logical, but conclusions based
on statistical analysis of hard data (observed facts) have greater certainty.
DESCRIBING ONE VARIABLE
A relatively complete description of any variable answers two (2) questions:


What is the typical score?
How much do the scores vary?
Three measures of central tendency (arithmetic mean, median, and mode) are used to help answer the first
question. Two measures of dispersion (range and standard deviation) can be calculated to answer the
second. In addition to providing a solid basis for further analysis, measures of central tendency and
measures of dispersion, of and by themselves, provide a great deal of insight into the workings of certain
phenomena. For example, if a variable does not vary - that is, if each case has approximately the same
score – there is little purpose in computing its co-variation with a second variable.
Arithmetic Mean
The arithmetic mean [also called the mean or the average] is one measure of central tendency. It is
computed according to the following formula:
Simply put, to compute the mean of a set of
observations on a variable, one should add all
the scores and divide by the number of cases.
n
 mean of X = sum of all observations on variable X divided by the sample size
 add up () all individual (i) values of the variable (X) beginning with the first one
(i=1) and running through the entire sample of size n
provisions, all American constitutions share similar philosophical underpinnings (i.e., republicanism, limited
government, individual freedom, etc.).
Example:
68+47+55+55+58+42+50+59+38+61+43+49+67+51+51+36+40+62+
50+64+65+42+53+50+42+45+45+59+47+51+ = 1545 [sum of X];
30 = sample size [n];
1545/30 = 51.5 [mean of X]
The mean gives due weight to every observation in a group of numbers. A drawback to using the mean as
a measure of central tendency is that it is influenced by extreme values on either side, particularly in a
small sample.
Median
The median is a second measure of central tendency. It is the score where one-half of the observations are
greater than that score and one-half are below. To determine the median score, you must first rank the
observations from lowest to highest (ordinal ranking). Below is an ordinal ranking of values of the
variable X:
36, 38, 40, 42, 42, 42, 43, 45, 45, 47, 47, 49, 50, 50, 50,
51, 51, 51, 53, 55, 55, 58, 59, 59, 61, 62, 64, 65, 67, 68
In this sample, because there is an even number of cases (30), there is no single value that is the middle
score. [If there had been 31 cases in our sample, we would take the 16th score as the median because 15
observations would lie above that score and 15 would lie below.] When the ranking has an even number
of cases, the median is the mean of the two middle values. In this example, the two middle values are the
15th and 16th ranks (50 and 51). The median, then, is the mean of these two values [50+51 = 101; 101/2
= 50.5]. The median value of our sample is 50.5.
As a position measurement, the median is NOT influenced by extreme cases on either side. If the smallest
value in our sample had been 25 rather than 36 and the largest had been 108 rather than 68 [with all other
cases being the same], the median would still be 50.5. However, the mean would be 53 instead of 51.5 as
previously calculated. The insensitivity of the median to extreme cases can be an advantage where a small
number of extremely high or extremely low scores would have a disproportionate effect on the mean.
Remember, a measure of central tendency tells us where the approximate center of our data is. For
example, an average test score in a class of, say, 25 students can be highly misleading if two or three
people score in the 20s or 30s while other students in the class score in the 70s, 80s, and 90s. In such an
example, the median may be a better indicator of how the class performed as a group.2
2
Interestingly, if a variable is normally distributed, the mean and the median (as well as the mode) will be virtually
the same and will be the highest point of the distribution. [There are several other characteristics of a normal
distribution as well.] These points are not important for this assignment, but are in more sophisticated statistical
analyses. I will be happy to discuss these issues with any one who is interested in learning more about statistical
analysis.
Mode
The mode is simply the most common score. In our example, there are 3 occurrences of the value 42, 3 of
50, and 3 of 51. It is not atypical of small samples to get a tie between two or more scores for the mode.
This is why the mode is less attractive as a measure of central tendency than either the mean or the
median. However, as the sample size gets larger and larger, it is likely that one score will emerge as the
true mode [as mentioned in footnote #2, in large samples it will be the same score as the mean and
median if the variable is normally distributed].
Range
The mean, the median, and the mode are all measurements of the center point of a group of data. The
range, on the other hand, is a measure of dispersion. It tells us the extent to which cases spread from the
center point. The range is computed by subtracting the smallest value from the largest value. The range of
values for the "X" variable is 32 [68-36 = 32].
The principle advantage of the range is that it is easy to calculate. It also gives us a quick and very rough
idea of the amount of variability in the data. Its chief disadvantage is that it is based on only two scores
(highest and lowest) and consequently does not reflect the amount of variability in the rest of the data. For
example, if the smallest value of X in our sample was 36 and the largest was 68, but every other score
was 51, the range would still be 32. We might be led to conclude that there is more variability in the data
than would actually be the case.
Standard Deviation
The standard deviation, unlike the range, makes use of every score in determining the amount of
variability. The formula for the standard deviation looks imposing. However, it is really very easy to
calculate on a hand calculator (I kid you not), though it can be time consuming.
There are six steps to calculate the standard deviation:
(see above)
1. calculate the mean of the data;
2. take the deviation of each individual score [the deviation is the
difference between an individual score and the mean of the sample];
3. square each deviation [this is very important because it gives us all
positive values];
4. sum the squared deviations;
5. divide the sum by the sample size minus one [the resulting quotient
is called the variance (s2)];
6. and, finally, take the square root of the variance [to obtain the
standard deviation (s)].
To illustrate using the "X" variable, first we find the deviation of each score from the mean. We have
already determined that the mean of X in our example is 51.5. Therefore, we subtract the mean from each
value of the X variable (Xi - mean of X). Next, we square each deviation (Xi - mean of X)2. [NOTE: The
mean of variable X is frequently denoted as a capital X with a bar over it. See above.] Then, we sum all
squared deviations. The sum of the squared deviations in our example is 2,269.50. We divide this sum by
29 (n -1 or, in this example, 30 -1). This gives us the variance 78.26 (denoted as s2). Finally, when we
take the square root of the variance, we get the standard deviation 8.85 (denoted as s). The standard
deviation, or typical difference between the mean value of X (51.5) and the any individual value in our
example is approximately 9.
The standard deviation is considerably more revealing than the range as a measure of dispersion [recall
that the range was 32]. If the variable is normally distributed, about 64% of the cases in our sample
should fall within + or - one standard deviation of the mean. Let's see if that is the case. The mean value
of X was 51.5. If we add 1 standard deviation (8.85) to the mean we get 60.35. If we subtract one standard
deviation (-8.85) from 51.5 we get 42.65. Eighteen (18) scores in our example are between 43 and 60.
Eighteen (18) divided by 30 is 60% -- so slightly less than the requisite 64% falls within + or - one
standard deviation of the mean.
Reflect for a moment on this point: Basing our conclusions on this type of analysis is far preferable to
basing them on our impressions. Frequently, we find that our impressions are supported by the data
(facts). This does not mean that computing the statistic was pointless. It is amazing how often people
comment that they do not need a scientific analysis to tell them what they already know by common
sense. We need scientific analyses to confirm what we believe is common sense knowledge -- because
sometimes what we think we know by common sense cannot be supported by the facts.
The most important aspect of the standard deviation as a measure of dispersion is its applicability in
further statistical analysis. For example, the standard deviation can readily be plugged into certain aspects
of probability theory that are of great assistance in carrying out more sophisticated data analyses.
Name: _______________
THE EXERCISE
Table 1 contains information on several variables related to the physical qualities of American
constitutions (50 state constitutions and 1 federal constitution), including the date each was ratified, the
number of years the constitution has been in existence since ratification, the number of words in the
document, the number of amendments to the constitution, and the frequency of amendment to each
constitution.
1. Complete Table 1 by calculating and entering the values for “Frequency of Amendments” (last
column). This value is calculated by dividing the number of amendments to a constitution by the
number of years since ratification. Alabama’s constitution, for example, has been amended at a rate
of almost 6 times a year (513 divided by 92).
Central Tendency
2. Calculate the mean of “Word Count.” _____________
3. Calculate the mean of “Number of Amendments.” ________________
4. The median of the “Word Count” variable is 22,000. Do you believe the median and the mean of
“Word Count” are providing you with similar or substantially different information about the central
tendency of this variable? ________________________ If they are providing significantly different
information, which do you believe is the better measure of central tendency?
_______________________ Explain.
5. The median of the “Number of Amendments” variable is 86. Do you believe the median and the
mean of “Number of Amendments” are providing you with similar or substantially different
information about the central tendency of this variable? ________________________ If they are
providing significantly different information, which do you believe is the better measure of central
tendency? _______________________ Explain.
Measures of Dispersion: the Range
6. Which constitution has the greatest number of words? __________________________
How many words does that constitution have? ________________________
7. Which constitution has the fewest number of words? __________________________
How many words does that constitution have? ________________________
8. Subtract the number of words in the shortest constitution from the number of words in the longest to
compute the range. ___________________
9. Repeat #6, #7, and #8 for the “Number of Amendments” variable.
________________
________________
________________
________________
________________
Measures of Dispersion: the Standard Deviation
10. Use the six steps outlined above to compute the standard deviation for the “Word Count” variable.
Use Table 2 to complete steps 2 through 6. Enter the standard deviation you computed here.
s = _____
11. Write the mean of “Word Count” here. __________
12. Add one standard deviation to the mean. ____________
Subtract one standard deviation from the mean. ____________
13. How many constitutions fall within this range of + or – one standard deviation of the mean? _______
Convert this frequency to a percentage by dividing by the sample size. _______________
14. Do you believe that the range or the standard deviation is a better measure of dispersion for this
variable? __________
Explain.
15. Use the six steps outlined above to compute the standard deviation for the “Number of Amendments”
variable.
Use Table 2 to complete steps 2 through 6. Enter the standard deviation you computed here.
s = _____
16. Write the mean of “Number of Amendments” here. __________
17. Add one standard deviation to the mean. ____________
Subtract one standard deviation from the mean. ____________
18. How many constitutions fall within this range of + or – one standard deviation of the mean? _______
Convert this frequency to a percentage by dividing by the sample size. _______________
19. Do you believe that the range or the standard deviation is a better measure of dispersion for this
variable? _________
Explain.
20. Without actually calculating the mean, range, and standard deviation, how would you describe the
“Frequency of Amendments” variable? (Just “eyeball” it.) What do you think the central tendency
is?
__________________
How much dispersion is there in the values of the variable? _____________________
21. Finally, write a paragraph about the physical characteristics of American constitutions. Are
American constitutions more different from one another or more similar to one another in terms of
their physical attributes? What do you believe accounts for these differences or similarities? (Write
legibly.)
Download