Frequency Distributions and Histograms

advertisement
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
Frequency Distributions and Histograms
Frequency Distributions
The purpose of a Frequency Distribution is to show the numbers of data values that fall into various categories
of a discrete/categorical variables. Frequency distributions are applied to variables that have discrete categories:
such as character/text variables, ordinal measurement variables, and nominal measurement variables (Histograms
are used to perform similar analysis for continuous measurement variables; histograms are discussed later in this
handout). Frequency distributions are useful for visualizing the “spread”—the variation, skewness and kurtosis-in the values of a discrete variable, even though we can’t calculate true skewness and kurtosis for some discrete
variables, such as character/text variables. The Four common types of Frequency Distributions are:
 Count/Frequency Distribution shows the number of observations in each category
 Percentage Distribution shows the percentage of observations in each category
 Cumulative Count/Frequency Distribution shows the number of observations up to and including each
category
 Cumulative Percentage Distribution shows the percentage of observations up to and including each
category
Example: consider a variable that has 5 different categories, such as the response to a survey question that asks
respondents to evaluate whether they "strongly agree with, agree with, are indifferent to, disagree with, or
strongly disagree with" a policy statement. Suppose the variable is named OPINION, and we use "sa" to mean
"strongly agree", "a" to mean "agree", etc. The variable categories are then: sa, a, i, d and sd. For the OPINION
variable with categories (sa, a, i, d and sd), the four types of frequency distributions would be:




A Count/Frequency Distribution shows the number of observations (number of survey respondents) in
each category (sa, a, i, d, sd) of OPINION
A Percentage Distribution shows the percentage of observations (percentage of survey respondents) in
each category of OPINION
A Cumulative Count/Frequency Distribution shows the number of observations up to and including
each category of OPINION
A Cumulative Percentage Distribution shows the percentage of observations up to and including each
category of OPINION
Bar Graphs are typically used to display Frequency Distributions. There are two basic types of Bar Graphs,
vertical (bars run vertically) and horizontal (bars run horizontally).
Vertical Frequency Distribution
Horizontal Frequency Distribution
Frequency
Variable
Categories
Variable
Categories
Frequency
We will learn how to make Frequency Distributions in both Excel and SAS.
1
Histograms
Histograms are very similar to Frequency Distributions. The purpose of a Histogram is to show how many of the
data values fall into various categories of a continuous measurement variable. The histogram is an effective
graphical technique for visualizing the variation, skewness and the kurtosis of a continuous measurement variable.
You typically make a separate histogram for each continuous numerical variable in a dataset.
When making histograms, you must decide how to divide the continuous data into categories and decide whether
values on the “borderline” of a category should be placed in the category above the borderline or below the
borderline.
When making histograms, use a larger number for categoris when you have many observations in your data set,
and use a smaller number of categories when you have few observations. You want to adjust the number of
categoris until you produce a histogram that (1) reveals the "spread" in your data but (2) doesn't result in many
categories with zero observations (i.e., histogram bars with zero height).
We will learn how to make histograms in both Excel and SAS.
2
Download