UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Frequency Distributions and Histograms Frequency Distributions The purpose of a Frequency Distribution is to show the numbers of data values that fall into various categories of a discrete/categorical variables. Frequency distributions are applied to variables that have discrete categories: such as character/text variables, ordinal measurement variables, and nominal measurement variables (Histograms are used to perform similar analysis for continuous measurement variables; histograms are discussed later in this handout). Frequency distributions are useful for visualizing the “spread”—the variation, skewness and kurtosis-in the values of a discrete variable, even though we can’t calculate true skewness and kurtosis for some discrete variables, such as character/text variables. The Four common types of Frequency Distributions are: Count/Frequency Distribution shows the number of observations in each category Percentage Distribution shows the percentage of observations in each category Cumulative Count/Frequency Distribution shows the number of observations up to and including each category Cumulative Percentage Distribution shows the percentage of observations up to and including each category Example: consider a variable that has 5 different categories, such as the response to a survey question that asks respondents to evaluate whether they "strongly agree with, agree with, are indifferent to, disagree with, or strongly disagree with" a policy statement. Suppose the variable is named OPINION, and we use "sa" to mean "strongly agree", "a" to mean "agree", etc. The variable categories are then: sa, a, i, d and sd. For the OPINION variable with categories (sa, a, i, d and sd), the four types of frequency distributions would be: A Count/Frequency Distribution shows the number of observations (number of survey respondents) in each category (sa, a, i, d, sd) of OPINION A Percentage Distribution shows the percentage of observations (percentage of survey respondents) in each category of OPINION A Cumulative Count/Frequency Distribution shows the number of observations up to and including each category of OPINION A Cumulative Percentage Distribution shows the percentage of observations up to and including each category of OPINION Bar Graphs are typically used to display Frequency Distributions. There are two basic types of Bar Graphs, vertical (bars run vertically) and horizontal (bars run horizontally). Vertical Frequency Distribution Horizontal Frequency Distribution Frequency Variable Categories Variable Categories Frequency We will learn how to make Frequency Distributions in both Excel and SAS. 1 Histograms Histograms are very similar to Frequency Distributions. The purpose of a Histogram is to show how many of the data values fall into various categories of a continuous measurement variable. The histogram is an effective graphical technique for visualizing the variation, skewness and the kurtosis of a continuous measurement variable. You typically make a separate histogram for each continuous numerical variable in a dataset. When making histograms, you must decide how to divide the continuous data into categories and decide whether values on the “borderline” of a category should be placed in the category above the borderline or below the borderline. When making histograms, use a larger number for categoris when you have many observations in your data set, and use a smaller number of categories when you have few observations. You want to adjust the number of categoris until you produce a histogram that (1) reveals the "spread" in your data but (2) doesn't result in many categories with zero observations (i.e., histogram bars with zero height). We will learn how to make histograms in both Excel and SAS. 2