Describing Univariate Distributions Learning Goals Use “level of measurement” to decide how to describe the variable distribution Understand frequency distributions Understand measures of central tendency Understand measures of dispersion, spread, and variability Three Little Phrases — Stay Alert! 1. Units of measurement: Standardized and uniform quantities for expressing an amount (e.g., feet/inches or metres; miles or kilometres; years, months, and days; dollars or Euros). 2. Units of analysis: The type of things on which a variable is defined (“cases” in the data file). 3. Level of measurement: Precision of measurement — nominal or ordinal categories (qualitative); interval-ratio (quantitative). (Named categories; rank order; precise and meaningful # — either continuous or discrete.) Exercise: Describing “Our” Variables for a Class Data File [1] HEIGHT What is its level of measurement? Is a frequency distribution a good idea? What measures of central tendency are best? What measures of dispersion/variability are best? Describing “Our” Variables [2] DISTANCE FROM CAMPUS What is its level of measurement? Is a frequency distribution a good idea? What measures of central tendency are best? What measures of dispersion/variability are best? Describing “Our” Variables [3] TIME OF COMMUTE What is its level of measurement? Is a frequency distribution a good idea? What measures of central tendency are best? What measures of dispersion/variability are best? Describing “Our” Variables [4] MODE OF TRANSPORTATION What is its level of measurement? Is a frequency distribution a good idea? What measures of central tendency are best? What measures of dispersion/variability are best? Describing “Our” Variables [5] GENDER What is its level of measurement? Is a frequency distribution a good idea? What measures of central tendency are best? What measures of dispersion/variability are best? Describing “Our” Variables [6] NUMBER OF SIBLINGS What is its level of measurement? Is a frequency distribution a good idea? What measures of central tendency are best? What measures of dispersion/variability are best? Describing “Our” Variables [7] EYE COLOUR What is its level of measurement? Is a frequency distribution a good idea? What measures of central tendency are best? What measures of dispersion/variability are best? Describing “Our” Variables [8] BEER ATTITUDE What is its level of measurement? Is a frequency distribution a good idea? What measures of central tendency are best? What measures of dispersion/variability are best? Can we make this a Likert response scale ( i.e., strongly agree, agree, uhhh, disagree, strongly disagree)? Why are Likert scales unlikeable? Describing “Our” Variables [9] HOW OFTEN DO YOU ATTEND MUSICAL EVENTS? What is its level of measurement? Is a frequency distribution a good idea? What measures of central tendency are best? What measures of dispersion/variability are best? Describing “Our” Variables [10] MOVIE FAVES What is its level of measurement? Is a frequency distribution a good idea? What measures of central tendency are best? What measures of dispersion/variability are best? Frequency Distributions Used for nominal and ordinal data; interval-ratio data may need to be grouped. Compute counts (frequencies) and relative frequencies (proportions expressed as %). Do not do the cumulative percentages for nominal data! Don’t get whole number variable values (number of pets) confused with frequencies! Measures of Central Tendency For interval-ratio data: Mean, median, mode. For ordinal data: Median and mode. For nominal data: Mode. Mean vs. Median If the variable distribution is skewed (long tail of extreme values on ONE side), median may be preferable. Mean and median are obtained in VERY DIFFERENT WAYS. Mean: See formula provided by Garner (2010, p. 59). Median: See algorithm and formula provided by Garner (2010, p. 61–62). Measures of Dispersion/Variability Range Standard deviation Percentile distributions — e.g., interquartile range For categoric data: Index of diversity and index of qualitative variation (optional). See Garner (2010, pp. 67–69). How to Compute the Standard Deviation Very important! The algorithm (summarized by the formula) needs to be memorized. See Garner (2010, p. 65). Work through a few simple examples. Don’t confuse the SD with the mean deviation or the mean absolute deviation. Mean and SD of a Proportion Proportion for a variable with two categories (binary or dichotomous). Coded as 0/1 for the two categories. The mean = number of cases coded 1 divided by the total number of cases. The SD is the square root of [p x (1–p)] where p is the proportion of cases coded 1.