I. Basic Terms II. Univariate Descriptive Statistics E. Three general characteristics of distributions 1. Central tendency 2. Variability s 2 2 ( Y Y ) n 1 (Y Y ) n 1 s 2 2 (Y Y )2 N (Y Y ) N 2 What does the standard deviation tell us? For almost all distributions, the value of s (or ) is somewhere between the smallest deviation from the mean and the largest deviation from the mean. For “bell shaped” distributions, about 68% of the cases are within one standard deviation of the distribution’s mean, about 95% of the cases are within two standard deviations from the mean, and amost all of the cases are within 3 standard deviations from the mean. 3. Shape a. Two distinctions 1) Unimodal vs. multimodal 1 2) Symmetric vs. asymmetric b. Common shapes 1) Bell shaped distributions 2) Skewed distributions 3) Uniform distributions 4) U-shaped distributions c. There are measures that describe how skewed a distribution is, how “flat” it is, etc. d. A few notes on how shape is related to the values of the mode, median, mean and standard deviation 1) For unimodal and symmetric distributions, Mo=Md=Mean 2) For symmetric distributions, Md=Mean. In addition they are both at the “center” of the distribution 3) For positively skewed distributions, Mo < Md < Mean. For negatively skewed distributions, Mo > Md > Mean. 4) For variables measured in terms of a ratio scale, if the standard deviation exceeds the mean, the distribution is positively skewed. F. Characterizing scores in terms of the distributions of which they are members 1. Two common problems a. What does a given score mean? b. How can I combine scores from different distributions? 2. Two solutions for the first problem a. Transform the score into a percentile rank. The percentile rank of score X is the percentage of the cases in the distribution that have scores lower than X. 2 Percentile ranks are easy to understand. Percentile ranks usually distort differences between scores. In general, they exaggerate small differences in regions of a distribution that have high frequencies, and they minimize differences in regions that have low frequencies. b. Transform the score into a “standard score” (z-score) z Y Y Y A z-score tells you how many standard deviations a given score is away from the mean of the distribution. For scores above the mean, z-scores are positive. For scores below the mean, z-scores are negative. For scores at the mean, the z-score is zero. Z-score distributions always have a mean of zero and a standard deviation of zero. The z-score is one kind of linear transformation of raw scores. z Y 1 (Y ) Y Y Linear transformations involve adding, subtracting, multiplying and/or dividing all of the raw scores (Y) by one or more constants. What happens when you add or subtract a constant from all of the scores in a distribution? What happens when you multiply or divide all of the scores in a distribution by a constant? A key property of linear transformations is that they do not change the shape (bimodal, skewed, etc.) of the original distribution of raw scores. As a result, they preserve information about relative distances between pairs of scores. There are many “standard” linear transformations (IQ scores, SAT scores, GRE scores, etc.) in addition to the “standard score transformation.” 3. Comparing and combining scores from different distributions—Don’t use raw scores! 3 1st Exam—mean = 50, s.d. = 5 2nd Exam—mean = 50. s.d. = 10 1st Exam 2nd Exam George 50 60 110 You 60 50 110 Total Pts. Does George deserve the same overall course grade as your grade? Instead of raw scores, use z-scores 1st Exam 2nd Exam George 0 1 1 You 2 0 2 Total Pts. III. Probability and Probability Distributions A. Probability is defined as the long run relative frequency of an outcome Pi=fi/n Pi The probability of the ith outcome fi The number of times the ith outcome occurred over many trials n The number of trials (big) 1. We could conceivably determine probabilities empirically, but sometimes we know them a priori. 2. Two properties of probabilities a. 0 ≤ Pi ≤ 1.0 b. Pi = 1.0 3. Two rules for calculating probabilities of complex events a. The probability that any one of a set of mutually exclusive outcomes will occur is equal to the sum of the probabilities of each of those outcomes. 4 What’s the probability of getting an even number when you roll an honest die? b. The probability that a particular combination of outcomes of independent trials will occur is equal to the product of the probabilities of each of those outcomes. What’s the probability of getting two ones when you roll an honest die twice? B. Probability distributions 1. The probability distribution for rolling an honest die number of dots 1 2 3 4 5 6 Pi 1/6 1/6 1/6 1/6 1/6 1/6 a. What’s the shape of this distribution? b. What’s the mean of this distribution? The mean of a probability distribution is special thing and has a special name—the “expected value” E(Y ) y P( y) c. What’s the standard deviation of this distribution? 2 y E( y) P( y) 5 2 2. The probability distribution for the number of females in random samples of one person when the population is half female (f = .5) # of females in sample 0 1 Pi .5 .5 1.0 What is the shape of this distribution? What is its mean? What is its standard deviation? 3. The probability distribution for the number of females in random samples of two people drawn from a population when the population is half female Here are the possible sample results F, F F, M M, F M, M The probability of each of these four possible sample results equals .52 or .25 (multiplication rule) Thus, the probability of drawing 2 females equals .25, the probability of drawing one female equals .50, and the probability of drawing 0 females equals .25 (addition rule) # of females in sample 0 1 2 Pi .25 .50 .25 1.00 What’s the shape of this distribution? What is its mean and standard deviation? In the 1700s, Jacob Bernoulli proved that for these kinds of distributions (two possible outcomes, n independent trials) the mean always equals n( and the standard deviation always equals the square root of n()(1-). Here n = sample size and = probability of drawing a female 6