Z-Scores Introduction to Statistics Chapter 5 Feb 4-9, 2010 Classes #6-7 Z-Scores Specifies the precise location of each x score within a normal distribution Normal Distribution A Normal Distribution is a distribution that has a symmetric, unimodal and bell-shaped density curve The mean and standard deviation completely specify the curve Examples with approximate Normal distributions Height Weight IQ scores Standardized test scores Body temperature Repeated measurement of same quantity Z-Scores Main purposes Make raw scores more meaningful Tells us exactly where original scores are located Allows us to standardize an entire distribution and thus allow for better comparisons The Standard Deviation and the Normal Distribution There are known percent ages of scores above or below any given point on a normal curve 34% of scores between the mean and 1 SD above or below the mean An additional 14% of scores between 1 and 2 SDs above or below the mean Thus, about 96% of all scores are within 2 SDs of the mean (34% + 34% + 14% + 14% = 96%) Note: 34% and 14% figures can be useful to remember The Normal Curve Mean = 65 S=4 0.70 99.72% of cases Relative Frequency 0.60 95.44% of cases 0.50 68.26% of cases 0.40 0.30 0.20 0.10 0.00 2% 14% 34% 34% 14% 2% 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 -3S -2 S -1 S 0 +1 S +2S +3S The 68-95-99.7 Rule “A standard score expresses a score’s position in relation to the mean of the distribution, using the standard deviation as the unit of measurement” “A z-score states the number of standard deviations by which the original score lines above or below the mean” Any score can be converted to a z-score as follows RelativeFrequency The Standard Normal Curve XX z S z-score formula For population: z x For a sample: X M z s z-score One of the primary purposes of z-score is to describe the exact location of a score within a distribution Pay particular attention to whether its +/This tells us whether the score is located above (+) or below (-) the mean The number tells us the distance between the score and the mean in terms of standard deviations Finding Area When the Score is Known To find the proportion of the curve that lies above or below a particular score Convert raw score to z score Finding Scores When the Area is Known Draw normal curve, shading approximate area for the percentage desired Make a rough estimate of the Z score where the shaded area starts Find the exact Z score using normal curve table Check to verify that it’s close to your estimate Convert Z score to raw score, if desired !!! Remember !!! We can use the standard normal distribution table (Table B.1 in Appendix B, pp. 584-587) ONLY when our distribution of scores is normal. Using the standard normal table is not appropriate if our distribution differs markedly from normality e.g., rectangular skewed bimodal Comparing Scores from Different Distributions Again: The standard normal distribution has a mean of 0 and standard deviation of 1 Consider two sections of statistics Gurnsey’s class has a mean of 80 and S of 5 Marcantoni’s class has a mean of 70 and S of 5 Student 1 gets 80 in Gurnsey’s class Student 2 gets 75 in Marcantoni’s class In relation to the rest of the class, which student did better? Percentile Ranks and the Normal Distribution When we ask what proportion of a distribution lies below a particular z score, we are actually asking what is the percentile rank of the score e.g., in a distribution with a mean of 100 and standard deviation of 15, 84% of the distribution falls below a score of 115 [z = (115-100)/15 = 1]. Therefore, the percentile rank of 115 is 84% Question Example: Dave gets a 50 on his Statistics midterm and a 50 on his Calculus midterm. Did he do equally well on these two exams? Big question: How can we compare a person’s score on different variables? Calculus 10 Statistics •In one case, Dave’s exam score is 10 points above the mean 5 15 Example 1 •In the other case, Dave’s exam score is 10 points below the mean 0 •Standard deviation is 10. •In an important sense, we must interpret Dave’s grade relative to the average 0 2 04 06 08 01 0 0 performance of the class Mean Calculus Mean Statistics GRA DE = 40 = 60 15 Example 1 Statistics Calculus Dave in Statistics: 10 (one SD above the mean) 5 (50 - 40)/10 = 1 Dave in Calculus (50 - 60)/10 = -1 0 (one SD below the mean) 0 2 04 06 08 01 0 0 Mean Statistics = 40 GRA DE Mean Calculus = 60 0 5 10 15 20 25 30 Example 2 Statistics •The following week Dave gets the same grades (50 in each class) Calculus 0 •Both distributions have the same mean (40), but different standard deviations (5 vs. 20) 2 04 06 08 0 100 GRA DE 0 5 10 15 20 25 30 Example 2 An example where the means are identical, but the two sets of scores have different spreads Statistics Dave’s Stats Z-score (50-40)/5 = 2.00 Calculus Dave’s Calc Z-score (50-40)/20 = 0.50 0 2 04 06 08 0 100 GRA DE Example 2 In one case, Dave is performing better than almost 95% of the class. In the other, he is performing better than approximately 68% of the class. Thus, how we evaluate Dave’s performance depends on how much variability there is in the exam scores Standard (Z) Scores In short, we would like to be able to express a person’s score with respect to both (a) the mean of the group and (b) the variability of the scores how far a person is from the mean = X M Standard score or Z (X M ) s Example 3: Young Women’s Height The heights of young women are approximately normal with mean = 64.5 inches and std.dev. = 2.5 inches. Example: Young Women’s Height % of young women between 62 and 67? % of young women lower than 62 or taller than 67? % between 59.5 and 62? % taller than 68.25? Example: Young Women’s Height How about our class? Three Properties of Standard Scores 1. The mean of a set of z-scores is always zero Three Properties of Standard Scores Why? The mean has been subtracted from each score. Therefore, following the definition of the mean as a balancing point, the sum (and, accordingly, the average) of all the deviation scores must be zero. Three Properties of Standard Scores 2. The SD of a set of standardized scores is always 1 The distribution of z-scores is always equal to a SD of 1.0 M = 50 if x = 60, SD = 10 60 50 10 1 10 10 x 20 30 40 50 60 70 80 z -3 -2 -1 0 1 2 3 Three Properties of Standard Scores 3. The distribution of a set of standardized scores has the same shape as the unstandardized scores Z-score distribution is the same as raw score distribution The shape is the same (but the scaling or metric is different) STANDARDIZED 0 0.0 0.1 2 0.2 0.3 4 0.4 6 0.5 UNSTANDARDIZED 0.4 0.6 0.8 1.0 -6 -4 -2 0 2 Two Disadvantages of Standard Scores 1. A person’s score is expressed relative to the group (X - M) Example: If Dave had taken his Calculus exam in a class in which everyone knew math well his z-score would be well below the mean. If the class didn’t know math very well, however, Dave would be above the mean. Dave’s score depends on everyone else’s scores. Two Disadvantages of Standard Scores 2. If the absolute score is meaningful or of psychological interest, it will be obscured by transforming it to a relative metric Credits http://wwwpsychology.concordia.ca/fac/gurnsey/PSYC315/M.Chapter6.ppt#2 95,17,Comparing Scores from Different Distributions http://www.uic.edu/classes/psych/psych343f/lecture06.ppt#268, 19,Two Disadvantages of Standard Scores http://www.unc.edu/~zhuz/teaching/Stat31/Notes/Lec05bb.ppt# 352,26,Homework 2.2 http://publish.uwo.ca/~pakvis/The%20Normal%20Curve.ppt#28 9,30,Probability: