S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal? Last week This week Why understanding probability is important? What is normal curve How to compute and interpret z scores. What is probability? The chance of winning a lotter The chance to get a head on one flip of a coin Determine the degree of confidence to state a finding Normal curve Symmetrical: (bellshaped) mean=median=mode Asymptotic: tail closer to the horizontal axis, but never touch. Normal distribution Figure 7.4 – P157 Almost 100% of the scores fall between (-3SD, +3SD) Around 34% of the scores fall between (0, 1SD) Normal distribution The distance between contains Range (if mean=100, SD=10) Mean and 1SD 34.13% of all cases 100-110 1SD and 2SD 13.59% of all cases 110-120 2SD and 3SD 2.15% of all cases 120-130 >3SD 0.13% of all cases >130 Mean and -1SD 34.13% of all cases 90-100 -1SD and -2SD 13.59% of all cases 80-90 -2SD and -3SD 2.15% of all cases 70-80 < -3SD 0.13% of all cases <70 Z score – standard score If you want to compare individuals in different distributions Z scores are comparable because they are standardized in units of standard deviations. Z score Standard score X X z s X: the individual score X : the mean S: standard deviation Z score Z scores across different distributions are comparable Z scores represent a distance of z score standard deviation from the mean Raw score 12.8 (mean=12, SD=2) z=+0.4 Raw score 64 (mean=58, SD=15) z=+0.4 Equal distances from the mean Excel for z score Standardize(x, mean, standard deviation) (a2-average(a2:a11))/STDEV(a2:a11) What z scores represent? Raw scores below the mean has negative z scores Raw scores above the mean has positive z scores Representing the number of standard deviations from the mean The more extreme the z score, the further it is from the mean, What z scores represent? 84% of all the scores fall below a z score of +1 (why?) 16% of all the scores fall above a z score of +1 (why?) This percentage represents the probability of a certain score occurring, or an event happening If less than 5%, then this event is unlikely to happen Lab Exercise In a normal distribution with a mean of 100 and a standard deviation of 10, what is the probability that any one score will be 110 or above? 16% Table B.1 (s-p357) Lab If z is not integer Table B.1 (S-P357-358) Exercise The probability associated with z=1.38 41.62% of all the cases in the distribution fall between mean and 1.38 standard deviation, About 92% falls below a 1.38 standard deviation How and why? Between two z scores What is the probability to fall between z score of 1.5 and 2.5 Z=1.5, 43.32% Z=2.5, 49.38% So around 6% of the all the cases of the distribution fall between 1.5 and 2.5 standard deviation. Lab Exercise What is the percentage for data to fall between 110 and 125 with the distribution of mean=100 and SD=10 Answer: 15.25% Excel NORMSDIST(z) To compute the probability associated with a particular z score Lab Exercise The probability of a particular score occurring between a z score of +1 and a z score of +2.5 15% What can we do with z score? Research hypothesis presents a statement of the expected event We use statistics to evaluate how likely that event is. Z tests are reserved for populations T tests are reserved for samples Lab Exercise Compute the z scores where mean=50 and the standard deviation =5 55 50 60 57.5 46 Lab Exercise Based on a distribution of scores with mean=75 and the standard deviation=6.38 What is the probability of a score falling between a raw score of 70 and 80? What is the probability of a score falling above a raw score of 80? What is the probability of a score falling between a raw score of 81 and 83? What is the probability of a score falling below a raw score of 63?