i INF 397C Introduction to Research in Library and Information Science Fall, 2009 Day 3 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 1 Standard Deviation σ = SQRT(Σ(X - i 2 µ) /N) R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 2 New formula for σ i • σ = SQRT(Σ(X - µ)2/N) – HARD to calculate when you have a LOT of scores. Gotta do that subtraction with every one! • New, “computational” equation – σ = SQRT((Σ(X2) – (ΣX)2/N)/N) – We’ll convince ourselves it gives us the same answer in just a minute. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 3 Remember . . . Measures of Central Tendency Measures of Dispersion Mode Range Median SIQR Mean Standard Deviation R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 4 Which when? Mode Range Median SIQR Mean (µ) SD (σ) i -“Most common score.” -Easy to calculate. -Maybe be misleading. -Capture the center. -Not influenced by extreme scores. -Take every score into account. -Allow later manipulations. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 5 In-class Practice Exercises R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 6 The Normal Distribution (From Jaisingh [2000]) R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 7 i R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 8 So far . . . i • . . . we’ve talked of summarizing ONE distribution of scores. – By ordering the scores. – By organizing them in graphs/tables/charts. – By calculating a measure of central tendency and a measure of dispersion. • What happens when we want to compare TWO distributions of scores? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 9 “Now, why would I want to do that”? i • Is your child taller or heavier? • Is this month’s SAT test any easier or harder than last month’s? • Is my 91 in my Research Methods class better than my 95 in my Digital Libraries class? • Is the new library lay-out better than the old one? • Can more employees sign up, more quickly, for benefits with our new intranet site than with our old one? • Did my class perform better on the TAKS test than they did on the TAAS test? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 10 Well? i • COULD it be the case that your 91 in your Research Methods class is better than your 95 in your Digital Libraries class? • How? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 11 i What if . . . • The mean in Research Methods was 50, and the mean in Digital Libraries was 99? • (What, besides the fact that everyone else is trying to drop the Research class!) • So: You Mean 91 50 Dig. Lib. 95 99 Res. Meth. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 12 The Point i • As I said last week, you need to know BOTH a measure of central tendency AND a measure of spread to understand a distribution. • BUT STILL, this can be convoluted . . . • “Well, daughter, how are you doing in grad school this semester”? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 13 “Well, Mom . . . i • “. . . I have a 91 in Research Methods but the mean is 50 and the standard deviation is 12. But I only have a 95 in Digital Libraries, whereas the mean in that class is 99 with a standard deviation of 1.” • Of course, your mom’s reaction will be, “Just call home more often, dear.” R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 14 Wouldn’t it be nice . . . i • . . . if there could be one score we could use for BOTH classes, for BOTH the TAKS test and the TAAS test, for BOTH your child’s height and weight? • There is – and it’s called the “standard score,” or “z score.” (Get ready for another headache.) R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 15 Standard Score i • z = (X - µ)/σ • “Hunh”? • Each score can be expressed as the number of standard deviations it is from the mean of its own distribution. • “Hunh”? • (X - µ) – This is how far the score is from the mean. (Note: Could be negative! No squaring, this time.) • Then divide by the SD to figure out how many SDs you are from the mean. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 16 Z scores (cont’d.) i • z = (X - µ)/σ • Notice, if your score (X) equals the mean, then z is, what? • If your score equals the mean PLUS one standard deviation, then z is, what? • If your score equals the mean MINUS one standard deviation, then z is, what? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 17 i An example Test 1 Test 2 Kris 76 76 Robin 52 86 Marty 58 80 Terry 58 90 ΣX 244 332 µ Mode, median? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 18 Let’s calculate σ – Test 1 X X-µ (X-µ)2 Kris 76 15 225 Robin 52 -9 81 Marty 58 -3 9 Terry 58 -3 9 Σ 244 0 324 /N 61 σ i 81 9 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 19 Let’s calculate σ – Test 2 X X-µ (X-µ)2 Kris 76 -7 49 Robin 86 3 9 Marty 80 -3 9 Terry 90 7 49 Σ 332 0 116 /N 83 σ R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 29 5.4 20 So . . . z = (X - µ)/σ i • Kris had a 76 on both tests. • Test 1 - µ = 61, σ = 9 – So her z score was (76-61)/9 or 15/9 or 1.67. So we say that Kris’s score was 1.67 standard deviations above the mean. • Test 2 - µ = 83, σ = 5.4 – So her z score was (76-83)/5.4 or -7/5.4 or –1.3. So we say that Kris’s score was 1.3 standard deviations BELOW the mean. • Given what I said earlier about two-thirds of the scores being within one standard deviation of the mean . . . . • Wouldn’t it be nice if we knew exactly how many . . . ? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 21 z = (X - µ)/σ i • If I tell you that the average IQ score is 100, and that the SD of IQ scores is 16, and that Bob’s IQ score is 2 SD above the mean, what’s Bob’s IQ? • If I tell you that your 75 was 1.5 standard deviations below the mean of a test that had a mean score of 90, what was the SD of that test? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 22 Notice . . . i • The mean of all z scores (for a particular distribution) will be zero, as will be their sum. • With z scores, we transform raw scores into standard scores. • These standard scores are RELATIVE distances from their (respective) means. • All are expressed in units of σ. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 23 z scores – table values i • z = (X - µ)/σ • It is often the case that we want to know “What percentage of the scores are above (or below) a certain other score”? • Asked another way, “What is the area under the curve, beyond a certain point”? • THIS is why we calculate a z score, and the way we do it is with the z table, on p. 362 of Hinton. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 24 Z distribution R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 25 z table practice 1. 2. 3. 4. 5. 6. 7. i What percentage of scores fall above a z score of 1.0? What percentage of scores fall between the mean and one standard deviation above the mean? What percentage of scores fall within two standard deviations of the mean? 200 people took a test. My z score is .1. How many scores did I “beat”? My z score is .01. How many scores did I “beat”? My score was higher than only 3% of the class. (I suck.) What was my z score. Oooh, get this. My score was higher than only 3% of the class. The mean was 50 and the standard deviation was 10. What was my raw score? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 26 Probability i • Remember all those decisions we talked about, last week. • VERY little of life is certain. • It is PROBABILISTIC. (That is, something might happen, or it might not.) R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 27 Prob. (cont’d.) i • Life’s a gamble! • Just about every decision is based on a probable outcomes. • None of you raised your hands in Week 1 when I asked for “statistical wizards.” Yet every one of you does a pretty good job of navigating an uncertain world. – None of you touched a hot stove (on purpose.) – All of you made it to class. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 28 Probabilities i • Always between one and zero. • Something with a probability of “one” will happen. (e.g., Death, Taxes). • Something with a probability of “zero” will not happen. (e.g., My becoming a Major League Baseball player). • Something that’s unlikely has a small, but still positive, probability. (e.g., probability of someone else having the same birthday as you is 1/365 = .0027, or .27%.) R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 29 Just because . . . i • . . . There are two possible outcomes, doesn’t mean there’s a “50/50 chance” of each happening. • When driving to school today, I could have arrived alive, or been killed in a fiery car crash. (Two possible outcomes, as I’ve defined them.) Not equally likely. • But the odds of a flipped coin being “heads,” . . . . R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 30 Let’s talk about socks R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 31 Prob (cont’d.) i • Probability of something happening is – – – – # of “successes” / # of all events P(one flip of a coin landing heads) = ½ = .5 P(one die landing as a “2”) = 1/6 = .167 P(some score in a distribution of scores is greater than the median) = ½ = .5 – P(some score in a normal distribution of scores is greater than the mean but has a z score of 1 or less is . . . ? – P(drawing a diamond from a complete deck of cards) = ? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 32 Probabilities – and & or i • From Runyon: – Addition Rule: The probability of selecting a sample that contains one or more elements is the sum of the individual probabilities for each element less the joint probability. When A and B are mutually exclusive, • p(A and B) = 0. • p(A or B) = p(A) + p(B) – p(A and B) – Multiplication Rule: The probability of obtaining a specific sequence of independent events is the product of the probability of each event. • p(A and B and . . .) = p(A) x p(B) x . . . R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 33 More prob. i • From Slavin: – Addition Rule: If X and Y are mutually exclusive events, the probability of obtaining either of them is equal to the probability of X plus the probability of Y. – Multiplication Rule: The probability of the simultaneous or successive occurrence of two events is the product of the separate probabilities of each event. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 34 Yet more prob. i • http://www.midcoast.com.au/~turfacts/maths.ht ml – The product or multiplication rule. "If two chances are mutually exclusive the chances of getting both together, or one immediately after the other, is the product of their respective probabilities.“ – the addition rule. "If two or more chances are mutually exclusive, the probability of making ONE OR OTHER of them is the sum of their separate probabilities." R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 35 What’s the probability . . . i • • • • • That the next card is a king? That the next card is a heart? That the next card is a spade? That the next card is a club and a king? That the next card is a spade OR a heart? • That the next two cards are kings? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 36 Think this through. i • What are the odds (“what are the chances”) (“what is the probability”) of getting two “heads” in a row? • Three heads in a row? • Three flips the same (heads or tails) in a row? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 37 So then . . . i • WHY were the odds in favor of having two people in our class with the same birthday? • Think about the problem! • What if there were 367 people in the class. – P(2 people with same b’day) = 1.00 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 38 Happy B’day to Us i • But we had 50. • Probability that the first person has a birthday: 1.00. • Prob of the second person having the same b’day: 1/365 • Prob of the third person having the same b’day as Person 1 and Person 2 is 1/365 + 1/365 – the chances of all three of them having the same birthday. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 39 Sooooo . . . i • http://en.wikipedia.org/wiki/Birthday_para dox R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 40 Homework i Keep reading. Practice problems. Date for midterm. Next week – Dr. Mary Lynn Rice Lively on qualitative research methods. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 41