i INF 397C Introduction to Research in Information Studies Fall, 2009 Randolph G. Bias, Ph.D., CHFP rbias@ischool.utexas.edu cell: 512-657-3924 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 1 i • There are three kinds of lies: lies, damned lies, and statistics. – Benjamin Disraeli (1804 – 1881), British politician R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 2 i • Statistics are like a bikini. What they reveal is suggestive, but what they conceal is vital. – Aaron Levenstein, U.S. politician R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 3 i • The statistics on sanity are that one out of every four Americans is suffering from some form of mental illness. Think of your three best friends. If they're okay, then it's you. – Rita Mae Brown, U.S. author R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 4 First . . . i • There are two components of this and any class: Instruction and Evaluation. • Let’s get the evaluation out of the way, early. • Need one volunteer. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 5 “Research shows . . .” i • Finger length is a good (and quick!) indicator of intelligence. • One volunteer – measure your finger length in cm. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 6 Hmmmm . . . i • Everyone in the class will get a grade of “C” • But still, we can continue with the “instruction” part of the course. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 7 Oh, so maybe . . . i • Just THIS person isn’t too smart. • OK, everyone measure your right index finger. • From the top (last) crease in your hand, to the tip of the finger. • Write down the length, in cm. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 8 How to measure R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 9 Bettin’ Time! i • I’ll give everyone one penny. • You HAVE to wager it: – Turn it heads UP on your desk if you think that two people in this classroom share a birthday (same month and day – NOT necessarily year), and heads DOWN if you think that no two people share a birthday. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 10 Questionnaire i • Please write down your answer to this question: “I approach this class with confidence and enthusiasm.” 1 – Strongly disagree 2 – Disagree 3 – Neutral 4 – Agree 5 – Strongly agree R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 11 Please go to the computer . . . i • . . . and complete your row: – Your first name and last initial. – Your birth month and day (I don’t care about the year). – The length of your finger, in cm. – The number of Major League Baseball games you’ve seen, in person. – Your answer (the number) to the questionnaire question. – One favorite hobby. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 12 Birthdays – Who won and got to keep the penny? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 13 Now, an experiment i • I will hand you each a slip of paper. Please read it an do NOT let anyone else read it. – Women receive a pink slip of paper. – Men receive a green slip of paper. • After everyone has read his/her slip of paper and refolded it, I’ll show some letters of the alphabet, one at a time, for one second each. • After the last one, I’ll say “Go,” and ask you to write down the letters, in order. • Any questions? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 14 OK, pencils down. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 15 i J F M A M J J A S O N D R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 16 Write down the letters. i • In order! R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 17 Answers i J F M A M J J A S O N D R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 18 i Exp. 1 -- Data All correct Not all correct Total Men Women Total R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 19 Who among you . . . i • . . . is a statistical wizard? • . . . has experience conducting research? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 20 What are some things that you know? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 21 Many ways to learn new things i • Method of Authority – trusted authority tells you something • Method of Reason – follow basic logical laws from philosophy • • • • Modeling Trial-and-error Intuition Scientific Method – belief on the basis of experience R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 22 After this class . . . i • You’ll know something about how scientists (information scientists) gather new information. • AND you’ll be good at evaluating information others offer you. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 23 Three Paths to “Belief” i 1 – Naïve acceptance. 2 – Cynicism. 3 – Critical skepticism. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 24 Critical Skepticism! i • Rabbit pie story. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 25 What you’ll learn: i • Reliability. (“Oh, just measure it however.”) • Validity. (Finger length a good indicator of intelligence?) • Sampling – picking a representative sample and then generalizing to a larger population • Why larger samples are better R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 26 What you’ll learn (cont’d.): i • How to represent a group of numbers, meaningfully. – – – – Frequency distributions Measures of central tendency Measures of dispersion (spread) Graphs/Tables • Operationalizing variables (“intelligence”) • Probability • Correlation R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 27 What you’ll learn (cont’d.): i • Different measurement scales • What makes a good research question? • Experimental design – Independent and dependent variables – Controls, counterbalancing, and confounds (like in the memory experiment) – Hypothesis testing – Inferential statistics (is THAT number really bigger than THIS number?) R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 28 Objectives i To arm you with a scientist’s skepticism, and a scientist’s tools to conduct research and evaluate others’ research. The student who successfully completes this course will understand: 1 – descriptive statistics, and how to represent a collection of numbers 2 – how to design a good experiment (and evaluate if someone else has) 3 – inferential statistics and hypothesis testing 4 – other techniques human beings use to gain new information, such as qualitative methods. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 29 Professional History i •B.S. in psych from FSU •Ph.D. in cognitive psych from UT-Austin •Bell Labs for 3 years •IBM-Austin for 11 years •BMC Software for 5 years •Co-founded small usability consulting co. •Previously adjunct faculty member at UT; Have taught at UT, Rutgers, HustonTillotson, SWTSU •Assoc. prof. in the UT School of Information since 2003. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 30 Review syllabus R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 31 What’s up with “Psychology”? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 32 Break i • Give me the penny back, if you lost the birthday bet. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 33 How to talk about a set of numbers i • We can list ‘em. – Can get WAY unwieldy. – Plus hard to make any sense out of them. • First step – put ‘em in order. • Second step – – Graph ‘em, and/or – Calculate percentiles/deciles R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 34 Frequency Distributions Histograms • # of pets ever owned – – – – – – – – – – 13 2 1 4 0 1 3 0 5 1 i • Put ‘em in order – – – – – – – – – – 0 0 1 1 1 2 3 4 5 13 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 35 i Freq Dist • Raw Scores (in order) – – – – – – – – – – 0 0 1 1 1 2 3 4 5 13 Raw Score 0 1 2 3 4 5 13 Freq 2 3 1 1 1 1 1 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu Cumu Freq 2 5 6 7 8 9 10 36 i Histogram 3 2.5 2 1.5 # of pets 1 0.5 0 0 1 2 3 4 5 13 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 37 Percentiles/Deciles i • The cumulative percentage for any given score is the “percentile” for that score. • The decile is one-tenth of the percentile (usually rounded to the nearest whole number). R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 38 Percentiles i • LOCATION of 25th percentile: – X.25 = (N+1) .25 • LOCATION of 50th percentile: – X.50 = (N+1) .50 • LOCATION of 75th percentile: – X.75 = (N+1) .75 • Example: If we had 10 scores, – the 25th percentile would be the (11).25=2.75th score or part way (half way!) between the 2nd and 3rd scores. – The 50th percentile would be the (11).50=5.5th score, or half way between the 5th and 6th scores. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 39 i Example • # of pets ever owned – – – – – – – – – – 13 2 1 4 0 1 3 0 5 1 • Put ‘em in order – – – – – – – – – – 0 0 1 1 1 2 3 4 5 13 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 40 Note . . . i • With an odd number of scores, the 50th percentile will be an actual score: • Raw Scores (in order) – – – – – – – – – – – 0 0 1 1 1 2 3 4 5 13 100 • 50th percentile = (N+1).50 = (12).5 = 6th score = 2. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 41 Scales i • The data we collect can be represented on one of FOUR types of scales: – Nominal – Ordinal – Interval – Ratio • “Scale” in the sense that an individual score is placed at some point along a continuum. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 42 Nominal Scale i • Describe something by giving it a name. (Name – Nominal. Get it?) • Mutually exclusive categories. • For example: – Gender: 1 = Female, 2 = Male – Marital status: 1 = single, 2 = married, 3 = divorced, 4 = widowed – Make of car: 1 = Ford, 2 = Chevy . . . • The numbers are just names. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 43 Ordinal Scale i • An ordered set of objects. • But no implication about the relative SIZE of the steps. • Example: – The 50 states in order of population: • • • • 1 = California 2 = Texas 3 = New York . . . 50 = Wyoming R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 44 Interval Scale i • Ordered, like an ordinal scale. • Plus there are equal intervals between each pair of scores. • With Interval data, we can calculate means (averages). • However, the zero point is arbitrary. • Examples: – Temperature in Fahrenheit or Centigrade. – IQ scores R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 45 Ratio Scale i • Interval scale, plus an absolute zero. • Sample: – Distance, weight, height, time (but not years – e.g., the year 2002 isn’t “twice” 1001). R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 46 Scales (cont’d.) i It’s possible to measure the same attribute on different scales. Say, for instance, your midterm test. I could: • Give you a “1” if you don’t finish, and a “2” if you finish. • “1” for highest grade in class, “2” for second highest grade, . . . . • “1” for first quarter of the class, “2” for second quarter of the class,” . . . • Raw test score (100, 99, . . . .). – (NOTE: A score of 100 doesn’t mean the person “knows” twice as much as a person who scores 50, he/she just gets twice the score.) R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 47 i Scales (cont’d.) Nominal Ordinal Interval Ratio Name = = = Mutuallyexclusive = = = Ordered = = Equal interval = + abs. 0 Days of wk., Temp. Inches, Dollars Gender, Yes/No Class rank, Survey ans. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 48 Critical Skepticism i • Remember the Rabbit Pie example from earlier? • The “critical consumer” of statistics asked “what do you mean by ’50/50’”? • Let’s look at some other situations and claims. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 49 Company is hurting. i • We’d like to ask you to take a 50% cut in pay. • But if you do, we’ll give you a 60% raise next month. OK? • Problem: Base rate. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 50 Sale! i • “Save 100%” • I doubt it. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 51 Probabilities i • “It’s safer to drive in the fog than in the sunshine.” (Kinda like “Most accidents occur within 25 miles of home.” Doesn’t mean it gets safer once you get to San Marcos.) • Navy literature around WWI: – “The death rate in the Navy during the SpanishAmerican war was 9/1000. For civilians in NYC during the same period it was 16/1000. So . . . Join the Navy. It’s safer.” R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 52 Are all results reported? i • “In an independent study [ooh, magic words], people who used Doakes toothpaste had 23% fewer cavities.” • How many studies showed MORE cavities for Doakes users? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 53 Sampling problems i • “Average salary of 1999 UT grads – “$41,000.” • How did they find this? I’ll bet it was average salary of THOSE WHO RESPONDED to a survey. • Who’s inclined to respond? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 54 Correlation ≠ Causation i • Around the turn of the 20th century, there were relatively MANY deaths of tuberculosis in Arizona. • What’s up with that? R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 55 Remember . . . • • • • i I do NOT want you to become cynical. Not all “media bias” is intentional. Just be sensible, critical, skeptical. As you “consume” statistics, ask some questions . . . R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 56 Ask yourself. . . i • Who says so? (A Zest commercial is unlikely to tell you that Irish Spring is best.) • How does he/she know? (That Zest is “the best soap for you.”) • What’s missing? (One year, 33% of female grad students at Johns Hopkins married faculty.) • Did somebody change the subject? (“Camrys are bigger than Accords.” “Accords are bigger than Camrys.”) • Does it make sense? (“Study in NYC: Working woman with family needed $40.13/week for adequate support.”) R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 57 Quote on front of Huff book: i • “It ain’t so much the things we don’t know that get us in trouble. It’s the things we know that ain’t so.” Artemus Ward, US author • Being a critical consumer of statistics will keep you from knowing things that ain’t so. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 58 Claims i • “Better chance of being struck by lightening than being bitten by a shark.” • Tom Brokaw – Tranquilizers. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 59 Earlier . . . i • We learned about frequency distributions. • I asserted that a frequency distribution, and/or a histogram (a graphical representation of a frequency distribution), was a good way to summarize a collection of data. • There’s another, even shorter-hand way. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 60 Measures of Central Tendency i • Mode – Most frequent score (or scores – a distribution can have multiple modes) • Median – “Middle score” – 50th percentile • Mean - µ (“mu”) – “Arithmetic average” – ΣX/N R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 61 Let’s calculate some “averages” i • From old data. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 62 A quiz about averages i 1 – If one score in a distribution changes, will the mode change? __Yes __No __Maybe 2 – How about the median? __Yes __No __Maybe 3 – How about the mean? __Yes __No __Maybe 4 – True or false: In a normal distribution (bell curve), the mode, median, and mean are all the same? __True __False R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 63 More quiz i 5 – (This one is tricky.) If the mode=mean=median, then the distribution is necessarily a bell curve? __True __False 6 – I have a distribution of 10 scores. There was an error, and really the highest score is 5 points HIGHER than previously thought. a) What does this do to the mode? __ Increases it __Decreases it __Nothing __Can’t tell b) What does this do to the median? __ Increases it __Decreases it __Nothing __Can’t tell c) What does this do to the mean? __ Increases it __Decreases it __Nothing __Can’t tell 7 – Which of the following must be an actual score from the distribution? a) Mean b) Median c) Mode d) None of the above R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 64 OK, so which do we use? i • Means allow further arithmetic/statistical manipulation. But . . . • It depends on: – The type of scale of your data • Can’t use means with nominal or ordinal scale data • With nominal data, must use mode – The distribution of your data • Tend to use medians with distributions bounded at one end but not the other (e.g., salary). (Look at our “Number of MLB games” distribution.) – The question you want to answer • “Most popular score” vs. “middle score” vs. “middle of the see-saw” • “Statistics can tell us which measures are technically correct. It cannot tell us which are ‘meaningful’” (Tal, 2001, p. 52). R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 65 Why . . . i • . . . isn’t a “measure of central tendency” all we need to characterize a distribution of scores/numbers/data/stuff? • “The price for using measures of central tendency is loss of information” (Tal, 2001, p. 49). R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 66 Note . . . • • • • • i We started with a bunch of specific scores. We put them in order. We drew their distribution. Now we can report their central tendency. So, we’ve moved AWAY from specifics, to a summary. But with Central Tendency, alone, we’ve ignored the specifics altogether. – Note MANY distributions could have a particular central tendency! • If we went back to ALL the specifics, we’d be back at square one. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 67 Measures of Dispersion i • Range • Semi-interquartile range • Standard deviation – σ (sigma) R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 68 Homework i • Bring in one claim that you hear in the next week. On the news, in your reading, in an ad, wherever. (Just be prepared to share it aloud.) • Buy books. (Co-op says they’re in.) • Read: – Huff: All of it. – Hinton: Ch. 1, 2, 3 – S, Z, & Z: Ch. 1, 2, 3 • Come with a guess as to what the most important sentence is in S, Z, & Z, Ch. 2. (In my humble opinion.) See you next week. R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu 69