i INF397C Introduction to Research in Information Studies Spring, 2005 Day 12 R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 1 Context • i Where we’ve been: – Descriptive statistics • • • • • Frequency distributions, graphs Types of scales Probability Measures of central tendency and spread z scores – Experimental design • • • • • The scientific method Operational definitions IV, DV, controls, counterbalancing, confounds Validity, reliability Within- and between-subject designs – Qualitative research • Gracy, Rice Lively – Inferential statistics • Dillon – standard error of the mean, t-tests • Doty – Chi square R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 2 Context (cont’d.) i • Where we’re going: – More descriptive statistics • Correlation – Inferential statistics • • • • Confidence intervals Hypothesis testing, Type I and II errors, significance level t-tests Anova – Which method when? – Cumulative final R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 3 Standard Error of the Mean i • So far, we’ve computed a sample mean (M, X bar), and used it to estimate the population mean (µ). • One thing we’ve gotten convinced of (I hope) is . . . larger sample sizes are better. – Think about it – what if I asked ONE of you, what School are you a student in? Versus asking 10 of you? R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 4 Standard Error (cont’d.) i • Well, instead of picking ONE sample, and using that mean to estimate the population mean, what if we sampled a BUNCH of samples? • If we sampled ALL possible samples, the mean of the means would equal the population mean. (“µM”) • Here are some other things we know: – As we get more samples, the mean of the sample means gets closer to the population mean. – Distribution of sample means tends to be normal. – We can use the z table to find the probability of a mean of a certain value. – And most importantly . . . R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 5 Standard Error (cont’d.) i • We can easily work out the standard deviation of the distribution of sample means: – SE = SM = S/SQRT(N) • So, the standard error of the mean is the standard distance that a sample mean is from the population mean. • Thus, the SE tells us how good an estimate our sample mean is of the population mean. • Note, as N gets larger, the SE gets smaller, and the better the sample mean estimates the population mean. • Hold on – we’ll use SE later. R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 6 Two methods of making statistical inferences i • Null hypothesis testing – Assume IV has no effect on DV; differences we obtain are just by chance (error variance) – If the difference is unlikely enough to happen by chance (and “enough” tends to be p < .05), then we say there’s a true difference. • Confidence intervals – We compute a confidence interval for the “true” population mean, from sample data. (95% level, usually.) – If two groups’ confidence intervals don’t overlap, we say (we INFER) there’s a true difference. R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 7 Remember . . . i • Earlier I said that there are two ways for us to be confident that something is true: – Statistical inference – Replicability • Now I’m saying there are two avenues of statistical inference: – Hypothesis testing – Confidence intervals R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 8 Effect Size i • How big of an effect does the IV have on the DV? • Remember, two things that make it hard to find a difference are: – There’s a small actual difference. – There’s a lot of within-group variability (error variance). – (Demonstrate with moving distributions.) R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 9 Effect Size (cont’d.) i • From S, Z, & Z: “To be able to observe the effect of the IV, given large within-group variability, the difference between the two group means must be large.” • Cohen’s d = (µ1 – µ2)/ σ • “Because effect sizes are presented in standard deviation units, they can be used to make meaningful comparisons of effect sizes across experiments using different DVs.” R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 10 Effect Size (cont’d.) i • When σ isn’t known, it’s obtained by pooling the within-group variability across groups and dividing by the total number of scores in both groups. • σ = SQRT {[(n1-1)S12 + (n2-1) S22]/N} • And, by convention: – d of .20 is considered small – d of .50 is considered medium – d of .80 is considered large R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 11 Effect Size example i • Let’s look at the heights of men and women. • Just for grins, intuitively, what you say – small, medium, or large difference? • µwomen = 64.6 in. µmen = 69.8 in. σ = 2.8 in. • d = (µ1 – µ2)/ σ = (69.8 – 64.6)/2.8 = 1.86 • So, very large difference. Indeed, one that everyone is aware of. R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 12 t-tests i • Remember the z scores: – z = (X - µ)/σ – It is often the case that we want to know “What percentage of the scores are above (or below) a certain other score”? – Asked another way, “What is the area under the curve, beyond a certain point”? – THIS is why we calculate a z score, and the way we do it is with the z table, on p. 306 of Hinton. • Problem: We RARELY truly know µ or σ. R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 13 t-tests (cont’d.) i • So, typically what we do is use M to estimate µ and s to estimate σ. (Duh.) (Note: When we estimate σ with s, we divide by N-1, which is degrees of freedom.) • Then, instead of z, we calculate t. • Hinton’s example on p. 64 is for a t-test when you have a null hypothesis population mean (µ0). (That is, you want to test if your observed sample mean is different from some value.) • Hinton then offers examples in Chapter 8 of related (dependent, within-subjects) and independent (unrelated, between-subjects) t-tests. • S, Z, & Z’s example on p. 409 is for a t-test to compare independent means. R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 14 Formulae i - For a single mean(compared with µ0): - t = (M - µ)/(s/SQRTn) - For related (within-subjects) groups: - t = (M1 – M2)/s M1 – M2 - Where s M1 – M2 = (sx1 – x2)/SQRTn - See Hinton, p. 83 - For independent groups: - From S, Z, & Z, p. 409, and Hinton, p. 87 - t = (M1 – M2)/s M1 – M2 – Where s M1 – M2 = SQRT [(S12/n1) + (S22/n2)] – See Hinton, p. 87 » Will’s correction – The minus sign in the numerator of the formula at the top of page 87 should be a minus sign. Also two formulas down. Hinton has it right by the bottom of the page. R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 15 Steps i • For a t test for a single sample – Restate the question as a research hypothesis and a null hypothesis about the populations. – Determine the characteristics of the comparison distribution. • The mean is the known population mean. • Compute the standard deviation by: – – – – Calculate the estimated population variance (S2 = SS/df) Calculate the variance of the distribution of means (S2/n) Take the square root, to get SE. Note, we’re calculating t with N-1 df. • Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. – Decide on an alpha and one-tailed vs. two-tailed – Look up the critical value in the table – Determine your sample’s t score: t = m- µ / SE – Decide whether to reject or not reject the null hypothesis. (If the observed value of t exceeds the table value, reject.) R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 16 Steps • i For a t test for dependent means – Restate the question as a research hypothesis and a null hypothesis about the populations. – Determine the characteristics of the comparison distribution. • Make each person’s score into a difference score. From here on out, use difference scores. • Compute the mean of the difference scores. • Assume a population mean of 0: µ = 0. • Compute the standard deviation of the difference scores: – – – – Calculate the estimated population variance (S2 = SS/df) Calculate the variance of the distribution of means (S2/n) Take the square root, to get SE. Note, we’re calculating t with N-1 df. • Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. – Decide on an alpha, and one-tailed vs. two-tailed – Look up the critical value in the table – Determine your sample’s t score: t = m - µ / SE – Decide whether to reject or not reject the null hypothesis. (If the observed value of t exceeds the table value, reject.) R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 17 Steps i • For a t test for independent means – Same as for dependent means, except the value for SE is that squirrely formula on Hinton, p. 87. – Basically, here’s the point. When you’re comparing DEPENDENT (within-subject, related) means, you can assume both sets of scores come from the same distribution, thus have the same standard deviation. • But when you’re comparing independent (betweensubject, unrelated) means, you gotta basically average the variability of each of the two distributions. R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 18 Three points i • df – Four people, take your choice of candy. – One df used up calculating the mean. • One or two tails – Must be VERY careful, choosing to do a one-tailed test. • Comparing the z and t tables – Check out the .05 t table values for infinity df (1.96 for two-tailed test, 1.645 for one-tailed). – Now find the commensurate values in the z table. R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 19 Significance Level i • Remember, two ways to test statistical significance – hypothesis tests and confidence intervals. • With confidence intervals, if two groups yield data and their confidence intervals don’t overlap, then we conclude a significant difference. • In hypothesis testing, if the probability of finding our differences by chance is smaller than our chosen alpha, then we say we have a significant difference. • We select alpha (α), by tradition. • Statistical significance isn’t the same thing as practical significance. R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 20 Type I and Type II Errors World Our decision Reject the null hypothesis i Null Null hypothesis is hypothesis is false true Correct decision Type I error (α) Fail to reject Type II error Correct the null (β) decision hypothesis (1-β) R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 21 Power of the Test i • The power of a statistical test refers to its ability to find a difference in distributions when there really is one there. • Things that influence the power of a test: – Size of the effect. – Sample size. – Variability. – Alpha level. R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 22