i INF397C Introduction to Research in Information Studies

advertisement
i
INF397C
Introduction to Research in Information
Studies
Fall, 2009
Day 10
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
1
Where we’ve been:
i
• Descriptive statistics
–
–
–
–
–
–
Frequency distributions
Graphs
Types of scales
Probability
Measures of central tendency and spread
z scores
• Experimental design
–
–
–
–
–
The scientific method
Operational definitions
IV, DV, controls, counterbalancing, confounds
Validity, reliability
Within- and between-subject designs
• Qualitative research
– Gracy, Rice Lively
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
2
Context (cont’d.)
i
• Where we’re going:
– More descriptive statistics
• Correlation (demo)
– Inferential statistics
•
•
•
•
•
Confidence intervals
Hypothesis testing, Type I and II errors, significance level
t-tests
ANOVA (VERY high level)
Chi square (demo)
– Which method when?
– Cumulative final
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
3
First, correcting a lie
Parameters
(for
populations)
Statistics (for
samples)
Mean
µ = ΣX/N
M (or “X bar”) =
ΣX/N
Standard
deviation
σ = SQRT of
s = SQRT of
2–(ΣX)2/N)/
2
2
(ΣX
(ΣX –(ΣX) /N)/
N-1
N
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
4
Degrees of Freedom
i
• Demo
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
5
Standard Error of the Mean
i
• So far, we’ve computed a sample mean
(M, X bar), and used it to estimate the
population mean (µ).
• One thing we’ve gotten convinced of (I
hope) is . . . larger sample sizes are
better.
– Think about it – what if I asked ONE of you,
what School are you a student in? Versus
asking 10 of you?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
6
Standard Error (cont’d.)
i
• Well, instead of picking ONE sample, and using that
mean to estimate the population mean, what if we
sampled a BUNCH of samples?
• If we sampled ALL possible samples, the mean of the
means would equal the population mean. (“µM”)
• Here are some other things we know:
– As we get more samples, the mean of the sample means gets
closer to the population mean.
– Distribution of sample means tends to be normal.
– We can use the z table to find the probability of a mean of a
certain value.
– And most importantly . . .
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
7
Standard Error (cont’d.)
i
• We can easily work out the standard deviation of the
distribution of sample means:
SE = SM = S/SQRT(N)
• So, the standard error of the mean is the standard
distance that a sample mean is from the population
mean.
• Thus, the SE tells us how good an estimate our
sample mean is of the population mean.
• Note, as N gets larger, the SE gets smaller, and the
better the sample mean estimates the population
mean.
• Hold on – we’ll use SE later.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
8
A research question
i
1. Does an iSchool IT-provided online
tutorial lead to better learning than a
face-to-face class?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
9
Two methods of making
statistical inferences
i
• Null hypothesis testing
– Assume IV has no effect on DV; differences we
obtain are just by chance (error variance)
– If the difference is unlikely enough to happen by
chance (and “enough” tends to be p < .05), then we
say there’s a true difference.
• Confidence intervals
– We compute a confidence interval for the “true”
population mean, from sample data. (95% level,
usually.)
– If two groups’ confidence intervals don’t overlap, we
say (we INFER) there’s a true difference.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
10
Remember . . .
i
• Earlier I said that there are two ways for
us to be confident that something is true:
– Statistical inference
– Replicability
• Now I’m saying there are two avenues of
statistical inference:
– Hypothesis testing
– Confidence intervals
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
11
Confidence Intervals
i
• We calculate a confidence interval for a population parameter.
• The mean of a random sample from a population is a point
estimate of the population mean.
• But there’s variability! (SE tells us how much.)
• What is the range of scores between which we’re 95% confident
that the population mean falls?
• Think about it – the larger the interval we select, the larger the
likelihood it will “capture” the true (population) mean.
• CI = M +/- (t.05)(SE)
• See Box 12.2 on “margin of error.” NOTE: In the box they arrive
at a 95% confidence that the poll has a margin of error of 5%. It
is just coincidence that these two numbers add up to 100%.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
12
CI about a mean -- example
•
•
•
•
i
CI = M +/- (t.05)(SE)
Establish the level of α (two-tailed) for the CI. (.05)
M=15.0 s=5.0 N=25
Use Table A.2 to find the critical value associated with the df.
– t.05(24) = 2.064
• CI = 15.0 +/- 2.064(5.0/SQRT 25)
= 15.0 +/- 2.064
= 12.935 – 17.064
“The odds are 95 out of 100 that the population mean falls between
12.935 and 17.064.”
(NOTE: This is NOT the same as “95% of the scores fall within this
range!!!)
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
13
Type I and Type II Errors
World
Our
decision
Reject the
null
hypothesis
i
Null
Null
hypothesis is hypothesis is
false
true
Correct
decision
Type I error
(α)
Fail to reject Type II error Correct
the null
(β)
decision
hypothesis
(1-β)
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
14
Download