i INF 397C Introduction to Research in Library and Information Science

advertisement
i
INF 397C
Introduction to Research in Library and
Information Science
Fall, 2009
Day 3
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
1
Standard Deviation
σ = SQRT(Σ(X -
i
2
µ) /N)
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
2
New formula for σ
i
• σ = SQRT(Σ(X - µ)2/N)
– HARD to calculate when you have a LOT of
scores. Gotta do that subtraction with every
one!
• New, “computational” equation
– σ = SQRT((Σ(X2) – (ΣX)2/N)/N)
– We’ll convince ourselves it gives us the
same answer in just a minute.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
3
Remember . . .
Measures of Central
Tendency
Measures of
Dispersion
Mode
Range
Median
SIQR
Mean
Standard Deviation
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
4
Which when?
Mode
Range
Median
SIQR
Mean
(µ)
SD
(σ)
i
-“Most common score.”
-Easy to calculate.
-Maybe be misleading.
-Capture the center.
-Not influenced by
extreme scores.
-Take every score into
account.
-Allow later
manipulations.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
5
In-class Practice Exercises
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
6
The Normal Distribution
(From Jaisingh [2000])
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
7
i
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
8
So far . . .
i
• . . . we’ve talked of summarizing ONE
distribution of scores.
– By ordering the scores.
– By organizing them in graphs/tables/charts.
– By calculating a measure of central
tendency and a measure of dispersion.
• What happens when we want to
compare TWO distributions of scores?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
9
“Now, why would I want to do
that”?
i
• Is your child taller or heavier?
• Is this month’s SAT test any easier or harder than last
month’s?
• Is my 91 in my Research Methods class better than
my 95 in my Digital Libraries class?
• Is the new library lay-out better than the old one?
• Can more employees sign up, more quickly, for
benefits with our new intranet site than with our old
one?
• Did my class perform better on the TAKS test than
they did on the TAAS test?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
10
Well?
i
• COULD it be the case that your 91 in
your Research Methods class is better
than your 95 in your Digital Libraries
class?
• How?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
11
i
What if . . .
• The mean in
Research Methods
was 50, and the
mean in Digital
Libraries was 99?
• (What, besides the
fact that everyone
else is trying to drop
the Research class!)
• So:
You
Mean
91
50
Dig. Lib. 95
99
Res.
Meth.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
12
The Point
i
• As I said last week, you need to know
BOTH a measure of central tendency
AND a measure of spread to understand
a distribution.
• BUT STILL, this can be convoluted . . .
• “Well, daughter, how are you doing in
grad school this semester”?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
13
“Well, Mom . . .
i
• “. . . I have a 91 in Research Methods
but the mean is 50 and the standard
deviation is 12. But I only have a 95 in
Digital Libraries, whereas the mean in
that class is 99 with a standard deviation
of 1.”
• Of course, your mom’s reaction will be,
“Just call home more often, dear.”
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
14
Wouldn’t it be nice . . .
i
• . . . if there could be one score we could
use for BOTH classes, for BOTH the
TAKS test and the TAAS test, for BOTH
your child’s height and weight?
• There is – and it’s called the “standard
score,” or “z score.” (Get ready for
another headache.)
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
15
Standard Score
i
• z = (X - µ)/σ
• “Hunh”?
• Each score can be expressed as the number
of standard deviations it is from the mean of its
own distribution.
• “Hunh”?
• (X - µ) – This is how far the score is from the
mean. (Note: Could be negative! No
squaring, this time.)
• Then divide by the SD to figure out how many
SDs you are from the mean.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
16
Z scores (cont’d.)
i
• z = (X - µ)/σ
• Notice, if your score (X) equals the
mean, then z is, what?
• If your score equals the mean PLUS one
standard deviation, then z is, what?
• If your score equals the mean MINUS
one standard deviation, then z is, what?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
17
i
An example
Test 1
Test 2
Kris
76
76
Robin
52
86
Marty
58
80
Terry
58
90
ΣX
244
332
µ
Mode, median?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
18
Let’s calculate σ – Test 1
X
X-µ
(X-µ)2
Kris
76
15
225
Robin
52
-9
81
Marty
58
-3
9
Terry
58
-3
9
Σ
244
0
324
/N
61
σ
i
81
9
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
19
Let’s calculate σ – Test 2
X
X-µ
(X-µ)2
Kris
76
-7
49
Robin
86
3
9
Marty
80
-3
9
Terry
90
7
49
Σ
332
0
116
/N
83
σ
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
29
5.4
20
So . . . z = (X - µ)/σ
i
• Kris had a 76 on both tests.
• Test 1 - µ = 61, σ = 9
– So her z score was (76-61)/9 or 15/9 or 1.67. So we say that
Kris’s score was 1.67 standard deviations above the mean.
• Test 2 - µ = 83, σ = 5.4
– So her z score was (76-83)/5.4 or -7/5.4 or –1.3. So we say
that Kris’s score was 1.3 standard deviations BELOW the
mean.
• Given what I said earlier about two-thirds of the scores
being within one standard deviation of the mean . . . .
• Wouldn’t it be nice if we knew exactly how many . . . ?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
21
z = (X - µ)/σ
i
• If I tell you that the average IQ score is
100, and that the SD of IQ scores is 16,
and that Bob’s IQ score is 2 SD above
the mean, what’s Bob’s IQ?
• If I tell you that your 75 was 1.5 standard
deviations below the mean of a test that
had a mean score of 90, what was the
SD of that test?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
22
Notice . . .
i
• The mean of all z scores (for a particular
distribution) will be zero, as will be their
sum.
• With z scores, we transform raw scores
into standard scores.
• These standard scores are RELATIVE
distances from their (respective) means.
• All are expressed in units of σ.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
23
z scores – table values
i
• z = (X - µ)/σ
• It is often the case that we want to know
“What percentage of the scores are
above (or below) a certain other score”?
• Asked another way, “What is the area
under the curve, beyond a certain point”?
• THIS is why we calculate a z score, and
the way we do it is with the z table, on p.
362 of Hinton.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
24
Z distribution
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
25
z table practice
1.
2.
3.
4.
5.
6.
7.
i
What percentage of scores fall above a z score of 1.0?
What percentage of scores fall between the mean and one
standard deviation above the mean?
What percentage of scores fall within two standard deviations
of the mean?
200 people took a test. My z score is .1. How many scores did
I “beat”?
My z score is .01. How many scores did I “beat”?
My score was higher than only 3% of the class. (I suck.) What
was my z score.
Oooh, get this. My score was higher than only 3% of the class.
The mean was 50 and the standard deviation was 10. What
was my raw score?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
26
Probability
i
• Remember all those decisions we talked
about, last week.
• VERY little of life is certain.
• It is PROBABILISTIC. (That is,
something might happen, or it might not.)
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
27
Prob. (cont’d.)
i
• Life’s a gamble!
• Just about every decision is based on a
probable outcomes.
• None of you raised your hands in Week 1
when I asked for “statistical wizards.” Yet
every one of you does a pretty good job of
navigating an uncertain world.
– None of you touched a hot stove (on purpose.)
– All of you made it to class.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
28
Probabilities
i
• Always between one and zero.
• Something with a probability of “one” will
happen. (e.g., Death, Taxes).
• Something with a probability of “zero” will not
happen. (e.g., My becoming a Major League
Baseball player).
• Something that’s unlikely has a small, but still
positive, probability. (e.g., probability of
someone else having the same birthday as
you is 1/365 = .0027, or .27%.)
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
29
Just because . . .
i
• . . . There are two possible outcomes,
doesn’t mean there’s a “50/50 chance” of
each happening.
• When driving to school today, I could
have arrived alive, or been killed in a
fiery car crash. (Two possible outcomes,
as I’ve defined them.) Not equally likely.
• But the odds of a flipped coin being
“heads,” . . . .
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
30
Let’s talk about socks
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
31
Prob (cont’d.)
i
• Probability of something happening is
–
–
–
–
# of “successes” / # of all events
P(one flip of a coin landing heads) = ½ = .5
P(one die landing as a “2”) = 1/6 = .167
P(some score in a distribution of scores is greater
than the median) = ½ = .5
– P(some score in a normal distribution of scores is
greater than the mean but has a z score of 1 or less
is . . . ?
– P(drawing a diamond from a complete deck of
cards) = ?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
32
Probabilities – and & or
i
• From Runyon:
– Addition Rule: The probability of selecting a
sample that contains one or more elements is the
sum of the individual probabilities for each element
less the joint probability. When A and B are
mutually exclusive,
• p(A and B) = 0.
• p(A or B) = p(A) + p(B) – p(A and B)
– Multiplication Rule: The probability of obtaining a
specific sequence of independent events is the
product of the probability of each event.
• p(A and B and . . .) = p(A) x p(B) x . . .
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
33
More prob.
i
• From Slavin:
– Addition Rule: If X and Y are mutually
exclusive events, the probability of obtaining
either of them is equal to the probability of X
plus the probability of Y.
– Multiplication Rule: The probability of the
simultaneous or successive occurrence of
two events is the product of the separate
probabilities of each event.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
34
Yet more prob.
i
• http://www.midcoast.com.au/~turfacts/maths.ht
ml
– The product or multiplication rule. "If two chances
are mutually exclusive the chances of getting
both together, or one immediately after the
other, is the product of their respective
probabilities.“
– the addition rule. "If two or more chances are
mutually exclusive, the probability of making ONE
OR OTHER of them is the sum of their separate
probabilities."
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
35
What’s the probability . . .
i
•
•
•
•
•
That the next card is a king?
That the next card is a heart?
That the next card is a spade?
That the next card is a club and a king?
That the next card is a spade OR a
heart?
• That the next two cards are kings?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
36
Think this through.
i
• What are the odds (“what are the
chances”) (“what is the probability”) of
getting two “heads” in a row?
• Three heads in a row?
• Three flips the same (heads or tails) in a
row?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
37
So then . . .
i
• WHY were the odds in favor of having
two people in our class with the same
birthday?
• Think about the problem!
• What if there were 367 people in the
class.
– P(2 people with same b’day) = 1.00
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
38
Happy B’day to Us
i
• But we had 50.
• Probability that the first person has a
birthday: 1.00.
• Prob of the second person having the
same b’day: 1/365
• Prob of the third person having the same
b’day as Person 1 and Person 2 is 1/365
+ 1/365 – the chances of all three of
them having the same birthday.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
39
Sooooo . . .
i
• http://en.wikipedia.org/wiki/Birthday_para
dox
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
40
Homework
i
Keep reading.
Practice problems.
Date for midterm.
Next week – Dr. Mary Lynn Rice Lively on
qualitative research methods.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
41
Download