i INF 397C Introduction to Research in Information Studies

advertisement
i
INF 397C
Introduction to Research in Information
Studies
Fall, 2009
Randolph G. Bias, Ph.D., CHFP
rbias@ischool.utexas.edu
cell: 512-657-3924
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
1
i
• There are three kinds of lies: lies,
damned lies, and statistics.
– Benjamin Disraeli (1804 – 1881), British
politician
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
2
i
• Statistics are like a bikini. What they
reveal is suggestive, but what they
conceal is vital.
– Aaron Levenstein, U.S. politician
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
3
i
• The statistics on sanity are that one out
of every four Americans is suffering from
some form of mental illness. Think of
your three best friends. If they're okay,
then it's you.
– Rita Mae Brown, U.S. author
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
4
First . . .
i
• There are two components of this and
any class: Instruction and Evaluation.
• Let’s get the evaluation out of the way,
early.
• Need one volunteer.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
5
“Research shows . . .”
i
• Finger length is a good (and quick!)
indicator of intelligence.
• One volunteer – measure your finger
length in cm.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
6
Hmmmm . . .
i
• Everyone in the class will get a grade of
“C”
• But still, we can continue with the
“instruction” part of the course.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
7
Oh, so maybe . . .
i
• Just THIS person isn’t too smart.
• OK, everyone measure your right index
finger.
• From the top (last) crease in your hand,
to the tip of the finger.
• Write down the length, in cm.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
8
How to measure
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
9
Bettin’ Time!
i
• I’ll give everyone one penny.
• You HAVE to wager it:
– Turn it heads UP on your desk if you think
that two people in this classroom share a
birthday (same month and day – NOT
necessarily year), and heads DOWN if you
think that no two people share a birthday.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
10
Questionnaire
i
• Please write down your answer to this
question:
“I approach this class with confidence and
enthusiasm.”
1 – Strongly disagree
2 – Disagree
3 – Neutral
4 – Agree
5 – Strongly agree
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
11
Please go to the computer . . .
i
• . . . and complete your row:
– Your first name and last initial.
– Your birth month and day (I don’t care about the
year).
– The length of your finger, in cm.
– The number of Major League Baseball games
you’ve seen, in person.
– Your answer (the number) to the questionnaire
question.
– One favorite hobby.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
12
Birthdays – Who won and got to
keep the penny?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
13
Now, an experiment
i
• I will hand you each a slip of paper. Please
read it an do NOT let anyone else read it.
– Women receive a pink slip of paper.
– Men receive a green slip of paper.
• After everyone has read his/her slip of paper
and refolded it, I’ll show some letters of the
alphabet, one at a time, for one second each.
• After the last one, I’ll say “Go,” and ask you to
write down the letters, in order.
• Any questions?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
14
OK, pencils down.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
15
i
J
F
M
A
M
J
J
A
S
O
N
D
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
16
Write down the letters.
i
• In order!
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
17
Answers
i
J
F
M
A
M
J
J
A
S
O
N
D
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
18
i
Exp. 1 -- Data
All correct
Not all
correct
Total
Men
Women
Total
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
19
Who among you . . .
i
• . . . is a statistical wizard?
• . . . has experience conducting
research?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
20
What are some things that you
know?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
21
Many ways to learn new things
i
• Method of Authority
– trusted authority tells you something
• Method of Reason
– follow basic logical laws from philosophy
•
•
•
•
Modeling
Trial-and-error
Intuition
Scientific Method
– belief on the basis of experience
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
22
After this class . . .
i
• You’ll know something about how
scientists (information scientists) gather
new information.
• AND you’ll be good at evaluating
information others offer you.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
23
Three Paths to “Belief”
i
1 – Naïve acceptance.
2 – Cynicism.
3 – Critical skepticism.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
24
Critical Skepticism!
i
• Rabbit pie story.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
25
What you’ll learn:
i
• Reliability. (“Oh, just measure it
however.”)
• Validity. (Finger length a good indicator
of intelligence?)
• Sampling – picking a representative
sample and then generalizing to a larger
population
• Why larger samples are better
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
26
What you’ll learn (cont’d.):
i
• How to represent a group of numbers,
meaningfully.
–
–
–
–
Frequency distributions
Measures of central tendency
Measures of dispersion (spread)
Graphs/Tables
• Operationalizing variables (“intelligence”)
• Probability
• Correlation
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
27
What you’ll learn (cont’d.):
i
• Different measurement scales
• What makes a good research question?
• Experimental design
– Independent and dependent variables
– Controls, counterbalancing, and confounds
(like in the memory experiment)
– Hypothesis testing
– Inferential statistics (is THAT number really
bigger than THIS number?)
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
28
Objectives
i
To arm you with a scientist’s skepticism, and a
scientist’s tools to conduct research and evaluate
others’ research. The student who successfully
completes this course will understand:
1 – descriptive statistics, and how to represent a
collection of numbers
2 – how to design a good experiment (and evaluate
if someone else has)
3 – inferential statistics and hypothesis testing
4 – other techniques human beings use to gain new
information, such as qualitative methods.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
29
Professional History
i
•B.S. in psych from FSU
•Ph.D. in cognitive psych from UT-Austin
•Bell Labs for 3 years
•IBM-Austin for 11 years
•BMC Software for 5 years
•Co-founded small usability consulting co.
•Previously adjunct faculty member at UT;
Have taught at UT, Rutgers, HustonTillotson, SWTSU
•Assoc. prof. in the UT School of
Information since 2003.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
30
Review syllabus
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
31
What’s up with “Psychology”?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
i
32
Break
i
• Give me the penny back, if you lost the
birthday bet.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
33
How to talk about a set of
numbers
i
• We can list ‘em.
– Can get WAY unwieldy.
– Plus hard to make any sense out of them.
• First step – put ‘em in order.
• Second step –
– Graph ‘em, and/or
– Calculate percentiles/deciles
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
34
Frequency Distributions Histograms
• # of pets ever owned
–
–
–
–
–
–
–
–
–
–
13
2
1
4
0
1
3
0
5
1
i
• Put ‘em in order
–
–
–
–
–
–
–
–
–
–
0
0
1
1
1
2
3
4
5
13
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
35
i
Freq Dist
• Raw Scores (in
order)
–
–
–
–
–
–
–
–
–
–
0
0
1
1
1
2
3
4
5
13
Raw Score
0
1
2
3
4
5
13
Freq
2
3
1
1
1
1
1
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
Cumu Freq
2
5
6
7
8
9
10
36
i
Histogram
3
2.5
2
1.5
# of pets
1
0.5
0
0
1
2
3
4
5
13
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
37
Percentiles/Deciles
i
• The cumulative percentage for any given
score is the “percentile” for that score.
• The decile is one-tenth of the percentile
(usually rounded to the nearest whole
number).
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
38
Percentiles
i
• LOCATION of 25th percentile:
– X.25 = (N+1) .25
• LOCATION of 50th percentile:
– X.50 = (N+1) .50
• LOCATION of 75th percentile:
– X.75 = (N+1) .75
• Example: If we had 10 scores,
– the 25th percentile would be the (11).25=2.75th score or part
way (half way!) between the 2nd and 3rd scores.
– The 50th percentile would be the (11).50=5.5th score, or half
way between the 5th and 6th scores.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
39
i
Example
• # of pets ever owned
–
–
–
–
–
–
–
–
–
–
13
2
1
4
0
1
3
0
5
1
• Put ‘em in order
–
–
–
–
–
–
–
–
–
–
0
0
1
1
1
2
3
4
5
13
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
40
Note . . .
i
• With an odd number of scores, the 50th percentile will
be an actual score:
• Raw Scores (in order)
–
–
–
–
–
–
–
–
–
–
–
0
0
1
1
1
2
3
4
5
13
100
• 50th percentile = (N+1).50 = (12).5 = 6th score = 2.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
41
Scales
i
• The data we collect can be represented
on one of FOUR types of scales:
– Nominal
– Ordinal
– Interval
– Ratio
• “Scale” in the sense that an individual
score is placed at some point along a
continuum.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
42
Nominal Scale
i
• Describe something by giving it a name.
(Name – Nominal. Get it?)
• Mutually exclusive categories.
• For example:
– Gender: 1 = Female, 2 = Male
– Marital status: 1 = single, 2 = married,
3 = divorced, 4 = widowed
– Make of car: 1 = Ford, 2 = Chevy . . .
• The numbers are just names.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
43
Ordinal Scale
i
• An ordered set of objects.
• But no implication about the relative
SIZE of the steps.
• Example:
– The 50 states in order of population:
•
•
•
•
1 = California
2 = Texas
3 = New York
. . . 50 = Wyoming
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
44
Interval Scale
i
• Ordered, like an ordinal scale.
• Plus there are equal intervals between each
pair of scores.
• With Interval data, we can calculate means
(averages).
• However, the zero point is arbitrary.
• Examples:
– Temperature in Fahrenheit or Centigrade.
– IQ scores
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
45
Ratio Scale
i
• Interval scale, plus an absolute zero.
• Sample:
– Distance, weight, height, time (but not years
– e.g., the year 2002 isn’t “twice” 1001).
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
46
Scales (cont’d.)
i
It’s possible to measure the same attribute on
different scales. Say, for instance, your
midterm test. I could:
• Give you a “1” if you don’t finish, and a “2” if
you finish.
• “1” for highest grade in class, “2” for second
highest grade, . . . .
• “1” for first quarter of the class, “2” for second
quarter of the class,” . . .
• Raw test score (100, 99, . . . .).
– (NOTE: A score of 100 doesn’t mean the person
“knows” twice as much as a person who scores 50,
he/she just gets twice the score.)
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
47
i
Scales (cont’d.)
Nominal
Ordinal
Interval
Ratio
Name
=
=
=
Mutuallyexclusive
=
=
=
Ordered
=
=
Equal
interval
=
+ abs. 0
Days of wk.,
Temp.
Inches,
Dollars
Gender,
Yes/No
Class rank,
Survey ans.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
48
Critical Skepticism
i
• Remember the Rabbit Pie example from
earlier?
• The “critical consumer” of statistics
asked “what do you mean by ’50/50’”?
• Let’s look at some other situations and
claims.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
49
Company is hurting.
i
• We’d like to ask you to take a 50% cut in
pay.
• But if you do, we’ll give you a 60% raise
next month. OK?
• Problem: Base rate.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
50
Sale!
i
• “Save 100%”
• I doubt it.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
51
Probabilities
i
• “It’s safer to drive in the fog than in the
sunshine.” (Kinda like “Most accidents occur
within 25 miles of home.” Doesn’t mean it gets
safer once you get to San Marcos.)
• Navy literature around WWI:
– “The death rate in the Navy during the SpanishAmerican war was 9/1000. For civilians in NYC
during the same period it was 16/1000. So . . . Join
the Navy. It’s safer.”
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
52
Are all results reported?
i
• “In an independent study [ooh, magic
words], people who used Doakes
toothpaste had 23% fewer cavities.”
• How many studies showed MORE
cavities for Doakes users?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
53
Sampling problems
i
• “Average salary of 1999 UT grads –
“$41,000.”
• How did they find this? I’ll bet it was
average salary of THOSE WHO
RESPONDED to a survey.
• Who’s inclined to respond?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
54
Correlation ≠ Causation
i
• Around the turn of the 20th century, there
were relatively MANY deaths of
tuberculosis in Arizona.
• What’s up with that?
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
55
Remember . . .
•
•
•
•
i
I do NOT want you to become cynical.
Not all “media bias” is intentional.
Just be sensible, critical, skeptical.
As you “consume” statistics, ask some
questions . . .
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
56
Ask yourself. . .
i
• Who says so? (A Zest commercial is unlikely to tell
you that Irish Spring is best.)
• How does he/she know? (That Zest is “the best
soap for you.”)
• What’s missing? (One year, 33% of female grad
students at Johns Hopkins married faculty.)
• Did somebody change the subject? (“Camrys
are bigger than Accords.” “Accords are bigger than
Camrys.”)
• Does it make sense? (“Study in NYC: Working
woman with family needed $40.13/week for adequate
support.”)
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
57
Quote on front of Huff book:
i
• “It ain’t so much the things we don’t
know that get us in trouble. It’s the
things we know that ain’t so.”
Artemus Ward, US author
• Being a critical consumer of statistics will
keep you from knowing things that ain’t
so.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
58
Claims
i
• “Better chance of being struck by
lightening than being bitten by a shark.”
• Tom Brokaw – Tranquilizers.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
59
Earlier . . .
i
• We learned about frequency
distributions.
• I asserted that a frequency distribution,
and/or a histogram (a graphical
representation of a frequency
distribution), was a good way to
summarize a collection of data.
• There’s another, even shorter-hand way.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
60
Measures of Central Tendency
i
• Mode
– Most frequent score (or scores – a
distribution can have multiple modes)
• Median
– “Middle score”
– 50th percentile
• Mean - µ (“mu”)
– “Arithmetic average”
– ΣX/N
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
61
Let’s calculate some “averages”
i
• From old data.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
62
A quiz about averages
i
1 – If one score in a distribution changes, will the mode
change?
__Yes __No __Maybe
2 – How about the median?
__Yes __No __Maybe
3 – How about the mean?
__Yes __No __Maybe
4 – True or false: In a normal distribution (bell curve), the
mode, median, and mean are all the same? __True
__False
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
63
More quiz
i
5 – (This one is tricky.) If the mode=mean=median, then the distribution is
necessarily a bell curve?
__True __False
6 – I have a distribution of 10 scores. There was an error, and really the
highest score is 5 points HIGHER than previously thought.
a) What does this do to the mode?
__ Increases it __Decreases it __Nothing __Can’t tell
b) What does this do to the median?
__ Increases it __Decreases it __Nothing __Can’t tell
c) What does this do to the mean?
__ Increases it __Decreases it __Nothing __Can’t tell
7 – Which of the following must be an actual score from the distribution?
a) Mean
b) Median
c) Mode
d) None of the above
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
64
OK, so which do we use?
i
• Means allow further arithmetic/statistical manipulation. But . . .
• It depends on:
– The type of scale of your data
• Can’t use means with nominal or ordinal scale data
• With nominal data, must use mode
– The distribution of your data
• Tend to use medians with distributions bounded at one
end but not the other (e.g., salary). (Look at our “Number
of MLB games” distribution.)
– The question you want to answer
• “Most popular score” vs. “middle score” vs. “middle of the
see-saw”
• “Statistics can tell us which measures are technically
correct. It cannot tell us which are ‘meaningful’” (Tal,
2001, p. 52).
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
65
Why . . .
i
• . . . isn’t a “measure of central tendency”
all we need to characterize a distribution
of scores/numbers/data/stuff?
• “The price for using measures of central
tendency is loss of information” (Tal,
2001, p. 49).
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
66
Note . . .
•
•
•
•
•
i
We started with a bunch of specific scores.
We put them in order.
We drew their distribution.
Now we can report their central tendency.
So, we’ve moved AWAY from specifics, to a
summary. But with Central Tendency, alone,
we’ve ignored the specifics altogether.
– Note MANY distributions could have a particular
central tendency!
• If we went back to ALL the specifics, we’d be
back at square one.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
67
Measures of Dispersion
i
• Range
• Semi-interquartile range
• Standard deviation
– σ (sigma)
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
68
Homework
i
• Bring in one claim that you hear in the next week. On the news,
in your reading, in an ad, wherever. (Just be prepared to share it
aloud.)
• Buy books. (Co-op says they’re in.)
• Read:
– Huff: All of it.
– Hinton: Ch. 1, 2, 3
– S, Z, & Z: Ch. 1, 2, 3
• Come with a guess as to what the most important sentence is in
S, Z, & Z, Ch. 2. (In my humble opinion.)
See you next week.
R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu
69
Download