RELEVANCE? in information science Tefko Saracevic, Ph.D.

advertisement
RELEVANCE?
in information science
Tefko Saracevic, Ph.D.
tefkos@rutgers.edu
Tefko Saracevic
1
Two worlds in information science
IR systems offer as
answers their version of
what may be relevant

by ever improving
algorithms
People go their way &
assess relevance

The two
worlds
interact
by their problem-at hand,
context & criteria
Covered here: human world of relevance
NOT covered: how IR deals with relevance
Tefko Saracevic
2
Relevance interaction
Human
context,
inf. need ...
System
algorithms ...
URLs, references, and inspirations are in Notes
Tefko Saracevic
3
“Our work is to understand a person's realtime goal and match it with relevant
information.”
“... relevant information.”
“... relevant ...”
Tefko Saracevic
4
Definitions
Merriam-Webster Dictionary Online
“1
a: relation to the matter at hand
b: practical and especially social applicability :
pertinence <giving relevance to college courses>
2
: the ability (as of an information retrieval
system) to retrieve material that satisfies the
needs of the user.”
Tefko Saracevic
5
Relevance – by any other name...

Many names e.g.
“pertinent; useful; applicable;
significant; germane;
material; bearing; proper;
related; important; fitting;
suited; apropos; ... “ &
nowadays even “truthful” ...

"A rose by any other
name would
smell as sweet“
Shakespeare, Romeo and
Juliet
Connotations may differ
but the concept is still
relevance
Tefko Saracevic
6
What is “matter at hand”?

Context in relation to which


a problem is addressed
 an information need is expressed
 a question is asked
an interaction is conducted
There is no such thing as considering
relevance without a context
Axiom: One cannot not have a context
in information interaction.
Tefko Saracevic
7
context – information seeking – intent
from Latin: contextus "a joining together”
contexere "to weave together”
 “Context
– circumstance, setting:
The set of facts or circumstances that surround a
situation or event; “the historic context” “ Wordnet
 However, in information science
& computer science as well:
“There is no term more often used, less often defined and,
when defined, defined so variously, as context.
Context has the potential to be virtually anything that is not
defined as the phenomenon of interest.”
Dervin, 1997
Tefko Saracevic
8
context –
information seeking – intent
Process in which humans purposefully
engage in order to change their state of
knowledge (Marchionini, 1995)
 A conscious effort to acquire information in
response to a need or gap in your knowledge

(Case, 2007)

...fitting information in with what one
already knows and extending this knowledge
to create new perspectives (Kuhlthau, 2004)
Tefko Saracevic
9
Information seeking concentrations

Purposeful process [all cognitive] to:
change state of knowledge
 respond to an information need or gap
 fit information in with what one already knows

To seek information people seek to change
the state of their knowledge
 Critique: Broader social, cultural,
environmental … factors not included
Tefko Saracevic
10
context – information seeking –
intent
Many information seeking studies involved
TASK as context & accomplishment of task
as intent
 Distinguished as to simple, difficult,
complex ...
 But: there is more to a task then task itself


time-line: stages of task; changes over time
Tefko Saracevic
11
Two large questions
Why did relevance become a central
notion of information science?
 What did we learn about relevance
through research in information science?

Tefko Saracevic
12
A bit of history
WHY RELEVANCE?
Tefko Saracevic
13
It all started with

Vannevar Bush: Article “As we may think” 1945

Defined the problem as “... the massive task of
making more accessible of a bewildering store of
knowledge.”



problem still with us & growing
Suggested a solution, a machine:
“Memex ... association of ideas ...
duplicate mental processes artificially.”
Technological fix to problem
1890-1974
Tefko Saracevic
14
Information Retrieval (IR) – definition
Term “information retrieval” coined & defined by
Calvin Mooers, 1951
“ IR: ... intellectual aspects of description of
information, ...
and its specification for search
... and systems, technique,
or machines...
[to provide information]
useful to user”

1919-1994
Tefko Saracevic
15
Technological determinant

In IR emphasis was not only on
organization but even more on searching

technology was suitable for searching


Tefko Saracevic
in the beginning information organization was
done by people & searching by machines
nowadays information organization mostly by
machines (sometimes by humans as well) &
searching almost exclusively by machines
16
Some of the pioneers
Mortimer Taube1910-1965
Hans Peter Luhn 1896-1964

at IBM pioneered many IR
computer applications

first to describe searching
using Venn diagrams
Tefko Saracevic

at Documentation Inc.
pioneered coordinate
indexing

first to describe searching
as Boolean algebra
17
Searching & relevance

Searching became a key
component of
information retrieval



And searching is about
retrieval of relevant
answers
extensive theoretical &
practical concern with
searching
technology uniquely
suitable for searching
Thus RELEVANCE emerged as a key notion
Tefko Saracevic
18
Why relevance?
Aboutness
 A fundamental notion
related to organization of
information
 Relates to subject & in a
broader sense to
epistemology
Relevance
 A fundamental notion
related to searching for
information
 Relates to problem-at-hand
and context & in a broader
sense to pragmatism
Relevance emerged as a central notion in information science
because of practical & theoretical concerns with searching
Tefko Saracevic
19
Relevance research
WHAT HAVE WE LEARNED ABOUT
RELEVANCE?
Tefko Saracevic
20
Claims & counterclaims in IR
Historically from the outset: “My system
is better than your system!”
 Well, which one is it? Lets test it. But:

what criterion to use?
 what measures based on the criterion?


Things got settled by the end of 1950’s
and remain mostly the same to this day
Tefko Saracevic
21
Relevance & IR testing

In 1955 Allen Kent &
James W. Perry were first
to propose two measures
for test of IR systems:


Allen Kent
1921 -
“relevance” later renamed
“precision” & “recall”
A scientific & engineering
approach to testing
Tefko Saracevic
James W. Perry
1907-1971
22
Relevance as criterion for measures
Precision
 Probability that what is
retrieved is relevant


conversely: how much junk is
retrieved?
Recall
 Probability that what is
relevant in a file is retrieved

conversely: how much
relevant stuff is missed?
Probability of agreement between what the system
retrieved/not retrieved as relevant (systems relevance)
& what the user assessed as relevant (user relevance)
where user relevance is the gold standard for
comparison
Tefko Saracevic
23
First test – law of unintended consequences

Mid 1950’s test of two
competing systems:




subject headings by Armed
Services Tech Inf Agency
uniterms (keywords) by
Documentation Inc.
15,000 documents
indexed by each group,
98 questions searched
but relevance judged by
each group separately
Results:


First group: 2,200 relevant
Second: 1,998 relevant


Then peace talks


but low agreement
but even after agreement came
to 30.9%
Test collapsed on
relevance disagreements
Learned: Never, ever use more than a single judge per query.
Since then to this day IR tests don’t
Tefko Saracevic
24
Cranfield tests 1957-1967




Funded by NSF
Controlled testing:
Cyril Cleverdon
1914-1997
different indexing languages,
same documents, same
relevance judgment
Used traditional IR model –
non-interactive
Many results, some surprising
 e.g. simple keywords “high
ranks on many counts”


Developed Cranfield
methodology for testing
Still in use today incl. in
TREC started in 1992, still strong in 2014
Tefko Saracevic
25
Tradeoff in recall vs. precision
Cleverdon’s law

Generally, there is a
tradeoff:



Example from TREC:
recall can be increased by
retrieving more but
precision decreases
precision can be increased
by being more specific but
recall decreases
Some users want high
precision others high
recall
Tefko Saracevic
26
Assumptions in Cranfield methodology


IR and thus relevance is
static (traditional IR model)
Relevance is:






topical
binary
independent
stable
consistent
if pooling: complete


Inspired relevance
experimentation on
every one of these
assumptions
Main finding:
none of them holds
but simplified assumptions enabled rich IR tests and many developments
Tefko Saracevic
27
IR & relevance: static vs. dynamic
Q: Do relevance inferences & criteria change over
time for the same user & task? A: They do

For a given task, user’s inferences are dependent on
the stage of the task:
Different stages = differing selections but different
stages = similar criteria = different weights
Increased focus = increased discrimination = more
stringent relevance inferences
IR & relevance inferences are highly dynamic processes
Tefko Saracevic
28
Experimental results
Topical
Binary
Independent
Tefko Saracevic
Topicality: very important but not exclusive role.
Cognitive, situational, affective variables: play a role
e.g. user background (cognitive); task complexity
(situational); intent, motivation (affective)
Continuum: Users judge on a continuum &
comparatively, not only binary (relevant – not relevant).
Bi-modality: Seems that assessments have high peaks at
end points of the range (not relevant, relevant) with
smaller peaks in the middle range
Order: in which documents are presented to users
seems to have an effect.
Near beginning: Seems that documents presented early
have a higher probability of being inferred as relevant.
29
Experimental results (cont.)
Stable
Consistent
If pooling:
Complete
Tefko Saracevic
Time: relevance judgments = not completely stable;
change over time as tasks progress & learning advances
Criteria: for judging relevance are fairly stable
Expertise: higher = higher agreement, less differences;
lower = lower agreement, more leniency.
Individual differences: the most prominent feature &
factor in relevance inferences. Experts agree up to 80%;
others around 30%
Number of judges: More judges = less agreement
(if only a sample of collection or a pool from several
searches is evaluated)
Additions: with more pools or increased sampling more
relevant objects are found
30
Clues: on what basis & criteria users make
relevance judgments?
Content
topic, quality, depth, scope, currency,
treatment, clarity
Object
characteristics of information objects,
e.g., type, organization, representation,
format, availability, accessibility, costs
Validity
accuracy of information provided,
authority, trustworthiness of sources,
verifiability
Tefko Saracevic
31
Clues (cont.): Matching users
Use or
situational
match
Cognitive
match
appropriateness to situation, or
tasks, usability, urgency; value in use
Affective
match
emotional responses to information,
fun, frustration, uncertainty
Belief match
personal credence given to information,
confidence
Tefko Saracevic
understanding, novelty, mental effort
32
Summary of relevance experiments

First experiment
reported in 1961

compared effects of
various representations
(titles, abstracts, full text)


Over the years about
300 or so experiments
Little funding

Most important
general finding:
Relevance is
measurable
only two funded by a US
agency (1967)
Tefko Saracevic
33
In conclusion

Information technology & systems will
change dramatically
even in the short run
 and in unforeseeable directions


But relevance is here to stay!
and relevance has many faces – some unusual
Tefko Saracevic
34
Innovation ... as well ... not all are
digital
Tefko Saracevic
35
and here is its use
Tefko Saracevic
36
Unusual services: Library therapy
dogs
U Michigan, Ann Arbor,
Shapiro Library
Tefko Saracevic
37
Presentation in Wordle
Tefko Saracevic
38
Thank you
for inviting me!
Tefko Saracevic
39
Download