Why is relevance still the basic notion in information science?

advertisement
Why is relevance still the basic
notion in information science?
(Despite great advances in information
technology & applications)
Tefko Saracevic, Ph.D.
Rutgers University, USA
tefkos@rutgers.edu
Tefko Saracevic
This work is licensed under a
Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License
1
Fundamental concepts
Every scholarly field has a fundamental,
basic notion, concept, idea ...
Relevance is a fundamental
concept or notion in
information science
• It was, but is it still?
Tefko Saracevic
2
Definition
relevance
1 a : relation to the
matter at hand
2: the ability (as of an
information
retrieval system) to
retrieve material
that satisfies the
needs of the user
Tefko Saracevic
3
What is “matter at hand”?
• Context in relation to
which
– a problem is addressed
– an information need is
expressed as a question
– a query for searching is
formulated
interaction is
taking place
• No such thing as
relevance without a
context
• Axiom:
One cannot not
have a context in
information
interaction.
Relevance is ALWAYS contextual
Tefko Saracevic
4
Context
has to be
there
Tefko Saracevic
5
Relevance – by any other name...
Many names connote
relevance e.g.:
pertinent; useful;
applicable; significant;
germane; material; bearing;
proper; related; important;
fitting; suited; apropos; ...
& nowadays even truthful
"A rose by any other
name would
smell as sweet“
Shakespeare, Romeo and
Juliet
Connotations may differ
but the concept is still
relevance
Tefko Saracevic
6
Tefko Saracevic
7
Two worlds in information science
Information retrieval (IR)
systems offer as answers
their version of what may
be relevant
– by ever improving algorithms
The two
worlds
interact
People go their way &
asses relevance
– by their problem at hand,
context & criteria
Considered here: human world of relevance
NOT covered: how IR deals with relevance
Tefko Saracevic
8
Two large questions
Why? (Part I)
Why still? (Part II)
• Why did relevance • Why did relevance
still remain a
become a central
central notion?
notion of
- despite advances
information
in technology
science?
Tefko Saracevic
9
Part I
WHY RELEVANCE?
Tefko Saracevic
10
Bit of history
• Vannevar Bush: Article “As we may think” 1945
– Defined the problem as “... the massive task of
making more accessible of a bewildering store of
knowledge.”
• problem still with us & growing
– Suggested a solution, a machine:
“Memex ... association of ideas ...
duplicate mental processes artificially.”
• Technological fix to problem
1890-1974
Tefko Saracevic
11
Information Retrieval (IR) – definition
• Term “information retrieval” coined & defined by
Calvin Mooers, 1951
“ IR: ... intellectual aspects of description of
information, ...
and its specification for search
... and systems, technique,
or machines...
[to provide information]
useful to user”
1919-1994
Tefko Saracevic
12
Technological determinant
• In IR emphasis was not only on organization
but even more on searching
– information technology was eminently suitable for
searching
• particularly computers
• Technological fix to the problem of
information explosion
Tefko Saracevic
13
Two important pioneers
Hans Peter Luhn 1896-1964
• at IBM pioneered many IR
computer applications
– first to describe searching
using Venn diagrams
Tefko Saracevic
Mortimer Taube1910-1965
• at Documentation Inc.
pioneered coordinate
indexing
– first to describe searching
as Boolean algebra
14
Searching & relevance
• Searching became a key
component of
information retrieval
• And searching is about
retrieval of relevant
answers
– extensive theoretical &
practical concern with
searching
– technology uniquely
suitable for searching
Thus RELEVANCE emerged as a key notion
Tefko Saracevic
15
Basic
Tefko Saracevic
16
Why relevance?
Aboutness
• A fundamental notion
related to organization of
information
• Relates to subject & in a
broader sense to
epistemology
Relevance
• A fundamental notion
related to searching for
information
• Relates to problem-at-hand
and context & in a broader
sense to pragmatism
Relevance emerged as a central notion in information science
because of practical & theoretical concerns with searching
Tefko Saracevic
17
Aboutness vs. relevance
Tefko Saracevic
18
Claims & counterclaims in IR
• Historically & from the outset:
“My system is better than your system!”
• Well, which one is it? OK: Lets test it.
But:
– what criterion to use?
– what measure(s) based on the criterion?
• Things got settled by the end of 1950’s and
remain mostly the same to this day
Tefko Saracevic
19
Relevance & IR testing
• In 1955 Allen Kent & James
W. Perry were first to
propose two measures for
test of IR systems:
Allen Kent
1921 - 2014
– “relevance” later renamed
“precision” & “recall”
• A scientific & engineering
approach to testing
James W. Perry
1907-1971
Tefko Saracevic
20
Tefko Saracevic
21
Relevance as criterion for
measures
Precision
• Probability that what is
retrieved is relevant
– conversely: how much junk is
retrieved?
Recall
• Probability that what is
relevant in a file is retrieved
– conversely: how much
relevant stuff is missed?
Probability of agreement between what the system
retrieved/not retrieved as relevant (systems relevance) &
what the user assessed as relevant (user relevance)
where user relevance is the gold standard for
comparison
Tefko Saracevic
22
User relevance still ...
Tefko Saracevic
23
Tefko Saracevic
24
Part II
WHY STILL RELEVANCE?
Tefko Saracevic
25
changing
dramatically,
globally
•
•
•
•
•
Many new applications
Transformations
Impacts
Connections
New, newer, newest
Tefko Saracevic
26
Social media ...
•
•
•
•
•
•
•
•
Tefko Saracevic
Twitter
Facebook
Instagram
Linkedin
Tumbrl
Youtube
Google+
Pinterest
27
Search engines ... Discovery tools
Tefko Saracevic
28
And of course ...
Tefko Saracevic
29
Societies, Journals, Conferences ...
Tefko Saracevic
30
Users
After the Web took over the world
Up to the time of the Web
• Primary & almost exclusive
users were
–
–
–
–
scientists
professionals
businesses
policy makers
• Everybody is a user
• Everybody searches for
everything
• And everything reflects
their needs, fashion,
behavior
• People ... all over the globe
• And everything reflected
their needs, behavior
• Professionals searched
Tefko Saracevic
31
Everyone ...
Tefko Saracevic
32
As the word is changing
- so is research
Tefko Saracevic
33
Relevance experiments – then
• First experiments
reported in 1960 & 61
– by an IBM group
– compared effects on
relevance judgements of
various representations
Tefko Saracevic
• In the next 50 years
some 300 or so
experiments conducted
• A variety of factors in
human judgments of
relevance addressed
34
Relevance experiments move on
• Eye-tracking studies in
information science first
reported in 2003
• Continued with studies
of web and online
searching
Tefko Saracevic
• Moved to include
relevance in 2012
35
Jacek Gwizdka, Gmunden Retreat on
NeuroIS 2012 (Neuro Information Systems)
Marrying neuro-cognitive methods
& information science
• Cognitive aspects of
human information
interaction make
information science a
good field for application
of neuroscience theories
& tools
Tefko Saracevic
• Hypothesis for
relevance experiments:
– fundamental neural
processes are associated
with relevance decisions
& these processes can
be detected by EEG or
fMRI.
36
Symbolically
Tefko Saracevic
37
Types of techniques
• Eye-tracking
– measurement of eye activity.
Where do we look? What do we
ignore?
• Functional magnetic
resonance imaging (fMRI)
– measures brain activity by detecting
associated changes in blood flow
• Electro-encephalography (EEG)
– detects electrical activity in the brain
using small, flat metal discs (electrodes)
attached to the scalp
Tefko Saracevic
38
Jacek Gwizdka, Gmunden
Retreat on NeuroIS 2013
• Experiment detecting
brain activity related to
information relevance
judgments
– 10 subjects given news
stories; looked for factual
relevant information
• Provides experimental
design & conduct but
results is in next paper
Tefko Saracevic
• Won the Dr. Hermann
Zemlicka Award (“most
visionary paper”)
– among 23 papers
39
• Does the degree of
relevance of a text
document affect how it
is read?
YES
“relevant documents tend
to be read more
coherently, whereas
irrelevant documents
tend to be scanned.”
Tefko Saracevic
40
EEG = Electro-encephalography
Tefko Saracevic
41
A study from
Finland – in SIGIR
Forty participants viewed six
terms: which is relevant for
given topics?
(relevant and irrelevant
terms defined by “experts”)
Findings:
“ ... showed improvement up to
17% in relevance prediction
based on brain signals
alone.”
Tefko Saracevic
42
Another study from
Finland
MEG= magnetoencephalographic
Nine subjects viewed images
– which is relevant for a
task?
Findings:
“the relevance of an image
a subject looks at can be
decoded from MEG signals
with performance
significantly better than
chance ...”
Tefko Saracevic
43
“There is one more thing...”
“I think the biggest innovation of the twenty first
century will be the intersection of biology and
technology. A new era is beginning ... “
Tefko Saracevic
44
But also a reality in numbers
No. of people in the world: 7.3 billion
No. of people using the Internet: 2.9 billion
No. of people NOT connected to the Internet: 4.4 billion (60%)
of these, 3 billion live in only 20 countries
Tefko Saracevic
45
...... different technology...
Tefko Saracevic
46
and relevance in its use
Tefko Saracevic
47
Tefko Saracevic
48
Christian
Franjo
Thank you
for inviting me!
Tefko Saracevic
49
FYI –
For Your Information
Presentation and paper at:
http://comminfo.rutgers.edu/~tefko/articles.htm
URLs and references are in PowerPoint Notes –
accessible after download
Tefko Saracevic
50
Download