Governed by effectiveness
& feedback
© Tefko Saracevic 1
• Search statement (query):
– set of search terms with logical connectors and attributes - file and system dependent
• Search strategy (big picture):
– overall approach to searching of a question
selection of systems, files, search statements & tactics, sequence, output formats; cost, time aspects
© Tefko Saracevic 2
(cont.)
(action choices):
– choices & variations in search statements
terms, connectors, attributes
• Move :
– modifications of search strategies or tactics that are aimed at improving the results
• Cycle
(particularly applicable to systems such as DIALOG):
– set of commands from start
(begin) to viewing (type) results, or from a viewing to a viewing command
© Tefko Saracevic 3
(cont.)
• Effectiveness :
– performance as to objectives
to what degree did a search accomplish what desired?
how well done in terms of relevance?
• Efficiency :
– performance as to costs
at what cost and/or effort, time?
Both KEY concepts & criteria for selection of strategy, tactics & evaluation
© Tefko Saracevic 4
• Search tactics chosen & changed following some criteria of accomplishment, such as:
– none - no thought given
– relevance (very often)
– magnitude (also very often)
– output attributes
– topic/strategy
• Tactics altered interactively
– role & types of feedback
Knowing what tactics may produce what results key to professional searcher
© Tefko Saracevic 5
• Attribute/criterion reflecting effectiveness of exchange of inf. between people (users) & IR systems in communication contacts, based on valuation by people
• Some attributes:
– in IR - user dependent
– multidimensional or faceted
– dynamic
– measurable - somewhat
– intuitively well understood
© Tefko Saracevic 6
• Several types considered:
– Systems or algorithmic relevance
relation between between a query as entered and objects in the file of a system as retrieved or failed to be retrieved by a given procedure or algorithm. Comparative effectiveness.
– Topical or subject relevance:
relation between topic in the query & topic covered by the retrieved objects, or objects in the file(s) of the system, or even in existence; Aboutness..
© Tefko Saracevic 7
(cont.)
– Cognitive relevance or
pertinence:
relation between state of knowledge
& cognitive inf. need of a user and the objects provided or in the file(s).
Informativeness, novelty ...
– Motivational or affective relevance
relation between intents, goals & motivations of a user & objects retrieved by a system or in the file, or even in existence. Satisfaction ...
– Situational relevance or utility:
relation between the task or problem-at-hand. and the objects retrieved (or in the files). Relates to usefulness in decision-making, reduction of uncertainty ...
© Tefko Saracevic 8
• Precision:
– probability that given that an object is retrieved it is relevant, or the ratio of relevant items retrieved to all items retrieved
• Recall:
– probability that given that an object is relevant it is retrieved, or the ratio of relevant items retrieved to all relevant items in a file
• Precision easy to establish, recall is not
union of retrievals as a “trick” to establish recall
© Tefko Saracevic 9
Items
RETRIEVED
Items
NOT RETRIEVED
Judged
RELEVANT a
No. of items relevant & retrieved c relevant & not retrieved
Judged
NOT RELEVANT b not relevant &
retrieved d not relevant & not retrieved
Precision
Recall
=
= a a + b a a + c
High precision = maximize a, minimize b
High recall = maximize a, minimize c
© Tefko Saracevic 10
• Precision= percent of relevant stuff you have in your answer
– or conversely percent of junk
– high precision = most stuff relevant
– low precision = a lot of junk
• Some users demand high precision
– do not want to wade through much stuff
– but it comes at a price: relevant stuff may be missed
tradeoff
© Tefko Saracevic 11
• A file may have a lot of relevant stuff
• Recall = percent of that relevant stuff in the file that you retrieved
– conversely percent of stuff you missed
– high recall = you missed little
– low recall = you missed a lot
• Some users demand high recall
(e.g. PhD students doing dissertation)
– want to make sure that important stuff is not missed
– but will have to pay a price of wading through a lot of junk
tradeoff
© Tefko Saracevic 12
• USUALLY: precision & recall are inversely related
– higher recall usually lower precision & vice versa
100 %
0
© Tefko Saracevic
Recall
100 %
13
• It is like in life, usually:
– you get some lose some
• Usually, but not always
keep in mind these are probabilities
– when you have high precision most stuff you got is relevant or on the target but you missed stuff that is also relevant – it was left behind
– when you have high recall you did not miss much but you got also a lot of junk - wading through it
You use different tactics for high recall from those for high precision
© Tefko Saracevic 14
• What variations possible?
– several ‘things’ in a query can be selected or changed that affect effectiveness
– each variation has consequence in output
if I do X then Y will happen
1. LOGIC
– choice of connectors among terms ( AND, OR, NOT, W …)
2. SCOPE
– no. of terms linked ANDs
(A AND B vs A AND B AND C)
© Tefko Saracevic 15
3.EXHAUSTIVITY
– for each concept no. of related terms
OR connections
(A OR B vs. A OR B OR C)
4. TERM SPECIFICITY
– for each concept level in hierarchy
( broader vs narrower terms)
5. SEARCHABLE FIELDS
– choice for text terms & non-text attributes
e.g. titles only, limit as to years
6. FILE OR SYSTEM SPECIFIC
CAPABILITIES
– e.g. ranking, sorting
© Tefko Saracevic 16
SCOPE
- adding more ANDs
Output size: down
Recall: down
Precision: up
EXHAUSTIVITY
- adding more more
ORs
USE OF NOTs
- adding more NOTs
Output size: up
Recall: up
Precision: down
Output size down
Recall: down
Precision: up
BROAD TERM
USE
– low specificity
PHRASE USE
high specificity
© Tefko Saracevic
Output size: up
Recall: up
Precision: down
Output size: down
Recall: down
Precision: up
17
• To increase precision:
– use precision devices
• To increase recall:
– use recall devices
• Each will also affect magnitude of output
• With experience use of these devices will become will become second nature
© Tefko Saracevic 18
BROADENING higher recall:
Fewer ANDs
More ORs
Fewer NOTs
More free text
Fewer controlled
More synonyms
Broader terms
Less specific
More truncation
Fewer qualifiers
Fewer limits
Citation growing
NARROWING higher precision:
More ANDs
Fewer ORs
More NOTs
Less free text
More controlled
Less synonyms
Narrower terms
More specific
Less truncation
More qualifiers
More limits
Building blocks
© Tefko Saracevic 19
• Citation growing:
– find a relevant document
– look for documents cited in
– look for documents citing it
– repeat on newly found relevant documents
• Building blocks
– find documents with term A
– review – add term B & so on
• Using different feedbacks
– a most important tool
© Tefko Saracevic 20
• Any feedback implies loops
– a completion of a process provides information for modification, if any, for the next process
– information from output is used to change previous or create new input
• In searching:
– some information taken from output of a search is used to do something with next query (search statement)
examine what you got to decide what to do next in searching
– a basic tactic in searching
• Several feedback types used in searching
– each used for different decisions
© Tefko Saracevic 21
• Content relevance feedback
– judge relevance of items retrieved
– make decision what to do next
switch files, change exhaustivity …
• Term relevance feedback
– find relevant documents
– examine what other terms used in those documents
– search using additional terms
also called query modification & in some systems done automatically
• Magnitude feedback
– on the basis of size of output make tactical decisions
often the size so big that documents are not examined but next search done to limit size
© Tefko Saracevic 22
• Tactical review feedback
– after a number of queries (search statements) in the same search review tactics as to getting desired outputs
review terms, logic, limits …
– change tactics accordingly
• Strategic review feedback
– after a while (or after consultation with user) review the “big” picture on what searched and how
sources, terms, relevant documents, need satisfaction, changes in question, query …
– do next searches accordingly
– used in reiterative searching
• There is a difference between reviewing strategy & tactics
– but they can be combined
© Tefko Saracevic 23
“…moving through many actions towards a general goal of satisfactory completion of research related to information need.”
– query is shifting (continually)
as search progresses queries are changing
different tactics are used
– searcher (user) may move through a variety of sources
new files, resources may be used
strategy may change
© Tefko Saracevic 24
– new information may provide new ideas, new directions feedback is used in various ways
– question is not satisfied by a single set of answers, but by a series of selections & bits of information found along the way
results may vary & may have to be provided in appropriate ways & means
© Tefko Saracevic 25