Relevance in information science Tefko Saracevic, PhD tefko@scils.rutgers.edu http://www.scils.rutgers.edu/~tefko/ © 2008 Tefko Saracevic 1 Preface 1975 & 1976 Relevance: A Review of the Literature and a Framework for Thinking on the Notion in Information Science 2007 Relevance: A Review of the Literature and a Framework for Thinking on the Notion in Information Science. Part II & part III © 2008 Tefko Saracevic Relevance 2 Purpose trace the evolution of thinking on relevance in information science for the past three decades provide an updated framework within which the still widely dissonant ideas on relevance might be interpreted and related to one another © 2008 Tefko Saracevic Relevance 3 … in information science Information retrieval (IR) systems offer their version of what may be relevant People go their way & asses relevance The two worlds interact Concern here: human world of relevance how IR deals with relevance NOT covered © 2008 Tefko Saracevic Relevance 4 Historical note Relevance came into IR unannounced From very start after WWII IR was about retrieval of relevant information First concerns: about “false drops” – non relevant retrievals First direct recognition in 1955 with proposal for precision & recall as evaluation measures based on relevance © 2008 Tefko Saracevic Relevance 5 Organization of presentation 1. What is the NATURE of relevance? 1.1 Meaning? 1.2 Theories? 1.3 Models? 2. What are MANIFESTATIONS of relevance? 3. What is the BEHAVIOR of relevance? 4. What are the EFFECTS of relevance? © 2008 Tefko Saracevic Relevance 6 1.1 Meaning of relevance Intuitively well understood same perception globally – “y’know” a “to” and context always present Relevance: a relation between objects P & Q along property R may also include a measure S of the strength of connection © 2008 Tefko Saracevic Relevance 7 … in information science Relation between inf. or inf. objects (Ps) & contexts (Qs) based on some property R & measure of intensity S But: relevance is not given – it is established Big questions & challenges How does relevance happen? Who does it, under what circumstances, and how? © 2008 Tefko Saracevic Relevance 8 Summary: Meaning of relevance Relevance attributes involve: relation intention context internal; external inference selection interaction measurement © 2008 Tefko Saracevic Relevance 9 1.2 Theories of relevance Theories suggested in several fields logic – in deduction of inferences, to reject fallacies – relevance logics philosophy – in phenomenology structure & functioning of the “life-world” (Alfred Schutz) it is stratified & relevance is the principle for stratification no single relevance but an interdependent system of relevances (plural) thematic (topical), interpretational, & motivational relevance © 2008 Tefko Saracevic Relevance 10 … theories in communication Sperber & Wilson: Relevance Theory based on inferential model of communication what must be relevant and why to an individual with a single cognitive intention? posited a cognitive & a communicative principle of relevance assessed in terms of cognitive effects & processing effort individuals pick up the most relevant stimuli & process them to maximize their relevance © 2008 Tefko Saracevic Relevance 11 Summary: Theories of relevance IS did not develop indigenous theory but a few “theories-on-loan” attempts Logic theories not applied, yet Schutz’s life-world theory used to some extend in specifying manifestations & models Sperber & Wilson’s theory used few times as explanation & guide not tested © 2008 Tefko Saracevic Relevance 12 1.3 Models of relevance Reviews – also produced models Syracuse school of relevance dynamic & situational model & Nilan, 1990) (Schamber, Eisenberg connection with human information behavior “Whole history of relevance” (Mizzaro, 1997) duality in modeling & studying documents & queries, 1959-1976 dynamics & multidimensionality, 1977-present © 2008 Tefko Saracevic Relevance 13 Split between system & user models Opposing views of IR: systems & users IR traditional model does not deal with users Battle royal started by Dervin & Nilan (1986) criticism of system viewpoint call for user orientation Several user & interaction models proposed – to reconcile, bridge still relevance has two basic models & cultures & they map like Australia © 2008 Tefko Saracevic Relevance 14 User vs system models “Informing systems design” became mantra of all relevance studies “Tell us what to do and we will do it.” response from systems side But “telling” is not that simple Issue is not a conflict but: how can we make user & system side work together for benefit of both? © 2008 Tefko Saracevic Relevance 15 Summary: relevance models All IR & inf. seeking models have relevance at their base Traditional IR model has most simplified –“weak”- version of relevance but with the weak model IR is successful Variety of integrative models have been proposed more complex models = increased challenge to incorporate in practice © 2008 Tefko Saracevic Relevance 16 © 2008 Tefko Saracevic Relevance 17 2. Manifestations of relevance “How many relevances in IR?” (Mizzaro, 1998) Several manifestations recognized since 1950s Issue: What given objects (Ps & Qs) are related by what given property (Rs) as relation? [adjective] relevance or different name Duality strikes again subject (topic, system) relevance vs user (psychological, cognitive) relevance objective vs subjective relevance © 2008 Tefko Saracevic Relevance 18 Issue of primacy: weak and strong relevance Does topical relevance underlie all others? predictably two answers: yes and no in a strict correspondence between query & answer topical is basic – weak relevance if derivation is involved topical may not be basic – strong relevance weak relevance is more associated with systems strong more with people © 2008 Tefko Saracevic Relevance 19 Beyond duality Numerous other kinds of relevance were identified for user relevances: psychological, cognitive, affective, situational, socio-cognitive, pertinence, utility … for topical relevances: logical, systems, algorithmic, documentary, bibliographic … each indicates different relations © 2008 Tefko Saracevic Relevance 20 Summary: Manifestations of relevance There are a limited number of manifestations – could be grouped: System or algorithmic relevance Topical or subject relevance Cognitive relevance or pertinence Situational relevance or utility Affective relevance These are interdependent – they feed on each other interactively © 2008 Tefko Saracevic Relevance 21 3. Behavior of relevance Relevance does not behave, people do how humans determine relevance of information or information objects? reviewed only studies that have data some 30 studies – started in 1991 related to information seeking & use studies & implicit relevance studies – not reviewed Pattern for review: [author] used [subjects] to do [tasks] in order to study [object of research] © 2008 Tefko Saracevic Relevance 22 Relevance clues What makes information or information objects relevant? What do people look for in order to infer relevance? two approaches: topic & clues analysis Clues research: uncover & classify attributes or criteria that users concentrate on while making relevance inferences usually on documents but also other objects © 2008 Tefko Saracevic Relevance 23 Relevance dynamics Do relevance inferences and criteria change over time for the same user and task, and if so, how? As a user progresses through various stages of a task the user’s cognitive state changes the task changes as well thus, something about relevance also is changing. © 2008 Tefko Saracevic Relevance 24 Relevance feedback What factors affect the process of relevance feedback? What types of feedback? How much is feedback used? Dealing here with manual not automatic feedback behavior of people when involved in manual feedback © 2008 Tefko Saracevic Relevance 25 Summary on behavior Caveats abound – nothing standardized still refreshing to see data Some generalizations on clues: criteria finite in number, similar, but different weights assigned Different users, tasks, progress in tasks, classes of users = similar criteria = different weights Different ratings of relevance = similar criteria = different weights. © 2008 Tefko Saracevic Relevance 26 …summary clues Clues criteria: content object validity use or situational match cognitive match affective match belief match © 2008 Tefko Saracevic Relevance 27 … summary clues Criteria are not independent; people apply multiple criteria; they interact content (topic) criteria very important but not sole – interact with others for search outputs value of results as a whole critical Visual information = faster inference than textual information © 2008 Tefko Saracevic Relevance 28 … summary dynamics Inferences dependent on task stage criteria stable, selection changes Different stages = differing selections but different stages = similar criteria = different weights Increased focus = increased discrimination = more stringent relevance inferences What is topical changes with progress in time and task © 2008 Tefko Saracevic Relevance 29 … summary feedback Several kinds search term, content, magnitude, tactics Use of relevance feedback = increase in performance however, used rarely in practice Searching behavior different when using feedback © 2008 Tefko Saracevic Relevance 30 5. Effects of relevance Works both ways: relevance affected by and affects host of factors Relevance judges What factors inherent in relevance judges make a difference in relevance inferences? How large are and what affects individual differences in relevance inferences? similar question asked for a number of information activities – indexing, searching … most often studied: domain knowledge © 2008 Tefko Saracevic Relevance 31 Relevance judgments What factors affect relevance judgments? short answer: a lot of them approach: classify into tables e.g. Schamber (1994) 80 factors, 6 categories Harter (1996) 24 factors, 4 categories different approach here: classify studies along basic assumptions in IR evaluations © 2008 Tefko Saracevic Relevance 32 Central assumptions Relevance is: topical binary independent stable consistent if pooling: complete Not to prove or disprove these assumptions but to organize studies along questions © 2008 Tefko Saracevic Relevance 33 Beyond topicality Do people infer relevance based on topicality only? Other factors enter & interact Only a few studies directly addressed Wang & Soergel (1998) 11 criteria for selection, with topicality being top Xu & Chen (2006) in web searching: topicality & novelty most significant, then reliability & understandability © 2008 Tefko Saracevic Relevance 34 Beyond binary Are relevance inferences binary i.e. relevant – not relevant? If not, what gradation do people use in inferences about relevance of information or information objects? Number of studies addressed this studied distributions of relevance inferences regions of relevance © 2008 Tefko Saracevic Relevance 35 Beyond independence Are information objects assessed independently of each other? Does the order or size of the presentation affect relevance judgments? Only a few studies on the questions Includes presentation of different representations © 2008 Tefko Saracevic Relevance 36 Beyond stability Are relevance judgments stable as tasks and other aspects change? Do relevance inferences and criteria change over time for the same user and task, and if so how? mentioned already under dynamics judgments not completely stable, criteria are Plato: “Everything is flux.” © 2008 Tefko Saracevic Relevance 37 Beyond consistency Are relevance judgments consistent among judges or group of judges? human judgments about anything informational are not consistent, relevance included Gull (1956) opened the Pandora box classic example of law of unintended consequences Some 6 studies addressed consistency subjects: experts, students © 2008 Tefko Saracevic Relevance 38 But does it matter? How does inconsistency in human relevance judgments affect results of IR evaluation? main contention by critics Five studies 1968 – 2000 addressed the question four also showed magnitude of agreement © 2008 Tefko Saracevic Relevance 39 Summary: judges Subject expertise accounts strongly higher expertise = higher agreement, less differences lesser expertise = more leniency in judgment large variability in relevance inferences by individuals same range as in other cognitive processes © 2008 Tefko Saracevic Relevance 40 Summary: judgments Relevance is measurable! None of the 5 postulates hold but by simplifying relevance for labs IR made significant advances Relevance judgments are not binary but are bimodal regions of low, middle and high relevance high peaks at both end Order affect relevance judgment © 2008 Tefko Saracevic Relevance 41 … summary: judgments Consistency: Higher expertise = higher consistency = more stringent. Lower expertise = lower consistency= more encompassing overlap using different populations hovers around 30% higher expertise up to 80% when 3rd, 4th … judge added overlap falls Higher expertise =larger overlap. Lower expertise =smaller overlap. More judges = less overlap. © 2008 Tefko Saracevic Relevance 42 Summary: does it matter? In lab conditions disagreement among judges does not affect evaluation rank order of different IR systems changes minimally Different judges = same relative performance (on the average) swaps in ranking do occur = low probability but performance for individual topics differs significantly law of averages kicks in © 2008 Tefko Saracevic Relevance 43 Summary: measures Users can use a variety of scales there is no “best” scale magnitude scales very appropriate but hard to explain & analyze © 2008 Tefko Saracevic Relevance 44 Epilogue Many things changed in IR & information science but goals the same As to nature of relevance: marked progress in understanding little in theory diversification in models in models As to manifestations: consensus there are several kinds of relevance, grouped in a half dozen or so well distinguished classes - interdependent © 2008 Tefko Saracevic Relevance 45 … epilogue As to behavior & effects: seen a number of experimental & observational studies lifted the discourse beyond debate, anecdotes to data interpretation but generalizations difficult – findings should be treated as hypotheses © 2008 Tefko Saracevic Relevance 46 … epilogue: reflections Relevance is poor no funding for relevance research of studies with data less than 17% mentioned outside funding, half from outside the US scholarship progressed sporadically & all over the place Globalization of IR – globalization of relevance as relevance went global & to the masses many & different research questions emerged © 2008 Tefko Saracevic Relevance 47 … epilogue: reflections… Proprietary IR – proprietary relevance major search engines proprietary = relevance proprietary for innovation must study users & use, but findings kept private relevance research into public & private branch paradox of the internet © 2008 Tefko Saracevic Relevance 48 … epilogue: research agenda - beyonds Beyond behaviorism & black box many studies stimulus-response & no diagnostics need to go use/adapt other approaches © 2008 Tefko Saracevic Relevance 49 … epilogue: research agenda - beyonds Beyond mantra “implication for system design” incorporating user concerns & characteristics not that simple integration between user/cognitive & systems approaches needed relevance research and IR research should at least get engaged, if not married interactive research on the right track © 2008 Tefko Saracevic Relevance 50 … epilogue: research agenda - beyonds Beyond students ~70% of behavior & effect studies used students as population not surprising – they are affordable we know a lot about student relevance does it generalize to other populations? © 2008 Tefko Saracevic Relevance 51 In conclusion Information technology & systems will change dramatically even in the short run and in unforeseeable directions But relevance is here to stay! © 2008 Tefko Saracevic Relevance 52 © 2008 Tefko Saracevic Relevance 53 © 2008 Tefko Saracevic Relevance 54