An Implicit Feedback approach for Interactive Information Retrieval Ryen W. White, Joemon M. Jose, Ian Ruthven University of Glasgow Hamza Hydri Syed Course Presentation - Web Information Retrieval An Implicit Feedback approach for Interactive Information Retrieval Roadmap • Introduction • Searcher Interaction • Binary Voting Model • Evaluation • Results & Analysis • Conclusions 2 An Implicit Feedback approach for Interactive Information Retrieval Roadmap • Introduction • Searcher Interaction • Binary Voting Model • Evaluation • Results & Analysis • Conclusions 3 An Implicit Feedback approach for Interactive Information Retrieval Introduction Relevance Feedback (RF): • Automatically improving a system´s representation of a searcher´s information need through an iterative process of feedback1. • Depends on a series of relevance assessments made explicitly by the user. • Assumes that underlying need is the same across all the iterations. Implicit RF: • IR system unobtrusively monitors search behaviour • Removes the need for the searcher to explicitly indicate which documents are relevant2. • Variety of „surrogate“ measures have been employed. • Hyperlinks clicked, mouseovers, scrollbar activity3,4 • Can be unreliable indicators, use interaction with the full-text documents as implicit feedback. 4 An Implicit Feedback approach for Interactive Information Retrieval Introduction Approach: • Searchers can interact with different representations of each document. • Representations are of varying length, focussed on the query & logically connected at the interface to form an interactive search path. • Develops a means of better representing searcher needs while minimizing the burden of explicitly reformulating queries. 5 An Implicit Feedback approach for Interactive Information Retrieval Roadmap • Introduction • Searcher Interaction • Binary Voting Model • Evaluation • Results & Analysis • Conclusions 6 An Implicit Feedback approach for Interactive Information Retrieval Searcher Interaction Document Representations: • Focus on query-relevant parts of documents • Reduce likelihood for selection of erroneous terms. • Interface uses multiple document representations (1) (2) (3) (4) (5) (6) Top-ranking sentences (TRS)– from each of the top 30 documents retrieved Title Query-biased summary of the documents Summary Sentence Sentence in Context Document itself 7 An Implicit Feedback approach for Interactive Information Retrieval 8 An Implicit Feedback approach for Interactive Information Retrieval Searcher Interaction Relevance Path: • The further along the path a searcher travels, the more relevant is the information in the path. • Paths can vary in length and searchers can access the full-text of the document from any step in the path. 9 An Implicit Feedback approach for Interactive Information Retrieval Roadmap • Introduction • Searcher Interaction • Binary Voting Model • Evaluation • Results & Analysis • Conclusions 10 An Implicit Feedback approach for Interactive Information Retrieval Binary Voting Model Features: • Heursitics-based model which implicitly selects terms for query modification. • Utilizes searcher interaction with document representations & relevance paths. • Term present in a viewed representation recieves a „vote“, when not present recieves no vote. • Winning terms are those with the most votes and hence best describe the information viewed by the searcher. • Contribution of a vote is weight-ed based on the indicative worth of the representation. 0.1 0.2 0.2 0.2 0.3 – – – Title TRS Summary Sentence Sentence in Context Summary 11 An Implicit Feedback approach for Interactive Information Retrieval Binary Voting Model Features: • Each document is represented by a vector of length n • n – total number of unique non-stop-word terms • The list holding these terms – vocabulary • A document x term matrix is built of size (d+1) x n • d – no. of documents the searcher has seen 12 An Implicit Feedback approach for Interactive Information Retrieval Binary Voting Model Example – Simple Updating: • Original Query Q0 contains t5 and t9 • Vector is normalised to give each term a value between [0,1] • Term occuring is assigned a weight wt p – no. of steps taken D – document t – term r – representation Wt,r – weight of t for the representation r • Weight for each term is added to the appropriate term/document entry in the matrix 13 An Implicit Feedback approach for Interactive Information Retrieval Binary Voting Model Example – Simple Updating: • Initial state of document x term matrix • Searcher expresses interest in the Title of the document D1 – with a step weight of 0.1 and contains terms t1,t2 and t7 • Matrix changes to • • • • Weights of terms t1,t2 and t7 are directly updated t2 is now seen as being important to D1 t1 and t7 are seen as more important than before to D1 Scoring is cummulative 14 An Implicit Feedback approach for Interactive Information Retrieval Binary Voting Model Query Creation: • For every 5 paths a new query is computed, which gathers sufficient implicit evidence from searcher interaction • To compute new query we calculate the average score for each term across all documents • Terms are ranked by this score • High average score implies the term has appeared in many viewed representations and/or in those with high indicative weights • Top 6 terms chosen are t9,t5,t1,t7,t3 and t2 Although t2,t3 and t8 have the same score, t8 is not included since t3 occurs more recently and t2 occurs in more than one document 15 An Implicit Feedback approach for Interactive Information Retrieval Binary Voting Model Tracking Information Need: • Change in the information need can be measured by computing the change in the term ordering from the term list at different steps i.e., qm and qm+1 • Since the vocabulary is static, only the order of the terms in the list will change • • • • Where is searcher information need and o is the Spearman rank-order correlation coefficient that computes the difference between two lists of unique terms The correlation returns values between -1 and 1 Result closer to -1 means the term lists are dissimilar w.r.t rank ordering Result closer to 1 means similarity between ranking terms increases 16 An Implicit Feedback approach for Interactive Information Retrieval Binary Voting Model Strategies Implemented: • Re-searching – coeffecient value < 0.2 indicates large change in term lists, that they are substantially different w.r.t rank ordering, this reflects a large change in . A new re-search is done to retrieve a new set of documents. • Reordering Documents – result in the range [0.2, 0.5) indicates weak correlation & consequently a less substantial change in . A new query is used to reorder the top 30 retrieved documents, which is done using bestmatch tf-idf scoring. • Reordering TRS – coefficient in range [0.5,0.8) indicates strong correlation in the two term lists & a small change in the predicted . New query is used to re-rank TRS list. 17 An Implicit Feedback approach for Interactive Information Retrieval Roadmap • Introduction • Searcher Interaction • Binary Voting Model • Evaluation • Results & Analysis • Conclusions 18 An Implicit Feedback approach for Interactive Information Retrieval Evaluation Manual Baseline System: • Similar to implicit feedback system except that searcher is solely responsible for adding new query terms & selecting what action is undertaken. • Baseline interface has additional component – term/strategy control panel, which allows searchers to decide how best to use the query. • This nature of Baseline allows us to evaluate how well the implicit feedback system detected information needs from the perspective of the subject. 19 An Implicit Feedback approach for Interactive Information Retrieval 20 An Implicit Feedback approach for Interactive Information Retrieval Evaluation Experimental Subjects: • Were mainly undergraduate and postgraduate students of University of Glasgow, divided into 2 groups of • • Experienced Inexperienced Experimental Tasks: • Each subject was asked to complete one search task from each of 4 categories • Categories were – • • • • • fact search (finding a person´s mail address) background search (finding information on dust allergies) decision search (choosing the best financial instrument) search for number of items (finding contact details of a no. of employees) Search scenarios reflect real-life search situations & allow the subject to make personal assessments on what constitutes relevant material5 21 An Implicit Feedback approach for Interactive Information Retrieval Roadmap • Introduction • Searcher Interaction • Binary Voting Model • Evaluation • Results & Analysis • Conclusions 22 An Implicit Feedback approach for Interactive Information Retrieval Results & Analysis Hypotheses tested: 1. „The terms selected for implicit feedback represent the information needs of the subject (i.e., term selection support)“ 2. „The implicit feedback approach estimates changes in the subject´s information need“ 3. „The implicit feedback approach makes search decisions that correspond closely with those of the subject“ 23 An Implicit Feedback approach for Interactive Information Retrieval Results & Analysis Hypothesis 1 – Information need detection: • We measure degree of term overlap using baseline system. • BVM runs in background, invisible to subject & not involved directly in any query modification decisions. • High values of term overlap suggest that the terms chosen by the BVM are of good value and match the subject´s own impression of information need 24 An Implicit Feedback approach for Interactive Information Retrieval Results & Analysis Hypothesis 1 – Information need detection: • • • Shows average percentage of occassions where the top 6 terms chosen by BVM also included as atleast one of the subject´s terms Difference between inexperienced and experienced subjects was not significant. Term overlap for experienced subjects was generally higher than that for inexperienced subjects. 25 An Implicit Feedback approach for Interactive Information Retrieval Results & Analysis Hypothesis 1 – Information need detection: • • • Shows the average number of query iterations & average query length „iteration“ is the use of a query for any action; reordering the TRS, the documents or re-searching the Web Average query length is the number of terms in the new query that were not in the original query 26 An Implicit Feedback approach for Interactive Information Retrieval Results & Analysis Hypothesis 1 – Information need detection: • • • • Shows the average frequency of query manipulation for each subject performing different types of search Query manipulation = adding terms and/or removing terms Subjects added terms to queries more often for decision and background searchs than for fact search and search for number of items. Implicit feedback is better in decision search as compared to fact search 27 An Implicit Feedback approach for Interactive Information Retrieval Results & Analysis Hypothesis 2 – Information need tracking: • Shows the average number of actions carried out on each system across all search tasks • Differences in the no. of times the TRS were reordered • Experienced subjects make more use of unfamiliar actions • Both groups reorder the list of TRS more than implicit feedback system and reorder the documents less frequently • Reordering of sentences/documents allows the system to reshape the information space 28 An Implicit Feedback approach for Interactive Information Retrieval Results & Analysis Hypothesis 2 – Information need tracking: • • • • • Shows the proportion of each type of action that was undone Reversal indicates a dissatisfaction with outcome of the action or terms suggested Subjects responded well to search strategy employed on their behalf Inexperienced subjects disliked the effects of the TRS reordering Experienced subjects liked TRS re-ranking, but reversed the re-searching operation more often 29 An Implicit Feedback approach for Interactive Information Retrieval Results & Analysis Hypothesis 3 – Relevance paths • • • Subjects were asked to rate the worth of following a relevance path from one document representation to another. The relevance paths were significantly more helpful, beneficial, appropriate and useful to experienced subjects than inexperienced The distance travelled along the relevance path was a good indicator of relevance of the information in that path 30 An Implicit Feedback approach for Interactive Information Retrieval Results & Analysis Hypothesis 3 – Relevance paths • • • Shows the most common path taken, the average number of steps followed, the average number of complete and partial paths etc. Subjects used relevance paths consistently, although experienced subjects followed the paths longer Experienced subjects interacted more with the retrieved documents and more frequently used the document representations for viewing the fulltext of a document 31 An Implicit Feedback approach for Interactive Information Retrieval Roadmap • Introduction • Searcher Interaction • Binary Voting Model • Evaluation • Results & Analysis • Conclusions 32 An Implicit Feedback approach for Interactive Information Retrieval Conclusions • Interface uses query-relevant document representations to facilitate access to potentially useful information and allow searchers to closely examine results. • This form of implicit feedback is at the extreme end of a spectrum of searcher support. They may be best used to make decisions in conjuction with, not in place of, the searcher. • This approach has the potential to alleviate some of the problems inherent in explicit relevance feedback, while preserving many of its beliefs. • The success of the approach bodes well for construction of effective implicit RF systems that will work in concert with the searcher. 33 An Implicit Feedback approach for Interactive Information Retrieval References 1. Salton, G., & Buckley, C. (1990). Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41(4), 288–297. 2. Morita, M., & Shinoda, Y. (1994). Information filtering based on user behavior analysis and best match text retrieval. In Proceedings of the 17th annual ACM SIGIR conference on research and development in information retrieval (pp. 272–281). 3. Lieberman, H. (1995). Letizia: an agent that assists web browsing. In Proceedings of the 14th international joint conference on artificial intelligence (pp. 475–480). 4. Joachims, T., Freitag, D., & Mitchell, T. (1997). Webwatcher: a tour guide for the world wide web. In Proceedings of the 16th joint international conference on artificial intelligence (pp. 770–775). 5. Ingwersen, P. (1992). Information retrieval interaction. London: Taylor Graham. 34 An Implicit Feedback approach for Interactive Information Retrieval Questions 35 An Implicit Feedback approach for Interactive Information Retrieval ..... Thank You ! 36