relevance model

Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science Carnegie Mellon University International Conference on Web Intelligence(WIC’03) Presented by Chu Huei-Ming 2005/02/24 Reference • Relevance Models in Information Retrieval – Victor Lavrenko and W.Bruce Croft – Center for Intelligent Information Retrieval Department of Computer Science University of Massachusetts (UMass) – 11-56 2003 Kluwer Academic Publishers. Printed in the Netherlands 2 Outline • • • • Introduction Relevance Model Web image retrieval re-ranking Estimating a Relevance Model – Estimation form a set of examples – Estimation without examples • Ranking Criterion • Experiment • Conclusion 3 Introduction (1/2) • Most current large-scale web image search engines exploit text and link structure to “understand” the content of the web images. • This paper propose a re-ranking method to improve web image retrieval by reordering the images retrieved from an image search engine. • The re-ranking process is based on a relevance model. 4 Introduction (2/2) • The web image retrieval with relevance model re-ranking 5 Relevance Model (1/2) • Mathematical Formalism –  is a vocabulary in some language – C is a large collection of documents – Define the relevant class R to be the subset of document in C which are relevant to some particular information needs – Define the relevance model to be the probability distribution  Prw | R – For every word w  , the relevance model gives the probability that observe w in the randomly selected some document D form the relevant class R and then picked a random word from D 6 Relevance Model (2/2) • The important issue in IR is capturing the topic discussed in a sample of text, and to that end unigram models fare quite well. • The choice of estimation techniques has a particularly strong influence on the quality of relevance models 7 Web image retrieval re-ranking (1/2) • For each image I in the rank list returned from a web image search engine, there is one associated HTML document D • Can we estimate the probability that the image is relevant given text of document D, i.e. Pr(R|D) ? • By Bayes’ Theorem Pr R | D   Pr D | R  Pr R  Pr D  (1) – Pr(D) is equal for all documents and assume every document is equally possible – Pr(D|R) is needed to estimate if we want to know he relevance of the document 8 Web image retrieval re-ranking (2/2) • Suppose the document D is consisted of words w1 , w2 ,..., wn  • Apply common word independence assumption n Pr D | R    Pwi | R  ( 2) i 1 • Pr(w|R) can be estimated without training data 9 Estimating a Relevance Model • Estimation form a set of examples – There has full information about the set R of relevant documents • Estimation without examples – There has no examples from which we could estimate Pw | R directly 10 Estimation form a set of examples (1/2) • There has perfect knowledge of the entire relevant class R • The probability distribution P(w|R) is a randomly picked word from a random document D  R will be the word w • Let pD | R denote the probability of randomly picking document D from the relevant set R. Assume each relevant document is equally likely to be picked at random  1 / R if D  R p D | R    otherwise 0 • |R| is the total number of document in R (3) 11 Estimation form a set of examples (2/2) • The probability of randomly picking a document D and then observing the word w is ( 4) Pw, D | R   Pml w | D  pD | R  Pml w | D # w, D / D (5) • Assumed that the document model of D completely determines word probabilities , when fix D , the probability of observing w is independent of the relevant class R and only depends on D Pw | R    Pml w | D pD | R  DC ( 6) • The smoothing is achieved by interpolating the maximum-likelihood probability from (5) with some background distribution P(w) over the vocabulary. (7 ) Psmoothw | D   D Pw | D   1  D Pw 12 Estimation without examples (1/6) • In the ad-hoc information retrieval, there has only a short 2-3 word query, indicative of the user’s information need and no examples of relevant documents 13 Estimation without examples (2/6) • Assume that for every information need there exists an underlying relevance model R • Assigns the probabilities Pw | R to the word occurrence in the relevant documents (8) Pw | R  Pw | Q • Given a large collection of documents and a user query Q  q1 , q2 ,..., qk  Pw, q1 , q2 ,..., qk  Pq1 , q2 ,..., qk  (9 ) Pw | R   Pw | Q   Pw, q1 , q2 ,..., qk  (10) Pw | Q   14 Estimation without examples (3/6) • Method 1: i.i.d(random) sampling • Assume that the query words q1 , q2 ,..., qk and the word w in relevant documents are sampled identically • Pick a distribution D C with probability p(D) and sample from it k+1 times. Then the total probability of observing w together with q1 , q2 ,..., qk is Pw, q1 , q2 ,..., qk    pD Pw, q1 , q2 ,..., qk | D  (11) DC • Assumed w and all qi are sampled independently and identically to each other k (12) Pw, q1 , q2 ,..., qk | D   Pw | D  Pqi | D  i 1 • Final estimation k Pw, q1 , q2 ,..., qk    pD Pw | D  Pqi | D  DC i 1 (13) 15 Estimation without examples (4/6) • Method 2: conditional sampling • Using chain rule and make the assumption that query words are independent given word w k P( w, q1 , q2 ,...qk )  P( w) P(qi | w, qi 1 , qi  2 ,...q1 ) i 1 (14) k  P( w) P(qi | w) i 1 (15) • To estimate the conditional probabilities P(qi | w) we compute the expectation over the universe C of our unigram models P(qi | w)   Pqi | Di PDi | w (16) DC 16 Estimation without examples (5/6) • Additional assumption that qi is independent of w once we picked a disbution Di • Then the final estimation for the joint probability of w and query is k P( w, q1 , q2 ,...qk )  P( w) Pr( qi | w) (17 ) i 1    P( w)   Pqi | Di PDi | w i 1  Di C  k (18) 17 Estimation without examples (6/6) • The word prior probability is P( w)   Pw | D PD  (19) DC • The probability of picking a distribution Di based on w is PDi | w  Pw | Di Pw PDi  ( 20) • P(Di) is kept uniform over all the documents in C 18 Comparison of Estimation Method1 and 2 • Probability Ranking Principle • • Document ranked the decreasing probability ratio Cross-Entropy H R || D    Pw | R  log Pw | D  n Pd1 ,..., d n | R  Pdi | R   Pd1 ,..., d n | N  i 1 Pdi | N   w 19 Ranking Criterion (1/2) • Ranking by (2) will favor short document • Use Kullback-Leibler(KL) divergence to avoid the short document bias Pr v | Di  DPr w | Di  || Pr w | R    Pr v | Di  log Pr v | R  v • • Prw | Di  is the unigram model from the document associated with rank i image in the list Prw | R is the aforementioned relevance model 20 Ranking Criterion (2/2) 21 Experiment (1/4) • Test the idea of re-ranking on six text queries to a large-scale web image search engine, Google Image Search. From July 2001 to March 2003 there are 425 million images indexed by it • Six queries are chosen from image categories in Core Image Database • Each text query is typed into Google Image Search and top 200 entries are saved for evaluation • The 1200 images for six queries are fetched, they are manually labeled into three categories: relevant, ambiguous, irrelevant 22 Experiment (2/4) 23 Experiment (3/4) • For each query, send the same keywords to Google Web Search and obtain a list of relevant documents via Google Web APIs • Top-ranked 200 web documents are removed all the HTML tag and filter out the words appearing in the INQUERY stop-word list and stem words using Porter algorithm • The smoothing parameter is 0.6 24 Experiments (4/4) • The average precision at DCP over six queries 25 Conclusion • The average precision at the top 50 documents with the precision improvement from the original 30-35% to 45 % • The Internet users are usually with limit time and patience, high precision at top-ranked documents will save user a lot of efforts and help them find relevant images more easily and quickly 26

relevance model

Related documents

Products

Support

relevance model

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib