WWW 2006 SIC

advertisement
Searching with Context
Reiner Kraft
Farzin Maghoul
Chi Chao Chang
Ravi Kumar
Yahoo!, Inc.,
Sunnyvale, CA 94089, USA
Agenda
• Motivation
• Contextual Search
– Introduction
– Case Study: Y!Q
– Algorithms
• Query Rewriting
• Rank-Biasing
• Iterative, Filtering Meta-search (IFM)
• Evaluation and Results
• Conclusion
Yahoo! Confidential
2
Motivation
•
Traditional web search based on keywords as good as it gets?
– Not too much qualitative differences between search results of major search
engines
– Introducing anchor text and link analysis to improve search relevancy last
major significant feature (1998)
•
•
•
•
•
Search can be vastly improved in the dimension of precision
The more we know about a user’s information need, the more
precise our results can be
There exists a lot of evidence (context) beyond the terms in the
query box from which we can infer better knowledge of information
need
Study of web query logs show that users are already employing a
manual form of contextual search by using additional terms to
refine and reissue queries when the search results for the initial
query turn out to be unsatisfactory
=> How can we automatically use context for augmenting,
refining, and improving a user’s search query to obtain more
relevant results?
Yahoo! Confidential
3
Contextual Search - General Problems
• Gathering evidence (context)
• Representing and inferring user
information need from evidence
• Using that representation to get more
precise results
Yahoo! Confidential
4
Contextual Search - Terminology
•
Context
– In general: Any additional information associated with a query
– More narrow: A piece of text (e.g., a few words, a sentence, a paragraph, an
article) that has been authored by someone
•
Context Term Vector
– Dense representation of a context in the vector space model
– Obtained using keyword extraction algorithms (e.g., Wen-tau Yih et al.,
KEA, Y! Content Analysis)
•
Search Query Types
– Simple: Few keywords, no special or expensive operators
– Complex: Keywords/phrases plus special ranking operators, more
expensive to evaluate
– Contextual: Query + context term vector
•
Search Engine Types
– Standard: Web search engines (e.g., Yahoo, Google, MSN, …) that support
simple queries
– Modified: A Web search engine that has been modified to support complex
search queries
Yahoo! Confidential
5
Case Study: Y!Q Contextual Search
• Acquiring context:
– Y!Q provides a simple API that allows publishers to associate visual
information widgets (actuators) to parts of page content
(http://yq.search.yahoo.com/publisher/embed.html)
– Y!Q lets users manually specify or select context (e.g., within Y!
Toolbar, Y! Messenger, included JavaScript library)
• Contextual Search Application
– Generates a digest (context term vector) of the associated content
piece as additional terms of interest for augmenting queries
(content analysis)
– Knows how to perform contextual searches for different search
back-end providers (query rewriting framework)
– Knows how to rank results based on query + context (contextual
ranking)
– Seamless integration by displaying results in overlay or embedded
within page without interrupting the user’s workflow
Yahoo! Confidential
6
Example
Yahoo! Confidential
7
Example
Yahoo! Confidential
Y!Q Actuator
8
Example
Y!Q Overlay showing contextual search results
Yahoo! Confidential
9
Example
Y!Q: Searching in Context
Yahoo! Confidential
10
Example CSRP
Terms extracted from context
Yahoo! Confidential
11
Y!Q System Architecture
Yahoo! Confidential
12
Implementing Contextual Search
• Assumption:
– We have a query plus a context term vector (contextual
search query)
• Design dimensions:
– Number of queries to send to a search engine per contextual
search query
– Types of queries to send
• Simple
• Complex
• Algorithms:
– Query Rewriting (QR)
– Rank-Biasing (RB)
– Iterative, Filtering, Meta-Search (IFM)
Yahoo! Confidential
13
Yahoo! Confidential
14
Algorithm 1: Query Rewriting
• Combine query + context term vector using AND/OR semantics
• Input Parameters:
– Query, context term vector
– Number of terms to consider from context term vector
• Experimental Setup:
– QR1 (takes top term only)
– QR2 (takes top two terms only)
– … up to QR5
• Example:
– QR3: Given query q and
c  a b c
d
=> q AND a AND b AND c
• Pros:
– Simplicity, supported in all major search engines
• Cons:
 for longer queries
– Possibly low recall
Yahoo! Confidential
15
Algorithm 2: Rank-Biasing
•
•
Requires modified search engine with support for RANK operator for rank-biasing
Complex query comprises:
–
–
•
Input Parameters:
–
–
–
–
•
Query, context term vector
Number of selection terms to consider (conjunctive semantics)
Number of RANK operators
Weight multiplier for each RANK operator (used for scaling)
Experimental Setup:
–
–
•
Selection part
Optional ranking terms are only impacting score of selected documents
RB2 (uses 1 selection term, 2 RANK operators, weight multiplier=0.1)
RB6 (uses 2 selection terms, 6 RANK operators, weight multiplier=0.01)
Example:
–
RB2: Given q and
•
Pros:
–
•
a,50


c  b,25 => q AND a RANK(b, 2.5) RANK(c, 1.2)


c,12


Ranking terms do not limit recall
Cons:
–
Requires a modified search engine back-end, more expensive to evaluate
Yahoo! Confidential

16
Algorithm 3: IFM
•
IFM based on concept of Meta-search (e.g., used in Buying Guide
Finder [kraft, stata, 2003])
– Sends multiple (simple) queries to possibly multiple search engines
– Combines results using rank aggregation methodologies
Yahoo! Confidential
17
IFM Query Generation
• Uses “query templates” approach:
– Query templates specify how sub-queries get constructed from the
pool of candidate terms
– Allow to explore the problem domain in a systematic way
– Implemented primarily sliding window technique using query
templates
– Example: Given query q and c  a b c d


=> a sliding window query template of size 2 may construct the
following queries:
• qab
• qbc
• qcd

• Parameters:
– Size of the sliding window
• Experimental Setup:
– IFM-SW1, IFM-SW2, IFM-SW3, IFM-SW4
Yahoo! Confidential
18
IFM uses Rank Aggregation for
combining different result sets
• Rank aggregation represents a robust and principled approach
of combining several ranked lists into a single ranked list
• Given universe U, and k ranked lists 1, …, k on the elements
k
of the universe
*

d( ,  i )
– Combine k lists into *, such that
i1
is minimized
– For d(.,.) we used various distance functions (e.g,. Spearman
footrule, Kendall tau)
• Parameters:

– Style of rank aggregation:
• Rank averaging (adaptation of Borda voting method)
• MC4 (based on Markov chains,more computationally expensive)
• Experimental Setup:
– IFM-RA, IFM-MC4
Yahoo! Confidential
19
Experimental Setup and Methodology
• Benchmark
– 200 contexts sampled from Y!Q query logs
• Tested 41 configurations
– 15 QR (Yahoo, MSN, Google)
– 18 RB (1 or 2 selection terms; 2, 4, or 6 RANK operators, 0.01, 0.1,
or 0.5 weight multipliers)
– 8 IFM (avg and MC4 on Yahoo, SW1 to SW4)
• Per item test
– Relevancy to the context, perceived relevancy used
– Relevancy Judgments:
•
•
•
•
Yes
Somewhat
No
Can’t Tell
– 28 expert judges, look at top 3 results, total of 24,556 judgments
Yahoo! Confidential
20
Example
• Context:
– “Cowboys Cut Carter; Testaverde to Start OXNARD, Calif
Quincy Carter was cut by the Dallas Cowboys on
Wednesday, leaving 40-year-old Vinny Testaverde as the
starting quarterback. The team would’nt say why it released
Carter.”
• Judgment Examples:
– A result directly relating to the “Dallas Coyboys” (football
team) or Quincy Carter => Yes
– A result repeating the same or similar information =>
Somewhat
– A result about Jimmy Carter, the former U.S. president =>
No
– If result doesn’t provide sufficient information => Can’t tell
Yahoo! Confidential
21
Metrics
• Strong Precision at 1 (SP@1) and 3 (SP@3)
– Number of relevant results divided by the number of
retrieved results, but capped at 1 or 3, and expressed as a
ratio
– A result is considered relevant if and only if it receives a ‘Y’
relevant judgment
• Precision at 1 (P@1) and 3 (P@3)
– Number of relevant results divided by the number of
retrieved results, but capped at 1 or 3, and expressed as a
ratio
– A result is considered relevant if and only if it receives a ‘Y’
or ‘S’ relevant judgment
Yahoo! Confidential
22
Coverage Results
Coverage Drop (<3)
Coverage Drop (=0)
Percentage of Contexts
9
8
7
6
MSN (0)
5
Yahoo (0)
4
Google (0)
3
2
25
20
QR1
QR2
QR3
QR4
QR5
MSN (0)
0
0
1
6
9
Yahoo (0)
0
0
1
6
9
Google (0)
0
0
3
4
5
MSN (<3)
15
Yahoo (<3)
Google (<3)
10
5
0
QR1
QR2
QR3
QR4
QR5
MSN (<3)
0
1
11
21
28
Yahoo
(<3)
0
3
11
20
26
Google
(<3)
0
0
4
7
12
1
0
•
30
Percentage of Contexts
10
Highlights
– Substantial drop in recall as number of vector entries in QR increases
(expected), comparable between MSN, Yahoo, roughly one order of
magnitude less on Google
– For QR4 using MSN, Yahoo, low recall may potentially affect user
experience
– RB configurations tested same recall as QR2
– IFM works on substantially larger set of candidate results
Yahoo! Confidential
23
Relevance Results for QR
Precision @ 3
Strong Precision @ 3
0.450
0.900
0.400
0.800
0.350
0.700
0.300
0.600
0.250
0.500
0.200
MSN
0.150
Yahoo
0.100
Google
QR1
QR2
QR3
QR4
QR5
MSN
0.250
0.364
0.390
0.396
0.358
Yahoo
0.250
0.375
0.397
0.416
0.394
0.404
Google
•
MSN
0.300
Yahoo
0.200
Google
0.100
0.050
0.000
0.400
0.254
0.384
0.395
0.394
0.000
QR1
QR2
QR3
QR4
QR5
MSN
0.504
0.687
0.770
0.775
0.757
Yahoo
0.496
0.688
0.758
0.801
0.780
Google
0.489
0.717
0.784
0.801
0.802
Highlights
–
–
–
–
Use P@1, P@3, SP@1, SP@3 metrics
SP drops sharply for MSN, Yahoo beyond QR4 (recall issues)
Optimal operating point for MSN, Yahoo QR3/QR4, Google QR5
QR4 uses 7.3 terms avg., QR5 uses 8.51 terms avg.
Yahoo! Confidential
24
Relevance Results for RB and IFM
RB/IFM Precision
0.900
0.800
0.700
0.600
0.500
0.400
P@1
0.300
P@3
0.200
0.100
0.000
•
RB2
RB6
IFM RA
SW 1
IFM RA
SW 2
IFM RA
SW 3
IFM RA IFM MC 4 IFM MC 4 IFM MC 4 IFM MC 4
SW 4
SW 1
SW 2
SW 3
SW 4
P@1
0.803
0.755
0.524
0.803
0.887
0.855
0.503
0.797
0.870
0.845
P@3
0.742
0.684
0.502
0.730
0.794
0.785
0.497
0.721
0.787
0.762
Highlights
– RB2/RB6 best configurations within RBs, RB2 has highest SP@1
– IFM-RA-SW3 winner (best P@1)
Yahoo! Confidential
25
Discussion of Results
•
Simple QR can attain high relevancy
– However, precision decreases as function of low recall
– Optimal setting depends on web search engine
•
•
Human reformulations are unlikely to attain the same level of relevancy
as that of QR (best QR1 issues 2.25 terms attains P@3 of 0.504)
RB can perform competitively
– particularly at SP@1
– Additional experiments showed that some good results are bubbling up from
middle-tier of results (ranked between positions 100 and 1000)
– Does not do well for SP@3 (problem if the “right” results are not recalled by
selection part)
– Requires substantial modifications to a web search engine
•
•
Contextual search is not solely a ranking problem, but one of recall
IFM
– achieves highest recall and overall relevancy
– Can be competitive and, in some measures, superior to QR
– More costly to execute
Yahoo! Confidential
26
Conclusion
• Investigated three algorithmsd for implementing contextual
search:
– QR
– RB
– IFM
• QR
–
–
–
–
can be easily implemented on top of a commodity search engine
Performs surprisingly well
Likely to be superior to manual query reformulation
Recall problems
• RB and IFM break recall limitations of QR
• IFM very effective
– Outperforms both QR and RB in terms of recall and precision
• The three algorithms offer a good design spectrum for
contextual search implementers
Yahoo! Confidential
27
Future Work
• Further tuning of contextual search
algorithms
• Alterative presentations of context
• Improve relevancy of context term vectors
• Better word sense disambiguation
• Investigate the usage of different context
types (e.g., time, location, user profiles)
• Improve contextual ranking and blending of
different source types
• How to leverage semantic web technologies
• …
Yahoo! Confidential
28
Yahoo! Confidential
Interested? Email your resume to: thinkbig@yahoo-inc.com
29
Backup Slides
Yahoo! Confidential
30
Example Search Scenarios
• User wants to find the nearest movie theater
– Context: location
– Query: “movie theater”
• User reads a press article about the new Mac OS X Tiger and
wants to learn more about it
– Context: news article
– Query: review
• User signs in to Yahoo! and wants to plan trip to ‘Java’
– Context: search history, user preferences
– Query: java
• => Query alone not sufficient!
• => Context critical for returning relevant results
• => Users often manually append context in form of
adding extra query terms
Yahoo! Confidential
31
Download