Information Filtering - School of Information

advertisement
Collaborative Information Retrieval
- Collaborative Filtering systems
- Recommender systems
- Information Filtering
• Why do we need CIR?
- IR system augmentation
- Filtering
• Focusing on the user
- People-centric view of data
- Linking users by interests
Recommender Systems
• Broader term than CF, may not be explicitly
collaborating
• We get recommendations every day
• Types of recommendations
- Implicit
- Explicit
• Properties of recommendations
- Identity
- Experts
• Use of recommendations
- Aggregation from data
- Leveraging naturally occurring factors
Recommendation Issues
• How do you get people to cooperate?
• How good can the recommendations be?
- Find things you’d never find?
- Step savings, information navigation
• Volume of recommendations vs. number of
recommendable items?
• How accurate can the recommendations be?
- Initially
- Overall
- Over time
• What about changing interests?
Social Issues
•
•
•
•
Who controls the sharing?
Who controls the controls?
“Give to get” systems
Anonymity vs. Community
- Community of “friends”
- People as data points
• Free riders
• Logrolling and Over-rating
Information Filtering & IR
• How about filtering, without the
collaboration?
- Individual preferences
- Implicit and Explicit
• Text is analyzed
- Feature extraction
- Recall & precision measures
• Vector space identified
• Relevance Feedback
- Matched with user or rating
- Attributes are matched or added to queries
Two sides of the same coin?
•
•
•
•
Filtering is removing data, IR is finding data
Dynamic datasets
Profile-based - preferences
Repeated use of the system, long term
interests
• Precision & Recall of profiles, not info?
• Different needs & motivations
• Less interactive than (Web) IR?
Community Centered CF
•
•
•
•
•
What is a community?
Helping people find new information
Mapping community (prefs?)
Rating Web pages
Recommended Web pages
- Measuring recommendation quantity?
- Measuring recommendation use
• Constant status
Community CF
• “Personal relationships are not necessary”
• What does this miss?
• If you knew about the user, would that help
with the cold start problem?
• Advisors & Trust
• Ratings
- Population wide
- Advisors
- Weighted sum
• How would an organization use this?
Contexts for Implicit Ratings
-
•
•
•
•
Who
When
What
How (discovery)
Web Browsing
RSS Reading
Blog posting
Newsgroup- listserv use
Social Affordance & Implicit
• How can you not use ratings?
• Read wear, clicks, dwell time, chatter
• Not all resources are as identifiable
- Granular- Web pages
- Items - commercial products
• Web is a shared informaiton space without
much sharing
• How do incent people to contribute?
- Social norms
- Rewards
Contexts for Explicit Ratings
•
•
•
•
•
Movies
Books
(Junk) mail
eBay transactions
Other content
PHOAKS
•
•
•
•
•
•
Wider group of people (anyone?)
Usenet link mining for Web resources
Raters & Users
Precision (88%) - belong in category
Recall (87%) - rules classify as category
What counts as a recommendation?
- More than one mention?
- Positive & negative?
• Fair and balanced for a Community
• How do you rank resources?
- Weights
- Topics
Fab
•
•
•
•
Beyond “black box” content
Combining recommendations & content
Tastes in the past & future likes
Identifies “emerging interests”
- Group awareness
- Communication (feedback)
• Profiles of content analysis compared
- Users’ own profile can recommend
- Relation between users can recommend
•
•
•
•
User profile = multiple interests
Content profile = static interest
Both may change
Items are continually presented to users
Future Issues in Collab IR
• It may be more interesting to find a like mind
than a resource recommendation
- Social Networking
- Ad hoc group discussions
• Allowing users control over their profile of
interests
- Over time
- Privacy
- Difficult to capture interests
• Working with diverse content or user interests
• Visualization of recommendations & areas
Collaboration
• How important is it to be able to collaborate?
- Add to your own intelligence
- Know about other things you don’t know about
• What are the best scenarios for collaboration
for Information Retrieval?
- Privacy
- Commerce
- Consistency
Is Filtering a Necessary Evil?
• What are the Costs of Content Filtering?
• Do you want filtering?
- What kind of filters?
- Who should control them?
• What is the importance of accuracy for filtering?
- Metadata
- Usage and appropriate content (not just for childern)
• Sharing filtering?
Bonus Work
• Up to 4 points on your final course average
- Size of the project
- Quality of project work
• Individual work
• Bibliography building & highlight reviews
- Collaborative Filtering since 1998
- Information Seeking in Financial Environments
- IR & Agents since 1999
• IR resources organization & taxonomy
Download