Information Filtering Personalization

advertisement
Information Filtering /
Personalization
Luz M. Quiroga
Stimulate 2005
Stimulate 2005
IF-Personalization / Luz. M. Quiroga
Information Filtering (IF) /
Personalization

What do we understand for IF?

How different is IF from IR

Why do we might need it?

What personalization means to you?

Do you make use of it? For what purpose?
Stimulate 2005
IF-Personalization / Luz M. Quiroga
IF / personalization
issues / related concept







Blocking, delivering
Profiles, Information needs,
user modeling
Organizing, Searching,
finding, discovering
Web design, Usability,
personas
Database, web, e-mail,
distribution lists, blogs,
community of practice
Recommenders, alert,
agents
Privacy, ethics, trust
Stimulate 2005

From class feedback
IF-Personalization / Luz M. Quiroga
Information Filtering: variants







SDI (selective dissemination of information)
Current awareness
Alert
Routing
Customization
Recommenders
Personalization
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Main concepts in IF




Information Filtering .vs.
Information Retrieval (definition)
Profiles
User models
Agents
Stimulate 2005
IF-Personalization / Luz M. Quiroga
IF v.s. IR. Definitions of IF



“a field of study designed for creating a
systematic approach to extracting
information that a particular person finds
important from a larger stream of
information” (Canavese 1994, p.2).
“tools … which try to filter out irrelevant
material” (Khan & Card 1997, p.305)
a process of selecting things from a larger
set of possibilities, then presenting them in
a prioritized order (Malone et al. 1987).
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Defining Information Filtering
Belkin & Croft, 1992. “IF and IR: two sides of the
same coin”
 Typical characteristics of the IF process
 Document set: Dynamic
 Information need: Stable, long term, specified in
a profile
 Profile: Highly personalized
 Selection process: Delegated
 Filtering: “the process of determining which
profiles have a high probability of being satisfied
by particular object from the incoming stream”
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Retrieval System Model (Douglas Oard)
User
Query
Formulation
Detection
Selection
Index
Examination
Indexing
Stimulate 2005
Docs
IF-Personalization / Luz M. Quiroga
Delivery
IF System Model
User profile
Information
need
(long term)
Profile
acquisition
Detection
Selection
(delegated:
agent)
Index
Examination
Indexing
Stimulate 2005
Docs
(dynamic)
IF-Personalization / Luz M. Quiroga
Delivery
Why do we need IF?


Internet growth is exponential: MIDS (Matrix
Information and Directory Services) home page:
http://www.mids.org/
One of the impacts of Internet is that any person
with access to the Internet can become an author
and a publisher. As a consequence, the quality of
the information to be found in the Internet is
extremely diverse and the quantity of information
available is enormous (Lynch 1997)
Information overload
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Information overload


With the explosion of information, the major
concerns are not availability but obtaining
the right information. Information that is
highly important for one individual has no
meaning for many others
“at least 99% of available data is of no
interest to at least 99% of the users
(Bowman et al. 1994, p. 106).
Stimulate 2005
IF-Personalization / Luz M. Quiroga
The need for IF: History



1945: Vannevar Bush / Memex
“... There is a new profession of trial blazers,
those who find delight in the task of
establishing useful trails through the
enormous mass of the common record..”
1958, Luhn: Selective Dissemination of
Information
1965: Ted Nelson / Xanadu / Hypertext

... Professionals who would compete to create
better trails, which would attract more users and
royalties .....
Stimulate 2005
IF-Personalization / Luz M. Quiroga
The need for IF: History

1969: Hollis & Hollis: “Personalizing
Information processes”



the amount of information was doubling every
seven to ten years
1982, Denning (ACM president / Filtering
e-mail)
1987: Malone: Social filtering (collaboration
- annotation in documents - groupware)
Stimulate 2005
IF-Personalization / Luz M. Quiroga
The need for IF: History
Information Filtering / Users profiles / agents

Need a system that selectively weed out the
irrelevant information based on users
preferences (user profile)

The system will act on behalf of the user and
will deliver selected, prioritized information
(active, agent)
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Profiles

User characteristics; user preferences

Profiles are the basis for the performance of IF
systems:



“the construction of accurate profiles is a key task -- the system’s
success will depend to a large extent on the ability of the learned
profile to represent the user’s actual interest” (Balabanovic &
Shonan 1997, p.68)
building a “good” profile is still the central obstacle to achieving
reasonable performances in IF systems
Need: evaluation of IF (profiles)

Fidel (corporations’ employees)

Quiroga (consumer health information systems)
Stimulate 2005
IF-Personalization / Luz M. Quiroga
User modeling


In order to build a good system in which a
person and a machine cooperate to perform a
task it is important to take into account some
significant characteristics of people (Elaine
Rich, 1983)
User models are personal characteristics of the
user that the system maintains (Chris Borgman)

A profile can be thought as a user model.
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Profiles, IF and User modeling
All information filtering models and systems are based
on modeling the user and presenting his information
needs in the form of a profile [1]
A conceptual framework for the design of IF systems
come from two established lines of research: IR & User
Modeling [2]
[1] Shapira, Peretz & Hanani. Dept. of Industrial Engineering, Ben Gurion
University; Dept. of IS, Bar-Ilan University
[2] Oard & Marchionini. University of Maryland
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Agents



Software programs that implement user
delegation [1]
A personal assistant who is collaborating with the
user in the same work environment; information
filtering is one of the many applications an agent
can assist [2]
Mental agents / Society of agents. Each mental
agent can only do small process; joining these
agents in societies leads to true intelligence [3]
[1] Jansen James. Phd Candidate Texas University, Computer Sc. US Academy
Military. Research: combination of agents & search engines
[2] Maes, Patty. MIT Media Lab. Research AI
[3] Minsky, Marvin. The Society of minds, 1986
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Types of user models (Rich)
Depending on:

The user being modeled



Individual
Canonical (stereotype; group)
Acquisition model


Explicit (stated)
Implicit (inferred)
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Individual / Canonical user models
(Elaine Rich)


Individual: Each user with one interface;
appropriate to his/her need; emphasis in
individual differences
Canonical [stereotype, group]]: The user is part
of a group; interface for the group; emphasis in
what the group has in common



Shared knowledge; community of practices
Collaborative filtering
Influencing the design of web sites for e-commerce
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Individual / Canonical user models
(Elaine Rich)
GRUNDY: an example of a canonical type of
user model
• A case study in the use of sterotypes
• Grundy recommends novels that people might like to read
• Stereotypes contain facets that relate to people’s taste in books
• Grundy learns from user feedback: have they read it / liked it
(reinforcement); if not, why?
• Experiments showed that Grundy does significantly better with
the user model than without it
• It is a good start toward the construction of individual models
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Explicit / Implicit user models (Rich)


Explicit: [stated].
The model is built by the
system based on explicit
information provided by the
user
Implicit: [inferred].
The model is built by the
system by mean of a
learning process based on:
 User feedback (inferred
from responses)
 User behavior (inferred
from action) -> AGENTS
Stimulate 2005
Issues to consider:

How to capture “user preKnowledge” ?

User effort

User control
(acceptability,
understanding)
IF-Personalization / Luz M. Quiroga
ASIS: Closing keynote presentations.
Plenary debate; the future of IR, IF

ASIS2001



James Hendler: chief scientist of the Information System Office at
the Defense Advanced Research Agency. He has Joint
appointments in the Computer Science, the Electrical Engineering
Department and the Advanced computer studies at University of
Maryland, College Park
Ben Schneiderman: Professor in the Department of Computer
Science at the University of Maryland, College Park. Founder of
the Human-Computer Interaction laboratory; fellow of ACM; he
received the ACM CHI lifetime Award in 2001
ASIST 2004

Tim Berners-Lee : inventor of the WWW; currently director of the
W3C (World Wide Web Consortium)
Stimulate 2005
IF-Personalization / Luz M. Quiroga
ASIS: Closing keynote presentations.
Plenary debate; the future of IR, IF

James Hendler (asist 2001)


Ben Schneiderman (asist 2001)


Solution: AUTONOMOUS AGENTS: when we need
information, one way to find it is to talk to an expert; both
engage in a conversation; the expert learns about our
needs, constrains and preference; the expert presents
options; we decide.
Solution: Good Interfaces; with autonomous agents we
loose control; we can not trust agents; who has the power:
the agent or the user?
Tim Bernster (asist 2004)


The semantic web; ontological representation of
knowledge (metadata)
Critics: any system that requires metadata is meant to fail
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Some other user modeling techniques




Social and collective profiles
Collaborative filtering
Social data mining
Filtering and communities of practices
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Social Profiles

Ardissono & Goy (1999)



SETA: A recommender system for electronic
shops
Based on Stereotypes
Profiles include “beneficiaries models”: user
models for each third person for whom the
shipper is selecting goods
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Social profiles

Petrelli et al (1999)



Personalized guides to museums
Based on stereotypes
Study suggest including “family profiles”
besides the individualized museum guide
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Collaborative profiles



A process where the system gives
suggestions based on information gleaned
from members of a community or peer
group.
Example: Amazon
People who (bought, read) X
also (bought, read) Y
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Social data mining


Stimulate 2005
Blogs
Community of practices / knowledge
sharing
IF-Personalization / Luz M. Quiroga
Web usability / Personas /
User models for web design

Sources:


Personas: Setting the Stage for Building Usable
Information Sites
By Alison J. Head
http://www.infotoday.com/online/jul03/head.shtml
Alan Cooper, The Inmates Are Running the
Asylum: Why High-Tech Products Drive Us Crazy
and How to Restore the Sanity, Indianapolis:
Sams, 1999
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Web usability / Personas /
User models for web design




Personas are hypothetical archetypes; imaginary
Personas are defined by their goals (detailed)
Developed through a series of ethnographic
interviews with real and potential users.
 Demographic (quantitative) data, such as age,
education, and job title. (similar to marketing
segmentation)
 More important: to collect qualitative data
(persona)
Interfaces are built to satisfy personas' needs and
goals.
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Personalization and web design
Web usability / Personas




Alan Cooper original idea: using a fictitious user with
a set of goals to guide and focus the design of a
product.
“His original idea was turned out into a rigorous form
of user model, based on behavior patterns that
emerge from ethnographic research.”
“A set of personas represents the key behaviors,
attitudes, skill levels, goals, and workflows of real
people we interview and observe, which we then use
along with scenarios to guide the product's
functionality and design.”
“The method has matured to the point that anyone
trained in it should be able to get the same personas
from the same data.”
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Personalization - environments
where is being used






Databases
Newsgroups, discussion lists
Personal Information Management (desktop files, E-mail,
bookmarks, etc.)
News: electronic journals
Search engines
Web sites
 Business
 e-commerce
 e-health
 e-etc.
Stimulate 2005
IF-Personalization / Luz M. Quiroga
LIS 678: IF & Personalization
Example of Special topics (previous semesters)











Privacy and personalization
E-commerce and personalization
Mining usage data for web personalization
Machine learning and personalization
Adaptive web sites: learning from visitor access patterns
Children's information seeking for electronic resources
Users' criteria for relevance in IF systems
Patterns in the use of search engines
Satisfaction of information users
Individual differences in organizing, searching, retrieving
and evaluating information
Information retrieval technologies for special users
Stimulate 2005
IF-Personalization / Luz M. Quiroga
LIS 678: IF & Personalization
Example of Special topics (this semester)






Personal Ontologies
Personal Information Management
Social / Collaborative filtering (wikis, blogs,
community of practice)
Desktop searching
Semantic Web: metadata, XML, RDF
Probabilistic IR / IF
Stimulate 2005
IF-Personalization / Luz M. Quiroga
LIS 678: IF & Personalization
Example of projects (this semester)




Technology and literacy in developing
countries (panel)
Business application of IF products
Personalized ranking
Semantic web and personalization
Stimulate 2005
IF-Personalization / Luz M. Quiroga
IF Independent studies


Alex Guilloux: usability study of bookmarking
behaviour; how specificity level in the hierarchy of
bookmarks affect relevance
Susan Lin:




Bookmarking software; specification for design
Bookmarking habits of reference librarians (Information
Architecture class)
Steve Lum: Ontology mapping; bookmark mapping
for collaborative filtering
Jennifer Cambell: Personalization and communities
of practice (evaluation)
Stimulate 2005
IF-Personalization / Luz M. Quiroga
LIS 678: Projects





Evaluation, comparison of IR / IF systems (e.g. search engines;
recommenders, personalization features in digital libraries and
portals)
Designing / running an IR/IF experiment (e.g. building a
collaborative profile using a movie recommender; testing usability
of a search interface; incorporating personalization in the design of
a digital library)
Analysis / design / prototype of a IR/IF component (e.g. a ranking
algorithm; building a prototype of a searching interface; designing
personalized web sites)
Writing a paper: literature review, reaction paper on IR/IF/User
modeling
Conducting research or development on IF - User modeling (e.g.
using faceted classification schemes for personalized web-IR);
using bookmarks as a source of profiles; visualization for personal
information management; observing users' searching behavior children, young adults, patients, students, members of a
community)
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Exercises

Use Sifter filtering system
http://ella.slis.indiana.edu/~junzhang/demo.html

Use the information filtering agent at:
http://www.ics.uci.edu/~pazzani/Publications/ - download several papers of
interest and see what recommendations you get

Use the movielens system: http://movielens.umn.edu/ rate movies (you
decide how many you need to rate to adjust your profile) and see what
recommendations you get
For all exercises discuss:



Content of the profile
Is the profile representing user interests?
To what extent do these systems allow the user control over their profile?
Stimulate 2005
IF-Personalization / Luz M. Quiroga
People / Resources

Douglas Oard IF page:
http://www.ee.umd.edu/medlab/filter/

SIFTER Project
http://sifter.indiana.edu/
Stimulate 2005
IF-Personalization / Luz M. Quiroga
People interested in IF in UH




User modeling: Martha Crosby, David Chin
User – Information interaction: Diane Nahl
Filtering in corporations: Bob SW.
Profile acquisition and representation: Luz
Quiroga
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Comments

Comments, Questions?

Thanks!
Stimulate 2005
IF-Personalization / Luz M. Quiroga
Download