View/Open

advertisement
Methods and Challenges in
Recommender systems: A Survey
Shruti Joshi#1, Shailendra Aote*2, Pawan Khade#3
Department of Computer Science and Engineering
Hingna Road ,Wanadongri ,Nagpur-10Maharashtra,India
shrutig285@gmail.com
pawan.khade@gmail.com
Hingna Road ,Wanadongri ,Nagpur-10Maharashtra,India
Shailendra_aote@rediffmail.com
Abstract — This paper introduces the methods and challenges in
recommendation system. The aim of the recommender system is
to provide a user with detailed and extract item. Also paper is
discussing the Scienstein’s hybrid research paper recommender
system. Hybrid research paper recommender systems have
combined four different approaches. Quality improvement that
can be achieved not only by accuracy but also by diversity. This
will bring personalization in true sense. Though the concept is
very clear actual balance maintained among accuracy and
diversity is challenging. Use of proper re-ranking technique and
proper measurement metric will improve the system. The
division into rating prediction and re-ranking has brought
flexibility in implementation of various state of the art
techniques. Such quality systems which balance not only the
accuracy but also the diversity can be made optimized using any
well known optimization algorithm. Genetic algorithm or
Particle swarm optimizations are well known optimization
techniques that can significantly enhance the efficiency of the
system. Instead of increasing diversity of individual items we can
think of the diversity improvement of the some sequence of items
or say bundle of items also.
Keywords— Recommender systems, Information retrieval,
collaborative filtering, content based filtering, Hybrid
approach, diversity ,ranking based techniques.
I. INTRODUCTION
The development of recommender system has been led by the
exponential growth of the world-wide-web and the emergence
of e-commerce technologies. As the information overload is
big concern but the use of this information in useful way is
becoming very important. This need is the goal of
development of recommender systems is to help users to get
most relevant product or information they requested. Example
of such applications are Amazon.com,Netflix.com,
MovieLens etc. The current generation of recommender
systems still requires further improvements to make
recommendation methods more effective and applicable to an
even broader range of real-life applications. Recommender
System apply varying techniques from statistics and
knowledge discovery to the problem of recommending items
to users of a system [7]. As noted by [9], although the roots of
Recommender systems can be traced back to the work on
cognitive science, approximation theory, information retrieval
or forecasting theories, RS emerged as an independent
research area in the mid-1990s when researchers started
focusing on recommendations problems that explicitly rely on
the ratings structure. One of the first known RS was the
Tapestry system, developed at Xerox Parc. This was a
filtering system for electronic documents, primarily e-mail
and Usenet postings. Non-automated filtering systems such as
Tapestry required the user to determine the relevant predictive
relationships within the community, placing a large cognitive
load on the user [3]. Automating the process of
recommendation allow recommendations for large
communities of users. In this sense, one of the first automated
Recommender systems was GroupLens [8] which used a
neighbourhood-based algorithm. Recommender systems uses
basically the three approaches collaborative filtering, contentbased and hybrid approach discussed in section 1.
LITERATURE SURVEY
1. B. Smyth et. al. [1] proposed some ad-hoc strategies to rank
items for inclusion in recommendation list. According to
author maximum similarity in target query and cases to be
retrieved is the general strategy in many domains but it
doesn‘t work in some domains.
2. K. Bradley et. al. [2] proposed three new algorithms for
improving individual diversity. According to author diversity
problem is always been limitation for content based
recommendation techniques and the proposed algorithms have
formed a benchmark on this concern. Out of these Bounded
Greedy Selection algorithm has greatly reduced the retrieval
cost and caused minimal loss of similarity among target query
and recommendations.
3. C. Ziegler et. al. [3] proposed topic diversification, a new
heuristic approach to optimize the balance between accuracy
and diversity so as to keep accuracy in a certain level when
increasing diversity, specifically for recommendation lists
obtained as a result of some item based collaborative filtering
algorithm. Topic diversification resembles to Osmotic
Pressure analogy where selective permeability is the key
criteria for optimization. Taxonomies are created for various
domains, arranged in a hierarchical way. Each product
belongs to one or more taxonomies and they also have content
descriptions relating to these domain taxonomies. The authors
also propose intra-list similarity, a new metric which is well
suited to capture the diversity using proposed algorithm.
According to authors effective use of content descriptions
along with relevance weights of products has effective impact
while ranking items and that is where the proposed method
differs from other existing ones. Their experimental results
shown that users preferred the altered diversified list even
some loss of accuracy occurred, than the accurate unaltered
list.
4.
D. Fleder et. al. [4] showed how basic design choices
afect the outcome, and thus managers can choose
recommender designs that are more consistent with their sales
goals and consumers' preferences. They found that
recommenders can increase sales, and recommenders that
discount popularity appropriately may increase sales more.
5. M. Zhang et. al. [5] proposed the approach that seeks to
find out best possible subset of items to be recommended over
all possible subsets. Here resultant list‘s similarity to target
query and diversity within list these two are taken as a binary
optimization problem. A new evaluation metrics, item
novelty, is proposed. Item novelty means how much an item is
different than existing items list. Item novelty depends upon
other existing items in user profile. Item novelty brings certain
level of difficulty for recommendations and hence can be used
to generate useful test cases. By adjusting the novelty value
the tolerance in accuracy loss is balanced. Author points out
that probability of recommending novel items is low
whenever similarity is the basic selection criterion.
SECTION – I
1) Collaborative Filtering :
The term collaborative filtering was first used by David
Goldberg at Xerox PARC in 1992 in a paper called ―Using
collaborative filtering to weave an information tapestry.‖ He
designed a system called Tapestry that allowed people to
annotate documents as either interesting or uninteresting and
used this information to filter documents for other people.
There are now hundreds of web sites that employ some sort of
collaborative filtering algorithm for movies, music, books,
dating, shopping, other web sites, podcasts, articles, and even
jokes.
Collaborative Filtering (CF) algorithms try to predict the
utility of items for a particular user based on the items
previously rated by other users. This way, a CF recommender
systems is not limited to recommend items similar to those
that the target user already know, enabling the
recommendation of items completely unknown by him/her,
taking advantage of information from other users.
2) Content-based Filtering :
Content-based filtering (CBF) algorithms search for items
similar to other items that the user liked in the past. That is,
the predicted utility of item for user is estimated based on the
known utilities. To estimate such similarity, the recommender
System uses stored information about the items, e.g. in the
case of movies, genre, director, etc. This approach has its
roots in the classical Information Retrieval (IR) [4] and
Information Filtering [4] research fields. The improvement
over traditional IR approaches come from the use of user
profiles that contain information about users‘ tastes,
preferences, and needs. The profiling information can be
elicited from users explicitly, e.g., through questionnaires, or
implicitly—learned from their transactional behaviour over
time [1]. Accordingly, the item information stored by
the RS is known as item profile. It is usually computed by
extracting a set of features of the item (possibly from external
sources), and is used to determine the appropriateness of the
item for recommendation purposes. Given its nature strongly
dependent on users‘ activities and stored information, content
based. RS have a number of limitations [1] content-based RS
are limited by the features that are explicitly associated with
the objects that these systems recommend, second limitation is
when the system can only recommend items that score highly
against a user‘s profile, the user is limited to being
recommended items that are similar to those already rated,
finally A RS needs enough information in the user profile
before it can generate reliable recommendations. Therefore, a
new user, which has entered very few information to the
system, would not be able to get accurate recommendations.
3) Hybrid Approach :
In RS context, a hybrid recommender is a combination of
content based and collaborative filtering algorithms, which
helps to avoid some limitations of such algorithms alone.
[1] classify hybrid RS as follows:
1) Separate implementations of collaborative and
content based methods and then combining their
predictions. The mixture can be made using a linear
combination of ratings or a voting scheme.
Alternatively, at a given moment one of the
individual recommenders can be chosen.
2) Collaborative RS incorporating some content based
characteristics. For example, using content based
information in the user profile to calculate user
similarity.
3) Content based RS incorporating some collaborative
characteristics. For example, using dimensionality
reduction techniques on content based profiles.
4) RS with a general unifying model that incorporates
both content based and collaborative characteristics.
That is, using content based and collaborative
characteristics in a single method which is able to
generate recommendations taking advantage of all
these characteristics.
There are various other approaches by which
recommendations can be done using this hybrid approach. In
[6] Example of Scienstein's approach is used to recommend
research papers illustrated in Figure 1. With Scienstein, users
may provide one or several of the six inputs (text, references,
authors, sources, ratings or documents), adjust the algorithms
to their needs8, and receive recommendations for research
papers.
domain valuations are similar as well. Standard recommender
systems based on collaborative filtering compare users
without splitting items in different domains. In cross-domain
systems similarities of users computed domain-dependent. An
engine creates local neighbourhoods for each user according
to domains. Then, computed similarity values and finite set of
nearest-neighbours are sent for overall similarities
computation. Recommender system determines the overall
similarity, creates overall neighbourhoods and makes
predictions and recommendations.
Four approaches are combined in this hybrid research paper
recommender system. Approaches are citation analysis, cited
by, reference list , Bibliographic coupling. To rank results,
Scienstein applies what we call ‗in-text citation frequency
analysis‘ (ICFA) and ‗in-text citation distance analysis‘.
Section II.
ADVANCED RECOMMENDATION APPROACHES
A. Context-aware approaches
Context is the information about the environment of a user
and the details of situation he/she is in. Such details may play
much more significant role in recommendations than ratings
of items, as the ratings alone don‘t have detailed information
about under which circumstances they were given by users.
The recommender systems that pay attention and utilize such
information in giving recommendations are called contextaware recommender systems. Mobile phones are good
example of such systems.
D. Peer-to-Peer approaches
The recommender systems with this approaches are
decentralized. Each peer can relate itself to a group of other
peers with same interests and get recommendations from the
users of that group. Recommendations can also be given
based on the history of a peer. Decentralization of
recommender system can solve the scalability problem.
E. Cross-lingual approaches
The recommender system based on cross-lingual approach lets
the users receive recommendations to the items that have
descriptions in languages they don‘t speak and understand.
Yang, Chen and Wu purposed an approach for a cross lingual
news group recommendations. The main idea is to map both
text and keywords in different languages into a single feature
space, that is to say a probability distribution over latent
topics. From the descriptions of items the system parses
keywords than translates them in one defined language using
dictionaries. After that, using collaborative or other filtering,
the system gives recommendations to users. With the help of
semantic analysis it‘s possible to make a languageindependent representation of text. Example of this
recommender system is MARS.
B. Semantic based approaches
Most of the descriptions of items, users in recommender
systems and the rest of the web are presented in the web in a
textual form. Using tags and keywords without any semantic
meanings doesn‘t improve the accuracy of recommendations
in all cases, as some keywords may be homonyms. Traditional
text mining approaches that base on lexical and syntactical
analysis show descriptions that can be understood by a user
but not a computer or a recommender system. That was a
reason of creating new text mining techniques that were based
on semantic analysis. Recommender systems with such
techniques are called semantic based recommender systems.
The performance of semantic recommender systems are based
on knowledge base usually defined as a concept diagram or
ontology.
SECTION III.
CHALLENGES AND ISSUES
Though the recommender systems are becoming popular with
time but they face various challenges while designing. Some
of the challenges are discussed in this section
C. Cross-domain based approaches
Finding similar users and building an accurate neighbourhood
is an important part of recommending process of collaborative
Recommender systems. Similarities of two users are
discovered based on their appreciations of items. But similar
appreciations in one domain don‘t surely mean that in another
B. Trust
The voices of people with a short history may not be that
relevant as the voices of those who have rich history in their
profiles. The issue of trust arises towards evaluations of a
certain customer. The problem could be solved by distribution
of priorities to the users.
A. Cold-start
Its difficult to give recommendations to new users as his
profile is almost empty and he hasn‘t rated any items yet so
his taste is unknown to the system. This is called the cold start
problem. In some recommender systems this problem is
solved with survey when creating a profile. Items can also
have a cold-start when they are new in the system and haven‘t
been rated before. Both of these problems can be also solved
with hybrid approaches.
C. Scalability
With the growth of numbers of users and items, the system
needs more resources for processing information and forming
recommendations. Majority of resources is consumed with the
purpose of determining users with similar tastes, and goods
with similar descriptions. This problem is also solved by the
combination of various types of filters and physical
improvement of systems. Parts of numerous computations
may also be implemented offline in order to accelerate
issuance of recommendations online.
D. Sparsity
In online shops that have a huge amount of users and items
there are almost always users that have rated just a few items.
Using collaborative and other approaches recommender
systems generally create neighbourhoods of users using their
profiles. If a user has evaluated just few items then its pretty
difficult to determine his taste and he/she could be related to
the wrong neighbourhood. Sparsity is the problem of lack of
information.
E. Privacy
Privacy has been the most important problem. In order to
receive the most accurate and correct recommendation, the
system must acquire the most amount of information possible
about the user, including demographic data, and data about the
location of a particular user. Naturally, the question of
reliability, security and confidentiality of the given
information arises. Many online shops offer effective
protection of privacy of the users by utilizing specialized
algorithms and programs.
PROPOSED WORK
The proposed system will recommend the items to users based
on hybrid approach. This systems will balance not only the
accuracy but also the diversity can be made optimized using
any well known optimization algorithm. Instead of increasing
diversity of individual items we can think of the diversity
improvement of the some sequence of items or say bundle of
items also. Now as multiple items are involved in a bundle so
we have to think about the aggregate characteristics of all
items or services.
1.1 Proposed Approach
As recommender systems have become more commonly used
for producing sets or lists of recommendations, rather than
simply individual predictions, attention has shifted to the
value of the recommendation list as a whole and not simply
the quality of each individual recommendation. A particular
concern
expressed
in
certain
domains
concerns
―pigeonholing‖ users—identifying a single narrow interest
and making many similar recommendations .When those lists
came from an item-item recommendation algorithm
.Experimented with a variety of collaborative filtering,
content filtering, and hybrid algorithms for research paper
recommendation, finding that different algorithms performed
better at generating different types of recommendation lists
(e.g., related research for a paper, a broad introduction to a
field).
1.2 Proposed Architecture
The first loop is periodically executed and involves
calculating recommendation candidates by several
recommendation algorithms utilizing more static information
on the content as well as recent usage information from the
web warehouse. The output of the algorithms is combined in
one recommendation database which is used to dynamically
select recommendations. In the second feedback loop we
continuously gather and evaluate user reactions on presented
recommendations. The learning module uses this information
to refine the recommendations in the database and thus to
immediately impact the selection of future recommendations.
CONCLUSION
Recommender systems have made significant progress in
recent years and many techniques have been proposed to
improve the recommendation quality. However, in most cases,
many techniques are calculated to pick up the accuracy of
recommendations, whereas the proposal diversity has often
been ignored. Scienstein aims to be a powerful alternative to
academic search engines by not solely relying on keyword
analysis, but by additionally using citation analysis, explicit
ratings, implicit ratings, author analysis, and source analysis.
Although some of the utilized methods have been known for
decades, they have not been applied in the context of research
paper recommender systems. Other approaches such as the
‗in-text distance similarity index‘ or collaborative
annotations, classifications and links were developed
exclusively for Scienstein. The combination of all approaches
is critical since each approach possesses disadvantages that
can only be overcome by combining them.
References
[1] B. Smyth and P. Maclave,‖Similarity Vs. Diversity‖, 4th
International Conference on case-based reasoning, 2001. pp.
348-361.
[2] K. Bradely and B. Smyth, ―Improving recommendation
diversity," Proceeding 12th Irish Conference Artificial
Intelligence and Cognitive Science,2001.
[3] C.-N. Ziegler, S. McNee, J. Konstan, and G.
Lausen,‖Improving recommendation list through topic
diversification,‖ Proceeding 14th International WWW
conference,2005.
[4] D. Fleder and K. Hosanagar,‖Blockbuster culture's next rise
or fall: The impact of recommender system on sales diversity,‖
Proceeding of 8th ACM conference, 2007.
[5] M. Zhang & N. Hurley,‖ Avoiding monotony: Improving the
diversity of recommendation list‖, 2008.
[6] Bela Gipp1, Jöran Beel1,Christian Hentschel ―Scienstein: A
Research Paper Recommender System ― , 2009 .
[7] Adomavicius, G., Tuzhilin, A. (2005), Toward the Next
Generation of Recommender Systems: A Survey of the Stateof-the-Art and Possible Extensions. IEEE Trans.on Knowl.and
Data Eng. 17(6), 734-749.
[8] Sarwar, B. M., Karypis, G., Konstan, J. A., Riedl, J.
(2002), Recommender Systems for Large-Scale
E-Commerce: Scalable Neighborhood Formation using
Clustering.
[9] A. Said, B. Kille, B. Jain, and S. Albayrak, 2012.,‖Increasing
diversity through furthes neighbor-based recommendation,".
[10] G. Adomavicius and Y. Kwon,‖Improving aggregate
recommendation diversity using ranking-based techniques,"
IEEE Transactions On Knowledge And Data Engineering, 2012.
[11] K. Alodhaibi, A. Brodsky, and G. Mihaila, ―A
randomized algorithm for maximizing the diversity of
recommendations," Proceedings of the 44th Hawaii International
Conference on System Sciences, 2011.
[12] M. Ge, F. Gedikli, and D.Jannach,‖Placing high-diversity
items in top-n recommendation lists," Proceedings of
International Joint Conferences on Artificial Intelligence, 2011.
[13] B. Wang, Z. Tao, and J.Hu, ―Improving the diversity of
user-based ton-n recommendation by cloud model,‖ 2010.
Download