Methods and Challenges in Recommender systems: A Survey Shruti Joshi#1, Shailendra Aote*2, Pawan Khade#3 Department of Computer Science and Engineering Hingna Road ,Wanadongri ,Nagpur-10Maharashtra,India shrutig285@gmail.com pawan.khade@gmail.com Hingna Road ,Wanadongri ,Nagpur-10Maharashtra,India Shailendra_aote@rediffmail.com Abstract — This paper introduces the methods and challenges in recommendation system. The aim of the recommender system is to provide a user with detailed and extract item. Also paper is discussing the Scienstein’s hybrid research paper recommender system. Hybrid research paper recommender systems have combined four different approaches. Quality improvement that can be achieved not only by accuracy but also by diversity. This will bring personalization in true sense. Though the concept is very clear actual balance maintained among accuracy and diversity is challenging. Use of proper re-ranking technique and proper measurement metric will improve the system. The division into rating prediction and re-ranking has brought flexibility in implementation of various state of the art techniques. Such quality systems which balance not only the accuracy but also the diversity can be made optimized using any well known optimization algorithm. Genetic algorithm or Particle swarm optimizations are well known optimization techniques that can significantly enhance the efficiency of the system. Instead of increasing diversity of individual items we can think of the diversity improvement of the some sequence of items or say bundle of items also. Keywords— Recommender systems, Information retrieval, collaborative filtering, content based filtering, Hybrid approach, diversity ,ranking based techniques. I. INTRODUCTION The development of recommender system has been led by the exponential growth of the world-wide-web and the emergence of e-commerce technologies. As the information overload is big concern but the use of this information in useful way is becoming very important. This need is the goal of development of recommender systems is to help users to get most relevant product or information they requested. Example of such applications are Amazon.com,Netflix.com, MovieLens etc. The current generation of recommender systems still requires further improvements to make recommendation methods more effective and applicable to an even broader range of real-life applications. Recommender System apply varying techniques from statistics and knowledge discovery to the problem of recommending items to users of a system [7]. As noted by [9], although the roots of Recommender systems can be traced back to the work on cognitive science, approximation theory, information retrieval or forecasting theories, RS emerged as an independent research area in the mid-1990s when researchers started focusing on recommendations problems that explicitly rely on the ratings structure. One of the first known RS was the Tapestry system, developed at Xerox Parc. This was a filtering system for electronic documents, primarily e-mail and Usenet postings. Non-automated filtering systems such as Tapestry required the user to determine the relevant predictive relationships within the community, placing a large cognitive load on the user [3]. Automating the process of recommendation allow recommendations for large communities of users. In this sense, one of the first automated Recommender systems was GroupLens [8] which used a neighbourhood-based algorithm. Recommender systems uses basically the three approaches collaborative filtering, contentbased and hybrid approach discussed in section 1. LITERATURE SURVEY 1. B. Smyth et. al. [1] proposed some ad-hoc strategies to rank items for inclusion in recommendation list. According to author maximum similarity in target query and cases to be retrieved is the general strategy in many domains but it doesn‘t work in some domains. 2. K. Bradley et. al. [2] proposed three new algorithms for improving individual diversity. According to author diversity problem is always been limitation for content based recommendation techniques and the proposed algorithms have formed a benchmark on this concern. Out of these Bounded Greedy Selection algorithm has greatly reduced the retrieval cost and caused minimal loss of similarity among target query and recommendations. 3. C. Ziegler et. al. [3] proposed topic diversification, a new heuristic approach to optimize the balance between accuracy and diversity so as to keep accuracy in a certain level when increasing diversity, specifically for recommendation lists obtained as a result of some item based collaborative filtering algorithm. Topic diversification resembles to Osmotic Pressure analogy where selective permeability is the key criteria for optimization. Taxonomies are created for various domains, arranged in a hierarchical way. Each product belongs to one or more taxonomies and they also have content descriptions relating to these domain taxonomies. The authors also propose intra-list similarity, a new metric which is well suited to capture the diversity using proposed algorithm. According to authors effective use of content descriptions along with relevance weights of products has effective impact while ranking items and that is where the proposed method differs from other existing ones. Their experimental results shown that users preferred the altered diversified list even some loss of accuracy occurred, than the accurate unaltered list. 4. D. Fleder et. al. [4] showed how basic design choices afect the outcome, and thus managers can choose recommender designs that are more consistent with their sales goals and consumers' preferences. They found that recommenders can increase sales, and recommenders that discount popularity appropriately may increase sales more. 5. M. Zhang et. al. [5] proposed the approach that seeks to find out best possible subset of items to be recommended over all possible subsets. Here resultant list‘s similarity to target query and diversity within list these two are taken as a binary optimization problem. A new evaluation metrics, item novelty, is proposed. Item novelty means how much an item is different than existing items list. Item novelty depends upon other existing items in user profile. Item novelty brings certain level of difficulty for recommendations and hence can be used to generate useful test cases. By adjusting the novelty value the tolerance in accuracy loss is balanced. Author points out that probability of recommending novel items is low whenever similarity is the basic selection criterion. SECTION – I 1) Collaborative Filtering : The term collaborative filtering was first used by David Goldberg at Xerox PARC in 1992 in a paper called ―Using collaborative filtering to weave an information tapestry.‖ He designed a system called Tapestry that allowed people to annotate documents as either interesting or uninteresting and used this information to filter documents for other people. There are now hundreds of web sites that employ some sort of collaborative filtering algorithm for movies, music, books, dating, shopping, other web sites, podcasts, articles, and even jokes. Collaborative Filtering (CF) algorithms try to predict the utility of items for a particular user based on the items previously rated by other users. This way, a CF recommender systems is not limited to recommend items similar to those that the target user already know, enabling the recommendation of items completely unknown by him/her, taking advantage of information from other users. 2) Content-based Filtering : Content-based filtering (CBF) algorithms search for items similar to other items that the user liked in the past. That is, the predicted utility of item for user is estimated based on the known utilities. To estimate such similarity, the recommender System uses stored information about the items, e.g. in the case of movies, genre, director, etc. This approach has its roots in the classical Information Retrieval (IR) [4] and Information Filtering [4] research fields. The improvement over traditional IR approaches come from the use of user profiles that contain information about users‘ tastes, preferences, and needs. The profiling information can be elicited from users explicitly, e.g., through questionnaires, or implicitly—learned from their transactional behaviour over time [1]. Accordingly, the item information stored by the RS is known as item profile. It is usually computed by extracting a set of features of the item (possibly from external sources), and is used to determine the appropriateness of the item for recommendation purposes. Given its nature strongly dependent on users‘ activities and stored information, content based. RS have a number of limitations [1] content-based RS are limited by the features that are explicitly associated with the objects that these systems recommend, second limitation is when the system can only recommend items that score highly against a user‘s profile, the user is limited to being recommended items that are similar to those already rated, finally A RS needs enough information in the user profile before it can generate reliable recommendations. Therefore, a new user, which has entered very few information to the system, would not be able to get accurate recommendations. 3) Hybrid Approach : In RS context, a hybrid recommender is a combination of content based and collaborative filtering algorithms, which helps to avoid some limitations of such algorithms alone. [1] classify hybrid RS as follows: 1) Separate implementations of collaborative and content based methods and then combining their predictions. The mixture can be made using a linear combination of ratings or a voting scheme. Alternatively, at a given moment one of the individual recommenders can be chosen. 2) Collaborative RS incorporating some content based characteristics. For example, using content based information in the user profile to calculate user similarity. 3) Content based RS incorporating some collaborative characteristics. For example, using dimensionality reduction techniques on content based profiles. 4) RS with a general unifying model that incorporates both content based and collaborative characteristics. That is, using content based and collaborative characteristics in a single method which is able to generate recommendations taking advantage of all these characteristics. There are various other approaches by which recommendations can be done using this hybrid approach. In [6] Example of Scienstein's approach is used to recommend research papers illustrated in Figure 1. With Scienstein, users may provide one or several of the six inputs (text, references, authors, sources, ratings or documents), adjust the algorithms to their needs8, and receive recommendations for research papers. domain valuations are similar as well. Standard recommender systems based on collaborative filtering compare users without splitting items in different domains. In cross-domain systems similarities of users computed domain-dependent. An engine creates local neighbourhoods for each user according to domains. Then, computed similarity values and finite set of nearest-neighbours are sent for overall similarities computation. Recommender system determines the overall similarity, creates overall neighbourhoods and makes predictions and recommendations. Four approaches are combined in this hybrid research paper recommender system. Approaches are citation analysis, cited by, reference list , Bibliographic coupling. To rank results, Scienstein applies what we call ‗in-text citation frequency analysis‘ (ICFA) and ‗in-text citation distance analysis‘. Section II. ADVANCED RECOMMENDATION APPROACHES A. Context-aware approaches Context is the information about the environment of a user and the details of situation he/she is in. Such details may play much more significant role in recommendations than ratings of items, as the ratings alone don‘t have detailed information about under which circumstances they were given by users. The recommender systems that pay attention and utilize such information in giving recommendations are called contextaware recommender systems. Mobile phones are good example of such systems. D. Peer-to-Peer approaches The recommender systems with this approaches are decentralized. Each peer can relate itself to a group of other peers with same interests and get recommendations from the users of that group. Recommendations can also be given based on the history of a peer. Decentralization of recommender system can solve the scalability problem. E. Cross-lingual approaches The recommender system based on cross-lingual approach lets the users receive recommendations to the items that have descriptions in languages they don‘t speak and understand. Yang, Chen and Wu purposed an approach for a cross lingual news group recommendations. The main idea is to map both text and keywords in different languages into a single feature space, that is to say a probability distribution over latent topics. From the descriptions of items the system parses keywords than translates them in one defined language using dictionaries. After that, using collaborative or other filtering, the system gives recommendations to users. With the help of semantic analysis it‘s possible to make a languageindependent representation of text. Example of this recommender system is MARS. B. Semantic based approaches Most of the descriptions of items, users in recommender systems and the rest of the web are presented in the web in a textual form. Using tags and keywords without any semantic meanings doesn‘t improve the accuracy of recommendations in all cases, as some keywords may be homonyms. Traditional text mining approaches that base on lexical and syntactical analysis show descriptions that can be understood by a user but not a computer or a recommender system. That was a reason of creating new text mining techniques that were based on semantic analysis. Recommender systems with such techniques are called semantic based recommender systems. The performance of semantic recommender systems are based on knowledge base usually defined as a concept diagram or ontology. SECTION III. CHALLENGES AND ISSUES Though the recommender systems are becoming popular with time but they face various challenges while designing. Some of the challenges are discussed in this section C. Cross-domain based approaches Finding similar users and building an accurate neighbourhood is an important part of recommending process of collaborative Recommender systems. Similarities of two users are discovered based on their appreciations of items. But similar appreciations in one domain don‘t surely mean that in another B. Trust The voices of people with a short history may not be that relevant as the voices of those who have rich history in their profiles. The issue of trust arises towards evaluations of a certain customer. The problem could be solved by distribution of priorities to the users. A. Cold-start Its difficult to give recommendations to new users as his profile is almost empty and he hasn‘t rated any items yet so his taste is unknown to the system. This is called the cold start problem. In some recommender systems this problem is solved with survey when creating a profile. Items can also have a cold-start when they are new in the system and haven‘t been rated before. Both of these problems can be also solved with hybrid approaches. C. Scalability With the growth of numbers of users and items, the system needs more resources for processing information and forming recommendations. Majority of resources is consumed with the purpose of determining users with similar tastes, and goods with similar descriptions. This problem is also solved by the combination of various types of filters and physical improvement of systems. Parts of numerous computations may also be implemented offline in order to accelerate issuance of recommendations online. D. Sparsity In online shops that have a huge amount of users and items there are almost always users that have rated just a few items. Using collaborative and other approaches recommender systems generally create neighbourhoods of users using their profiles. If a user has evaluated just few items then its pretty difficult to determine his taste and he/she could be related to the wrong neighbourhood. Sparsity is the problem of lack of information. E. Privacy Privacy has been the most important problem. In order to receive the most accurate and correct recommendation, the system must acquire the most amount of information possible about the user, including demographic data, and data about the location of a particular user. Naturally, the question of reliability, security and confidentiality of the given information arises. Many online shops offer effective protection of privacy of the users by utilizing specialized algorithms and programs. PROPOSED WORK The proposed system will recommend the items to users based on hybrid approach. This systems will balance not only the accuracy but also the diversity can be made optimized using any well known optimization algorithm. Instead of increasing diversity of individual items we can think of the diversity improvement of the some sequence of items or say bundle of items also. Now as multiple items are involved in a bundle so we have to think about the aggregate characteristics of all items or services. 1.1 Proposed Approach As recommender systems have become more commonly used for producing sets or lists of recommendations, rather than simply individual predictions, attention has shifted to the value of the recommendation list as a whole and not simply the quality of each individual recommendation. A particular concern expressed in certain domains concerns ―pigeonholing‖ users—identifying a single narrow interest and making many similar recommendations .When those lists came from an item-item recommendation algorithm .Experimented with a variety of collaborative filtering, content filtering, and hybrid algorithms for research paper recommendation, finding that different algorithms performed better at generating different types of recommendation lists (e.g., related research for a paper, a broad introduction to a field). 1.2 Proposed Architecture The first loop is periodically executed and involves calculating recommendation candidates by several recommendation algorithms utilizing more static information on the content as well as recent usage information from the web warehouse. The output of the algorithms is combined in one recommendation database which is used to dynamically select recommendations. In the second feedback loop we continuously gather and evaluate user reactions on presented recommendations. The learning module uses this information to refine the recommendations in the database and thus to immediately impact the selection of future recommendations. CONCLUSION Recommender systems have made significant progress in recent years and many techniques have been proposed to improve the recommendation quality. However, in most cases, many techniques are calculated to pick up the accuracy of recommendations, whereas the proposal diversity has often been ignored. Scienstein aims to be a powerful alternative to academic search engines by not solely relying on keyword analysis, but by additionally using citation analysis, explicit ratings, implicit ratings, author analysis, and source analysis. Although some of the utilized methods have been known for decades, they have not been applied in the context of research paper recommender systems. Other approaches such as the ‗in-text distance similarity index‘ or collaborative annotations, classifications and links were developed exclusively for Scienstein. The combination of all approaches is critical since each approach possesses disadvantages that can only be overcome by combining them. References [1] B. Smyth and P. Maclave,‖Similarity Vs. Diversity‖, 4th International Conference on case-based reasoning, 2001. pp. 348-361. [2] K. Bradely and B. Smyth, ―Improving recommendation diversity," Proceeding 12th Irish Conference Artificial Intelligence and Cognitive Science,2001. [3] C.-N. Ziegler, S. McNee, J. Konstan, and G. Lausen,‖Improving recommendation list through topic diversification,‖ Proceeding 14th International WWW conference,2005. [4] D. Fleder and K. Hosanagar,‖Blockbuster culture's next rise or fall: The impact of recommender system on sales diversity,‖ Proceeding of 8th ACM conference, 2007. [5] M. Zhang & N. Hurley,‖ Avoiding monotony: Improving the diversity of recommendation list‖, 2008. [6] Bela Gipp1, Jöran Beel1,Christian Hentschel ―Scienstein: A Research Paper Recommender System ― , 2009 . [7] Adomavicius, G., Tuzhilin, A. (2005), Toward the Next Generation of Recommender Systems: A Survey of the Stateof-the-Art and Possible Extensions. IEEE Trans.on Knowl.and Data Eng. 17(6), 734-749. [8] Sarwar, B. M., Karypis, G., Konstan, J. A., Riedl, J. (2002), Recommender Systems for Large-Scale E-Commerce: Scalable Neighborhood Formation using Clustering. [9] A. Said, B. Kille, B. Jain, and S. Albayrak, 2012.,‖Increasing diversity through furthes neighbor-based recommendation,". [10] G. Adomavicius and Y. Kwon,‖Improving aggregate recommendation diversity using ranking-based techniques," IEEE Transactions On Knowledge And Data Engineering, 2012. [11] K. Alodhaibi, A. Brodsky, and G. Mihaila, ―A randomized algorithm for maximizing the diversity of recommendations," Proceedings of the 44th Hawaii International Conference on System Sciences, 2011. [12] M. Ge, F. Gedikli, and D.Jannach,‖Placing high-diversity items in top-n recommendation lists," Proceedings of International Joint Conferences on Artificial Intelligence, 2011. [13] B. Wang, Z. Tao, and J.Hu, ―Improving the diversity of user-based ton-n recommendation by cloud model,‖ 2010.