2023 International Conference on Recent Advances in Electrical, Electronics, Ubiquitous Communication, and Computational Intelligence (RAEEUCCI) | 979-8-3503-3742-6/23/$31.00 ©2023 IEEE | DOI: 10.1109/RAEEUCCI57140.2023.10134257 MUSIC RECOMMENDATION SYSTEM USING REAL TIME PARAMETERS Shivam Dawar Soumitra Chatterjee Dept. of Electronics and Communication SRM Institute of Science and Technology Chennai, India sd7369@srmist.edu.in Dept. of Electronics and Communication SRM Institute of Science and Technology Chennai, India sc5401@srmist.edu.in Mohammed Fardin Hossain Dr. Malarvizhi S Dept. of Electronics and Communication SRM Institute of Science and Technology Chennai, India mh3279@srmist.edu.in Dept. of Electronics and Communication SRM Institute of Science and Technology Chennai, India malarvig@srmist.edu.in Abstract—The use of music as a means of calming, energizing, and motivating oneself is widely popular. With the help of modern technology, the goal is to recommend suitable music for various situations. This project aims to achieve this by utilizing realtime data such as time, location, weather, facial expressions, artists, and audio attributes to accurately determine the user’s emotional state. Machine learning algorithms like CNN and DNN will be used to create a model that can classify song samples into different genres and Spotify API and LastFM API are connected to the user’s database to recommend music based on the individual’s parameters. The ultimate objective is to provide personalized music recommendations to users. Index Terms—music, music recommender, mood detection, music genre classification, CNN, DNN, Spotify API, LastFM API I. I NTRODUCTION The role of music in people’s lives cannot be overemphasized, as it serves as a medium for individuals to express themselves, understand their emotions, and connect with others. However, there are times when individuals may not be fully aware of their current mood or may find it challenging to comprehend their feelings, and music can play a crucial role in helping them to achieve this. Therefore, this project seeks to create a music recommendation system that leverages the latest technologies to assist millions of music lovers in discovering the perfect music for their needs. The proposed music recommendation system utilizes real-time parameters such as the user’s location, time, artist, weather, and facial expression to provide personalized recommendations. By taking these parameters into account, the system can accurately determine the user’s current mood and behaviour, making it easier to recommend a song or playlist that aligns with their preferences. To achieve this, the music recommendation system comprises three primary components. The first component is centred on understanding human emotions and moods through facial expressions. By analyzing facial expressions, the system can accurately detect and interpret the user’s current emotional state, providing valuable insights that can be used to make personalized music recommendations. The second component is centred on creating user music profiles based on their favourite artists and music genres. The system analyzes the user’s listening habits, taking into account the frequency with which they listen to certain artists and music genres. This information is then used to create a comprehensive music profile that provides insights into the user’s musical preferences, making it easier to suggest suitable songs and playlists. The third and final component of the system involves integrating the data obtained from the first and second components to make personalized music recommendations. By combining the user’s current mood and music profile, the system can provide a suitable mix of music that aligns with the user’s preferences. The system also takes into account the user’s feedback on the recommended songs and playlists, making it possible to refine and improve the recommendations over time. In summary, the objective of the music recommendation system is to offer users a customized music experience by utilizing machine learning algorithms and real-time data to make precise music recommendations. This personalized approach helps users to comprehend their emotions better, improve their mood, and ultimately, enhance their overall well-being. This system’s accuracy surpasses that of other models because it links a user’s preferences with their current emotions, resulting in exponentially improved accuracy. II. L ITERATURE S URVEY Emotion is detected using facial gestures, Humans are able to express their emotions with a lot of gestures given by their face. The facial emotions are detected through the computer’s camera using technology such as Convolution Neural Network and OpenCV [1]. Here the music is recommended based on creating a personal profile of social media usage based on various views in the social media apps and then suggesting a list of songs many models are used like CNN and VGG networks and for framework ubuntu, pyspark is used to develop such model[2]. This is a project where emotion is detected in a very modern manner by using a user’s social media data it may be ads or the post done by the user or seen and liked by the user, some useful technology is used to make advancements in this like graph, tree, LSTM and many other frameworks this helped us understand the various ways the emotion of a user can be analysed [3]. The audio signal beats are classified based on the different genres with the help of CNN. The user can get recommendations for a similar type of music based on their previous music history with the help of the collaborative filter algorithm. Data visualization has been used to visualize the feature of the music such as danceability, loudness, energy, liveness, acousticness, energy, tempo, and speechiness [4]. After understanding the requirements of music categorization the authors found the need to categorize the songs using the audio files so they improved the categorization using CNN model after trying many other models on the data set of GTZAN, which is a set of audio files[5]. III. N OVELTY To ensure accurate music recommendations, the model relies on frequently collected data parameters. This means the system continuously gathers and analyzes real-time information such as the user’s location, time, weather, facial expressions, and preferred artists and music genres. We have used many models like 3,5 and 7-layer CNN models and many attributes to find the model with which we will get the highest accuracy after comparison hence such models are used to build this project. Using this information, the model can build a comprehensive music profile for the user and integrate it with mood detection analysis data to provide the best combination of song suggestions. The advantage of using frequently collected data is that it enables the system to adapt and update its recommendations based on the user’s changing preferences and moods. Compared to previous models, this approach utilizing multiple attribute data helps improve the model’s detection accuracy and enhances its overall effectiveness. The music recommendation system creates a comprehensive music profile of the user, which includes their musical preferences. This profile is then combined with the mood detection analysis data to generate a highly personalized and accurate playlist. The system selects songs that match both the user’s preferences and their current mood, resulting in a playlist that is tailored to their exact needs. The utilization of Spotify API and Lastfm API is a unique feature that distinguishes this project from others. The integration of Spotify API enables us to access the user’s song profile from the day they started using Spotify, providing us with long-term data to analyze efficiently. This approach differs from other projects that only gather data from the point of connection. By incorporating Spotify API and Lastfm API, we enhance the accuracy and effectiveness of the music recommendation system. IV. P ROPOSED W ORK Our proposal is to develop a music recommendation system that generates playlists based on real-time parameters such as the user’s video feed, time, and location. This project is divided into three parts. The first part focuses on preparing the recommender system by detecting the user’s mood. The second part involves creating a deep neural network (DNN) to establish a user song profile. Finally, we integrate the results from the first two parts to build a recommender system that generates personalized playlists as the final output. A. Mood Detection In the first phase of our work, we aim to detect the mood of the user, which is a critical aspect of our music recommendation system. As people tend to listen to music that aligns with their current mood, accurately detecting their mood can significantly improve the song suggestion accuracy. Although emotions are intangible, we can detect them using facial recognition and questionnaires. A questionnaire is a set of a few questions asked to understand the mood of the user. For our project, we have employed the Haar Cascade algorithm, open CV, and various CNN models with 3, 5, and 7 layers to detect mood through facial recognition from the camera feed and survey. We found that the five-layer CNN model provides the highest accuracy among these models. CNN, or convolutional neural network, is a deep learning algorithm that classifies images into different categories, objects, or lists. It is particularly useful in detecting patterns in images, making it an effective method for mood detection from images or video feeds compared to other models such as the Histogram of Oriented Gradients (HOG), which can provide misleading data after detection. Therefore, we have chosen CNN as the best method to detect mood in our system. Fig. 1. Flowchart for Mood Detection using Facial Expression B. User Song Profile In the next phase of our project, we aim to create a user song profile that provides insights into the user’s musical preferences based on factors such as artists, genres, song speed, and duration. This will be useful in enhancing the accuracy of our music playlist predictions by understanding the user’s musical tastes. We have leveraged APIs to integrate with popular music apps like Spotify and LastFM to gather user data by connecting the API to our project. Authentication through 3rd party apps is necessary, and various key links are used to obtain the required data, which is then saved to form a user-specific dataset. Once we have a complete user music profile, we analyze the profile and mood using a machine-learning model to classify the best-suited songs for the occasion. of each song is fetched using Spotify API. Finally, the URI is fed into the song profile DNN model. Fig. 3. 3-Layer DNN Model for Song Profile Prediction Fig. 2. Flowchart for Mood Detection using Facial Expression C. Recommender System In the final phase of our project, we utilize facial expression detection to analyze human emotions and create a music preference profile that determines the user’s preferred music genre and artists. This information is then utilized to develop a machine-learning model that suggests suitable music or playlists based on the user’s current context. Our model incorporates data collected in previous phases, including song and artist tags, to select a few songs for the suggested playlist. The user’s feedback on the recommended playlist is used to refine the model and improve future suggestions. V. I MPLEMENTATION This section contains the actual implementation of our proposed system. The first part of the work is that we will be looking at detecting the person’s mood. We used 3 CNN of 3Layer, 5-Layer, and 7-Layer respectively on the same training data. The training data consist of 35,886 grayscale images of 48x48 pixel resolution categorized into 5 moods i.e. Happy, Sad, Neutral, Angry, and Fear. While training we found that 5-Layer with more than 16,000 variable parameters (as shown in Fig. 4) was the most efficient. The output of the CNN model is 5 different moods i.e. Neutral, Angry, Sad, Happy, and Fear. After detecting the mood of the user a user song profile needs to be made which helps the recommender system to suggest a song that is similar to the user so the song doesn’t feel too alien to the user. LastFM API and Spotify API, both RESTful APIs, are used to collect and scrobble (to record users’ music preferences) data. A LastFM account is created and linked to the Spotify account of the user. LastFM is used to scrobble through the user’s Spotify account. Using LastFM API user’s top 10 artists are collected in JSON format. Using Spotify API top 10 songs of each artist, previously fetched, are fetched. Following that, the URI (Uniform Resource Indicator) Song Profile DNN model is a 3-Layer DNN model having 10 input nodes and 4 output nodes as shown in Fig. 3. The input parameters are acoustics, danceability, energy, instrumentals, liveness, valence, loudness, speechiness, tempo, key, and time-signature which are obtained by requesting it to Spotify API using the song URI. The output will be the song profile which is ‘neutral’, ‘Angry’,’Fear’ ‘Sad’, or ‘Happy’ respectively. A sample song profile is shown in Fig. 5. To give the final playlist of the recommended song we will be using content-based recommendation. We sort the song from the user’s top 25 songs on the basis of its emotion. We search all the tags related to the songs using LastFM API. We take the top 10 tags and find the top 5 songs of that and labelled them with those specific songs. We create a similarity matrix using the cosine similarity formula (1). sim(A, B) = (A · B) (||A|| ∗ ||B||) (1) where A and B are the feature vectors for two items being compared, · denotes the dot product of the two vectors, and ||A|| and ||B|| are the Euclidean norms of the two vectors. We pick the top 5 songs with the highest similarity and suggest them to the user. VI. R ESULTS A 5-layer CNN model was accurately detecting facial emotions through a camera, classifying them as sad, happy, calm, or energetic. User time and location and a question set are also collected, and the Spotify API is used to gather the user’s profile information, including their favourite songs and artists, and convert it into URIs. A deep learning model is then used to classify whether the songs listened to by the user are energetic, sad, calm, or happy. The user profile is utilized to recommend songs based on their emotion, location, and time, and the LastFM API is used to recommend more songs based on the user’s preferred genre tags. A pie chart of user’s preferred songs shows that they mostly listen to happy songs followed by fear songs and then angry, sad and neutral at 28.7%, Fig. 5. User’s Song Profile Fig. 6. Confusion Matrix for Song Profile Detection 25.1%, 21.5%, 17.8% and 6.9% respectively [Fig 5]. The confusion matrix is used to evaluate the performance of the classification model, with an overall accuracy of 90%, higher than most other similar systems [Fig 6]. The mood detection test to validation dataset also achieves an accuracy of around 94%, demonstrating better categorization than existing mood detection systems [Fig 7]. An example of mood detection from a video feed is given [Fig 8]. Accuracy = (T N + T P ) (T N + F P + F N + T P ) (2) TP (F P + T P ) (3) P recision = Fig. 4. 5-Layer CNN Model for Mood Detection Recall = TP (T P + F N ) (4) Fig. 7. Accuracy chart comparison Fig. 8. Mood detection using a video feed In (1), (2) and (3) TN stands for true negative, TP stands for true positive, FP for false positive and FN for false negative. True positive that is TP is when observation is predicted positive and is actually positive. False positive FP means when observation is predicted positive and is actually negative. False negative FN means when observation is predicted negative and is actually positive. True negative is when observation is predicted negatively and is actually negative. So accordingly above are the two formulas to calculate the accuracy and precision. Accuracy is the number of times the model is correct overall whereas precision, on the other hand, means the number of correct predictions in a category Using (1),(2) and (3) the values of the table are calculated. Here we have made 2 tables one for the mood detection performance matrix and the other one for the song profile matrix. We have taken moods to be happy, sad, neutral, angry TABLE I S ONG P ROFILE P ERFORMANCE M ATRIX Song profile Happy Sad Neutral Angry Fear Accuracy(%) 95.97 97.31 96.35 95.39 96.16 Recall(%) 90.65 91.83 90.39 88.79 91.43 Precision(%) 89.82 93.75 91.26 88.79 89.72 Fig. 9. 5-Layer CNN Model for Mood Detection TABLE II M OOD D ETECTION P ERFORMANCE M ATRIX Mood Happy Sad Neutral Angry Fear Expected Result 91 92 89 92 81 Calculated Result 88 89 86 87 76 Accuracy (%) 96.70 96.73 96.63 94.57 93.82 and fearful and a value which is the expected result from training data and the calculated values from the test data from which we get the final accuracy in each category and having an average accuracy of 95.69% [Table I]. In the song profile the categories are happy, sad, neutral, angry and fearful and from the confusion matrix in Fig. 6. we have calculated the average accuracy, recall, and precision of 96.23%,90.61%, and 90.66% respectively [Table II]. In the graph, we can see that the expected result and calculated values are put in a bar chart to have a visual representation of the calculated accuracy[Fig 9]. C ONCLUSION Music plays a significant role in people’s lives, influencing their mood and serving as a source of motivation or stress relief. However, everyone has unique music preferences, making it essential to recommend songs based on the user’s mood, time, and location accurately. In this study, we utilized different parameters to recommend music more precisely to the user’s choice. We employed a 5-layer convolution neural network and OpenCV to detect the user’s emotions through their face and other features, incorporating this information with time and location to determine their mood and recommend music accordingly. We utilized Spotify Web API to extract the user’s music preferences and classify them into four categories (Happy, Sad, Angry, Neutral and Fear) using deep learning algorithms. We then used LastFM API to recommend more songs based on the user’s preferred genre tags. Our results demonstrate that predicting music using real-time parameters and emotion leads to more accurate song recommendations. Future research in this area could help create personalized playlists for users and recommend background music for social media platforms such as Instagram reels or stories. DATA AVAILABILITY The data which is used for mood detection is taken from an open-source platform called Kaggle. https://www.kaggle.com/datasets/jonathanoheix/faceexpression-recognition-dataset The data for the song profile is built by using Spotify, astFM and a form, which is circulated among users of different gender, age group, and geographic area. https://forms.gle/a6F6WACvtoBE8kpAA R EFERENCES [1] S. Giri et al., ”Emotion Detection with Facial Feature Recognition Using CNN and OpenCV,” 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India, 2022, pp. 230-232, doi: 10.1109/ICACITE53722.2022.9823786. [2] J. V. Moscato, A. Picariello and G. Sperlı́, ”An Emotional Recommender System for Music,” in IEEE Intelligent Systems, vol. 36, no. 5, pp. 5768, 1 Sept.-Oct. 2021, doi: 10.1109/MIS.2020.3026000. [3] X. Zhang, W. Li, H. Ying, F. Li, S. Tang and S. Lu, ”Emotion Detection in Online Social Networks: A Multilabel Learning Approach,” in IEEE Internet of Things Journal, vol. 7, no. 9, pp. 8133-8143, Sept. 2020, doi: 10.1109/JIOT.2020.3004376. [4] M. Lahoti, S. Gajam, A. Kasat and N. Raul, ”Music Recommendation System Based on Facial Mood Detection,” 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), Kannur, India, 2022, pp. 284-289, doi: 10.1109/ICICICT54557.2022.9917956. [5] A. Ghildiyal, K. Singh and S. Sharma, ”Music Genre Classification using Machine Learning,” 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 2020, pp. 1368-1372, doi: 10.1109/ICECA49313.2020.9297444. [6] S. R, N. C, V. S, V. P. R, N. A and N. P. S. M, ”Spotify Genre Recommendation Based On User Emotion Using Deep Learning,” 2022 Fifth International Conference on Computational Intelligence and Communication Technologies (CCICT), Sonepat, India, 2022, pp. 422426, doi: 10.1109/CCiCT56684.2022.00081. [7] A. Karatana and O. Yildiz, ”Music genre classification with machine learning techniques,” 2017 25th Signal Processing and Communications Applications Conference (SIU), Antalya, Turkey, 2017, pp. 1-4, doi: 10.1109/SIU.2017.7960694. [8] K. Markov and T. Matsui, ”Music Genre and Emotion Recognition Using Gaussian Processes,” in IEEE Access, vol. 2, pp. 688-697, 2014, doi: 10.1109/ACCESS.2014.2333095. [9] L. Xu, Y. Zheng, D. Xu and L. Xu, ”Predicting the Preference for Sad Music: The Role of Gender, Personality, and Audio Features,” in IEEE Access, vol. 9, pp. 92952-92963, 2021, doi: 10.1109/ACCESS.2021.3090940.