Music Genre Classifier Group Number X Team Members ● Members & Bilkent IDs Dataset In this project, the current plan is to make use of the GTZAN Genre Dataset [1]. This dataset can be found on Kaggle, and was originally used in 2001 in the paper “Automatic Musical Genre Classification Of Audio Signals” [2]. It has a collection of 30 seconds instrumental songs of 10 genres, making it a dataset of 10 classes, with each class having 100 song tracks. In total, this dataset has 1000 audio ‘.wav’ data samples each 30 seconds long. The genres included are “blues, classical, country, disco, hip-hop, jazz, metal, pop, reggae, and rock” [1]. For each audio file, there is also an image version giving a visual representation of the song files. Furthermore, there are 2 CSV files (for the full 30 seconds audio and another for a trimmed 3 seconds audio for each) containing some extracted features. Problem Definition Firstly, we will start with an exploratory analysis of the dataset and as audio data is something new to us, we may need to learn about some feature’s capabilities. We might introduce nonlinearity in the data (taking squares or logs, etcetera) to increase the amount of data and to test if the further features would be helpful. We want to classify (instrumental) music data and for that we will make a classifier using various machine learning algorithms. At this stage we plan on models of different complexities, using a simpler model as a K-Nearest Neighbours (KNN) algorithm and a few types of Neural Networks for more complex models. In Neural Networks, we will research further as to what specific types to make use of, but at this stage, we are more inclined to test a basic Artificial Neural Network (ANN), a Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM). The models will be evaluated based on their Confusion Matrices, as well as on performance metrics such as Accuracy, Precision, Recall and the F-1 Score per class. Since this is a classification problem, we can observe how the model performs and track cases of proper classification and misclassification. If time permits, we may also try to use other models such as some type of Decision Tree or some other weak learner for a comparison to already trained models. If there is further time, or if we do not work on additional models, we may try to make an Ensemble Model where our different (pre-trained) machine learning models are combined to get an even better performing model. References [1] “GTZAN Dataset - Music Genre Classification,” Kaggle, 2019. [Online]. Available: https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification [2] G. Tzanetakis, G. Essl, P. Cook, “Automatic Musical Genre Classification Of Audio Signals,” The International Society for Music Information Retrieval (ISMIR), 2001. [Online]. Available: https://ismir2001.ismir.net/pdf/tzanetakis.pdf