Uploaded by Abcdullah Aynal

proposal example good 2

advertisement
Music Genre Classifier
Group Number X
Team Members
●
Members & Bilkent IDs
Dataset
In this project, the current plan is to make use of the GTZAN Genre Dataset [1]. This dataset can be
found on Kaggle, and was originally used in 2001 in the paper “Automatic Musical Genre
Classification Of Audio Signals” [2]. It has a collection of 30 seconds instrumental songs of 10
genres, making it a dataset of 10 classes, with each class having 100 song tracks. In total, this dataset
has 1000 audio ‘.wav’ data samples each 30 seconds long. The genres included are “blues, classical,
country, disco, hip-hop, jazz, metal, pop, reggae, and rock” [1]. For each audio file, there is also an
image version giving a visual representation of the song files. Furthermore, there are 2 CSV files (for
the full 30 seconds audio and another for a trimmed 3 seconds audio for each) containing some
extracted features.
Problem Definition
Firstly, we will start with an exploratory analysis of the dataset and as audio data is something new to
us, we may need to learn about some feature’s capabilities. We might introduce nonlinearity in the
data (taking squares or logs, etcetera) to increase the amount of data and to test if the further features
would be helpful. We want to classify (instrumental) music data and for that we will make a classifier
using various machine learning algorithms. At this stage we plan on models of different complexities,
using a simpler model as a K-Nearest Neighbours (KNN) algorithm and a few types of Neural
Networks for more complex models. In Neural Networks, we will research further as to what specific
types to make use of, but at this stage, we are more inclined to test a basic Artificial Neural Network
(ANN), a Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM). The models
will be evaluated based on their Confusion Matrices, as well as on performance metrics such as
Accuracy, Precision, Recall and the F-1 Score per class. Since this is a classification problem, we can
observe how the model performs and track cases of proper classification and misclassification.
If time permits, we may also try to use other models such as some type of Decision Tree or some other
weak learner for a comparison to already trained models. If there is further time, or if we do not work
on additional models, we may try to make an Ensemble Model where our different (pre-trained)
machine learning models are combined to get an even better performing model.
References
[1]
“GTZAN Dataset - Music Genre Classification,” Kaggle, 2019. [Online]. Available:
https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification
[2]
G. Tzanetakis, G. Essl, P. Cook, “Automatic Musical Genre Classification Of Audio Signals,”
The International Society for Music Information Retrieval (ISMIR), 2001. [Online]. Available:
https://ismir2001.ismir.net/pdf/tzanetakis.pdf
Download