Automatic Music Generation Using Deep Learning Prateek Mishra Computer Science and Engineering Sharda University Greater Noida, India 2020574956.prateek@ug.sharda.ac.in Aamit Dutta Computer Science and Engineering Sharda University Greater Noida, India 2020570071.aamit@ug.sharda.ac.in Pratyush Chaudhary Computer Science and Engineering Sharda University Greater Noida, India 2020526068.pratyush@ug.sharda.ac.in Abstract—Automatic music generation using deep learning is a rapidly developing area of research that aims to create music without human intervention. The use of deep learning models, such as artificial neural networks, enables machines to learn the patterns and structures of music and generate new pieces that have never been heard before. This paper provides a comprehensive review of the different techniques used in automatic music generation, with a specific focus on deep learning. The paper describes the challenges involved in generating music using machine learning and the various types of deep learning models used in this area, including recurrent neural networks (RNNs). Additionally, the paper explores the evaluation metrics used to assess the quality of generated music and the different datasets available for training these models, such as the MIDI and Lakh MIDI datasets.Furthermore, the paper highlights the potential applications of automatic music generation using deep learning, including music composition, background music for movies and video games, and personalized music recommendations. The paper also discusses the ethical considerations associated with the use of AI-generated music, such as copyright infringement and the potential displacement of human musicians. In conclusion, this review provides insights into the state-of-the-art approaches in automatic music generation using deep learning and offers suggestions for future research in this exciting area. Keywords—Automatic learning,Recurrent neural network (LSTM),MIDI files II. music generation,Deep neural network(RNN),Convolutional (CNN),Long short-term memory INTRODUCTION Music is a universal language that has been a part of human culture for thousands of years. Over time, music has evolved and diversified into various genres and styles, each with its unique sound and characteristics. With the advent of technology, music production and distribution have become more accessible, and the music industry has grown exponentially. However, the process of creating music is still largely reliant on human creativity and skill. Automatic music generation using deep learning is a field of research that aims to automate the process of creating music. This involves using artificial intelligence (AI) and machine learning techniques to analyze existing music and generate new music that is similar in style and structure. The use of deep learning models allows for more sophisticated and complex music generation, as the models can capture and learn the underlying structure and characteristics of music. In recent years, automatic music generation using deep learning has gained significant attention, with numerous research studies exploring various techniques and models for music generation. This paper aims to provide a comprehensive review of the various approaches used in automatic music generation, with a specific focus on deep learning. The paper discusses the challenges associated with music generation, the different types of deep learning models used, and the various datasets available for training these models. The paper also explores the potential applications of automatic music generation using deep learning and the ethical considerations associated with this technology. By providing a comprehensive overview of this field, this paper aims to contribute to the advancement of automatic music generation using deep learning and inspire future research in this exciting area of study. III. IV. 1. Data Collection: Collect the dataset of MIDI files or audio files from various sources or create your dataset. 2. Data Preprocessing: Preprocess the dataset by cleaning and filtering the data, separating it into different categories, and converting the data into a format that can be fed to deep learning models. 3. Model Selection: Choose a deep learning model that is suitable for automatic music generation, 4. Model Architecture Design: Design the architecture of the selected model, including the number of layers, nodes, and activation functions. 5. Model Training: Train the model using the preprocessed data, optimizing the model's parameters to minimize the loss function. 6. Model Evaluation: Evaluate the model's performance using appropriate metrics such as melodic and harmonic similarity, rhythmic accuracy, and emotional expression. 7. Music Generation: Use the trained model to generate new music, by feeding it a starting melody or chord progression and allowing it to generate new music based on the learned patterns. 8. Music Evaluation: Evaluate the generated music using subjective and objective metrics, such as expert feedback, musical coherence, and novelty. 9. Refinement: Refine the model based on the feedback received, either by adjusting the model architecture or by fine-tuning the model's parameters. 10. Application: Deploy the model for real-world applications, such as music composition, background music generation, or personalized music recommendations.. LITERATURE REVIEW 1. 2. 3. 4. 5. This paper introduced a deep learning model for automatic music generation using a recurrent neural network (RNN). The model was trained on a dataset of polyphonic music and was able to generate new music with similar patterns and structures[2]. This paper provided an overview of deep learning techniques used in music analysis and generation, including RNNs, CNNs, and GANs. The paper also discussed the challenges and future directions of automatic music generation using deep learning[5]. This paper introduced a novel deep learning model for automatic music generation using a generative adversarial network (GAN). The model was trained on a dataset of MIDI files and was able to generate new music with a high degree of musical coherence and novelty[6]. This paper proposed a supervised learning approach using long short-term memory (LSTM) networks for automatic music composition. The model was trained on a dataset of MIDI files and was able to generate new music with a high degree of melodic and rhythmic coherence[1]. This paper provided a comprehensive review of deep learning techniques used in automatic music generation, including RNNs, CNNs, and GANs. The paper also discussed the challenges and opportunities of using deep learning in music generation, as well as the potential applications of this technology[13]. Overall, these papers demonstrate the effectiveness of deep learning models in automatic music generation and highlight the potential applications of this technology in music composition, background music generation, and personalized music recommendations. However, there are still challenges to be addressed, such as the need for more diverse and representative datasets, the evaluation of the quality of generated music, and the ethical considerations associated with the use of AI-generated music. METHODOLOGY V. RESULTS Automatic Music generation with LSTM has supported lengthy sequences to execute at ease. The differed method used in this study to stand out of other writings is the use and development of batches which made memory consumption a lot less but decreases the performance of the system. It is rather a trained model which takes in mind the chords of familiar music that is heard generally among humans and executes a calculated output. Fig1. : Output Plot However, there are still some challenges that need to be addressed. The evaluation of the quality of generated music is still subjective and open to interpretation, and more research is needed to develop objective metrics for evaluating the quality of generated music. Another challenge is the need for diverse and representative datasets to train the deep learning models. Additionally, there are ethical considerations to be addressed regarding the ownership of the generated music, copyright infringement, and the potential replacement of human musicians. Overall, the field of automatic music generation using deep learning is rapidly evolving and holds great potential for future developments in the music industry. With further research and development, deep learning models could be used to generate music that is not only aesthetically pleasing but also emotionally expressive and meaningful to the listeners. REFERENCES 1. 2. Fig2. : The notes and pitch with the duration in the output. VI. 3. CONCLUSION In conclusion, Various deep learning models, including RNNs, CNNs, GANs, and LSTMs, have been used to generate music based on either MIDI or audio files. These models have shown promising results in generating music that has a high degree of melodic and rhythmic coherence, as well as emotional expression. Moreover, deep learning models have the potential to be used in various musicrelated applications, such as music composition, background music generation, and personalized music recommendations. 4. 5. 6. 7. 8. Choi, K., Fazekas, G., & Sandler, M. (2019). Towards automatic music composition using supervised learning with long short-term memory networks. In Proceedings of the 20th International Society for Music Information Retrieval Conference (ISMIR-19) (pp. 348-355). Boulanger-Lewandowski, N., Bengio, Y., & Vincent, P. (2012). Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription. In Proceedings of the 29th International Conference on Machine Learning (ICML-12) (pp. 1159-1166). Colombo, F., & Serra, X. (2017). Deep learning for music transcription: A review. Journal of Intelligent Information Systems, 48(3), 423-446. Dieleman, S., & Eck, D. (2018). Brief survey of deep learning. IEEE Signal Processing Magazine, 35(1), 82-89. Huang, C., Li, Y., & Li, X. (2017). Deep learning for music. Journal of Computer Science and Technology, 32(3), 545-563. Yang, X., Cui, Y., & Li, S. (2017). MidiNet: A convolutional generative adversarial network for symbolic-domain music generation. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI-17) (pp. 2238-2244). Engel, J., Hoffman, M., & Roberts, A. (2019). GANSynth: Adversarial neural audio synthesis. In Proceedings of the 36th International Conference on Machine Learning (ICML), 1931-1940. Fiebrink, R., & Telkamp, T. (2018). Machine learning for musical expression. In Proceedings of 9. 10. 11. 12. 13. 14. 15. 16. the 2018 CHI Conference on Human Factors in Computing Systems, Paper No. 282. Huang, A., Liang, X., & Wang, Y. (2021). Deep learning for music generation: A survey. Applied Sciences, 11(11), 4966. Huang, C., Yang, Y., & Chen, Y. (2018). Music generation with deep learning. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), 3061-3065. Johnson, J., & Sontag, D. (2017). Learning musical structure for style-aware music generation and performance. arXiv preprint arXiv:1709.01083. Kim, Y., Lee, J., & Lee, S. (2019). BachProp: Modeling polyphonic music with long-term dependencies using sequential variational autoencoder with tree-structured latent variable models. Neural Computing and Applications, 31(4), 1245-1258. Wu, J., & Lerch, A. (2020). Deep learning in music generation. IEEE Signal Processing Magazine, 37(1), 54-68. Liang, X., Huang, A., & Wang, Y. (2021). Transformer-based generative music model. IEEE Access, 9, 60908-60918. Liu, W., & Yang, Y. (2019). A comparative study of deep learning-based music generation models. In Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2857-2862. Lu, H., Wu, Y., & Mao, Y. (2018). Music generation with deep learning using piano rolls. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), 4948-4953.