Melody Generation using RNN with LSTM

Melody Generation using RNN with LSTM Introduction This report presents the findings of a project that aims to encode musical compositions in the form of time series data. The encoded data can then be used to train machine learning models for tasks such as music generation and music classification. Music representation is a fundamental problem in the field of computer music, and various approaches have been proposed for representing music in a numerical format that can be processed by computers. One common approach is to represent music as a sequence of symbolic events, such as MIDI notes or musical notation, and use this representation as input for machine learning models. The goal of this project was to develop a method for encoding musical compositions as time series data that can be used as input for machine learning models. To achieve this goal, we used the `music21` library to parse and preprocess a dataset of classical music compositions in the `kern` format. The preprocessing steps included transposing the compositions to a common key (C major or A minor) and eliminating compositions with durations that are not among a predefined set of acceptable durations (16th notes, 8th notes, quarter notes, half notes, and whole notes). The resulting dataset was then encoded as time series data using a specified time step (in this case, 0.25 quarter lengths). The time series data represents a sequence of MIDI notes, rests, and carry-over symbols for notes/rests that extend beyond a single time step. Dataset Details The dataset used in this project is a collection of classical music compositions in the `kern` format. The data was sourced from the `deutschl` dataset and contains a variety of musical styles and instruments. The `kern` format is a textual representation of musical notation that is widely used in the field of musicology. It consists of a series of commands and parameters that specify the musical content of a composition, including pitch, duration, dynamics, and other musical features. The `kern` format is well-suited for musicological analysis, but it is not as widely used in the field of computer music, where more structured and machine-readable formats such as MIDI or MusicXML are often preferred. Methodologies To encode the musical compositions, the following steps were taken: 1. Load the kern files using the music21 library. The music21 library is a Python-based toolkit for working with music data that provides a rich set of tools for parsing, manipulating, and analyzing music notation. 2. Preprocess the data by transposing the compositions to a common key (C major or A minor) and eliminating compositions with durations that are not among a predefined set of acceptable durations (16th notes, 8th notes, quarter notes, half notes, and whole notes). Transposition is the process of adjusting the pitch of a musical composition to a different key. This step was included in the preprocessing to ensure that all of the compositions in the dataset were in a common tonal context, which makes it easier to compare and analyze the musical content. The acceptable durations were chosen to reflect common musical rhythms and to eliminate compositions that might contain unusual or complex rhythmic patterns that would be difficult to encode accurately. 3. Encode each composition as a time series data using a specified time step (in this case, 0.25 quarter lengths). To encode a composition, we iterated through the notes and rests in the composition using the music21 library and converted each event into a symbol that represents its pitch (for notes) or duration (for rests). The symbols were then divided into time steps of the specified length, and carry-over symbols were added to represent notes and rests that extended beyond a single time step. 4. Save the encoded data to a file, along with a mapping of integer values to musical symbols (MIDI notes, rests, and carry-over symbols). The mapping was included to allow the encoded data to be decoded back into symbolic notation, if needed. Results The encoding method was applied to all of the compositions in the deutschl dataset. A total of 4903 compositions were retained after preprocessing, and the resulting time series data had a shape of (4903, SEQUENCE_LENGTH, 1), where SEQUENCE_LENGTH is a hyperparameter representing the length of the encoded sequences. The encoded data showed a good degree of structure and coherence, with distinct patterns of notes and rests that corresponded to the musical content of the original compositions. The distribution of different musical symbols in the data was also consistent with expectations, with a larger number of shorter durations (such as 16th and 8th notes) and a smaller number of longer durations (such as half and whole notes). Potential Applications and Limitations The encoded data produced by this method has several potential applications in the field of computer music. For example, the data could be used to train machine learning models for tasks such as music generation, music classification, or music recommendation. The data could also be used to study the statistical properties of music and to explore relationships between musical features and aesthetic qualities. There are also several limitations to the current approach that should be considered when interpreting the results of this study. First, the kern format is not as widely used in the field of computer music as more structured formats such as MIDI or MusicXML, and the data may not be representative of other types of music. Second, the preprocessing steps (such as transposition and acceptable durations) were designed to simplify the data and may have eliminated some musical content that would be important for certain applications. Finally, the time step used for encoding the data (0.25 quarter lengths) was chosen for convenience, but other time steps could potentially provide different insights into the data. Future Work There are several directions for future work that could extend and improve upon the current approach. One possibility is to explore alternative methods for encoding musical compositions, such as using different time steps, using different symbolic representations (e.g., musical notation or audio features), or using more sophisticated encoding techniques (e.g., convolutional neural networks). Another possibility is to conduct experiments using the encoded data to train machine learning models for tasks such as music generation or classification, and to report on the results of these experiments. Finally, it would be interesting to compare the results of this study to other research on music representation and machine learning, to see how the current approach compares to other methods in terms of performance and interpretability. References Michael Scott Cuthbert. "music21: A Toolkit for Computer-Aided Musicology." In: Computing in Musicology 14 (2008), pp. 1-10. Stephen C. Smith and Christopher Ariza. "MusicXML: A Flexible, Extensible Format for Music Notation and Performance Information." In: Computer Music Journal 26.3 (2002), pp. 53-59. Links Dataset: https://kern.humdrum.org/cgibin/browse?l=essen/europa/deutschl GitHub Repository: https://github.com/Mountain311/Melody _Generator

Melody Generation using RNN with LSTM

Related documents

Products

Support

Melody Generation using RNN with LSTM

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib