Neural Networks and Musical Style Recognition

advertisement
Neural Networks and Musical Style Recognition
Sarah Callaghan
3rd year Physics and Music BSc
Aims:
The aims of this project were to:
1) Develop a way of symbolically mapping a melody line on to the input layer
of a neural network, while retaining the pitch and the rhythmic structure of the line.
2) Using this mapping technique, train a neural net on a series of examples
from two different composers, JS Bach and Igor Stravinsky, then test the network
with some previously unseen examples.
3) Determine what criteria the neural net uses to classify each unseen piece.
Apparatus:
The neural networks used were simulated by SNNS (Stuttgart Neural Network
Simulator) which is a software simulator for neural networks on UNIX workstations
developed at the Institute for Parallel and Distributed High Performance Systems
(IPVR) at the University of Stuttgart.
The SNNS simulator consists of two main components:
1) simulator kernel written in C
2) graphical user interface under X11R4 or X11R5
The simulator kernel operates on the internal network data structures of the
neural nets and performs all operations of learning and recall. The testing and training
pattern files were created using a simple text editor.
The networks that were created, trained and tested were simple, with no more
than fifty nodes. They used standard backpropogation as their learning function.
The music samples were taken from JS Bach’s “Six Sonatas for Violin and
Klavier” and “Six Sonatas for Unaccompanied Violin”; the Stravisnky music sampled
was “Duo Concertant pour Violon et Piano” and “Three Pieces for Clarinet Solo”.
Methodology:
The music had to be translated into a form that could be presented easily to the
input layer of the neural network. It is possible to use MIDI data for this, but due to
the time constraints I chose to use a simplified way of translating music into numbers.
I chose also to look only at melody lines with rhythmic aspects added in at a later
stage in the project.
The melody lines were plotted across the input layer of the network as a
function of space rather than time. Each actual pitch, as given by the sheet music, was
assigned a number between 0 and 1, with the interval of a semitone between notes
being represented by a difference between the two numbers of 0.01. Rests (periods of
silence) were represented by 0. The basic unit of rhythm was taken to be the
semiquaver, one semiquaver was assigned to each input node. If a particular note was
longer than a semiquaver then the number of nodes its pitch value was spread across
was directly proportional to the number of semiquavers that the note rhythm value
corresponded to.
Networks were also tested and trained using the frequency of occurrence of
the notes in the samples as learning information. To discover if the network developed
notions of key from the samples it was trained from, it was trained and tested using
samples all in the same key.
The output nodes were set up to give results of (1,0) for any sample by Bach,
and (0,1) for a sample by Stravinsky. The untrained network gave results of 0.5 for
both nodes when shown any sample of music, over the training cycles a definite trend
became evident with the samples becoming more and more as expected.
SNNS automatically plotted graphs of the sum of the squared errors versus the
number of training cycles. The number of training cycles was dependent on the
amount of training samples and the size of the network. In general the network used
had twenty input nodes, ten hidden nodes and two output nodes, the first representing
a result of Bach the other a result of Stravinsky. The training files had samples from a
number of pieces by the two composers, there were always equal numbers of samples
by each composer. In general there were twenty training patterns and a variable
number of test patterns.
Results:
The trained network demonstrated its ability to tell two pieces apart with very
high accuracy, the Bach training sets were all taken from the Gigue from Partita
number 3 in E, all the Stravinsky sets were taken from the first piece from his three
pieces for clarinet solo. Only the first hundred or so notes were used to make the
training patterns, the last twenty notes were used to test the network. It consistently
gave the correct answer for these test files.
Over large amounts of training patterns taken from large numbers of different
pieces, the network did not perform well at all. It had difficulty generalising
sufficiently well enough to be able to categorise the samples by composer. However,
when the internal logic of the network was examined the network gave consistent
result with what it had been trained. For example, in one of the training patterns there
had been a rest at the beginning, which had happened to be a sample from a Bach
piece. In all testing after the network had been trained using this sample, it classified
any test pattern that had a rest at the beginning as being a sample of Bach.
Commentary:
This project served as an interesting starting point for further work. Assuming
unlimited time and funds I would have loved to have used an actual neural network
instead of the simulator and used MIDI data to give a more accurate picture of what
music really is. In essence what I was actually teaching the neural net was how to
recognise music from its arbitrary numerical pattern; the same as trying to teach
someone who can’t read music to recognise the difference between two pieces simply
by looking at the sheet music.
Download