Boécienne

advertisement
March 2003
Music Analyser
MSc Project
By Jean-Marie FROUX
Preliminary report.
Project supervised by DJ Styles.
Table of contents:
Table of contents: ................................................................................................................ 2
Introduction: ........................................................................................................................ 3
1/ The software. .................................................................................................................. 4
2/ The problems to be solved. ............................................................................................. 5
a) How to capture the sound?.......................................................................................... 5
b) How to find the note? ................................................................................................. 6
c) How to save the work in the different formats and in particular in MIDI? ................ 8
d) Others problems. ....................................................................................................... 10
3/ Planning. ...................................................................................................................... 11
Conclusion. ....................................................................................................................... 12
References: ........................................................................................................................ 12
Annexes: ........................................................................................................................... 13
Annex 1 : The table of frequencies. .............................................................................. 13
Annex 2 : A MIDI file exemple: ................................................................................... 14
MSc Project
By JM FROUX
Music analyser
City University
-2-
Introduction:
Since a long time the computers and the music do good work, and the MIDI
standards imposed itself today in the Computer Assisted Music and in particular in the
assistance with the creation of scores.
However, the musician who wishes to benefit from it must have a MIDI
instrument which is limited almost to the keyboard. And many musicians not pianists
would like to be able to enjoy the recognition of notes without having to invest in such
equipment.
This is why it is interesting to consider the recognition of musical notes with a
microphone and a sound card as only interface with the instrument. And it is the goal of
this project, to create a software of recognition of notes played by a flute with the
language C #.
Wee will study in this preliminary report what exactly the software will do,
have a first approach of all the problems to be solved, and see a quick planning.
MSc Project
By JM FROUX
Music analyser
City University
-3-
1/ The software.
The first stage of the software will be to tune the instrument so that the played
notes are at the good frequency.
Then the user will play his music and the software will write in reel-time the
notes in a score. It will be possible to change some parameter for the spectrum analyse
like the sampling frequency.
Once that made, it will be possible to listen what has been played and what has
been written.
The next stage will be the modification of the score, by changing any wrong
note, or by adding new note or silence, title or other information, key signature, bar
line…
Finally, the user will be able to print his work, save it as an image file (like
JPEG) or as a MIDI file, and save the sound recorded as a wave file.
MSc Project
By JM FROUX
Music analyser
City University
-4-
2/ The problems to be solved.
a) How to capture the sound?
I will use DirectX®. It is designed with hardware acceleration in mind. It tries
to provide a low level access to hardware, while still remaining a generic interface.
DirectSound® and DirectMusic® are separate components of DirectX® with
some overlapping functionality. Both play WAV sounds, and DirectMusic® ultimately
synthesizes all sounds into waveforms that are played through DirectSound buffers.
We can use the DirectSound API independently to play WAV sounds, even in
applications that use DirectMusic to play other content. We can also use DirectSound to
manipulate sound buffers that are managed by DirectMusic
The following table summarizes the functionality offered by the two APIs.
Functionality
DirectMusic
DirectSound
Play WAV sounds
Yes
Yes
Play MIDI
Yes
No
Play DirectMusic Producer
segments
Yes
No
Load content files and manage
objects
Yes
No, but some support in sample
code
Control musical parameters at
run time
Yes
No
Manage timeline for cuing
sounds
Yes
No
Use downloadable sounds (DLS)
Yes
No
Set volume, pitch, and pan of
individual sounds
Yes, through DirectSound
API
Yes
Set volume on multiple sounds
(audiopaths)
Yes
No
Apply effects (DMOs)
Yes, through DirectMusic
Producer content or DirectSound
API
Yes
Chain buffers for mix-in (send)
effects
Yes, through DirectMusic
Producer content
No
Capture WAV sounds
No
Yes
Implement full duplex
No
Yes
Capture MIDI
Yes
No
MSc Project
By JM FROUX
Music analyser
City University
-5-
So by using the Microsoft® DirectMusic® and Microsoft® DirectSound®
interfaces in my application, I can capture wave sounds from a microphone or other
input and analyse it in reel time. I explain how below:
First I create a capture buffer, calling the
DirectSoundCapture8::CreateCaptureBuffer method for capturing waveform audio.
Then I start the buffer by calling the IDirectSoundCaptureBuffer8::Start method. The
buffer will keep running continuously rather than stopping when it reaches the end.
The program will wait until the desired amount of data is available using the
IDirectSoundNotify8::SetNotificationPositions method. When sufficient data is
available, the program locks a portion of the capture buffer by calling the
IDirectSoundCaptureBuffer8::Lock method. As parameters to the Lock method, the
program passes the size and offset of the block of memory wanted to be read.
The program copies the data from the buffer, using the addresses and block sizes
returned by the Lock method, unlock the buffer with the
IDirectSoundCaptureBuffer8::Unlock method. The frequency analyse will takes place
at this stage.
And all this is repeated until the user to stop his record. The program will call the
IDirectSoundCaptureBuffer8::Stop method.
b) How to find the note?
We can find the note by making a spectral analysis of the signal by FFT.
A frequential spectrum is an ordered graph of the magnitude relating to the
sinusoidal components of an acoustic vibration according to the frequency. As an
indication, the ear is sensitive to the sounds whose frequency lies between 20Hz (low
registers) and 20 KHz (the acute ones) (that is 10 octaves) and the magnitude higher
than 30 dB. But a flute can only play notes between 130.81 and 1046.50 Hz (look at the
frequencies table in annex) so it will limit our analyses.
One passes from one octave to the other by dividing the frequency by 2. The
scale divides the octave into 12 intervals of 122 each one (=1.0594). It is a system to
MSc Project
By JM FROUX
Music analyser
City University
-6-
recognize the notes, the basic note being the "A3" 440 Hz, and the other notes being
deduced from the preceding one by multiplication with T = 122.
It is in fact more complicated than that: the note emitted is made up of several:
- the basic, called fundamental, which is the first term of the series and whose
frequency characterizes the note. It is in the majority of the cases of the most
significant magnitude, but it happens that it is not the case, which can pose
problem in the recognition of the octave.
- and the harmonics, whose frequencies are multiples of the fundamental one.
The number and the intensity of different harmonics determine the tone of the
instrument. And as if that were not enough, for the same instrument, the panel
of the harmonics changes according to the height of the sound.
The theoretical spectrum of the "A3" 440Hz, note of reference, is as follows:
Magnitude
440
880
1320
1760
Frequency (Hz)
Thus the main problem will be to find which peak corresponds to the note. The
problem consists in selecting in the spectrum the note having the most power knowing
that a note corresponds to a fundamental frequency and whole multiple harmonic
frequencies of the frequency of fundamental. The difficulty consists in identifying on
the spectrum which harmonics are attached to which fundamental. It is especially
necessary to avoid regarding the harmonics as notes because they have never been
played.
First we have to clean the spectrum from all very small peaks due to noise. It is
then necessary to associate fundamental and harmonics: for each frequency, one looks
if there is a possible fundamental in the purified spectrum.
We can create a list of all fundamental possible with for each fundamental a list
of its associated harmonics. To select the good fundamental, we can seek the note
MSc Project
By JM FROUX
Music analyser
City University
-7-
having the most of harmonics or the note having the most power, or even use these two
criteria simultaneously by balancing their relative importance.
Another method could be the following:
We glance through the table containing the peaks. For the index i of the table
we look at which are the frequencies j which, multiplied by 1,2,3,4,5 or 6 can give the
sound of frequency i: thus i would be a harmonic of these frequencies.
As soon as one peak j could generate i, we add the amplitude of i to that of j,
and store the result in the box j of a results table. Thus, energies of the harmonics of a
note are added, whereas isolated harmonics energies are not. We can now detect which
note has the most of energy and it should be the fundamental.
c) How to save the work in the different formats and in particular in
MIDI?
A MIDI file is made up of a number of 'chunks' of data. The first of these is the
header chunk, which contains information like the format type, the number of tracks,
and the timing resolution. This is followed by one or more track chunks, each of which
contains the information for one complete track of MIDI data. All chunks, however,
regardless of type, have the same basic components - each contains a four byte
indicator, signifying the chunk type, another four byte word containing the length in
bytes of the data following, and lastly, the data itself.
See Annex 2 for an example.
Header chunk
The header chunk type is always the same, being the letters 'M','T', 'h' and 'd' in
ASCII format. Following this, we have four more bytes which indicate the length of the
header chunk data. The six bytes header chunk data is grouped into three sixteen-bit
words, representing, in order,

File format number
MSc Project
By JM FROUX
Music analyser
City University
-8-


Number of tracks
Timing resolution. This is the time increment unit which is used in the track
data to represent MIDI event durations, and can be one of two types. If the MSB
(Most Significant Bit - i.e. the leftmost bit) is set to 0, the unit is 'ticks' per
quarter note (or crotchet), the actual number being expressed in the remaining
15 bits of the word. If the MSB is set to 1, however, the unit is in terms of 'ticks'
per time code frame.
So the header chunk on a four track Format 1 MIDI file with a time resolution of
1024 (&0400) ticks per crotchet will appear as follows.
Track chunk
A track chunk contains within it a complete description of one sequencer track.
Like the header chunk, it consists of a type ('MTrk' in ASCII), a data length indicator,
and the data itself. The data consists of a series of track events. A track event can be
anything that can be sent down a MIDI cable - NOTE ON, NOTE OFF, PROGRAM
CHANGE etc. Alternatively, it can represent what is known as a meta-event - that is to
say, information about the piece of music. This may be the key or time signature, the
name of the track or tempo information. A meta-event always starts with a three byte
header, the first byte of which is &FF. The second indicates the meta-event type, and
the third indicates the number of bytes in the data following. A C major key signature
meta-event, for example, would have the following format:
Track events, whether they are MIDI messages or meta-events, are prefixed
with a number of bytes representing the delta time. This is the amount of time in ticks
which has elapsed since the last event, and is used to denote note and rest durations.
MSc Project
By JM FROUX
Music analyser
City University
-9-
d) Others problems.
There are some other less important problems for this project:
How to find the duration of a note?
How to write the score?
MSc Project
By JM FROUX
Music analyser
City University
- 10 -
3/ Planning.
The project will begin the 19th of May.
I expect that I will be able to capture sound by the start of June, to detect a note
by the start of July. There I think I will have to do a lot of tests to improve the
detection. If I have still a lot of time at this stage, I will try to apply my program for
other instruments like guitar or piano.
I want this project to be finished at semi-August, then do the report and be able
to present it at the beginning of September, because I want then to continue my studies
in a French engineering school.
MSc Project
By JM FROUX
Music analyser
City University
- 11 -
Conclusion.
This project is a multidisciplinary one it needs some knowledge in
programming, digital signal processing, mathematics, physic of sound…
After doing this report, this project and his realization are now clearer in my
mind.
I am certain that this work will be enthralling.
References:
http://villemin.gerard.free.fr/CultureG/MusNote.htm
http://msdn.microsoft.com/library/default.asp?url=/library/enus/directx9_c/directx/htm/gettingstartedwithdirectsound.asp
http://msdn.microsoft.com/library/default.asp?url=/library/enus/directx9_c/directx/htm/directsoundanddirectmusic.asp
http://chv.chez.tiscali.fr/jm/musique/
http://courses.ece.uiuc.edu/ece291/books/labmanual/io-devices-speaker.html
http://www.lgu.ac.uk/~seago/SMF.html
http://sunlightd.virtualave.net/Windows/DirectX.NET/
MSc Project
By JM FROUX
Music analyser
City University
- 12 -
Annexes:
Annex 1 : The table of frequencies.
The following table lists frequencies and frequency numbers for the three octaves which
can play a flute:
Note
C
C#
D
D#
E
F
F#
G
G#
A
A#
B
Middle C
C#
D
D#
E
F
F#
G
G#
A
A#
B
C
C#
Frequency (Hz)
130.81
138.59
146.83
155.56
164.81
174.61
185.00
196.00
207.65
220.00
233.08
246.94
261.63
277.18
293.66
311.13
329.63
349.23
369.99
391.00
415.30
440.00
466.16
493.88
523.25
554.37
Note
D
D#
E
F
F#
G
G#
A
A#
B
C
MSc Project
By JM FROUX
Music analyser
City University
Frequency (Hz)
587.33
622.25
659.26
698.46
739.99
783.99
830.61
880.00
923.33
987.77
1046.50
- 13 -
Annex 2 : A MIDI file exemple:
Taking the above as an example, the following format 0 file was created out of it. (Format 0 is
the simplest, and is used for recording a single multichannel track of MIDI data, together with
tempo information.)
Delta time
Data
Interpretation
&4d &54 &68 &64
Header chunk
&00 &00 &00 &06
Six data bytes
&00 &00
Format 0
&00 &01
One track
&04 &00
1024 ticks/crotchet
&4d &54 &72 &6b
Track chunk
&00 &00 &00 &59
89 data bytes
&00
&ff &58 &04 &04 &02 &18 &08
Time signature (4/4)
&00
&ff &59 &02 &00 &00
Key signature (C)
&00
&ff &51 &03 &09 &89 &68
Tempo (crotchet = 96)
&00
&90 &48 &40
C (NOTE ON)
&84 &00
&80 &48 &00
NOTE OFF
&00
&90 &43 &40
G
&82 &00
&80 &43 &00
&00
&90 &43 &40
&82 &00
&80 &43 &00
&00
&90 &44 &40
&84 &00
&80 &44 &00
&00
&90 &43 &40
&84 &00
&80 &43 &00
&84 &00
&90 &47 &40
&84 &00
&80 &47 &00
&00
&90 &48 &40
MSc Project
By JM FROUX
G
A flat
G
B
C
Music analyser
City University
- 14 -
&88 &00
&80 &48 &00
&01 &ff &2f &00
MSc Project
By JM FROUX
End of track
Music analyser
City University
- 15 -
Download