funkmeister 7 - Mads Lykke | sound

advertisement
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
FUNKMEISTER 7
ABSTRACT
The Funkmeister7 project is about the creation of a voice controlled, digital drum synthesizer that
is triggered by the input from a standardised microphone. The output sound can be altered with a
tactile sensor that we have created ourselves. The core of the project is the analysis of the human
voice simulating drum sounds, in order to extract the needed information to trigger the drum
synthesizer we have built.
The voice input from the microphone is analysed in order to find the attack characteristics of a
percussive sound and the frequency characteristics of a bass drum and a snare drum,
respectively. The result of this analysis is used to trigger a drum synthesizer, which we built in
Max/MSP.
The drum synthesizer uses elements that simulate the sound of vintage drum synthesizers. These
are the combination of frequency swept sinusoidal waves, noise, noise with band-passed filter,
additive synthesis and amplitude envelopes.
The tactile sensor, used to alter the parameters of output sound from the drum synthesizer,
consists of a voltage divider switching network and a potentiometer, connected to a
microprocessor.
After the design and implementation of the system we subject it to user tests, and draw
conclusions based on these.
Page 2 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
TABLE OF CONTENTS
ABSTRACT
2
INTRODUCTION
6
PROBLEM DEFINITION
7
Delimitations
8
Success Criteria
8
RESEARCH AND THEORY
9
Sound Theory
9
The Basics of Sound
9
Digital Representation of Sound
12
Filters and Effects
14
Sound Synthesis
20
ANALYSIS AND DISCUSSION
23
Percussive Sound Characteristics
23
Amplitude Envelope
23
Voice-to-drum Frequency Analysis
24
Conclusion of Percussive Sound Characteristics
29
Short Analysis of DrumSynth 2
29
Electronic Sensor Interfacing
31
Mapping
32
Real-time
35
DESIGN & IMPLEMENTATION
37
Short Description of the System
37
Page 3 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
System overview
38
Mapping
41
Mapping the microphone
42
Mapping the Tactile Sensor
42
The Tactile Sensor
45
Transducing Motion and Pressure
46
Formatting the Transduced Data
47
The Tactile Sensor – Software (Max/MSP)
48
Sound Input Analysis
51
Introduction to Sound Input Analysis
51
Attack Detection
52
Drum-type Detection
53
Drum Triggering
56
Sound Synthesis
58
Introduction to Sound Synthesis
58
Description of the Sound Synthesis Blocks
59
USER TEST
70
Introduction to User Test
70
Summary of the Answers
71
Conclusion of User Test
72
CONCLUSION
74
General Conclusion
74
Sound Analysis
74
Sound Synthesis
75
Sensor
75
The Mapping
76
Testing
76
Future Improvements
77
Page 4 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Perspective
78
LITERATURE AND SOURCES
79
Primary Literature
79
Papers
79
Lectures
80
Internet
80
Software
83
APPENDICES
84
Appendix 1 - Sensor Theory
84
Voltage
84
Current
84
Resistance
85
Circuits
86
Switches
86
Variable Resistor and Voltage Divider Switching Network
87
Appendix 2 - Sound Hardware Specifications
89
Appendix 3 – Bass and Snare drum peak frequencies of recorded voice-to-drum samples
90
Appendix 4 – User Test Questionnaire with Answers
92
User test of Funkmeister7 – 01
92
User test of Funkmeister7 – 02
93
User test of Funkmeister7 – 03
94
User test of Funkmeister7 – 04
95
User test of Funkmeister7 – 05
96
User test of Funkmeister7 – 06
97
Page 5 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
INTRODUCTION
Have you ever come up with a catchy tune or a cool drum beat and found yourself humming or
‚beat-boxing‛1 it with your mouth? Have you been unable to put it to any good use, because of
lack of musical knowledge? Imagine if there was a system that could take the humming/beatboxing straight from your mouth and turn it into actual music, without forcing you to learn how to
play an instrument or learn any musical theory. This is the entry point to this project – to make the
process of music making much more straightforward and intuitive.
Figure 1 - Roland TR-909
The basic idea for this project is to create a software-based drum machine system, inspired by
the classic vintage ones like the Roland TR-9092 (see Figure 1),thatthe user is able to ‚play‛ w ith
his mouth. The drum machine will play different kinds of synthesized, percussive sounds, based
on what kind of microphone input it is getting from the user. I.e. if the user tries to imitate a bass
drum sound with his mouth, the system should recognize this and play an actual, synthesized
bass drum sound. In addition to triggering the drum sounds, the user should also be able to
manipulate these sounds with his feet, via a controller unit placed on the floor. All this should be
linked together in a simple Graphical User Interface.
With this system the user will be able to create a new range of sounds compared to what is
possible with just the mouth alone.
1
2
Definition of Beat-boxing: http://en.wikipedia.org/wiki/Beatboxing
More info about TR-909: http://www.synthtopia.com/synth_review/RolandTR-909.html
Page 6 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 2 shows the initial sketch of the idea.
Figure 2 - Initial sketch of the idea
PROBLEM DEFINITION

How can we analyze voice-to-drum simulated sounds, inputted through a microphone,
and identify the different drum types?

How do we create a synthetic set of drum sounds, which resemble the sounds from the
vintage drum machines, and can be triggered in synchronization with the identified
sounds from the voice-to-drum input?

How do we create a tactile3 sensor that resembles a floorboard, and can manipulate the
parameters of the synthetically created drum sounds?
3
Tactile: Perceptible to the sense of touch; tangible.
Page 7 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Delimitations
We have decided that the main focus of this project is the sound analysis and sound synthesis,
and therefore we are not going to build the tactile sensor as a floor board controller, as we
imagine it to be in the final version of the system. Instead we are going to build a smaller tactile
sensor that resembles the structure we had in mind for the final version, and which can be
controlled by hand. The reason for this is that we simply do not have enough time and resources
to produce the floor board at the current time.
We will not build our own microphone, as the standardized ones that are available to us suit our
needs in a sufficient way.
Furthermore, this prototype version of the system will only detect and play two different kinds of
drums: bass drum and snare drum. This, again, is a matter of time and resources. These two
drums have been chosen, since they are the two most fundamental drum types in most musical
genres.
Success Criteria
In order to be able to determine whether our system has been a success or not, later in the
process (see page 70), we need to define some criteria to set a requirement for success:

The system should be able to w ork in ’perceived real-tim e’in order to be usable in a
musical context. I.e. the user should perceive the triggering and playback of the
synthesized drum sound, based on the voice-to-drum input, as happening
instantaneously (see ‚Real-time‛ chapter on page 35 for more details).

Most people should be able to learn to use the system within a reasonable period of time
(approx. 10-15 min) in order to make sure that our system is straightforward and intuitive
to use.
Page 8 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
RESEARCH AND THEORY
Since we have narrowed the project down to be focused mainly on sound analysis and –
synthesis, we are now going to look at some of the basic theory behind these. The purpose is to
provide a better understanding for the choices we make later in the process.
We are going to look at the basics of sound, digital representation of sound, filters and effect, and
sound synthesis. We are also going to look at some basic theories behind the concept of
‚m apping‛,in order to geta better idea ofhow to map the input from the tactile sensor and the
sound input from the microphone to the synthesized output sounds.
The theory behind the tactile sensor can be found in Appendix 1 - Sensor Theory, on page 84.
Sound Theory
In this chapter we explain some of the fundamentals of sound in general and how sound can be
represented in digital form. Then we describe some of the filters and effects that can be applied
to sound, and then lastly we explain how basic sound synthesis works.
The Basics of Sound4
The sound we hear is basically change in air pressure. When an object moves it will set the air
molecules near it in motion and the air molecules next to these molecules in motion, and so on.
When this motion reaches our ears, it will be perceived as sound.
4 Sources for this chapter:
- ‛M S P Tutorials and Top ics‛, pages 13-19, by Cycling74 - http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf
- ‛D igitalA udio - introduction to theory‛ - http://www.chienworks.com/webinfo/digaudio/
- ‛A P 2 – lecture 1‛ slides by Stefania S erafin - http://www.media.aau.dk/ap2/lecture1ap2.pdf
Page 9 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Sound can be represented as a graph of air pressure over time (see Figure 3).
Figure 3 - Graph of air pressure over time
The amplitude of a sound is the amount of change in air pressure, measured in decibel (dB). In
general, the higher the amplitude is, the higher the perceived loudness of the sound will be.
The amplitude envelope of a sound refers to the shape of the overall change in amplitude over the
course of its duration (see Figure 4).
Figure 4 - Amplitude envelope
The attack part of the amplitude envelope is the range from where the sound starts until it reaches
its peak amplitude. Decay is where the amplitude falls to the sustain part of the sound. At the
sustain part the amplitude roughly keeps the same level, until it reaches the release part, where
the amplitude drops to its final level. These concepts are used in correlation with sound synthesis
(see ‛Sound Synthesis‛ on page 20).
Page 10 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The frequency of a sound wave is the amount of cycles per second, measured in hertz (Hz) (see
Figure 3 on page 10). The higher the frequency is, the higher the perceived pitch of the sound will
be. The audible frequency range for a human is approximately 20 - 20,000 Hz.
Most sounds contain more than just a single frequency. These are called complex tones. The
spectrum of a sound is the combination of all the frequencies, and their amplitudes, the sound
consists of. Figure 5 and Figure 6 below show the spectrum of the sound of a snare drum, in the
time domain (amplitude over time) and the frequency domain (amplitude over frequency),
respectively.
Figure 5 - time domain of a snare drum sound
Figure 6 - frequency domain of a snare drum sound
Each individual frequency of a complex tone is called a partial. When these partials are all integer
multiples of the same frequency, the sound has a harmonic spectrum. These harmonic sounds
are usually perceived as having a single pitch, and are used for creating music, for example. The
partials of an inharmonic sound, on the other hand, are not all integer multiples of a fundamental
frequency, and thus they do not blend together in a single perceived pitch as easily as with the
harmonic ones. When we perceive noise, the sound consists of a lot of different frequencies with
no apparent mathematical relationship.
Page 11 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Digital Representation of Sound5
The basic concept of digital sound consists of taking a lot of snapshots of the amplitude values of
the sound, saving those values as numbers, and then reproducing the amplitude based on these.
Figure 7 shows the process of converting an analogue sound signal into digital information and
then playing it back.
Figure 7 - Digital recording and playback
6
From the source, the sound goes into a microphone. The microphone converts the change in air
pressure into change in electrical voltage. To limit the amount of information that needs to be
processed, this change in voltage is only recorded at a certain periodic interval, in a process
called sample and hold. Basically, the voltage value is recorded and then kept at that value until
the next periodic sample is received (see Figure 8). The amount of samples taken per second is
called the sampling rate, and it is measured in Hertz (Hz).
5
The source for this chapter is:‛M S P Tutorials and Top ics‛, pages 21-28, by Cycling74
http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf
6
Source:‛M S P Tutorials and Top ics‛, page 23, by C ycling74
http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf
Page 12 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 8 - voltage signal sampled periodically 7 (amplitude over time)
To represent a sound accurately, the computer needs to take many samples per second. The
Nyquist Theorem 8 states that a computer can only accurately represent frequencies that are half
or less than half of the value of the sampling rate. I.e. to accurately sample frequencies up to the
20,000 Hz a human can perceive, we would need to use a sampling rate of at least 40,000 Hz.
This is why the signal is sent through a low-pass filter before the sample and hold process, to
remove any frequencies above half the sample rate, in order to avoid that those frequencies
create noise in the signal (see page 14 for more information on filters).
From the sample and hold process, the sampled voltage values then go into a device called
Analogue-to-Digital Converter (ADC). Here the voltages are being converted into a string of
binary digits, in a process called quantization. The higher the resolution of the quantization is, the
more values can be assigned to the amplitude range of the sound, and thus more precisely the
sound can be stored digitally. I.e. a resolution of 8 bits allows the amplitude range to be divided
into 256 steps (28), 16 bits allows 65,536 steps (216), and so on. If the incoming signal is higher
than the maximum quantized amplitude that can be expressed with numbers, the phenomenon
clipping occurs. Clipping causes the sound to be cut off, and become more or less distorted (see
Figure 9 below).
Figure 9 - Clipping of a signal9
7
Source:‛M S P Tutorials and Top ics‛, page 22, by C ycling74 - http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf
Source:‛M S P Tutorials and Top ics‛, page 23, by Cycling74 - http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf
9
Source:‛M S P Tutorials and Top ics‛, page 27, by C ycling74 - http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf
8
Page 13 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
After the quantization step, the data is stored on the computer. It takes about 10 MB of memory
space to store one minute of audio data in compact disc quality (44,100 Hz, 16 bit, stereo).
When the audio needs to be played back again, it first goes through a Digital-To-Analogue
Converter (DAC), which transforms the strings of stored binary digits into a continuous stream of
voltage. The signal then goes through a low-pass filter, to filter out any potential high-frequency
noise created by the sample and hold process, before it is amplified and sent through a speaker.
Filters and Effects
10
A filter is used to change the characteristics of a sound, by shaping the spectrum of the signal. It
does not change the frequency of a signal, only the amplitude and phase (placement in time).
When describing the characteristics of a filter, you apply a sine wave to the input, and measure
the differences in the output. The characteristics of the filtered sine wave are called frequency
response. The frequency response consists of amplitude response and phase response. Both
amplitude response and phase response varies with frequency. Amplitude response is the ratio
between the amplitude of the input sine wave, compared to the amplitude of the output sine wave.
You can normally tell which filter is being used by the shape of its amplitude response. The phase
response describes the phase change of the input compared to the output.
Low-pass filter
A low-pass filter only allows frequencies below its cut-off frequency point (fc) to pass.
Frequencies above this point are removed. However, the filter cannot abruptly cut off the
frequencies, and therefore, there will always be a smooth transition between the frequencies that
are kept (passband), and the frequencies that are thrown away (stopband). Because of this
transition it can be difficult to specify where the cut-off frequency is, but normally it is defined as
the point where the signal has dropped -3 dB compared to its maximum dB (see Figure 10
below).
10
Source for this chapter:‚C om p uter M usic‛, pages 171-174.
Page 14 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 10 - Amplitude response of a low-pass and a high-pass filter
High-pass filter
A high-pass filter does the opposite of a low-pass filter. It discards the low frequencies and
keeps the high frequencies (see Figure 10).
Band-pass filter
A band-pass filter is a combination of a low-pass and a high-pass filter. It discards both low
and high frequencies with a pass-band in between. It has a centre frequency (f 0) which is the
centre of the pass-band, and a bandwidth (BW). The bandwidth is defined by a lower cut-off
frequency (fl) and an upper cut-off frequency (fu) (see Figure 11 below). The response of a
band-pass filter can often be describe as either sharp or broad, depending on the bandwidth.
Figure 11 - Amplitude response of a band-pass and a band-reject filter
Page 15 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Band-reject filter
The band-reject filter has the opposite amplitude response of a band-pass filter. It rejects a
band of frequencies and passes the rest (see Figure 11, above). It is defined by a centre
frequency and a bandwidth like the band-pass filter.
Delay
A delay is created by changing the phase of the signal as it passes through the filter (phase
response). It is a simple, but very useful effect. The most basic type of delay is shown in Figure
12. The input signal is played immediately, and after a certain delay time (t), the delayed signal
will be played. The delayed signal is multiplied with an amplitude factor (g) (amplitude response),
which would normally be below 1, since it otherwise would be louder than the original signal.
Figure 12 - Diagram of a basic delay
If the delay time is short it is called a slapback delay (approx. 40-120ms), otherwise it is called an
echo11.
Comb filter
Comb filters are used together with all-pass filters (see below) to create reverberation12 (see
‚Reverberation‛ section on page 18).
The comb filter works by sending the signal through a delay. The delayed signal is fed back to the
input, after being multiplied by an amplitude factor (g). The amount of time the signal uses to go
through this loop is determined by the loop time (t) (see Figure 13). The amplitude factor (g) must
be between 0 and 1, not including 1, so that the signal will get lower for each loop. The closer g
gets to 1 the more extreme the reverberation will sound13.
11
Source: http://www.harmony-central.com/Effects/Articles/Delay/
Source: http://www.harmony-central.com/Effects/Articles/Reverb/
13
Source: ‚C om p uter M usic‛, page 296
12
Page 16 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 13 - Comb filter
All-pass filter
All-pass filters are used either alone or together with comb filters to create reverberation (see
Reverberation section on page 18).
The all-pass filter is similar to a comb filter, but a bit more advanced. Again we have a loop time
(t) and the amplitude factor (g) which must be less than 1 (see Figure 14).
Figure 14 - All-pass filter
In an all-pass filter there is, unlike the comb filter, no delay between the input and output. The first
output, or impulse response, will therefore be:
1*-g = -g
The next impulse response will be:
(1*g)*-g+1 = 1-g 2
And the impulse response after will be:
(g*g)*-g +g = g*(1-g2)
… and so on14 (see Figure 15)
14
Source: ‚C om p uter M usic‛ p ag e 297
Page 17 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 15 - The impulse response of an all-pass filter
Reverberation
Reverberation (reverb) is the result of sound reflected off surfaces. It is similar to multiple echoes
and yet it is not. Imagine that you stand in a room. In the other end of the room someone hits a
drum. The first sound you will hear is the sound wave that has travelled directly from the drum to
you. But a few milliseconds later, the same sound will hit you again, but this time the sound wave
has not travelled directly, but instead it has been reflected off the walls, floor and/or ceiling (see
Figure 16). This w illcontinue untilthe sound ‚dies‛. Every tim e the sound reflects off a surface it
looses some of its energy (amplitude). The amount of energy it looses depends on the surface.
E.g. hard, solid surfaces, such as marble, absorb very little energy, whereas soft materials, such
as curtains, absorb the energy very well. The water vapour in the air is also contributing to the
loss of energy. Other factors that influence the amount of reverb are the size and shape of the
environment15.
15
Source:‚C om p uter M usic‛, page 289.
Page 18 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 16 - Sound is reflected off the different surfaces
Humans do not perceive all the reflected sounds as independent sounds, because these hit the
listeners within a few milliseconds. But we can hear the effect of the reverb. If you could
distinguish between the sounds it would be like echoes16.
When creating a digital reverb, a series of delays are not enough. There are more factors than just
the delays that influence the effect. These are: Early and late reflections and reverberation time17.
The early reflections are the first reflected sound waves that reach the listener. Their amplitudes
are almost as high as the sound wave travelling directly to the listener. There is quite a big gap
between the arrivals of these early reflections (see Figure 17).
The late reflections reach the listener after the early reflections. They arrive much closer to each
others and with a more random interval between them.
The amplitude of the reflections gets lower as time goes, since these reflections will have travelled
farther and have reflected off surfaces more times than the previous ones. However, if you look at
a small section of the spectrum, this will vary a little (see Figure 17).
The reverberation time is how long time it takes for the sound to die away to 1/1000 th of its original
amplitude18. This depends on the size, shape, and surfaces of the environment as explained
earlier.
16
Source: http://www.harmony-central.com/Effects/Articles/Reverb/
Source: http://www.harmony-central.com/Effects/Articles/Reverb/
18
Source:‚C om p uter M usic‛, page 290
17
Page 19 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 17 shows a graphical representation of a reverb. Each of the lines represents the same
piece of the sound when it hits the listener. The height shows the amplitude of the sound when it
reaches the listener.
Figure 1719 - Showing the sound decay over time
Sound Synthesis
As we mentioned earlier, digital sound is basically strings of binary digits that hold the amplitude
information of a sound. We have explained how these numbers can be created from real-life
sounds through a microphone and analogue-to-digital conversion (see ‚Digital Representation
of Sound‛ chapter on page 12).
The basic concept of digital sound synthesis is to create these numbers directly on the computer,
without actually having a real-life sound as the source. In this chapter we will explain some of the
basic concepts behind sound synthesis.
Oscillators
The oscillator is fundamental to almost all computer synthesis units, and produces a periodic
waveform. You can set the frequency and amplitude of the waveform.
The output of the oscillator is a sequence of samples which forms a digital signal representing the
waveform.
19
Image taken from http://www.harmony-central.com/Effects/Articles/Reverb/
Page 20 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The spectrum of an oscillator that produces periodic waveforms with well-defined spectral
components is called a discrete spectrum. The opposite is a disturbed spectrum, which covers all
frequencies. An oscillator that produces such a spectrum is called a noise oscillator.
Noise is sound with an extremely rich spectrum. A signal covering all frequencies is called white
noise, but also other types of noise exist. It is made by creating a random frequency at each
sample.20
ADSR envelopes
Amplitude envelopes are used to create an amplitude sequence, instead of using fixed
amplitude. This is done to create a more natural sound. No natural sounds have constant
amplitude. In connection with synthesis of musical sounds, the term ADSR envelope is used to
create more realistic envelopes.
ADSR envelopes consist of 4 phases: Attack, Decay, Sustain, and Release (see ‚Figure 4 Amplitude envelope‛ and further description on page 10). Percussive instruments have a very
short attack, whereas instruments such as pipe organ or tuba have a longer attack 21.
Synthesis techniques
Throughout time several synthesis techniques have been invented. Each of them has its own
qualities and can be used with different purposes. In this chapter we look at two techniques;
additive synthesis and subtractive synthesis. This is due to the fact that synthetic drum sounds
traditionally were made with these techniques.22
Additive synthesis
Additive synthesis is based on the idea that complex tones can be created by the summation, or
addition, of simpler ones.
Basically, additive synthesis starts from scratch and adds sinusoids together, until the desired
sound is achieved.
20
Source:‚C om p uter M usic‛, page 75-78 + 95-98.
Source:‚C om p uter M usic‛,p ag e 84.
22
Sources: http://www.soundonsound.com/sos/apr02/articles/synthsecrets0402.asp and
http://www.soundonsound.com/sos/feb02/articles/synthsecrets0202.asp
21
Page 21 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
‚G iven enough oscillators, any set of spectral components can be synthesized,
and so virtually any sound can be generated.‛23
With additive synthesis you have good sound control, but in order to get a very complex sound
you will need many sound generators, which demands a lot of data.
In its most basic form, additive synthesis is the addition of two sinusoids. Two sinusoids can be
added to create a more complex sound.
A sinusoid can be defined mathematically this way:
Χ (t) = Asin(2лft + Φ )
A is the amplitude (always positive, by convention)
f is the frequency
Φ is the starting phase
Example of adding two sinusoids with the frequencies of 500Hz and 328Hz:
Χ (t) = sin(2л500t + Φ ) + sin(2л328t + Φ )
Subtractive synthesis
The opposite technique of additive synthesis is subtractive synthesis. Subtractive synthesis has
all frequencies as the starting point. Such signals, covering all frequencies, are called white noise.
From these signals, you filter out the unwanted frequencies, using filters like low-pass, highpass, or band pass-filters (see chapter ‚Filters and Effects‛ on page 14 for details). With this
technique it is easier to create more complex sounds, but it is rather difficult to filter out unwanted
frequencies very precisely.
23
Source:‚C om p uter M usic‛, page 88.
Page 22 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
ANALYSIS AND DISCUSSION
With basis in the theory chapter, we will now discuss the choices we make for the system we are
going to create. The areas we will cover in this chapter are Percussive Sound Characteristics,
Short Analysis of DrumSynth 2, Electronic Sensor Interfacing, Mapping, and Real-time.
Percussive Sound Characteristics
In this chapter we look at the characteristics of a percussive sound, in order to get a better
understanding of the parameters we need to be aware of, when detecting voice-to-drum
simulated sounds and creating our own synthetic drum sounds.
‛There are tw o key elem ents in percussive sounds, the amplitude envelope shape and the
frequency content. The amplitude envelope usually has a sharp attack followed by a slow
exponential decay. In the frequency content of the sound usually consists of non-integer
harmonics or noise, with little or no pitch. There are also often many frequencies at the beginning
ofa sound fading into only a few frequencies atthe end.‛24
Amplitude Envelope
As the quote above states, a key characteristic of a percussive sound is the rapid attack (see
page 21 for more on amplitude envelopes and attacks). Figure 18 below shows the amplitude
envelope of a typical snare drum (top) and of a typical bass drum (bottom), and supports the
statement about a rapid attack. Because of this, one of the key elements we have chosen to use
in our system, when it comes to detecting drum sounds, is the attack.
24
Source:‛P ercussion Synthesis‛ by S tephen D ill - http://ccrma-www.stanford.edu/~sdill/220A-project/drums.html
Page 23 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 18 - Snare drum (top) and bass drum (bottom), amplitude over time
Voice-to-drum Frequency Analysis
A s w e found out in ‚The Basics of Sound‛ chapter on page 9, a good way to distinguish between
sounds is by frequency. Furthermore, the quote on page 23 also states that frequency is a key
element of a drum sound. Because of this, we have chosen to look further into using frequency as
another key element in our system, when it comes to detecting and distinguishing drum sounds.
The purpose of this chapter is to find the frequency bands where voice-to-drum simulated bass
drum sounds and voice-to-drum simulated snare drum sounds, respectively, typically have their
main amplitude peaks. We want to find out whether we can use these frequency bands to
distinguish between bass drum and snare drum sounds, when we build our system. To find the
frequency bands, we use statistics.
We record a set of 18 bass drum sounds and 18 snare drum sounds, done by three different
people with two different microphones – a cheap Labtec microphone and a more expensive
Shure microphone – on a low latency FireWire soundcard (see ‚Appendix 2 - Sound Hardware
Specifications‛, on page 89 for technical details on hardware). We look at the frequency spectrum
of each sound and find the frequency value of the main amplitude peak of each of them. We then
Page 24 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
calculate the mean frequency value and the standard deviation25 based on these peaks, in order
to find the interesting frequency bands in a scientific way.
Bass Drum Frequency Calculation
Figure 19 shows the frequency spectrum of a typical voice-to-drum simulated bass drum sound.
It has a lot of activity in the lower end of the frequency scale, at around 100-400Hz, and not much
activity in the higher end of the scale.
Figure 19 - Frequency Spectrum for a voice-to-drum simulated bass drum sound
‚Appendix 3 – Bass and Snare drum peak frequencies of recorded voice-to-drum samples‛ on
page 90 contains a table of the frequency values for the amplitude peak of each of the 18 bass
drum sounds we recorded. We use these values to find the mean and the standard deviation, in
order to get a more precise frequency band.
Mean:
sum of sampled frequencie s
= 238
n
Where n is the number of sampled frequencies.
25
C alculations are based on Thom as M oeslund ’s slides about statistics
http://www.cvmt.dk/education/teaching/e04/MED3/AP/ap17+18.ppt – pages 9-12
Page 25 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Variance: [(Fbd1 - Fmean)2 + (Fbd2 - Fmean)2 + … + (Fn - Fmean) 2] / n = 5649.2222
W here ‛F n‛/‛ F bdx‚ is the frequency peak value ofthe corresponding bass drum sound,‛F mean‛ is
the mean frequency value of all the sounds and ‛n‛ is the num ber of sam pled frequency peaks.
Standard Deviation = σ =
Variance = 75.1613
To cover 95.44% of the sampled bass drum frequency peak values, we multiply the Standard
Deviation by 2:
2σ = 150.3226
Snare Drum Frequency Calculation
Figure 20 shows the frequency spectrum of a typical voice-to-drum simulated snare drum
sound. It has its main activity in the higher end of the frequency scale, at around 2000-4000Hz,
but also significant activity in the lower end of the scale, at around 100-400Hz.
Figure 20 - Frequency Spectrum for a voice-to-drum simulated snare drum sound
Page 26 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Since we have already covered the lower end of the frequency scale and calculated a frequency
band based on the sampled values, when we did the bass drum calculation, we are going to
concentrate the snare drum calculation on the higher end of the frequency scale.
‚Appendix 3 – Bass and Snare drum peak frequencies of recorded voice-to-drum samples‛ on
page 90 contains the frequency values for the amplitude peak of each of the 18 recorded snare
drum sounds. These peaks are not necessarily the highest ones, overall, but they are the highest
ones in the higher end of the frequency scale. I.e. we disregard any peaks that occur in the lower
frequency range, at around 100-400Hz, since this range has already been covered.
We use the sampled values to find the mean and the standard deviation, in order to get a more
precise frequency band:
Mean:
sum of sampled frequencie s
= 2847
n
Where n is the number of sampled frequencies
Variance: [(Fsd1 - Fmean)2 + (Fsd2 - Fmean)2 + … + (F n - Fmean) 2] / n = 174916.6111
W here ‛F n‛/‛ F sdx‚ is the frequency peak value ofthe corresponding snare drum sound, ‛F mean‛ is
the mean frequency value of all the sounds and ‛n‛ is the num ber of sam pled frequency peaks.
Standard Deviation = σ =
Variance = 418.2303
To cover 95.44% of the sampled snare drum frequency peak values, we multiply the Standard
Deviation by 2:
2σ = 836.4607
Page 27 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Page 28 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Conclusion of Percussive Sound Characteristics
In the first part of the chapter we concluded that an important element of a drum sound is a rapid
attack. This is something we need to keep in mind both when it comes to detecting voice-todrum simulated sounds, and also when it comes to creating our own synthetic drum sounds.
Furthermore, we found out that frequency is another important part of a drum sound, so we
analyzed some recorded voice-to-drum simulated sounds. The results show that frequency is a
parameter we can use to distinguish between a voice-to-drum simulated bass drum sound and
a voice-to-drum simulated snare drum sound. We have found the following interesting frequency
bands:

For the bass drum, we get a frequency band with a centre frequency equal to the mean
value, 238 Hz, and a bandwidth of 2σ 150Hz.

For the snare drum, this means we get a frequency band with a centre frequency equal to
the mean value, 2847 Hz, and a bandwidth of 2σ 836Hz.
We will use these results as a basis for creating our system (see the chapter ‚Design &
Implementation‛ on page 37).
Short Analysis of DrumSynth 2
In this chapter we take a look at how the program DrumSynth 2 creates synthetic drum sounds, in
order to get some inspiration as to how we can create our own sounds from scratch.
As mentioned in the introduction, we have chosen to synthesize drum sounds that resemble the
old analogue vintage drum machines. Such drum machines are still essential in modern electronic
music, and we are all fond of the sound from such machines.
We have been taking a look into different software drum synthesizer to get inspired, and we were
amazed by the sonic possibilities that DrumSynth 226 could offer with only a few parameters.
26
DrumSynth 2 is available from: http://www.hitsquad.com/smm/programs/DrumSynth/
Page 29 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
DrumSynth 2 is a simple program that builds drums sounds from combining elements like sweptfrequency sine waves, noise bands, noise, distortion, and overtones (see Figure 21).
Figure 21 - DrumSynth 2
The different elements have a few controllable parameters, including amplitude envelope and e.g.
frequency range. The elements are then combined in a mixer that controls the overall volume and
the signal is then sent to the sound card.
Even though the controls of the different elements are limited, it is possible to create rather
complex sounds by combining the different elements. This is something we intend to use, when
we are going to create our own synthetic drum sounds.
Page 30 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Electronic Sensor Interfacing
27
By using sensor technology, we would like to provide an additional interface for user interaction,
in addition to the standard input devices found on a PC (keyboard and mouse). Originally, this
additional user interface was supposed to consist of a floor board, where the user could select
different application commands by pressing buttons on the floor board with the foot, and a foot
pedal, where the user could input a continuous range of values into the application, again by
using foot pressure. Finally, the microphone as a sensor device for auditory input would be
included as a part of our user interface extension.
Since a foot pedal can be understood simply as a mechanical construction that converts rocking
foot motion into rotary motion of a potentiometer28, and since we did not have the resources to
build a foot pedal, we estimated that it would be enough to simply use a potentiometer instead. In
addition, since a potentiometer can be directly connected to a power supply in order to produce
a voltage divider, no additional circuitry but the potentiometer itself is needed to provide a voltage
input to the system, and to simulate a foot pedal input.
A floor board, on the other hand, would electrically consist of push buttons. A single push button
can be interfaced directly to a device called a microprocessor (Intro Teleo Module), which
converts voltage into digital data (see chapter ‚The Tactile Sensor‛ on page 45 for more
information on this device). But you would need to use one of the four available inputs on the
microprocessor per button. Since we want to use six buttons in total, and we only have three
available inputs left, (the potentiometer uses one of the inputs) we need to build a voltage divider
switching network. This enables us to use six push buttons, but only one input on the
microprocessor.
In this report, the electronic aspect is focused on this circuit, since it is the only one that we had to
build ourselves.
Since a microphone can be taken as a standardized device used in connection with PCs, the
electronic aspect of interfacing a microphone will not be debated further here 29.
27
This chapter is b ased on ‚Appendix 1 - Sensor Theory‛,see page 83
Source: http://www.geofex.com/Article_Folders/wahrocker/wahrocker.pdf
29
See ‚
Digital Representation of Sound‛ chapter on p ag e 12 for a bit more info about microphones
28
Page 31 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Mapping
To successfully build a system thatis able to ‚convert‛ hum an voice into drum s, m any factors will
have to be taken in to account. How do we detect the sound, but also which sound should trigger
what synthesis? In our voice to synthesis mapping we have a one-to-one mapping; a voice input
with short attack and frequencies in the lower areas will trigger the bass drum. Voice input with
short attack and high frequencies trigger the snare drum.
When mapping both the sensor and the microphone to the instrument we have built, it is important
that some sort of consistency and logic is maintained. An example could be: When changing the
instrument by pushing the left instrument button (a button on a controller), the menu is displaying
this action too, and the GUI30 button is placed onscreen similar to the layout of the tactile sensor.
To have a better idea of how the mapping should be like, we have looked at some examples
made by other researchers. One of the projects we have looked at is also investigated by Andy
Hunt and Marcelo M. Wanderley, where they give an example of how the mapping in a MIDI wind
controller may look like, and how it may be altered to get a different experience both of how to
play it and when listening to it.
How the mapping looked like is illustrated in Figure 22, below.
30
Graphical User Interface
Page 32 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 2231 - An example of altered mapping for a MIDI wind controller.
By using different mappings between the actual instrument and the sound synthesis engine like in
the wind controller example, it is possible to alter the playing experience, while only changing the
mapping.
Another example of mapping has been carried out by Andy Hunt and Marcelo M. Wanderley, but
in this case the instrument interface was not a simulation of a real instrument, only a computer
with a mouse and a controller with sliders was used32.
The test was set up in different configurations, again with the mapping altered. In Figure 23 a
one-to-one configuration was used, where each slider on the screen manipulated a different
parameter in the sound synthesis. Only one parameter could be controlled at a time, using the
mouse.
31
32
Image taken from :‛M apping perform er p aram eters to synthesis engines‛ by A ndy H unt and M arcelo M . W anderley.
Source: ‛The im portance ofp aram eter m apping in electronic instrum ent design‛ by H unt, W anderley and Paradis.
Page 33 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 23 - One-to-one
The second configuration also had 4 sliders on the screen, but instead of controlling them with the
mouse, they where controlled by a control board with sliders. The main difference in this
configuration was the fact that the user had to keep moving one of the sliders to produce a sound,
kind of like using a bow on a violin.
Figure 24 - One-to-one with sliders
The last configuration combined the slider interface and the mouse interface into one. The mouse
had to be moved in order to produce sound, while sound parameters could be changed by the
sliders and also by the position of the mouse. Moving the mouse was necessary in order to
produce a sound.
Page 34 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 25 - The multiparametric interface
Andy Hunt and Marcelo M. Wanderley write that the last configuration seemed frustrating at first,
but most people grew fond of this after a while, because the interface only using two sliders and
the mouse controlling several parameters seemed more like an actual instrument.
Based on this we can say that how we build our own mapping is essential to the user experience,
and how much they will like the system. We have to be aware of the fact that a slight difference in
mapping can change the way the user interacts with the system.
Real-time
In order to make an instrument like ours useful in a musical context, the system needs to be fast
enough so that the user does not perceive the actual latency.
In a system like ours, there will always be latency because of e.g. the data throughput of the
computer, but the latency can be reduced through efficient coding and better hardware, such as
e.g. a sound card with a low-latency driver. Also the distance from the speakers to the user
creates latency. In air, sound travels at a speed of approximately 345 m/s33. This means that for
each meter the speakers are located from the user, there will be an extra latency of:
1m * (1000ms / 345m) ≈ 2.9ms
In relation to this it is important to find out how much latency the human ear accepts, without
sensing it as a disturbance in a musical context.
33
Source:‚C om p uter M usic‛ p ag e 289
Page 35 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Nelson Posse Lago has researched how much latency is acceptable between a user action and
the corresponding reaction in music applications. He concludes:
‚… up to 20-30ms, are pretty much acceptable for most multimedia and music applications‛34
Knowing that if the speakers are positioned 3-4m away there will be an extra latency of
approximately 10ms, our system should not have a latency of more than 20ms from the input (the
microphone) to the output (the speakers), if we want it to be perceived as real-time.
34
Source: "Distributed Real-Time Audio Processing" by Nelson Posse Lago, Page 7
http://gsd.ime.usp.br/~lago/masters/extended_abstract.pdf
Page 36 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
DESIGN & IMPLEMENTATION
In this chapter we talk about the design and implementation of the actual system we have
created. We start out with a general description and an overview of the system, and then we go
into a more detailed description of each element.
Short Description of the System
We have named our system ‚Funkm eister7‛ (see Figure 26). It is basically a digital, percussive
instrument that you can play using your voice.
Figure 26 - Funkmeister7 Screenshot
The system takes voice-to-drum simulated sounds as input through a microphone, analyzes the
sound, and determines whether it is a snare drum or a bass drum (or neither). When it detects
one of these drum types, the system will play back a synthesized version of the same kind of
drum type. In addition to this, there is a tactile sensor/controller with six buttons and a turning
knob (see Figure 28 on page 39) that allows the user to control certain parameters and effects of
the synthesized output sound, through a menu. The menu is divided into two levels: a main menu
which allows the user to switch between the available types of instruments, and a sub-menu
Page 37 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
where there options change, based on which instrument the user has chosen in the main menu. In
the sub-menu the parameters of the chosen instruments can be altered.
System overview
Figure 27 below shows the overall system structure.
Figure 27 – System Structure
Page 38 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The hardware part of our system consists of a microphone and two interaction input devices:
a board interface and a potentiometer, and a Teleo Intro Module35 as a microprocessor that
acquires and converts the data from the input devices (see Figure 28 below). The board interface,
the potentiometer and the microprocessor make up what we call‚The Tactile Sensor‛.
Figure 28 – The Tactile Sensor
The microprocessor is connected via USB to a PC with a soundcard, speakers and a monitor. The
microphone is connected to this PC via the soundcard.
The software part of our system is implemented in Max/MSP36. It is based on a hierarchic
construction consisting of 12 patches (see Figure 29).
35
36
Teleo Intro Module from www.makingthings.com
Max/MSP is available from: http://www.cycling74.com/products/maxmsp.html
Page 39 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 29: Overview of the patches and sub-patches in the software part of our system
In the top ofthe hierarchy is the patch called ‚FU N K M EISTER 7‛.This patch is the graphical
presentation (GUI) of our system, and from here you control the different effects, volume, etc.
The ‚FU N K M EISTER 7‛ patch has five sub -patches, w hich are: ‚B O A R D ‛,‚LO O P‛,‚SYN TH ‛,
‚M EN U ‛, and ‚M IC _IN PU T‛.
The ‚B O A R D ‛ patch is w here w e receive the input from the tactile sensor that we have created. It
has a sub-patch called ‚R ESET‛ w hich is used to reset the entire system
The ‚LO O P‛ patch is a sample player with four different musical loops that the user can play while
using the instrument.
The ‚SYN TH ‛ patch is the drum synthesizer; it is in this patch that we connect the drum sounds
w ith the differenteffects.‚SYN TH ‛ has four sub -patches: ‚B D _SYN TH ‛,‚SD _SYN TH ‛, ‚D ELA Y‛,
and ‚R EVER B ‛.‚B D _SYN TH ‛ and ‚SD _SYN TH ‛ is w here respectively the bass drum and the
snare drum are synthesized.The ‚D ELA Y‛ patch is w here the delay is created.The ‚R EVER B ‛
patch is where you control the reverb. It has a sub-patch called ‚reverb‛, and it is in this patch
that the actualreverb is m ade (see ‚Sound Synthesis‛ on page 58 for details).
In the ‚M EN U ‛ patch we connect the input from the tactile sensor to the rest system.
Page 40 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The ‚M IC _IN PU T‛ patch is w here the analysis of the sound inputfrom the m icrophone takes
place. It detects if the input is a voice-to-drum simulated bass drum – or snare drum sound, or
neither (see ‚Sound Input Analysis‛ on page 51 for details).
We will now go through how we have mapped the input of our system to the output, and then
describe the design and implementation of the three parts of the system: Tactile Sensor, Sound
Input Analysis, and Sound Synthesis.
Mapping
In our mapping there are two steps where data is mapped to match either the graphical options
or the sound creation parameters.
Figure 30 - The steps in mapping sensor input to the sound
output and the dynamic user interface
The first step in mapping takes place after the conversion of physical data. E.g. the pressure of
the hand on a button or sound waves in the air is mapped through a normalized data stream that
can be read by the computer and the application. This step is where the input data is mapped to
num bers betw een 0 and 100 (See ‛Design & Implementation‛ on page 41). Both the sound
synthesis parameters and the GUI needs these values in order to create the sound and display
what is currently going on.
Page 41 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The second step in the mapping is where the parameters are mapped to actual parts of the
synthesis and objects in the GUI.
Mapping the microphone
In our application one of the main interfaces is the microphone, which we use to map the voice of
the user to specific instruments. This mapping can be illustrated as in Figure 31.
Figure 31 - The mapping of frequency and amplitude from the
microphone.
From the illustration (Figure 31) it shows that there is a one-to-one mapping between frequency
and which instrument to play, and also a one-to-one mapping between amplitude and detection.
Actually this mapping can be combined into the next illustration (Figure 32), since both frequency
and amplitude determines together the instrument to be played, and not individually. Now several
parameters control what instrument to be played, giving the system a many-to-one relation.
Figure 32 - The combination of frequency and amplitude
Mapping the Tactile Sensor
The tactile sensor is our second sensor. A part of this is the control board. It is used to manipulate
several parameters of the synthesis engine. In this sensor the mapping is not determining what
Page 42 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
sound to play, but rather what the sound should sound like. The mapping in this part of the
system is m ore ‚function‛-based, where different buttons are mapped to specific parameters in
the sound engine.
To give a better picture of how the different buttons and functions are mapped together, the
illustration below (Figure 33) has been made.
Figure 33 - The mapping of the buttons to the parameters in the patch.
The illustration above shows how everything is linked together to produce the augmented sound
scape.
The two buttons to change instruments have a one-to-many relationship, since several of the
same instruments are controlled by these two buttons. The change effect buttons have the same
relationship because many parameters and effects can be changed by only using these buttons.
The reset button has a one-to-one relationship in the system, one button for one control
(resetting) when using the system.
Page 43 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The ‘apply’-button is a 1 to many relationship because it is possible to apply both the values of
the different synthesis parameters and effects, but also possible to start and stop the loop player
which can be run in the background. In the mapping illustration (Figure 33) there are no arrows
going outfrom the ‘apply’-button and the ‘reset’-button, this is on purpose since it would clutter
up the illustration because both buttons are connected to alm osteverything. ‘A pply’is also
connected to the potentiometer this would clutter the illustration further.
A part of the tactile sensor is the potentiometer. Although this sensor is connected to the
microprocessor together with the voltage divider switching network, it is still considered a standalone sensor. The potentiometer also works in a different way, since it supplies a continuous
range of numbers, instead of the 7 fixed from the control board. The mapping for the
potentiometer is illustrated in Figure 34 - Mapping of the potentiometer
Figure 34 - Mapping of the potentiometer
The mapping in the potentiometer is a bit different, since the potentiometer is mapped to a lot of
values, all depending on the range of the resistance in the potentiometer. The potentiometer
produces a continuous range of numbers, but after the processing by the Teleo Intro module,
1024 different numbers is the maximum range.
The potentiometer has a one-to-many relationship since it controls the level of volume, pitch,
distortion, delay and reverberation. It uses a continuous range of values to control this, but with
the sampling in bits set to 5, we have 31 different values, which are then scaled from 0-100 to
make a more intuitive control of e.g. volume.
Page 44 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The Tactile Sensor
37
What we are building is two sensors: A control board with six buttons measuring pressure at a
very low level (voltage divider switching network) and a turning knob measuring motion
(potentiometer). Both sensors convert, using the microprocessor, the pressure and motion into
different voltages. When the user of the system presses a button it is determined whether the
button is on or off and different voltages are produced. The different voltages are produced by
changing the amount of resistance in the circuit depending on what button is pressed. We also
use a potentiometer that supplies the system with a continuous range of values, which can be
used to adjust methods inside the application which requires more fine steps. The tactile sensor
chapter consists of two main parts: transducing pressure and motion into voltage and formatting
the voltage into a digital stream. The voltage divider and potentiometer can be thought of as a
transducer and the microprocessor as a formatter.
Control Board
(Voltage divider switching network)
Turning Knob
(Potentiometer)
Microprocessor
(Teleo Intro Module)
Figure 35 - Overview of the tactile sensor
37
This section is mainly based on the slides provided by our teacher and supervisor Smilen Dimitrov, if nothing else is noted.
Link: http://www.smilen.net/SensorTechMED4/
Page 45 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Transducing Motion and Pressure
The potentiometer is transducing motion, but we will not cover this any further than already
described in ‚Appendix 1 - Sensor Theory‛ on page 84.In stead we will focus on the control
board, which works in the simple way of dividing voltage. The Teleo module supplies a 5V output
as a power supply and it is this voltage that is being divided. When working with small voltages it
is essential that the system is able to tell a difference in the voltage, i.e. if a button being pressed
resulting in a 0.003V difference, it might be hard to detect the change.
To accommodate this it has been made sure that the voltage is being spread out as much as
possible in the 0V to 5V range, by using 4kOhm resistors between the different steps and a
10kOhm resistor as default resistor.
When the user presses a button, the voltage divider circuit gets cut off at different places, and
thereby giving different measurements, as shown in the illustrations below. The circuit contains 7
resistors and when no button is pressed the current runs through all of the resistors. When a
button is pressed, a certain amount of resistance is taken out of the circuit, and thereby resulting
in a higher measured output voltage. If no buttons in the circuit are pressed, there will be a total
resistance of 34kOhms producing 1.466V output.For each button ‚low er‛ in the circuita button is
pressed,the resistance is decreased by 4kO hm s.If the button at the ‚bottom ‛ (button 1) is
pressed, only the default resistor (10kOhms) will be left in the circuit providing a 4.992V output
(see Figure 36, Figure 37 and Figure 38).
Figure 36 - no button is pressed
Figure 37 – button 2 pressed
Figure 38 - button 3 is pressed
Page 46 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
A listing of all the values from the different buttons pressed is shown in
Figure 39.
The relative distribution schema
State
kΩ
kΩ
Spread
Volts
Standby
10
/
34
0.294
1.466V
1. button
10
/
10
1.000
4.992V
2. button
10
/
14
0.714
3.569V
3. button
10
/
18
0.556
2.786V
4. button
10
/
22
0.455
2.266V
5. button
10
/
26
0.385
1.917V
6. button
10
/
30
0.333
1.663V
Figure 39: Voltage values from the voltage divider
Formatting the Transduced Data
Teleo is a tool developed for building prototypes of systems very fast (see Figure 40). Without the
needs to understand low-level programming or even being an expert in electronics, it is possible
to build systems that really work, and make them work in real-time.
Figure 40 - Illustration of the Teleo module
Page 47 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
As a formatter in our interface we have used the Teleo intro module as an easy-to-build
translator, to translate the analogue data in to understandable data. The Teleo module translates
the analogue data into a binary stream that the computer understands through an extension
installed in Max/MSP. This gives us the possibility to modify the numerical representation of the
input voltage, which from an input voltage in the range 0-5V can be sampled with the maximum
of 10bit resolution, and represented with a number from 0 to 2 10 - 1.
The Teleo module has 4 analogue inputs that can all be addressed simultaneously and with
different kinds of sensors. Our control board has been connected to channel 0 and the
potentiometer is connected to channel 2. In this way we are able to capture the values from each
sensor individually.
The Tactile Sensor – Software (Max/MSP)
To m ake use of the values being sent from the Teleo m odule som e kind of‚logic‛ has to be build.
The ‚board‛-patcher in the application is where the numbers are received, and the decision of
w hat is going to happen is m ade.The ‚board‛-patcher w orks together w ith ‚m enu‛- patcher in
order to make the user interface look nice, and the variables for the changeable parameters of the
drum sounds are also saved in ‚m enu‛-patcher.
Figure 41 - Data acquisition in Max/MSP
After passing through the sensor, the voltage-values end up in Max/MSP being displayed as
values on a scale from -100 to +100. These values are then scaled to positive values only, which
gives us the values in Figure 42. The ‚t.intro.ain‛ object is the element in Max/MSP which makes
the connection to the Teleo module, and returns the values. The object has 4 options:
Page 48 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
1st = Sample period (1  1000ms)
2nd = Minimum value of the output (-100  100)
3rd = Maximum value of the output (-100  100)
4th = Resolution in bits (1  10)
The output with the values we have selected is a number between 0 and 100. The object is
receiving a value from the sensor every 250ms with a resolution of 5bits (which gives us 31
different possible values38).
There are two of the above mentioned t.intro.ain modules in the patch. The second channel on the
module is used by the potentiometer, which is there to give us the possibility to change e.g. the
volume of the output sound more smoothly. Both blocks are almost identical and are doing almost
the sam e thing,the only difference is thata ‚scale‛-object has been inserted after the block that
receives the potentiometer value. Because of resistance in the circuit it is not possible to get a
perfect 0 and a perfect 100 (rather 14-76), so the scale object is there to make sure the values
are always between 0 and 100.
The relative distribution schema
State
kΩ
kΩ
Spread
Volts
Max/MSP
value
Standby
10
/
34
0.294
1.466V
29.03
1. button
10
/
10
1.000
4.992V
100
2. button
10
/
14
0.714
3.569V
70.97
3. button
10
/
18
0.556
2.786V
54.84
4. button
10
/
22
0.455
2.266V
45.16
5. button
10
/
26
0.385
1.917V
35.48
6. button
10
/
30
0.333
1.663V
32.26
Figure 42: The Voltage values including the scaled Max/MSP values
38
Source:‚T e le o S ta rte r K it U s e r G u id e ” - http://makingthings.com/products/documentation/teleo_intro_user_guide/index.html
Page 49 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The next important section of code is the part where it is actually determined what button was
pressed and what function to call. This is
achieved with six
if-sentences, each determining whether
the value received
is within a certain threshold (e.g. between
99 and 101 to
activate button 1) or not (see Figure 43).
The thresholds are:
Button 1 = 99 to 101
Button 2 = 69 to 72
Button 3 = 53 to 56
Button 4 = 43 to 46
Button 5 = 34 to 37
Button 6 = 31 to 33
Figure 43 - Thresholds
The thresholds are made with a couple of numbers in between to make sure that the value
received really is real + or – some decimals if the tactile sensor produces a little bit different
numbers.
This part of the patch has been made to also allow the user to manually use the system with the
mouse. The ‚bt1_click‛ to ‚bt6_click‛ receives bangs from the front panel (user interface) and
thereby simulates actual values being received in the system.
The last part ofthis patch is the sm all‚controlpanel‛ (see Figure 44), which was made during the
construction of the patch, for simpler control without the sensor. It is still important because some
of the elements of this control panel are still used to control some parameters, but in a future
version this could be taken out of the patch.
Page 50 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 44 - The control panel
Sound Input Analysis
Introduction to Sound Input Analysis
The purpose of this part of the system is to analyze the incoming sound from the microphone, and
detect when the attack of a drum-type sound occurs and then determine whether the user is
trying to sim ulate a snare drum (‚SD‛) or a bass drum (‚BD‛).
Figure 45: Sound Input Analysis Diagram
Page 51 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The Sound Input Analysis consists of four main parts (see Figure 45):

The Input, where the analogue sound input is being converted into digital sound through
the microphone and soundcard

The Attack Detection, where the sound input is being analysed based on amplitude

The Drum-type Detection, where the sound input is being analysed based on frequency

The Drum Trigger, where the two detection parts are synchronized
We have already covered how a microphone and analogue-to-digital conversion works in the
‚Digital Representation of Sound‛ chapter on page 12, so we will not cover that any further in this
chapter. Instead we will take a closer look at the other three parts of the Sound Input Analysis.
Attack Detection
The attack detection part of the analysis patch revolves around the ‚B onk~‛ object39. As we found
out in the ‚Percussive Sound Characteristics‛ chapter on page 23, a typical percussive sound has
a rapid attack, which makes the ‚Bonk~ ‛ object very suitable for this task. Bonk~ is an extension
to Max/MSP, that tracks relative changes in amplitude over time, and detects percussive attacks
based on the adjustable threshold values, ‚hithresh‛ and ‚lothresh‛.
If the am plitude of the incom ing signalrises above the ‚hithres‛ value w ithin one analysis interval,
and then drops to below the ‚lothresh‛ value,B onk~ will detect an attack and send out a ‚bang‛
com m and (generalM ax/M SP term for ‚execute‛).In addition to these thresholds,you can also set
a minimum velocity value (‚m invel‛), which simply just ignores signals with a lower amplitude
value than the number it is set to, no matter what the relative amplitude change is. 40
39
40
Bonk is available from: http://www-crca.ucsd.edu/~tapel/software.html
Source: "Real-time audio analysis tools for Pd and MSP" by Puckette, Apel and Zicarelli:
http://www-crca.ucsd.edu/~tapel/icmc98.pdf
Page 52 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 46: Attack Detection
Figure 46 shows the attack detection part of the Sound Input Analysis patch.‚lothresh‛ has been
set to a value of6, ‚hithresh‛ has been setto a value of75 and ‚m invel‛ has been setto a value of
25.The m ain m ethod oftw eaking these values has been ‚training by ear‛.W e sim ply fed the B onk
object with a lot of percussive sounds, both in form of pre-recorded samples and live input from
a microphone with voice-to-drum simulation and by banging on the table, etc, and then
monitored when Bonk reported an attack. This way we found the best compromise between
detecting the necessary attacks, without the system being too sensitive and detecting more
attacks in one percussive sound, e.g.
Drum-type Detection
A s w e found out in the ‚Voice-to-drum Frequency Analysis‛ section on page 24, there are two
frequency intervals that are relevant to look at, when it comes to deciding whether a human is
trying to simulate the sound of a snare drum or a bass drum: The lower end of the frequency
scale around 150-300 Hz, and the higher end of the frequency scale, around 2500-3500 Hz. The
bass drum has a lot of activity in the lower end of the frequency scale, and the snare drum has a
lot of activity in the higher end of the scale, but also significant activity in the lower end.
Page 53 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 47: Two parallel band-pass filters
Because of this, we have chosen the method of setting up two parallel band-pass filters (see
page 15 for more information about band-pass filters). One for the lower end of the spectrum
where a voice-to-drum simulated bass drum sound has its main activity, and one for the higher
end of the spectrum where a voice-to-drum simulated snare drum sound has its main activity
(see Figure 47). The idea is to monitor the strength of these two filtered signals, compare them,
and then to make a set of rules based on the two values, that determine whether a sound is a
simulated bass-drum or a simulated snare-drum.
The first band-pass-filter has a centre frequency of 238Hz and a bandwidth of approximately
150Hz, the second filter has a centre frequency of approximately 2847Hz and a bandwidth of
approximately 836Hz, corresponding to the values we extracted from our prior analysis (‚Voiceto-drum Frequency Analysis‛ section on page 24).
Page 54 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 48: Drum type detection
As shown on Figure 48, the two signals go into two separate ‚peakam p‛ objects, which monitor
the peak am plitude ofeach signalin 5 m s intervals. The ‚peakam p‛ value for the lower bandpassed signalbelongs to the ‚B D D etector‛ (bass drum detector), and the value for the higher
band-passed signalbelongs to the ‚SD D etector‛ (snare drum detector). These tw o ‚peakam p‛
values are then com pared in tw o sets ofrules. Ifthe first rule is true,the ‚B D D etector‛ w illdetect
a bass drum and outputa ‚1‛. Ifthe rule is not true, it w illoutputa ‚2‛.The ‚SD D etector‛ w orks in
the sam e w ay, by giving an outputofeither ‚1‛ or ‚2‛.
- The BD Detector rule:
If $f1 > 0.2 && $f1 > $f2 * 1.2 then set 1 else set 2
Where $f1 is the peakamp value for the lower band-passed signal (Low Peakamp) and $f2 is the
peakamp value for the higher band-passed signal (High Peakamp).
Explanation: If the Low Peakamp value is above 0.2 (in order to filter out noise and low signals)
and if it is more than 1.2 times higher than the High Peakamp value, then it will report the sound
as being a bass drum.
As we deduced in the chapter about ‚Voice-to-drum Frequency Analysis‛ section on page 24,
voice-to-drum simulated bass drum sounds have their main activity in the low-frequency area.
Page 55 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The rule above ensures that only signals with more activity in the low-frequency area, compared
to the high-frequency area, will be reported as being a bass drum.
- The SD Detector rule:
If $f2 > 0.2 && $f2 > $f1 * 1 then set 1 else set 2
Where $f1 is the peakamp value for the lower band-passed signal (Low Peakamp) and $f2 is the
peakamp value for the higher band-passed signal (High Peakamp).
Explanation: If the High Peakamp value is above 0.2 (in order to filter out noise and low signals)
and if it is higher than the Low Peakamp value, then it will report the sound as being a snare
drum.
As we deduced in the chapter about ‚Voice-to-drum Frequency Analysis‛ section on page 24,
voice-to-drum simulated snare drum sounds have their main activity in the high-frequency area.
The rule above ensures that only signals with more activity in the high-frequency area, compared
to the low-frequency area, will be reported as being a snare drum.
Drum Triggering
This part of the patch is divided into two drum triggers, one for the bass drum and one for the
snare drum (See Figure 49). This is where the signals from the two detection parts are joined and
synchronized, to ensure that e.g. a bass drum sound is only triggered if a sound has a rapid
attack AND if the frequency of the sound is in the bass drum band.
Page 56 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 49: Drum Triggers
The bass drum trigger listens to the output of the BD Detector, and the snare drum trigger listens
to the output of the SD Detector. I.e. they w illboth receive either a ‚1‛ or a ‚2‛.
At the same time, both triggers also listen to the output of the Attack Detector, which outputs a
‚bang‛ com m and w henever itdetects a percussive attack.
The drum triggers then compare the two kinds of inputs they are getting, and if they receive a
‚bang‛ com m and from the A ttack D etector and atthe sam e tim e receive a ‚1‛ from the Drum
Detector, they will output a ‚hit‛.
I.e.for a ‚hit‛ to be reported, the D rum Trigger has to receive both a ‚1‛ from respectively the B D
or the SD detector AND receive a ‚bang‛ from the A ttack D etector,atthe sam e tim e.The B D
Trigger will send a ‚bdhit‛ command and the SD Trigger will send an ‚sdhit‛ command, which is
used by the Sound Synthesis part of the system.
In order to properly synchronise the tw o inputs the D rum Triggers are receiving, the ‚bang‛
received from the Attack Detector has been delayed by 15ms for the BD Trigger, and by 30ms for
the SD Trigger. This was done in order to compensate for calculation time of the SD and BD
Detectors, and because the main frequency content of a percussive sound happens after the
attack.
Page 57 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Sound Synthesis
Introduction to Sound Synthesis
This part of the system creates the actual output sound, based on the information from the
previous processes. The result of the previous part, Sound Input Analysis, is practically – in the
end - a ‚bang‛, w hich instantiates the sound synthesis engine of either a bass drum or a snare
drum. We have created two different drum synthesizers – one for the bass drum and one for the
snare drum.
The other input used in this part of the system, are the values from the tactile sensor. The buttons
produce unique integer values used to trigger the menu system, as explained earlier on. These
values are used to ‚bang‛ the belonging parts ofthe system .The potentiom eter produces integer
values between 0 and 100, which are used to set the values of the different parts of the sound
synthesis system.
Since the tactile sensor and the menu system has been explained in the chapter ‚Mapping‛ on
page 41, we will only refer to the results from the previous processes here,w hich are ‚bangs‛ and
integer values between 0 and 100.
The structure of both the bass- and snare drum synthesizer is inspired by the structure of a
freeware software drum synthesizer called Drumsynth 2.0. DrumSynth creates synthetic drum
sounds from a combination of swept-frequency sine waves, noise, complex waveforms, and
noise with band-pass filters. It can reproduce sounds from classic analogue drum machines or
make new drum sounds (see ‚Short Analysis of DrumSynth 2‛ on page 29).
The sound from our two drum synthesizers can be manipulated with the tactile sensor, which can
control both some of the parameters of the two synthesizers, but also add sound effects such as
reverberation and delay to the snare drum synthesizer. When the drum sound is generated and
the effects are added, the sound is passed on to the soundcard.
The sound synthesis part of the system contains the following four major blocks: The Bass Drum
Synthesizer, The Snare Drum Synthesizer, The Sound Effects, and The Master Section (see Figure
50).
Page 58 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 50: Overview of sound synthesis blocks
Description of the Sound Synthesis Blocks
The Bass Drum Synthesizer
The bass drum synthesizer consists of a sinusoidal sweep that interpolates linearly between
frequency values from 250Hz to 80Hz in a period of 110ms.
The amplitude of the bass drum is controlled by the following envelope (see Figure 51):
Attack:
from 0 to 1 amplitude in 1ms
Decay:
from 1 to 0.3 amplitude in 120ms
Release:
from 0.4 to 0 amplitude in 120ms
Page 59 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 51: Amplitude envelope for bass drum synthesizer
Figure 52: Max/MSP patch for the Bass Drum
Page 60 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The Snare Drum Synthesizer
The snare drum synthesizer is a bit more complex than the bass drum synthesizer, since it
consists of more elements. The snare drum sound is created by adding four different elements
together: a sinusoidal frequency sweep, overtones, white noise, and band-passed white noise.
Each element has its own amplitude envelope in order to achieve a more natural sound with
independent temporal evolution41.
A llelem ents are ‚banged‛ at the sam e tim e,and the outputofeach elem ent is assem bled in a
small mixer that controls the amount of each element in the final snare drum sound (see Figure
53).
Figure 53: The elements of the snare drum synthesizer
41
Source:‚C om p uter M usic‛, page 88.
Page 61 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
- Walkthrough of snare drum synthesizer elements
Frequency sweep
The sinusoidal frequency sweep interpolates linearly between frequency values from 454Hz to
250Hz in a period of 25ms.
The amplitude of the sinusoidal frequency sweep is controlled by the following envelope:
Attack:
from 0 to 1 amplitude in 0ms
Decay:
from 1 to 0.5 amplitude in 25ms
Sustain:
from 0.5 to 0.2 amplitude in 45ms
Release:
from 0.2 to 0 amplitude in 110ms
Overtones
The overtones are created from addition of two sinusoids with the frequency of 500 Hz and 328
Hz. (see ‚Synthesis techniques‛ on page 21)
We use an independent amplitude envelope for the overtone element in order to make the final
snare drum sound more natural, than if we use the same amplitude envelope for each element of
the snare drum.
The amplitude used for the overtone element is controlled by the following envelope:
Attack:
from 0 to 1 amplitude in 0ms
Decay:
from 1 to 0.2 amplitude in 16ms
Release:
from 0.2 to 0 amplitude in 34ms
Page 62 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
White noise
The white noise enhances the frequency spectrum of the snare drum.
The amplitude of the white noise is controlled by the following envelope:
Attack:
from 0 to 1 amplitude in 0ms
Decay:
from 1 to 0.3 amplitude in 25ms
Release:
from 0.3 to 0 amplitude in 150ms
White noise with band-pass filter
This element is used to give the snare drum a more punchy sound. The punch can be controlled
by altering the centre frequency, bandwidth or gain of the filter. The bandwidth is described
through Q, which corresponds to the bandwidth divided by the centre frequency. The gain is set
relatively high, in order to emphasise the effect of this element.
The parameters of the band-pass filter have the following values:
Centre frequency:
5500Hz
Q:
25
Gain:
50
The band-pass filtered white noise is also controlled by its own amplitude envelope:
Attack:
from 0 to 1 amplitude in 0ms
Decay:
from 1 to 0.05 amplitude in 32ms
Release:
from 0.05 to 0 amplitude in 50ms
Mixer
Here it is possible to control the volume of the four elements individually, to create the mix that
suits the desired snare drum sound. The numbers are fixed in this version of the system, but it is
possible to alter the values within the snare drum patch in Max/MSP.
Page 63 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Changeable Parameters for the Two Synthesizers
In order to make the sound from the two synthesizers more interesting to the user, we have made
it possible to change some of the basic parameters of both synthesizers.
Changeable parameters for bass drum synthesizer
In the bass drum synthesizer you can change both the frequency domain of the sweep, the
volume, and the amount of distortion of the sound by adding a number to the amplitude of the
sound.
- Frequency
The frequency domain is changed by adding a number between -50 to 50 to the output of the
sinusoidal frequency sweep. Example: if the current output frequency from the sinusoidal sweep
is 250Hz to 80Hz and you have chosen to add 50Hz to the bass drum, you will hear the frequency
sweep of 250Hz(+50Hz) to 80Hz(+50Hz) = 300Hz to 130Hz. The default value added to the
frequency is 0.
In order to protect the speakers that are used with the system, the lower limit of possible
frequencies that can be produced is 35Hz, so even if you add the number – 50 to 80Hz, which is
the last number of the frequency produced by the sinusoidal sweep, you will not get a frequency
of 30Hz, but instead 35Hz.
- Distortion
The amplitude of the bass drum can be distorted by adding floats between 0 and 1 to the actual
amplitude of the envelope.
The value of the number to be added to the amplitude is set by rotating the potentiometer and
applying the value to the synthesizer. In this way it is possible to obtain amplitude of 2 on the bass
drum sound, which results in a distorted sound. The default value added to the amplitude is 0.
- Volume
The volume of the bass drum is controlled by sending values between 0 and 100 from the tactile
sensor to the bass drum synthesizer and applying these. The values are then scaled from
integers ranging between 0-100 to floats between 0-1 and applied to the amplitude of the
synthesizer by multiplication. The default volume for the bass drum is 99, or 0.99.
Page 64 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Changeable parameters for snare drum synthesizer
In the snare drum synthesizer you can change the frequency domain of the element sinusoidal
frequency sweep and the volume. Besides this, you can add the two sound effects delay and
reverberation – which are mentioned in the next paragraph.
- Frequency
The frequency domain of the sinusoidal frequency sweep-element is made in exactly the same
way as for the bass drum synthesizer. Only here, there is no limiter, so you can add numbers from
-50 to 50 and get outer values like:
454Hz + 50Hz = 504Hz (upper limit)
250Hz – 50Hz = 200Hz (lower limit)
The default value added to the frequency sweep is 0.
- Volume
You can control the overall volume of all of the 5 elements combined in the snare drum sound.
This is done in exactly the same way as with the bass drum volume.
The default volume for the snare drum is set to 99, or 0,99.
The Sound Effects
From a musical point of view not all sound effects are equally interesting on any instrument. This is
why we have chosen to have sound effects like delay and reverb for the snare drum only.
Delay patch
The ‚D ELA Y‛ patch is a sub-patch to ‚SYN TH ‛ and itcreates a sim ple delay.The patch is
divided in two blocks: An orange and a blue (see Figure 54). The orange block controls the delay
time, and the blue block sets the delay mix. The orange block receives a default delay which is
set to 0, and a variable delay which can be a value between 0-500 milliseconds. However, the
delay object only takes delay time in samples. Therefore we convert the value in ms to number of
samples with the ‚m stosam ps~ ‛ object.
Page 65 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The blue block receives the snare drum audio signalfrom the ‚SYN TH ‛ patch w hich is then split
up. It is sent both to a delay and directly to a signal multiplier. The delayed signal is afterwards
also sent to a signal multiplier. The gain of the two signal multipliers is controlled by the delay mix
value which sends out values from 0-100. 0 means that the amplitude of the delayed signal will
be zero, and 100 means that it will have the same amplitude as the original signal.
Figure 54: DELAY patch, sub-patch to SYNTH patch
In the end the two signals are added together and is sent back to the ‚SYN TH ‛ patch.
Reverberation patch
In order to create a reverberator we will need to create early reflections, late reflections, and
reverberation time, as explained earlier in the Reverberation Theory section (see page 18).
Based on the knowledge we had on creating digital reverbs we tried to create a patch with both
comb filters and all-pass filters. We felt it sounded too ‚m etallic‛, and in our search to improve it,
we found a Max/MSP patch created by Scott Wieser42. We modified the reverb part of his patch so
that it would fit into our system.
It consist of two patches, the first one is called ‚R EVER B ‛ and it controls the am ountofearly- and
late reflections, reverberation time and the mix between the original signal and the reverb signal.
The ‚R EVER B ‛ patch is a sub-patch to the ‚SYN TH ‛ patch (see ‚System O verview‛, page 38).
The second one is called ‚reverb‛ and itis a sub-patch to ‚R EVER B ‛. It is in this patch thatthe
actual filters that creates the reverb effect are.
42 ‚C om puterm uzak‛ p atch available from : http://www.geocities.com/snottywong_1999/maxmsp/
Page 66 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
In the ‚R EVER B ‛ patch w e receive tw o signals,SD _delay w hich is the synthetically snare drum
sound we have created, and reverb_mix which is a number from 0-100 that controls the mix
between the reverb and the original sound (see Figure 55).
The audio signal goes into a signal multiplier object and also into the channel1 ofthe ‚p reverb‛
object w hich sends itto the ‚reverb‛ sub -patch, inlet1. C hannel2 of the ‚p reverb‛ object is fed
w ith a num ber w hich determ ines the gain ofthe early feedback;this is sent to inlet2 in ‚reverb‛.
Channel 3 is fed with a positive number below 1 that controls the reverberation time, which is sent
to inlet 3.
Figure 55: Control patch (REVERB patch) of the reverberator, sub-patch to SYNTH patch
In ‚reverb‛ w e have the three inlets called 1, 2, and 3 (see Figure 56). Inlet 1 goes to a ‚tapin~ ‛
object which stores the audio signal. The signal is then activated with different delays by the
‚tapout~ ‛ object.The num ber in the ‚tapin~ ‛ objectdeterm ines how long the stored audio signal
can be in m s,and the num bers in the tw o ‚tapout~ ‛ objects setthe delays for the signalin m s.
Page 67 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Figure 56: The content of the reverberator (reverb patch), sub patch to REVERB patch
The 2*6 outlets ofthe ‚tapout~ ‛ objects are m ultiplied w ith differentam plitude factors and are
then connected in a signal multiplier. These signals are the early reflections. The gain for the early
reflections is set by inlet 2.
The sixth outlet from each of the tw o ‚tapout~ ‛ objects also sends the signalto three all-pass
filters each. The ‚allpass~ ‛ objects have different delay times, the lowest is 24ms and the highest
is 40m s.The ‚allpass~ ‛ objects creates the late reflections and inlet3 controls the reverberation
time.
Finally,allof the signals are connected and sentback to the ‚R EVER B ‛ patch via an outlet.
The Master Section
The Master section collects the sound produced by the two synthesizers and the loop playback
patch, controls the overall volume, and sends the final result to the soundcard (see Figure 57)
The master volume can be controlled by the user by sending values from the tactile sensor.
The Max/MSP object ‚dac~ ‛ is used to convert the signals from analogue to digital(see ‚
Page 68 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Digital Representation of Sound‛ on page 12 for details about analogue to digital conversion)
Figure 57: Max/MSP patch for Master section
Page 69 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
USER TEST
Introduction to User Test
The system has so far been optimized using results of the analysis of our own voices as
described in the chapter ‚Voice-to-drum Frequency Analysis‛, page 24. In order to see if the
adjustments we have made to the system also applies to other users we have performed a user
test. The purpose of the user test is to ensure that we have reached our success criteria, which
are:

The system should be able to w ork in ’real-tim e’.(See chapter ‚R ealTim e‛ on page 35)

Most people should be able to learn to use the system within a reasonable period of time
(approx. 10-15 min).
The test w as carried outas a ‚think-out-loud‛-test, where the users are supposed to use the
system with only few instructions. In order to create a somewhat domestic atmosphere, the test
took place in a room, where the user was accompanied only by the person taking notes during
the test. This was done to prevent the user from getting too nervous or shy to provide us with
useful results.
We just monitor and take notes of the behaviour and the comments from the users. After testing
and experiencing the system we ask them some questions according to a questionnaire43.
We will compare the reactions and answers from the users to our success criteria.
We have tested the system on six users, half of them female and the other half male. It was
important to us to have an equal amount of both genders represented in the test, because the
optimization before the test had been carried out using male voices only.
We were interested in seeing how female voices could control the system with the Sound Input
Analysis mechanism described on page 51.
43 See the questionnaire and the answ ers in ‛Appendix 4 – User Test Questionnaire with Answers‛, on page 91
Page 70 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The users for the test were from different nationalities and were all between 21 and 29 years old.
All of them were experienced computer users, but only one of them was a little experienced in
computer music. Three of the users had played regular instruments before.
The test was carried out using a firewire low latency sound card and a good handheld
microphone in order to create an optimal scenario for the users. (See ‚Appendix 2 - Sound
Hardware Specifications‛ on page 89)
Summary of the Answers
All in all the users found the system fascinating and were thrilled to be able to trigger synthetic
drum sounds with their voice. They were all able to trigger the two different drums sounds
individually after only a few tries. However, it was more difficult for them to create a groove with
more than 4 beats in a row.
Some users noticed that the snare drum was more difficult to trigger than the bass drum,
irrespective of gender.
Many users found the menu system a bit difficult to understand, because of the sub-menu. Some
of them said that this could be avoided by displaying the actual positioning in the menu system
visually.
Another problem to the users was reading the value of the potentiometer, which is not shown in
the GUI after applying the value.
All of the users agreed that the tactile sensor would be more efficient and easy to use on the floor.
A user would like the control board to w ork in ‚real-tim e‛, w ithout having to apply your effects to
the output sound every time you want to make a change.
Two of the users noticed that sometimes the buttons produce voltages that make unwanted
menus appear.
Page 71 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The three users with musical experience would like the possibility of recording your performance
within the system in order to improve their skills, but also to make more complex rhythmic
patterns, by playing along with oneself.
One user found that the music to play along with was too fast for him.
Another user noticed the difficulties of saying the correct input sound in order to trigger the
system, when the output sound was considerably different.
All the users liked the sound of the snare and base drum, and the effects. One of them would
have liked to have more effects for the bass drum.
None of the users noticed any latency from the system.
Conclusion of User Test
This test was only performed on a total of six users, which is not quite enough to be representative
according to statistics. Nevertheless the statements of the users can be used as pointers that can
help us improve our prototype before testing a more final version on a real target group.
We have partly fulfilled our criteria of success. All of the users found the system interesting and
fun. All of them were able to trigger the sounds, even though the system was set up exactly the
same way for all of the participants, and the fact that the users used different microphone
techniques. This proves that the users after a short presentation will be able to use the
fundamental functionalities of the program.
Within the time frame of 15 minutes none of the users could really create a longer and coherent
groove, which was part of our success criteria. The users noticed that it was difficult to trigger the
sounds fast after each other. The triggering was also difficult because of the difference in the
input sound they had to produce compared to the output sound from the system. These problems
could possibly be avoided if the users had more time to adjust to the system and practise, just as
you have to with other instruments.
Page 72 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Generally, the users found the bass drum easier to trigger than the snare drum. This indicates that
the Sound Input Analysis could be tweaked and improved in order to ease this.
The issues concerning the tactile sensor is not our first priority, compared to the rest of the
system, since the controller is a very basic prototype, which we in a final version of the system
would create in cooperation with more trained and skilled engineers.
Also the issues concerning the flow of the menu system will not be corrected, since these are out
of the primary scope of this project.
None of the users experienced any latency what so ever, so the system works sufficiently fast.
This means that we have fully fulfilled one of our main success criteria.
It is our experience that it is possible to achieve a useful result for entertainment purposes with an
ordinary sound card, but if you want to use the system in a musical context, you need to have a
low latency soundcard.
We have obtained a lot of useful advice that can help us improve the system in order to remove
bugs and implement small features that enhance the usability and the usefulness of the system.
But basically the primary features of the system are working as intended.
Nevertheless, we have only partly fulfilled our criteria of success. We were able to make a system
that triggers bass drum sounds and snare drum sounds according to voice input, no matter the
gender of the user. The users did not notice any latency on the system at all.
The creation of rhythmic patterns is more difficult than triggering single sounds. This is due to the
following issues:

The snare drum is more difficult to trigger than the bass drum, which we should be able to
improve.

Creating patterns is difficult when the user has to focus on producing the right input
sound, which is very different from the actual output sound. This is something that will
improve with practice.
Page 73 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
CONCLUSION
In this chapter we will conclude on the different parts of the project. We will compare the
questions in our problem formulation to our final results, and then wrap the conclusion up based
on our test chapter. After this we will discuss the future improvements that can be done within the
scope of the problem definition. Lastly we will try to put this project into perspective.
General Conclusion
Sound Analysis
‚H ow can w e analyze voice-to-drum simulated sounds, inputted through a microphone, and
identify the differentdrum types?‛
When we took a look at the characteristics of percussive sounds, we found that the human voice
has tw o im portant features w hen sim ulating drum sounds (see chapter ‚Percussive Sound
Characteristics‛ on page 23). These are the difference in frequency content when simulating
respectively a bass drum sound and a snare drum sound, and the rapid increase in amplitude at
the beginning of the sound (the attack). These elements were implemented in our system as the
criteria behind detecting if the incoming sound is a voice-simulated drum sound, and then which
kind of drum sound it is.
In order to detect a rapid attack we used the bonk~ object in Max\MSP. This object made it
possible to set up rules defining what criteria the attack of the input sound should meet, in order
to qualify as a potential drum triggering sound (see chapter ‚Attack Detection‛ on page 52).
Based on our analysis of the human voices simulating drum sounds, we acquired knowledge of
which frequency intervals to look for amplitude peaks within, in order to find the human voice
simulating a bass drum sound and a snare drum sound (see ‚Voice-to-drum Frequency
Analysis‛ on page 24). This knowledge was used to set up two band-pass filters to filter the input
sound from the microphone. Based on the filtered signals, we set up some rules to determine
whether the incoming sound is a bass drum, a snare drum or neither. When a specific drum is
detected, and an attack has been detected by the bonk~ object, the sound analysis part of the
system sends a trigger signal to the corresponding sound synthesis part (see chapter ‚Sound
Input Analysis‛ on page 51).
Page 74 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Sound Synthesis
‚H ow do w e create a synthetic set of drum sounds, that resemble the sounds from the vintage
drum machines, and can be triggered in sync with the identified sounds from the voice-to-drum
input?‛
The construction of our bass drum synthesizer and snare drum synthesizer was based on the
structure of a software drum synthesizer called DrumSynth 2. The program is capable of
producing many different drum sounds, resembling the sound of the vintage drum synthesizers,
which we had chosen as the goal for our sound synthesis.
Through a short analysis of DrumSynth 2 (see page 29), we figured out that a combination of
sinusoidal frequency sweeps, noise, overtones created through additive synthesis, band-passed
noise, and distortion in combination with the right amplitude envelope, could be used to create a
bass drum sound and a snare drum sound. Both sounds sounded a lot like the bass drum and
snare drum from one of our own favourite vintage drum synthesizer – the Roland TR-909.
For the bass drum synthesizer we used a sinusoidal sweep in combination with an amplitude
envelope. The snare drum sound is a bit more complex, and consists of a sinusoidal frequency
sweep, two overtones made with additive synthesis, noise, and band-passed white noise, in
combination with individual amplitude envelopes for the four different elements. (see Sound
Synthesis on page 58)
The playback of these two drum sounds are synchronized to the trigger signal received from the
sound analysis part.
Sensor
‚H ow do w e create a tactile sensor thatresem bles a floorboard,and can m anipulate the
parameters ofthe synthetically created drum sounds?‛
Our primary idea was to create a foot controlled floor board for this system, but we ended up
creating a handheld prototype, which had the exact same functionality as the intended one. On
the sensor side, this project has mainly been focusing on the development of the handheld
controller, and only describing the theory of a microphone, since the standardised microphones
available to us suited our needs in that regard.
Page 75 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
The tactile sensor created for this project was created from our theoretical sensor knowledge,
described in ‛Appendix 1 - Sensor Theory‛ on page 84. The actual tactile sensor consists of a
potentiometer and a control board created as a voltage divider switching network, which are both
connected to a microprocessor (Teleo Intro Module).
The voltage divider switching network was created to achieve unique voltages when pressing a
button on the control board. These voltages, together with the voltages from the potentiometer,
could be used to control some of the parameters of the software system after the conversion. The
choice of accessing the computer through the microprocessor eased the conversion of voltages
to binary digits, which else wise could have been achieved through the serial or parallel port.
The Mapping
We believe that we have made a reasonable mapping for the microphone for this system. Input
sounds in the low frequency area has been mapped to the synthesis of a low frequency sound
(bass drum), and the high frequencies have been mapped to a higher frequency sound (snare
drum), which seems natural and easy to understand.
The tactile sensor is also w orking as w e hoped,although a few m inor ‚errors‛ exists.For som e
reason turning the potentiometer clockwise produces decreasing values. It would seem more
natural to have it the other way around; since this is also the way most turning knobs work.
Besides this we believe the mapping is good. Improvements for creating an even better mapping
could be using e.g. the amplitude of the input voice in the output sound from the synthesizer.
Testing
The main purpose of the test was to measure if we had reached our predefined success criteria.
The test was performed on six users, of who half was female and the other half was male.
We needed to find out if the users noticed any latency, which could reduce the usability and
usefulness of the system in a musical context. We also wanted the system to be easily accessible,
meaning that the user should be able to play the instrument within a reasonable period of time
(10-15 minutes).
Even though six are not enough to conclude anything statistically, the results of the test can be
used as a pointer giving us qualitative information for further improvements and tweaking.
Page 76 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
All of the six participants were able to play the instrument within the given period of time, but not
all succeeded equally well. They could all trigger the two sounds individually, but they all had
difficulties creating rhythmic patterns with combinations of the two sounds. This can be due to the
fact that it takes more practice to learn the system, and say a particular sound, when the output
sound is obviously different.
Some of the users also noticed that the snare drum was harder to trigger than the bass drum.
This is possibly due to the fact, that the filter analysing the input sound is optimized using only the
data of measurements of 36 different voice-simulated drum sounds done by 3 persons. This
triggering could be im proved by ‛training‛ the analysis partofthe system w ith the sounds from
more different users saying drum sounds.
All in all, we feel the system works in a satisfying way, compared to the success criteria we initially
had set up. There is always room for improvement, of course, and we will cover those in the next
chapter.
Future Improvements
Here we will cover some of the improvements we feel we could make within the scope of the
problem definition.
The detection of the input sounds could be improved, if we spent some more time on analysing a
wider range of sounds, like mentioned above, and by using more band-pass filters and a more
complex set of detection rules, for example.
The sound of the synthesized drums were satisfying, but could be made to sound more like the
sounds from the TR-909, that we were trying to emulate, had we spent more time on tweaking
them. We could also have provided a wider range of possible output sounds and effects, and
maybe even have tried to emulate some more natural sounding drums.
The tactile sensor works in the basic way it was intended to do, but we did not build it as a floor
board as originally planned. This, of course, could have been changed if we had the time and
resources necessary to do so.
Page 77 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Perspective
Throughout the last decade many home studios have appeared due to the fast dropping prices
on PCs and the recording possibilities that fast modern computers provide. It has been made
possible to create great sounding music in your bedroom, but still you need to have some
knowledge of how to play instruments and arrange music. There has also been a tendency of
easing the production of electronic music with programs that allows you to create music by
combining small samples of music played by real musicians.
We wanted to allow non-musicians to be able to hear their musical ideas played with instrument
sounds on a stereo, instead of keeping the music inside their heads.
The techniques behind this project are relatively simple and still need development in order to
make them work in a more consistent way and for more users, but our vision has proven to work in
reality.
We imagine that a more final version of Funkmeister7 also would be able to track frequency and
amplitude in combination in order to synthesize melodic instruments such as bas, piano or flute
from these parameters. Perhaps the parameters of the input sound could be mapped to MIDI
signals, in order to make the system trigger samples of real instruments and thereby expand the
sonic possibilities of Funkmeister7.
In this way the Funkmeister7 could be an entire music recording program that enables the user to
create and edit entire musical productions without touching a regular instrument, but only by the
use of the voice.
Page 78 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
LITERATURE AND SOURCES
Primary Literature
‚C om puter M usic – second edition‛
Charles Dodge and Thomas A. Jerse
Published in 1997 by Schirmer, Thomson Learning
ISBN: 0-02-864682-7
Papers
‚Percussion Synthesis‛
Stephen Dill, first-year EE graduate from Stanford University
http://ccrma-www.stanford.edu/~sdill/220A-project/drums.html
Found 2nd of May 2005
‚The im portance of param eter m apping in electronic instrum ent design‛
Andy Hunt, Marcelo M. Wanderley, Matthew Paradis, for NIME-02.
Borrowed from Juraj Kojs (juko@media.aau.dk), teacher at MED4, spring 2005
‚M apping perform er param eters to synthesis engines‛
Andy Hunt and Marcelo M. Vanderlay, Department of Electronics, the University of York.
Borrowed from Juraj Kojs (juko@media.aau.dk), teacher at MED4, spring 2005
‚D istributed R eal-Tim e A udio Processing‛
Master by Nelson Posse Lago
Distributed Systems Research Group
http://gsd.ime.usp.br/~lago/masters/extended_abstract.pdf
Found 15th of May 2005
Page 79 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Lectures
‚SignalProcessing in A utom atic Perception 2‛
Lecture 1 Review, by Stefania Serafin
MED4 2005, Aalborg University Copenhagen
Spring 2005
http://www.media.aau.dk/ap2/lecture1ap2.pdf
Found 6th of May
‚Video Segm entation in A utom atic Perception1‛
Lecture 17 and 18, by Thomas Moeslund
MED3, Aalborg University, Copenhagen
Autumn 2004
http://www.cvmt.dk/education/teaching/e04/MED3/AP/ap17+18.ppt
Found 11th of May
Internet
Wikipedia – web encyclopedia
http://en.wikipedia.org/wiki/Main_Page
Found 24th of April 2005
Synthopia – Portal for electronic music
http://www.synthtopia.com/
Found 26th of April 2005
Max\MSP tutorials and topics
http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf
Found 24th of April 2005
Chienworks – computer services
http://www.chienworks.com/
Found 26th of April 2005
Page 80 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Signal Processing in AP2 – website for AP2 sound course on MED4
http://www.media.aau.dk/ap2/soundap2.html
Found 18th of April 2005
Harmony Central - Internet resource for musicians
http://www.harmony-central.com/
Found 27th of April 2005
Sound on Sound – web based music recording technology magazine
http://www.soundonsound.com/
Found 2nd of May 2005
Geofex – Guitar effects oriented webpage
http://www.geofex.com/
Found 10th of May 2005
Sensor Technology – website for Sensor Technology course MED4
http://www.smilen.net/SensorTechMED4/
Found 1st of April
Makingthings – contract services and tools for prototyping and development tools
http://www.makingthings.com/
Found 25th of March 2005
Realtime Audio Analysis Tools for PD and Max
Real-time audio analysis tools for Pd and MSP
http://www-crca.ucsd.edu/~tapel/icmc98.pdf
Found 3rd of April
Resistor Colour Code Tutorial
http://www.uoguelph.ca/~antoon/gadgets/resistors/resistor.htm
Found 5th of May
Page 81 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
O hm ’s Law and M aterials Properties
http://www.techfak.uni-kiel.de/matwis/amat/elmat_en/kap_1/backbone/r1_3_2.html#_13
Found 5th of May
Play Hookey – technical information
http://www.play-hookey.com/dc_theory/voltage_divider.html
Found 6th of May
Terratec – Audio Equipment
http://audioen.terratec.net/modules.php?op=modload&name=News&file=article&sid=5
Found 25th of May
Same Day Music – Musical instrument store
http://www.samedaymusic.com/product--BEHMIC100
Found 25th of May
Shure – audio equipment
http://www.shure.com/microphones/models/sm58.asp
Found 26h of May
Group 405 – Funkmeister7 homepage
http://cphstud.aue.aau.dk/~ka1147/
Found 26th of May 2005
Page 82 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Software
Drumsynth 2 – Drum sound synthesizer
http://www.hitsquad.com/smm/programs/DrumSynth/
Found 1st of April 2005
Max/MSP – demo of version 4.5.4
http://www.cycling74.com/products/dlmaxmsp.html
Found 1st of April 2005
Bonk~ - extra Max\MSP object for amplitude detection
http://www-crca.ucsd.edu/~tapel/software.html
Found 3rd of April 2005-05-25
Synthesizer with reverb – used for reverb creation
http://www.geocities.com/snottywong_1999/maxmsp/computermuzak.sit
Found 6th of May 2005
Page 83 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
APPENDICES
Appendix 1 - Sensor Theory
To explain a little more deeply what we are going to build, we will have to look at some of the
basics in electronics. We want to build a simple switching circuit, thereby general theory about
voltage, current, resistance, circuits, switches, variable resistors, and voltage divider switching
network circuits will need to be explained.
Voltage
Voltage is defined as electric potential between two points in a circuit; the voltage can be
understood as the pressure pushing the current (electrons) at a given point. The electrons are
pushed through the circuit by an electrical force (difference of pressure) because of potential
difference in the circuit. If a difference in the amount of electrons exists between the poles of e.g.
a battery in a circuit,the pole w ith the m ostelectrons w ill‚push‛ electrons through the circuit. In
real life a battery achieves this by an electrochemical process, which results in the minus side
having a large amount of charged electrons, while the positive side has fewer. If the two poles in
the battery are connected through a circuit and difference in potential exists the electrons will be
forced to move through the circuit thereby producing current. The voltage (difference in pressure)
is measured in volts [V].
Current
Current is the amount of free electrons pushed from one point to another (through an electric
circuit) in a given period of time. The flow of electrons can be compared to the water running
through a pipe. How fast the current goes is decided by the amount of pressure (voltage). The
current is the amount of charged electrons at one point in the circuit per second, and it is
measured in ampere [A], the amount of ampere is proportional to the number of charged
electrons.
Page 84 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Resistance
As a result of the crystal structure of most conductors (metals), the free electrons whose
movement makes up the electric current, experience collisions with the crystal lattice, which
converts part of the electrical energy into heat – this phenomenon is known as electric resistance.
R esistance is the m easure ofhow m uch an electricalcom ponent‚refuses‛ to let current flow
through it. Everything in a circuit has a certain amount of resistance, even the wires in the copper
lanes on print boards, but this resistance is so small that it is not taken into consideration. In
electric circuits it is possible to insert resistance at any point one wishes. When resistance is
introduced in our system, we are talking about specific components made to lower the flow of
current. Such components are called Resistors, their resistance is measured in Ohms [ ] and
they come in almost an endless variety of resistance values.
How much resistance a resistor produces can visually be determined by the coloured rings
painted on the resistor, which follow the resistor colour code standard44. As an example a 4Kohm
resistor would look like Figure 58.
Figure 58: A 4Kohm resistor45
Figure 59: A drawing of two resistors
A s m entioned resistance is m easured in O hm ’s and in technical terms can be described as the
relationship betw een voltage and current in O hm ic m aterials,also know n as O hm ’s law (V = I * R ).
O hm ’s law 46 states that voltage at the ends of the poles and the current flowing through a
conductor (i.e. resistor) are proportional to each other at a given temperature (V / I = R) 47. Voltage
at the terminal ends of e.g. a resistor is proportional to both the current (I) flowing through the
element and the resistance (R) of the element. Therefore we write as V = I * R. The resistance of
an element changes with temperature, which is why the law is valid at a given temperature.
By using this equation it can be determined what voltages we might end up with in the end.
44
http://www.uoguelph.ca/~antoon/gadgets/resistors/resistor.htm
From S m ilen D im itrov’s slide: S T_M ED 4_05_C ircuit_Theory_Elem entary_M easurem ent_Labs01.ppt
46
http://www.techfak.uni-kiel.de/matwis/amat/elmat_en/kap_1/backbone/r1_3_2.html#_13
47
http://en.wikipedia.org/wiki/Ohms_Law
45
Page 85 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Circuits
A simple circuit (see Figure 60) consists of a power source and a closed path of conductors,
which forms a circular path with the electric power source, where the current flows through. If the
energy source’s poles are only connected together w ith a good conductor it will be a short circuit,
m eaning som ething w ill‚crash‛.This is because the electrons do notexperience m uch resistance
from the conductor that connects the poles, and the electrons simply return the energy that the
source gave back to the source. To make the circuit work, all that has to be done is to insert
something that provides resistance, like a resistor or a light bulb, and then the circuit is complete
and qualifies as the simplest circuit possible.
Figure 60: Illustration of a simple circuit 48
When talking about circuits there are a few things one has to be aware of. First of all is the fact
that when making the schematics/diagram for a circuit the charged electrons is written as moving
from positive to negative, although they actually move from minus to plus.
When the circuit is built from resistors in series, the current stays the same, but the voltage
changes49.
Switches
Switches are used in circuits to open or close the circuit. When the switch is closed (e.g. pushed
down when talking about push buttons) the current will flow through the circuit, when it is open
nothing will happen. The pushbutton50 has a default state of open, which is whenever energy is not
applied to the button nothing will happen in the circuit. This is illustrated in Figure 61.
48
Image from S m ilen D im itrov’s slid e: ST_M ED 4_05_C ircuit_Theory_Elem entary_M easurem ent_Labs01.ppt
http://en.wikipedia.org/wiki/Resistor
50
http://www.bcae1.com/images/gifs/switpush.gif
49
Page 86 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Push button actuator
Plunger
Movable contact
Stationary contacts
Figure 61: Push button schematic (open)
Solder lug terminals
Variable Resistor and Voltage Divider Switching Network
A potentiometer51 is also a resistor but it has a defined range of resistance that
can be adjusted. The potentiometer has three terminals where the middle
terminal is the ground and also it is where the difference is measured.
If 5 V are connected to the right and left terminal, the connection from middle to
right will have one output voltage and the connection from middle to left
Figure 62 - Typical
will have a second output voltage. As the meter is trimmed, the resistance
potentiometer
will be increasing or decreasing on both sides, as middle-left connection
(taken from wikipedia)
gets smaller, middle-right connection gets bigger and vice versa. This is also the basic concept
of a voltage divider switching network (voltage divider), in another way one can say that the
potentiometer acts as a voltage divider but with only two possible output voltages.
The voltage divider52 has the ability to produce more than two output voltages using only one
circuit. It is made of series combination of 7 resistors and 6 corresponding push button switches
as on the illustration. There is only one loop in the circuit and thus only one current. By pressing a
given switch, a part of the resistors are short-circuited, so the total amount of the resistance in the
circuit changes, and the output voltage changes as well. Due to this specific construction, it is
only possible to detect one button press at a time. Therefore if two buttons are pressed
sim ultaneously,the ‚low est‛ one is reported,since it short-circuits the ‚upper‛ one.The principle
of the operation can be demonstrated with two cases:
51
52
http://en.wikipedia.org/wiki/Potentiometer
http://www.play-hookey.com/dc_theory/voltage_divider.html
Page 87 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
In the first case all the switches are open:
The current in the circuit is
I
(1)
U
R1  R2  R3  R4  R5  R6  R7
Then the output voltage (which is the voltage drop on resistor R1 due to current I) is
(2)
Vout  R1 I  R1
U
R1  R2  R3  R4  R5  R6  R7
or the same formula, rewritten as a voltage divider:
(3)
Vout  U
R1
R1  R2  R3  R4  R5  R6  R7
In the second case switch nr. 3 is closed:
The current in the circuit is
(4)
I
U
R1  R2  R3
Then the output voltage (which is the voltage drop on resistor R1 due to current I) is
(5)
Vout  R1 I  R1
U
R1  R2  R3
or the same formula, rewritten as a voltage divider:
(6)
Vout  U
R1
R1  R2  R3
These two cases illustrate that for every pressed switch, there is generated a distinct output
voltage. These distinct outputs can then be used to get distinct numerical values in our
programming environment.
Page 88 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Appendix 2 - Sound Hardware Specifications
The following hardware has been used during sound analysis and test phase:
Terratec FW2453 (low latency firewire soundcard)
Behringer MIC10054 (Tube pre-amp)
Shure SM5855 (handheld dynamic microphone)
Labtec AM-22 (cheap handheld dynamic microphone for computer use)
53
http://audioen.terratec.net/modules.php?op=modload&name=News&file=article&sid=5
http://www.samedaymusic.com/product--BEHMIC100
55
http://www.shure.com/microphones/models/sm58.asp
54
Page 89 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Appendix 3 – Bass and Snare drum peak frequencies of
recorded voice-to-drum samples
Bass Drum:
Filename
Main Frequency Peak (Hz)
BD_Allan_Labtec_01.wav
318
BD_Allan_Labtec_02.wav
289
BD_Allan_Labtec_03.wav
301
BD_Allan_Shure_01.wav
199
BD_Allan_Shure_02.wav
200
BD_Allan_Shure_03.wav
200
BD_Kasper_Labtec_01.wav
331
BD_Kasper_Labtec_02.wav
309
BD_Kasper_Labtec_03.wav
330
BD_Kasper_Shure_01.wav
167
BD_Kasper_Shure_02.wav
168
BD_Kasper_Shure_03.wav
170
BD_Mads_Labtec_01.wav
258
BD_Mads_Labtec_02.wav
302
BD_Mads_Labtec_03.wav
336
BD_Mads_Shure_01.wav
110
BD_Mads_Shure_02.wav
118
BD_Mads_Shure_03.wav
186
Page 90 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Snare Drum:
Filename
Main Frequency Peak (Hz)
SD_Allan_Labtec_01.wav
2647
SD_Allan_Labtec_02.wav
3360
SD_Allan_Labtec_03.wav
2792
SD_Allan_Shure_01.wav
3317
SD_Allan_Shure_02.wav
3461
SD_Allan_Shure_03.wav
3346
SD_Kasper_Labtec_01.wav
2386
SD_Kasper_Labtec_02.wav
2395
SD_Kasper_Labtec_03.wav
2643
SD_Kasper_Shure_01.wav
2230
SD_Kasper_Shure_02.wav
2234
SD_Kasper_Shure_03.wav
2296
SD_Mads_Labtec_01.wav
2905
SD_Mads_Labtec_02.wav
2802
SD_Mads_Labtec_03.wav
3402
SD_Mads_Shure_01.wav
3360
SD_Mads_Shure_02.wav
2834
SD_Mads_Shure_03.wav
2841
Page 91 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Appendix 4 – User Test Questionnaire with Answers
User test of Funkmeister7 – 01
Facts about the user
Age, nationality, sex: 22 years, Danish, male
Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no
Has the user any kind of previous experience with musical instruments or singing? If yes; what
kind of instrument and for how many years? no
Has the user any kind of previous experience with drum-synth programs? If yes; what/which
program(s) and for how many years? no
Observations during the user test
Was the user able to trigger the snare drum and base drum? Yes, individually he was able to
trigger the drums ok, but he had difficulties producing in a sequence of sounds
How long time did it take the user to figure out how to play the FUNKMEISTER7? After an
introduction to the system, it only took him a couple of minutes, but he was able to create a
coherent pattern with different drum sounds with the ten minutes it took.
Could the user figure out how to use the control board? At first the user found the menu system
difficult to understand, so I introduced it to him. After this it was ok.
Questions to the users
What was good/bad? ‚Easy to trigger w hen you know the sound‛. It’s difficultto figure out w hat
value the chosen effect already has!
What did you miss, what could have been done better/different? BD is too nice, need more rough
effects
How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were
there any real-time related problems?) Nice that it actually works! No latency
How do you think the control board works? The buttons do not work every time, it would be nice to
have on the floor
What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu
system/entire system works?) Difficult to understand that the menu has a main and sub part
What do you think of the bass and snare drum sounds? Fine
What do you think of the effects you can apply? (Do you miss some effects?) More on BD
Anything else? The interface is nice!
Page 92 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
User test of Funkmeister7 – 02
Facts about the user
Age, nationality, sex 29, Danish, male
Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no
Has the user any kind of previous experience with musical instruments or singing? If yes; what
kind of instrument and for how many years? Sings, plays guitar, since age 12
Has the user any kind of previous experience with drum-synth programs? If yes; what/which
program(s) and for how many years? no
Observations during the user test
Was the user able to trigger the snare drum and base drum? Starting difficulties, especially SD
How long time did it take the user to figure out how to play the FUNKMEISTER7? Could create
simple patterns after approx. 5 minutes
Could the user figure out how to use the control board? Ok, after a short presentation.
Questions to the users
What was good/bad? BD pitch has a too wide range, ‚the m usic is too fast‛ m aybe use pitch
correction on the songs that are already there?
What did you miss, what could have been done better/different?
How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were
there any real-time related problems?) The system is very difficult to control when you say the
sounds really fast. No latency
How do you think the control board works? Ok, but it should be on the floor
What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu
system/entire system works?) R eally sim ple = nice.‚Itw ould be nice to be able to see w here Iam
in the m enu system ‛
What do you think of the bass and snare drum sounds? Ok
What do you think of the effects you can apply? (Do you miss some effects?)
Anything else? Does the input monitor work for girls? It would be nice to have the possibility to
save your user presets (favourite adjustments) Real-time adjustment of the sounds would give
more expression possibilities.
Page 93 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
User test of Funkmeister7 – 03
Facts about the user
Age, nationality, sex 24, Danish, female
Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no
Has the user any kind of previous experience with musical instruments or singing? If yes; what
kind of instrument and for how many years? no
Has the user any kind of previous experience with drum-synth programs? If yes; what/which
program(s) and for how many years? no
Observations during the user test
Was the user able to trigger the snare drum and base drum? Individually it was ok. The BD was
fine,butthe SD dem anded m ore resources from her… itw as better over tim e
How long time did it take the user to figure out how to play the FUNKMEISTER7? 3 minutes for the
basics
Could the user figure out how to use the control board? Difficult to understand the used terms,
because she does not play m usic, ‚w hy do the values not w ork in real-tim e?‛,‚w hy can’tI see
the currentstatus?‛
Questions to the users
What was good/bad? It actually works. Nice!
What did you miss, what could have been done better/different? The menu is a bit tricky
How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were
there any real-time related problems?) fast learner, so the basics took about 5 minutes. No
latency
How do you think the control board works?
What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu
system/entire system works?) 90s layout, easy to see what is going on,‚m aybe the m enu should
lighten up the chosen instrum ent so you know w here you are?‛
What do you think of the bass and snare drum sounds? fine
What do you think of the effects you can apply? (Do you miss some effects?) OK
Anything else? ‚Saying another sound than w hatyou hear is difficult‛. D ifficultto trigger the drum s
fast
Page 94 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
User test of Funkmeister7 – 04
Facts about the users
Age, nationality, sex 21, Macedonia, female
Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no
Has the user any kind of previous experience with musical instruments or singing? If yes; what
kind of instrument and for how many years? no
Has the user any kind of previous experience with drum-synth programs? If yes; what/which
program(s) and for how many years? no
Observations during the user test
Was the user able to trigger the snare drum and base drum? Ok, after a while
How long time did it take the user to figure out how to play the FUNKMEISTER7? A w hile… never
really succeeded… was too shy
Could the user figure out how to use the control board? Yes, itw as easy to her… but the system
was a bit unstable because of voltage issues.
Questions to the users
What was good/bad? What did you miss, what could have been done better/different? How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were
there any real-time related problems?) The basics were ok after a few minutes. No latency
How do you think the control board works? ok
What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu
system/entire system works?) it is very nice
What do you think of the bass and snare drum sounds? What do you think of the effects you can apply? (Do you miss some effects?) Anything else? -
Page 95 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
User test of Funkmeister7 – 05
Facts about the users
Age, nationality, sex 24, Island, female
Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no
Has the user any kind of previous experience with musical instruments or singing? If yes; what
kind of instrument and for how many years? Plays piano and flute, since child
Has the user any kind of previous experience with drum-synth programs? If yes; what/which
program(s) and for how many years? no
Observations during the user test
Was the user able to trigger the snare drum and base drum? Good individually. SD difficult
How long time did it take the user to figure out how to play the FUNKMEISTER7? Approx. 5
minutes
Could the user figure out how to use the control board? Yes, after a short explanation
Questions to the users
What was good/bad? What did you miss, what could have been done better/different? How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were
there any real-time related problems?) No latency
How do you think the control board works? Ok, would be better on the floor
What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu
system/entire system works?) looks nice
What do you think of the bass and snare drum sounds? good
What do you think of the effects you can apply? (Do you miss some effects?) fun
Anything else? It would be nice to hear your performance afterwards. A recording option
Page 96 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
User test of Funkmeister7 – 06
Facts about the users
Age, nationality, sex 26, Danish, male
Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no
Has the user any kind of previous experience with musical instruments or singing? If yes; what
kind of instrument and for how many years? Plays guitar and sings
Has the user any kind of previous experience with drum-synth programs? If yes; what/which
program(s) and for how many years? Not really, but has used Fruity Loops a bit, and Cool Edit
(sound editor)
Observations during the user test
Was the user able to trigger the snare drum and base drum? Yes, individually, but did not trigger
every time
How long time did it take the user to figure out how to play the FUNKMEISTER7? Approx. 3
minutes, but the triggering was not perfect
Could the user figure out how to use the control board? Yes, after a short introduction
Questions to the users
What was good/bad? B D difficult to do fast, it is not precise enough,‚m aybe practice will improve
m y skills?‛
What did you miss, what could have been done better/different?
How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were
there any real-time related problems?) Fine, but sometimes there is sound even if you are not
saying anything
How do you think the control board works?
What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu
system/entire system works?) What do you think of the bass and snare drum sounds? Many adjustments are possible, but the
interface is ‚narrow ‛… w hich is not good in a live situation
What do you think of the effects you can apply? (Do you miss some effects?) I would expect them
to be the w ay they are (m andatory)… they sound fine
Page 97 of 98
Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005
Anything else? The songs are very ‚N ik & Jay‛-like, the amplitude clips, it would be nice with a
recording option in order to play along with yourself and create more complex patterns.
Page 98 of 98
Download