Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 FUNKMEISTER 7 ABSTRACT The Funkmeister7 project is about the creation of a voice controlled, digital drum synthesizer that is triggered by the input from a standardised microphone. The output sound can be altered with a tactile sensor that we have created ourselves. The core of the project is the analysis of the human voice simulating drum sounds, in order to extract the needed information to trigger the drum synthesizer we have built. The voice input from the microphone is analysed in order to find the attack characteristics of a percussive sound and the frequency characteristics of a bass drum and a snare drum, respectively. The result of this analysis is used to trigger a drum synthesizer, which we built in Max/MSP. The drum synthesizer uses elements that simulate the sound of vintage drum synthesizers. These are the combination of frequency swept sinusoidal waves, noise, noise with band-passed filter, additive synthesis and amplitude envelopes. The tactile sensor, used to alter the parameters of output sound from the drum synthesizer, consists of a voltage divider switching network and a potentiometer, connected to a microprocessor. After the design and implementation of the system we subject it to user tests, and draw conclusions based on these. Page 2 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 TABLE OF CONTENTS ABSTRACT 2 INTRODUCTION 6 PROBLEM DEFINITION 7 Delimitations 8 Success Criteria 8 RESEARCH AND THEORY 9 Sound Theory 9 The Basics of Sound 9 Digital Representation of Sound 12 Filters and Effects 14 Sound Synthesis 20 ANALYSIS AND DISCUSSION 23 Percussive Sound Characteristics 23 Amplitude Envelope 23 Voice-to-drum Frequency Analysis 24 Conclusion of Percussive Sound Characteristics 29 Short Analysis of DrumSynth 2 29 Electronic Sensor Interfacing 31 Mapping 32 Real-time 35 DESIGN & IMPLEMENTATION 37 Short Description of the System 37 Page 3 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 System overview 38 Mapping 41 Mapping the microphone 42 Mapping the Tactile Sensor 42 The Tactile Sensor 45 Transducing Motion and Pressure 46 Formatting the Transduced Data 47 The Tactile Sensor – Software (Max/MSP) 48 Sound Input Analysis 51 Introduction to Sound Input Analysis 51 Attack Detection 52 Drum-type Detection 53 Drum Triggering 56 Sound Synthesis 58 Introduction to Sound Synthesis 58 Description of the Sound Synthesis Blocks 59 USER TEST 70 Introduction to User Test 70 Summary of the Answers 71 Conclusion of User Test 72 CONCLUSION 74 General Conclusion 74 Sound Analysis 74 Sound Synthesis 75 Sensor 75 The Mapping 76 Testing 76 Future Improvements 77 Page 4 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Perspective 78 LITERATURE AND SOURCES 79 Primary Literature 79 Papers 79 Lectures 80 Internet 80 Software 83 APPENDICES 84 Appendix 1 - Sensor Theory 84 Voltage 84 Current 84 Resistance 85 Circuits 86 Switches 86 Variable Resistor and Voltage Divider Switching Network 87 Appendix 2 - Sound Hardware Specifications 89 Appendix 3 – Bass and Snare drum peak frequencies of recorded voice-to-drum samples 90 Appendix 4 – User Test Questionnaire with Answers 92 User test of Funkmeister7 – 01 92 User test of Funkmeister7 – 02 93 User test of Funkmeister7 – 03 94 User test of Funkmeister7 – 04 95 User test of Funkmeister7 – 05 96 User test of Funkmeister7 – 06 97 Page 5 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 INTRODUCTION Have you ever come up with a catchy tune or a cool drum beat and found yourself humming or ‚beat-boxing‛1 it with your mouth? Have you been unable to put it to any good use, because of lack of musical knowledge? Imagine if there was a system that could take the humming/beatboxing straight from your mouth and turn it into actual music, without forcing you to learn how to play an instrument or learn any musical theory. This is the entry point to this project – to make the process of music making much more straightforward and intuitive. Figure 1 - Roland TR-909 The basic idea for this project is to create a software-based drum machine system, inspired by the classic vintage ones like the Roland TR-9092 (see Figure 1),thatthe user is able to ‚play‛ w ith his mouth. The drum machine will play different kinds of synthesized, percussive sounds, based on what kind of microphone input it is getting from the user. I.e. if the user tries to imitate a bass drum sound with his mouth, the system should recognize this and play an actual, synthesized bass drum sound. In addition to triggering the drum sounds, the user should also be able to manipulate these sounds with his feet, via a controller unit placed on the floor. All this should be linked together in a simple Graphical User Interface. With this system the user will be able to create a new range of sounds compared to what is possible with just the mouth alone. 1 2 Definition of Beat-boxing: http://en.wikipedia.org/wiki/Beatboxing More info about TR-909: http://www.synthtopia.com/synth_review/RolandTR-909.html Page 6 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 2 shows the initial sketch of the idea. Figure 2 - Initial sketch of the idea PROBLEM DEFINITION How can we analyze voice-to-drum simulated sounds, inputted through a microphone, and identify the different drum types? How do we create a synthetic set of drum sounds, which resemble the sounds from the vintage drum machines, and can be triggered in synchronization with the identified sounds from the voice-to-drum input? How do we create a tactile3 sensor that resembles a floorboard, and can manipulate the parameters of the synthetically created drum sounds? 3 Tactile: Perceptible to the sense of touch; tangible. Page 7 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Delimitations We have decided that the main focus of this project is the sound analysis and sound synthesis, and therefore we are not going to build the tactile sensor as a floor board controller, as we imagine it to be in the final version of the system. Instead we are going to build a smaller tactile sensor that resembles the structure we had in mind for the final version, and which can be controlled by hand. The reason for this is that we simply do not have enough time and resources to produce the floor board at the current time. We will not build our own microphone, as the standardized ones that are available to us suit our needs in a sufficient way. Furthermore, this prototype version of the system will only detect and play two different kinds of drums: bass drum and snare drum. This, again, is a matter of time and resources. These two drums have been chosen, since they are the two most fundamental drum types in most musical genres. Success Criteria In order to be able to determine whether our system has been a success or not, later in the process (see page 70), we need to define some criteria to set a requirement for success: The system should be able to w ork in ’perceived real-tim e’in order to be usable in a musical context. I.e. the user should perceive the triggering and playback of the synthesized drum sound, based on the voice-to-drum input, as happening instantaneously (see ‚Real-time‛ chapter on page 35 for more details). Most people should be able to learn to use the system within a reasonable period of time (approx. 10-15 min) in order to make sure that our system is straightforward and intuitive to use. Page 8 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 RESEARCH AND THEORY Since we have narrowed the project down to be focused mainly on sound analysis and – synthesis, we are now going to look at some of the basic theory behind these. The purpose is to provide a better understanding for the choices we make later in the process. We are going to look at the basics of sound, digital representation of sound, filters and effect, and sound synthesis. We are also going to look at some basic theories behind the concept of ‚m apping‛,in order to geta better idea ofhow to map the input from the tactile sensor and the sound input from the microphone to the synthesized output sounds. The theory behind the tactile sensor can be found in Appendix 1 - Sensor Theory, on page 84. Sound Theory In this chapter we explain some of the fundamentals of sound in general and how sound can be represented in digital form. Then we describe some of the filters and effects that can be applied to sound, and then lastly we explain how basic sound synthesis works. The Basics of Sound4 The sound we hear is basically change in air pressure. When an object moves it will set the air molecules near it in motion and the air molecules next to these molecules in motion, and so on. When this motion reaches our ears, it will be perceived as sound. 4 Sources for this chapter: - ‛M S P Tutorials and Top ics‛, pages 13-19, by Cycling74 - http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf - ‛D igitalA udio - introduction to theory‛ - http://www.chienworks.com/webinfo/digaudio/ - ‛A P 2 – lecture 1‛ slides by Stefania S erafin - http://www.media.aau.dk/ap2/lecture1ap2.pdf Page 9 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Sound can be represented as a graph of air pressure over time (see Figure 3). Figure 3 - Graph of air pressure over time The amplitude of a sound is the amount of change in air pressure, measured in decibel (dB). In general, the higher the amplitude is, the higher the perceived loudness of the sound will be. The amplitude envelope of a sound refers to the shape of the overall change in amplitude over the course of its duration (see Figure 4). Figure 4 - Amplitude envelope The attack part of the amplitude envelope is the range from where the sound starts until it reaches its peak amplitude. Decay is where the amplitude falls to the sustain part of the sound. At the sustain part the amplitude roughly keeps the same level, until it reaches the release part, where the amplitude drops to its final level. These concepts are used in correlation with sound synthesis (see ‛Sound Synthesis‛ on page 20). Page 10 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The frequency of a sound wave is the amount of cycles per second, measured in hertz (Hz) (see Figure 3 on page 10). The higher the frequency is, the higher the perceived pitch of the sound will be. The audible frequency range for a human is approximately 20 - 20,000 Hz. Most sounds contain more than just a single frequency. These are called complex tones. The spectrum of a sound is the combination of all the frequencies, and their amplitudes, the sound consists of. Figure 5 and Figure 6 below show the spectrum of the sound of a snare drum, in the time domain (amplitude over time) and the frequency domain (amplitude over frequency), respectively. Figure 5 - time domain of a snare drum sound Figure 6 - frequency domain of a snare drum sound Each individual frequency of a complex tone is called a partial. When these partials are all integer multiples of the same frequency, the sound has a harmonic spectrum. These harmonic sounds are usually perceived as having a single pitch, and are used for creating music, for example. The partials of an inharmonic sound, on the other hand, are not all integer multiples of a fundamental frequency, and thus they do not blend together in a single perceived pitch as easily as with the harmonic ones. When we perceive noise, the sound consists of a lot of different frequencies with no apparent mathematical relationship. Page 11 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Digital Representation of Sound5 The basic concept of digital sound consists of taking a lot of snapshots of the amplitude values of the sound, saving those values as numbers, and then reproducing the amplitude based on these. Figure 7 shows the process of converting an analogue sound signal into digital information and then playing it back. Figure 7 - Digital recording and playback 6 From the source, the sound goes into a microphone. The microphone converts the change in air pressure into change in electrical voltage. To limit the amount of information that needs to be processed, this change in voltage is only recorded at a certain periodic interval, in a process called sample and hold. Basically, the voltage value is recorded and then kept at that value until the next periodic sample is received (see Figure 8). The amount of samples taken per second is called the sampling rate, and it is measured in Hertz (Hz). 5 The source for this chapter is:‛M S P Tutorials and Top ics‛, pages 21-28, by Cycling74 http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf 6 Source:‛M S P Tutorials and Top ics‛, page 23, by C ycling74 http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf Page 12 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 8 - voltage signal sampled periodically 7 (amplitude over time) To represent a sound accurately, the computer needs to take many samples per second. The Nyquist Theorem 8 states that a computer can only accurately represent frequencies that are half or less than half of the value of the sampling rate. I.e. to accurately sample frequencies up to the 20,000 Hz a human can perceive, we would need to use a sampling rate of at least 40,000 Hz. This is why the signal is sent through a low-pass filter before the sample and hold process, to remove any frequencies above half the sample rate, in order to avoid that those frequencies create noise in the signal (see page 14 for more information on filters). From the sample and hold process, the sampled voltage values then go into a device called Analogue-to-Digital Converter (ADC). Here the voltages are being converted into a string of binary digits, in a process called quantization. The higher the resolution of the quantization is, the more values can be assigned to the amplitude range of the sound, and thus more precisely the sound can be stored digitally. I.e. a resolution of 8 bits allows the amplitude range to be divided into 256 steps (28), 16 bits allows 65,536 steps (216), and so on. If the incoming signal is higher than the maximum quantized amplitude that can be expressed with numbers, the phenomenon clipping occurs. Clipping causes the sound to be cut off, and become more or less distorted (see Figure 9 below). Figure 9 - Clipping of a signal9 7 Source:‛M S P Tutorials and Top ics‛, page 22, by C ycling74 - http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf Source:‛M S P Tutorials and Top ics‛, page 23, by Cycling74 - http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf 9 Source:‛M S P Tutorials and Top ics‛, page 27, by C ycling74 - http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf 8 Page 13 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 After the quantization step, the data is stored on the computer. It takes about 10 MB of memory space to store one minute of audio data in compact disc quality (44,100 Hz, 16 bit, stereo). When the audio needs to be played back again, it first goes through a Digital-To-Analogue Converter (DAC), which transforms the strings of stored binary digits into a continuous stream of voltage. The signal then goes through a low-pass filter, to filter out any potential high-frequency noise created by the sample and hold process, before it is amplified and sent through a speaker. Filters and Effects 10 A filter is used to change the characteristics of a sound, by shaping the spectrum of the signal. It does not change the frequency of a signal, only the amplitude and phase (placement in time). When describing the characteristics of a filter, you apply a sine wave to the input, and measure the differences in the output. The characteristics of the filtered sine wave are called frequency response. The frequency response consists of amplitude response and phase response. Both amplitude response and phase response varies with frequency. Amplitude response is the ratio between the amplitude of the input sine wave, compared to the amplitude of the output sine wave. You can normally tell which filter is being used by the shape of its amplitude response. The phase response describes the phase change of the input compared to the output. Low-pass filter A low-pass filter only allows frequencies below its cut-off frequency point (fc) to pass. Frequencies above this point are removed. However, the filter cannot abruptly cut off the frequencies, and therefore, there will always be a smooth transition between the frequencies that are kept (passband), and the frequencies that are thrown away (stopband). Because of this transition it can be difficult to specify where the cut-off frequency is, but normally it is defined as the point where the signal has dropped -3 dB compared to its maximum dB (see Figure 10 below). 10 Source for this chapter:‚C om p uter M usic‛, pages 171-174. Page 14 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 10 - Amplitude response of a low-pass and a high-pass filter High-pass filter A high-pass filter does the opposite of a low-pass filter. It discards the low frequencies and keeps the high frequencies (see Figure 10). Band-pass filter A band-pass filter is a combination of a low-pass and a high-pass filter. It discards both low and high frequencies with a pass-band in between. It has a centre frequency (f 0) which is the centre of the pass-band, and a bandwidth (BW). The bandwidth is defined by a lower cut-off frequency (fl) and an upper cut-off frequency (fu) (see Figure 11 below). The response of a band-pass filter can often be describe as either sharp or broad, depending on the bandwidth. Figure 11 - Amplitude response of a band-pass and a band-reject filter Page 15 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Band-reject filter The band-reject filter has the opposite amplitude response of a band-pass filter. It rejects a band of frequencies and passes the rest (see Figure 11, above). It is defined by a centre frequency and a bandwidth like the band-pass filter. Delay A delay is created by changing the phase of the signal as it passes through the filter (phase response). It is a simple, but very useful effect. The most basic type of delay is shown in Figure 12. The input signal is played immediately, and after a certain delay time (t), the delayed signal will be played. The delayed signal is multiplied with an amplitude factor (g) (amplitude response), which would normally be below 1, since it otherwise would be louder than the original signal. Figure 12 - Diagram of a basic delay If the delay time is short it is called a slapback delay (approx. 40-120ms), otherwise it is called an echo11. Comb filter Comb filters are used together with all-pass filters (see below) to create reverberation12 (see ‚Reverberation‛ section on page 18). The comb filter works by sending the signal through a delay. The delayed signal is fed back to the input, after being multiplied by an amplitude factor (g). The amount of time the signal uses to go through this loop is determined by the loop time (t) (see Figure 13). The amplitude factor (g) must be between 0 and 1, not including 1, so that the signal will get lower for each loop. The closer g gets to 1 the more extreme the reverberation will sound13. 11 Source: http://www.harmony-central.com/Effects/Articles/Delay/ Source: http://www.harmony-central.com/Effects/Articles/Reverb/ 13 Source: ‚C om p uter M usic‛, page 296 12 Page 16 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 13 - Comb filter All-pass filter All-pass filters are used either alone or together with comb filters to create reverberation (see Reverberation section on page 18). The all-pass filter is similar to a comb filter, but a bit more advanced. Again we have a loop time (t) and the amplitude factor (g) which must be less than 1 (see Figure 14). Figure 14 - All-pass filter In an all-pass filter there is, unlike the comb filter, no delay between the input and output. The first output, or impulse response, will therefore be: 1*-g = -g The next impulse response will be: (1*g)*-g+1 = 1-g 2 And the impulse response after will be: (g*g)*-g +g = g*(1-g2) … and so on14 (see Figure 15) 14 Source: ‚C om p uter M usic‛ p ag e 297 Page 17 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 15 - The impulse response of an all-pass filter Reverberation Reverberation (reverb) is the result of sound reflected off surfaces. It is similar to multiple echoes and yet it is not. Imagine that you stand in a room. In the other end of the room someone hits a drum. The first sound you will hear is the sound wave that has travelled directly from the drum to you. But a few milliseconds later, the same sound will hit you again, but this time the sound wave has not travelled directly, but instead it has been reflected off the walls, floor and/or ceiling (see Figure 16). This w illcontinue untilthe sound ‚dies‛. Every tim e the sound reflects off a surface it looses some of its energy (amplitude). The amount of energy it looses depends on the surface. E.g. hard, solid surfaces, such as marble, absorb very little energy, whereas soft materials, such as curtains, absorb the energy very well. The water vapour in the air is also contributing to the loss of energy. Other factors that influence the amount of reverb are the size and shape of the environment15. 15 Source:‚C om p uter M usic‛, page 289. Page 18 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 16 - Sound is reflected off the different surfaces Humans do not perceive all the reflected sounds as independent sounds, because these hit the listeners within a few milliseconds. But we can hear the effect of the reverb. If you could distinguish between the sounds it would be like echoes16. When creating a digital reverb, a series of delays are not enough. There are more factors than just the delays that influence the effect. These are: Early and late reflections and reverberation time17. The early reflections are the first reflected sound waves that reach the listener. Their amplitudes are almost as high as the sound wave travelling directly to the listener. There is quite a big gap between the arrivals of these early reflections (see Figure 17). The late reflections reach the listener after the early reflections. They arrive much closer to each others and with a more random interval between them. The amplitude of the reflections gets lower as time goes, since these reflections will have travelled farther and have reflected off surfaces more times than the previous ones. However, if you look at a small section of the spectrum, this will vary a little (see Figure 17). The reverberation time is how long time it takes for the sound to die away to 1/1000 th of its original amplitude18. This depends on the size, shape, and surfaces of the environment as explained earlier. 16 Source: http://www.harmony-central.com/Effects/Articles/Reverb/ Source: http://www.harmony-central.com/Effects/Articles/Reverb/ 18 Source:‚C om p uter M usic‛, page 290 17 Page 19 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 17 shows a graphical representation of a reverb. Each of the lines represents the same piece of the sound when it hits the listener. The height shows the amplitude of the sound when it reaches the listener. Figure 1719 - Showing the sound decay over time Sound Synthesis As we mentioned earlier, digital sound is basically strings of binary digits that hold the amplitude information of a sound. We have explained how these numbers can be created from real-life sounds through a microphone and analogue-to-digital conversion (see ‚Digital Representation of Sound‛ chapter on page 12). The basic concept of digital sound synthesis is to create these numbers directly on the computer, without actually having a real-life sound as the source. In this chapter we will explain some of the basic concepts behind sound synthesis. Oscillators The oscillator is fundamental to almost all computer synthesis units, and produces a periodic waveform. You can set the frequency and amplitude of the waveform. The output of the oscillator is a sequence of samples which forms a digital signal representing the waveform. 19 Image taken from http://www.harmony-central.com/Effects/Articles/Reverb/ Page 20 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The spectrum of an oscillator that produces periodic waveforms with well-defined spectral components is called a discrete spectrum. The opposite is a disturbed spectrum, which covers all frequencies. An oscillator that produces such a spectrum is called a noise oscillator. Noise is sound with an extremely rich spectrum. A signal covering all frequencies is called white noise, but also other types of noise exist. It is made by creating a random frequency at each sample.20 ADSR envelopes Amplitude envelopes are used to create an amplitude sequence, instead of using fixed amplitude. This is done to create a more natural sound. No natural sounds have constant amplitude. In connection with synthesis of musical sounds, the term ADSR envelope is used to create more realistic envelopes. ADSR envelopes consist of 4 phases: Attack, Decay, Sustain, and Release (see ‚Figure 4 Amplitude envelope‛ and further description on page 10). Percussive instruments have a very short attack, whereas instruments such as pipe organ or tuba have a longer attack 21. Synthesis techniques Throughout time several synthesis techniques have been invented. Each of them has its own qualities and can be used with different purposes. In this chapter we look at two techniques; additive synthesis and subtractive synthesis. This is due to the fact that synthetic drum sounds traditionally were made with these techniques.22 Additive synthesis Additive synthesis is based on the idea that complex tones can be created by the summation, or addition, of simpler ones. Basically, additive synthesis starts from scratch and adds sinusoids together, until the desired sound is achieved. 20 Source:‚C om p uter M usic‛, page 75-78 + 95-98. Source:‚C om p uter M usic‛,p ag e 84. 22 Sources: http://www.soundonsound.com/sos/apr02/articles/synthsecrets0402.asp and http://www.soundonsound.com/sos/feb02/articles/synthsecrets0202.asp 21 Page 21 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 ‚G iven enough oscillators, any set of spectral components can be synthesized, and so virtually any sound can be generated.‛23 With additive synthesis you have good sound control, but in order to get a very complex sound you will need many sound generators, which demands a lot of data. In its most basic form, additive synthesis is the addition of two sinusoids. Two sinusoids can be added to create a more complex sound. A sinusoid can be defined mathematically this way: Χ (t) = Asin(2лft + Φ ) A is the amplitude (always positive, by convention) f is the frequency Φ is the starting phase Example of adding two sinusoids with the frequencies of 500Hz and 328Hz: Χ (t) = sin(2л500t + Φ ) + sin(2л328t + Φ ) Subtractive synthesis The opposite technique of additive synthesis is subtractive synthesis. Subtractive synthesis has all frequencies as the starting point. Such signals, covering all frequencies, are called white noise. From these signals, you filter out the unwanted frequencies, using filters like low-pass, highpass, or band pass-filters (see chapter ‚Filters and Effects‛ on page 14 for details). With this technique it is easier to create more complex sounds, but it is rather difficult to filter out unwanted frequencies very precisely. 23 Source:‚C om p uter M usic‛, page 88. Page 22 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 ANALYSIS AND DISCUSSION With basis in the theory chapter, we will now discuss the choices we make for the system we are going to create. The areas we will cover in this chapter are Percussive Sound Characteristics, Short Analysis of DrumSynth 2, Electronic Sensor Interfacing, Mapping, and Real-time. Percussive Sound Characteristics In this chapter we look at the characteristics of a percussive sound, in order to get a better understanding of the parameters we need to be aware of, when detecting voice-to-drum simulated sounds and creating our own synthetic drum sounds. ‛There are tw o key elem ents in percussive sounds, the amplitude envelope shape and the frequency content. The amplitude envelope usually has a sharp attack followed by a slow exponential decay. In the frequency content of the sound usually consists of non-integer harmonics or noise, with little or no pitch. There are also often many frequencies at the beginning ofa sound fading into only a few frequencies atthe end.‛24 Amplitude Envelope As the quote above states, a key characteristic of a percussive sound is the rapid attack (see page 21 for more on amplitude envelopes and attacks). Figure 18 below shows the amplitude envelope of a typical snare drum (top) and of a typical bass drum (bottom), and supports the statement about a rapid attack. Because of this, one of the key elements we have chosen to use in our system, when it comes to detecting drum sounds, is the attack. 24 Source:‛P ercussion Synthesis‛ by S tephen D ill - http://ccrma-www.stanford.edu/~sdill/220A-project/drums.html Page 23 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 18 - Snare drum (top) and bass drum (bottom), amplitude over time Voice-to-drum Frequency Analysis A s w e found out in ‚The Basics of Sound‛ chapter on page 9, a good way to distinguish between sounds is by frequency. Furthermore, the quote on page 23 also states that frequency is a key element of a drum sound. Because of this, we have chosen to look further into using frequency as another key element in our system, when it comes to detecting and distinguishing drum sounds. The purpose of this chapter is to find the frequency bands where voice-to-drum simulated bass drum sounds and voice-to-drum simulated snare drum sounds, respectively, typically have their main amplitude peaks. We want to find out whether we can use these frequency bands to distinguish between bass drum and snare drum sounds, when we build our system. To find the frequency bands, we use statistics. We record a set of 18 bass drum sounds and 18 snare drum sounds, done by three different people with two different microphones – a cheap Labtec microphone and a more expensive Shure microphone – on a low latency FireWire soundcard (see ‚Appendix 2 - Sound Hardware Specifications‛, on page 89 for technical details on hardware). We look at the frequency spectrum of each sound and find the frequency value of the main amplitude peak of each of them. We then Page 24 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 calculate the mean frequency value and the standard deviation25 based on these peaks, in order to find the interesting frequency bands in a scientific way. Bass Drum Frequency Calculation Figure 19 shows the frequency spectrum of a typical voice-to-drum simulated bass drum sound. It has a lot of activity in the lower end of the frequency scale, at around 100-400Hz, and not much activity in the higher end of the scale. Figure 19 - Frequency Spectrum for a voice-to-drum simulated bass drum sound ‚Appendix 3 – Bass and Snare drum peak frequencies of recorded voice-to-drum samples‛ on page 90 contains a table of the frequency values for the amplitude peak of each of the 18 bass drum sounds we recorded. We use these values to find the mean and the standard deviation, in order to get a more precise frequency band. Mean: sum of sampled frequencie s = 238 n Where n is the number of sampled frequencies. 25 C alculations are based on Thom as M oeslund ’s slides about statistics http://www.cvmt.dk/education/teaching/e04/MED3/AP/ap17+18.ppt – pages 9-12 Page 25 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Variance: [(Fbd1 - Fmean)2 + (Fbd2 - Fmean)2 + … + (Fn - Fmean) 2] / n = 5649.2222 W here ‛F n‛/‛ F bdx‚ is the frequency peak value ofthe corresponding bass drum sound,‛F mean‛ is the mean frequency value of all the sounds and ‛n‛ is the num ber of sam pled frequency peaks. Standard Deviation = σ = Variance = 75.1613 To cover 95.44% of the sampled bass drum frequency peak values, we multiply the Standard Deviation by 2: 2σ = 150.3226 Snare Drum Frequency Calculation Figure 20 shows the frequency spectrum of a typical voice-to-drum simulated snare drum sound. It has its main activity in the higher end of the frequency scale, at around 2000-4000Hz, but also significant activity in the lower end of the scale, at around 100-400Hz. Figure 20 - Frequency Spectrum for a voice-to-drum simulated snare drum sound Page 26 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Since we have already covered the lower end of the frequency scale and calculated a frequency band based on the sampled values, when we did the bass drum calculation, we are going to concentrate the snare drum calculation on the higher end of the frequency scale. ‚Appendix 3 – Bass and Snare drum peak frequencies of recorded voice-to-drum samples‛ on page 90 contains the frequency values for the amplitude peak of each of the 18 recorded snare drum sounds. These peaks are not necessarily the highest ones, overall, but they are the highest ones in the higher end of the frequency scale. I.e. we disregard any peaks that occur in the lower frequency range, at around 100-400Hz, since this range has already been covered. We use the sampled values to find the mean and the standard deviation, in order to get a more precise frequency band: Mean: sum of sampled frequencie s = 2847 n Where n is the number of sampled frequencies Variance: [(Fsd1 - Fmean)2 + (Fsd2 - Fmean)2 + … + (F n - Fmean) 2] / n = 174916.6111 W here ‛F n‛/‛ F sdx‚ is the frequency peak value ofthe corresponding snare drum sound, ‛F mean‛ is the mean frequency value of all the sounds and ‛n‛ is the num ber of sam pled frequency peaks. Standard Deviation = σ = Variance = 418.2303 To cover 95.44% of the sampled snare drum frequency peak values, we multiply the Standard Deviation by 2: 2σ = 836.4607 Page 27 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Page 28 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Conclusion of Percussive Sound Characteristics In the first part of the chapter we concluded that an important element of a drum sound is a rapid attack. This is something we need to keep in mind both when it comes to detecting voice-todrum simulated sounds, and also when it comes to creating our own synthetic drum sounds. Furthermore, we found out that frequency is another important part of a drum sound, so we analyzed some recorded voice-to-drum simulated sounds. The results show that frequency is a parameter we can use to distinguish between a voice-to-drum simulated bass drum sound and a voice-to-drum simulated snare drum sound. We have found the following interesting frequency bands: For the bass drum, we get a frequency band with a centre frequency equal to the mean value, 238 Hz, and a bandwidth of 2σ 150Hz. For the snare drum, this means we get a frequency band with a centre frequency equal to the mean value, 2847 Hz, and a bandwidth of 2σ 836Hz. We will use these results as a basis for creating our system (see the chapter ‚Design & Implementation‛ on page 37). Short Analysis of DrumSynth 2 In this chapter we take a look at how the program DrumSynth 2 creates synthetic drum sounds, in order to get some inspiration as to how we can create our own sounds from scratch. As mentioned in the introduction, we have chosen to synthesize drum sounds that resemble the old analogue vintage drum machines. Such drum machines are still essential in modern electronic music, and we are all fond of the sound from such machines. We have been taking a look into different software drum synthesizer to get inspired, and we were amazed by the sonic possibilities that DrumSynth 226 could offer with only a few parameters. 26 DrumSynth 2 is available from: http://www.hitsquad.com/smm/programs/DrumSynth/ Page 29 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 DrumSynth 2 is a simple program that builds drums sounds from combining elements like sweptfrequency sine waves, noise bands, noise, distortion, and overtones (see Figure 21). Figure 21 - DrumSynth 2 The different elements have a few controllable parameters, including amplitude envelope and e.g. frequency range. The elements are then combined in a mixer that controls the overall volume and the signal is then sent to the sound card. Even though the controls of the different elements are limited, it is possible to create rather complex sounds by combining the different elements. This is something we intend to use, when we are going to create our own synthetic drum sounds. Page 30 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Electronic Sensor Interfacing 27 By using sensor technology, we would like to provide an additional interface for user interaction, in addition to the standard input devices found on a PC (keyboard and mouse). Originally, this additional user interface was supposed to consist of a floor board, where the user could select different application commands by pressing buttons on the floor board with the foot, and a foot pedal, where the user could input a continuous range of values into the application, again by using foot pressure. Finally, the microphone as a sensor device for auditory input would be included as a part of our user interface extension. Since a foot pedal can be understood simply as a mechanical construction that converts rocking foot motion into rotary motion of a potentiometer28, and since we did not have the resources to build a foot pedal, we estimated that it would be enough to simply use a potentiometer instead. In addition, since a potentiometer can be directly connected to a power supply in order to produce a voltage divider, no additional circuitry but the potentiometer itself is needed to provide a voltage input to the system, and to simulate a foot pedal input. A floor board, on the other hand, would electrically consist of push buttons. A single push button can be interfaced directly to a device called a microprocessor (Intro Teleo Module), which converts voltage into digital data (see chapter ‚The Tactile Sensor‛ on page 45 for more information on this device). But you would need to use one of the four available inputs on the microprocessor per button. Since we want to use six buttons in total, and we only have three available inputs left, (the potentiometer uses one of the inputs) we need to build a voltage divider switching network. This enables us to use six push buttons, but only one input on the microprocessor. In this report, the electronic aspect is focused on this circuit, since it is the only one that we had to build ourselves. Since a microphone can be taken as a standardized device used in connection with PCs, the electronic aspect of interfacing a microphone will not be debated further here 29. 27 This chapter is b ased on ‚Appendix 1 - Sensor Theory‛,see page 83 Source: http://www.geofex.com/Article_Folders/wahrocker/wahrocker.pdf 29 See ‚ Digital Representation of Sound‛ chapter on p ag e 12 for a bit more info about microphones 28 Page 31 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Mapping To successfully build a system thatis able to ‚convert‛ hum an voice into drum s, m any factors will have to be taken in to account. How do we detect the sound, but also which sound should trigger what synthesis? In our voice to synthesis mapping we have a one-to-one mapping; a voice input with short attack and frequencies in the lower areas will trigger the bass drum. Voice input with short attack and high frequencies trigger the snare drum. When mapping both the sensor and the microphone to the instrument we have built, it is important that some sort of consistency and logic is maintained. An example could be: When changing the instrument by pushing the left instrument button (a button on a controller), the menu is displaying this action too, and the GUI30 button is placed onscreen similar to the layout of the tactile sensor. To have a better idea of how the mapping should be like, we have looked at some examples made by other researchers. One of the projects we have looked at is also investigated by Andy Hunt and Marcelo M. Wanderley, where they give an example of how the mapping in a MIDI wind controller may look like, and how it may be altered to get a different experience both of how to play it and when listening to it. How the mapping looked like is illustrated in Figure 22, below. 30 Graphical User Interface Page 32 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 2231 - An example of altered mapping for a MIDI wind controller. By using different mappings between the actual instrument and the sound synthesis engine like in the wind controller example, it is possible to alter the playing experience, while only changing the mapping. Another example of mapping has been carried out by Andy Hunt and Marcelo M. Wanderley, but in this case the instrument interface was not a simulation of a real instrument, only a computer with a mouse and a controller with sliders was used32. The test was set up in different configurations, again with the mapping altered. In Figure 23 a one-to-one configuration was used, where each slider on the screen manipulated a different parameter in the sound synthesis. Only one parameter could be controlled at a time, using the mouse. 31 32 Image taken from :‛M apping perform er p aram eters to synthesis engines‛ by A ndy H unt and M arcelo M . W anderley. Source: ‛The im portance ofp aram eter m apping in electronic instrum ent design‛ by H unt, W anderley and Paradis. Page 33 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 23 - One-to-one The second configuration also had 4 sliders on the screen, but instead of controlling them with the mouse, they where controlled by a control board with sliders. The main difference in this configuration was the fact that the user had to keep moving one of the sliders to produce a sound, kind of like using a bow on a violin. Figure 24 - One-to-one with sliders The last configuration combined the slider interface and the mouse interface into one. The mouse had to be moved in order to produce sound, while sound parameters could be changed by the sliders and also by the position of the mouse. Moving the mouse was necessary in order to produce a sound. Page 34 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 25 - The multiparametric interface Andy Hunt and Marcelo M. Wanderley write that the last configuration seemed frustrating at first, but most people grew fond of this after a while, because the interface only using two sliders and the mouse controlling several parameters seemed more like an actual instrument. Based on this we can say that how we build our own mapping is essential to the user experience, and how much they will like the system. We have to be aware of the fact that a slight difference in mapping can change the way the user interacts with the system. Real-time In order to make an instrument like ours useful in a musical context, the system needs to be fast enough so that the user does not perceive the actual latency. In a system like ours, there will always be latency because of e.g. the data throughput of the computer, but the latency can be reduced through efficient coding and better hardware, such as e.g. a sound card with a low-latency driver. Also the distance from the speakers to the user creates latency. In air, sound travels at a speed of approximately 345 m/s33. This means that for each meter the speakers are located from the user, there will be an extra latency of: 1m * (1000ms / 345m) ≈ 2.9ms In relation to this it is important to find out how much latency the human ear accepts, without sensing it as a disturbance in a musical context. 33 Source:‚C om p uter M usic‛ p ag e 289 Page 35 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Nelson Posse Lago has researched how much latency is acceptable between a user action and the corresponding reaction in music applications. He concludes: ‚… up to 20-30ms, are pretty much acceptable for most multimedia and music applications‛34 Knowing that if the speakers are positioned 3-4m away there will be an extra latency of approximately 10ms, our system should not have a latency of more than 20ms from the input (the microphone) to the output (the speakers), if we want it to be perceived as real-time. 34 Source: "Distributed Real-Time Audio Processing" by Nelson Posse Lago, Page 7 http://gsd.ime.usp.br/~lago/masters/extended_abstract.pdf Page 36 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 DESIGN & IMPLEMENTATION In this chapter we talk about the design and implementation of the actual system we have created. We start out with a general description and an overview of the system, and then we go into a more detailed description of each element. Short Description of the System We have named our system ‚Funkm eister7‛ (see Figure 26). It is basically a digital, percussive instrument that you can play using your voice. Figure 26 - Funkmeister7 Screenshot The system takes voice-to-drum simulated sounds as input through a microphone, analyzes the sound, and determines whether it is a snare drum or a bass drum (or neither). When it detects one of these drum types, the system will play back a synthesized version of the same kind of drum type. In addition to this, there is a tactile sensor/controller with six buttons and a turning knob (see Figure 28 on page 39) that allows the user to control certain parameters and effects of the synthesized output sound, through a menu. The menu is divided into two levels: a main menu which allows the user to switch between the available types of instruments, and a sub-menu Page 37 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 where there options change, based on which instrument the user has chosen in the main menu. In the sub-menu the parameters of the chosen instruments can be altered. System overview Figure 27 below shows the overall system structure. Figure 27 – System Structure Page 38 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The hardware part of our system consists of a microphone and two interaction input devices: a board interface and a potentiometer, and a Teleo Intro Module35 as a microprocessor that acquires and converts the data from the input devices (see Figure 28 below). The board interface, the potentiometer and the microprocessor make up what we call‚The Tactile Sensor‛. Figure 28 – The Tactile Sensor The microprocessor is connected via USB to a PC with a soundcard, speakers and a monitor. The microphone is connected to this PC via the soundcard. The software part of our system is implemented in Max/MSP36. It is based on a hierarchic construction consisting of 12 patches (see Figure 29). 35 36 Teleo Intro Module from www.makingthings.com Max/MSP is available from: http://www.cycling74.com/products/maxmsp.html Page 39 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 29: Overview of the patches and sub-patches in the software part of our system In the top ofthe hierarchy is the patch called ‚FU N K M EISTER 7‛.This patch is the graphical presentation (GUI) of our system, and from here you control the different effects, volume, etc. The ‚FU N K M EISTER 7‛ patch has five sub -patches, w hich are: ‚B O A R D ‛,‚LO O P‛,‚SYN TH ‛, ‚M EN U ‛, and ‚M IC _IN PU T‛. The ‚B O A R D ‛ patch is w here w e receive the input from the tactile sensor that we have created. It has a sub-patch called ‚R ESET‛ w hich is used to reset the entire system The ‚LO O P‛ patch is a sample player with four different musical loops that the user can play while using the instrument. The ‚SYN TH ‛ patch is the drum synthesizer; it is in this patch that we connect the drum sounds w ith the differenteffects.‚SYN TH ‛ has four sub -patches: ‚B D _SYN TH ‛,‚SD _SYN TH ‛, ‚D ELA Y‛, and ‚R EVER B ‛.‚B D _SYN TH ‛ and ‚SD _SYN TH ‛ is w here respectively the bass drum and the snare drum are synthesized.The ‚D ELA Y‛ patch is w here the delay is created.The ‚R EVER B ‛ patch is where you control the reverb. It has a sub-patch called ‚reverb‛, and it is in this patch that the actualreverb is m ade (see ‚Sound Synthesis‛ on page 58 for details). In the ‚M EN U ‛ patch we connect the input from the tactile sensor to the rest system. Page 40 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The ‚M IC _IN PU T‛ patch is w here the analysis of the sound inputfrom the m icrophone takes place. It detects if the input is a voice-to-drum simulated bass drum – or snare drum sound, or neither (see ‚Sound Input Analysis‛ on page 51 for details). We will now go through how we have mapped the input of our system to the output, and then describe the design and implementation of the three parts of the system: Tactile Sensor, Sound Input Analysis, and Sound Synthesis. Mapping In our mapping there are two steps where data is mapped to match either the graphical options or the sound creation parameters. Figure 30 - The steps in mapping sensor input to the sound output and the dynamic user interface The first step in mapping takes place after the conversion of physical data. E.g. the pressure of the hand on a button or sound waves in the air is mapped through a normalized data stream that can be read by the computer and the application. This step is where the input data is mapped to num bers betw een 0 and 100 (See ‛Design & Implementation‛ on page 41). Both the sound synthesis parameters and the GUI needs these values in order to create the sound and display what is currently going on. Page 41 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The second step in the mapping is where the parameters are mapped to actual parts of the synthesis and objects in the GUI. Mapping the microphone In our application one of the main interfaces is the microphone, which we use to map the voice of the user to specific instruments. This mapping can be illustrated as in Figure 31. Figure 31 - The mapping of frequency and amplitude from the microphone. From the illustration (Figure 31) it shows that there is a one-to-one mapping between frequency and which instrument to play, and also a one-to-one mapping between amplitude and detection. Actually this mapping can be combined into the next illustration (Figure 32), since both frequency and amplitude determines together the instrument to be played, and not individually. Now several parameters control what instrument to be played, giving the system a many-to-one relation. Figure 32 - The combination of frequency and amplitude Mapping the Tactile Sensor The tactile sensor is our second sensor. A part of this is the control board. It is used to manipulate several parameters of the synthesis engine. In this sensor the mapping is not determining what Page 42 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 sound to play, but rather what the sound should sound like. The mapping in this part of the system is m ore ‚function‛-based, where different buttons are mapped to specific parameters in the sound engine. To give a better picture of how the different buttons and functions are mapped together, the illustration below (Figure 33) has been made. Figure 33 - The mapping of the buttons to the parameters in the patch. The illustration above shows how everything is linked together to produce the augmented sound scape. The two buttons to change instruments have a one-to-many relationship, since several of the same instruments are controlled by these two buttons. The change effect buttons have the same relationship because many parameters and effects can be changed by only using these buttons. The reset button has a one-to-one relationship in the system, one button for one control (resetting) when using the system. Page 43 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The ‘apply’-button is a 1 to many relationship because it is possible to apply both the values of the different synthesis parameters and effects, but also possible to start and stop the loop player which can be run in the background. In the mapping illustration (Figure 33) there are no arrows going outfrom the ‘apply’-button and the ‘reset’-button, this is on purpose since it would clutter up the illustration because both buttons are connected to alm osteverything. ‘A pply’is also connected to the potentiometer this would clutter the illustration further. A part of the tactile sensor is the potentiometer. Although this sensor is connected to the microprocessor together with the voltage divider switching network, it is still considered a standalone sensor. The potentiometer also works in a different way, since it supplies a continuous range of numbers, instead of the 7 fixed from the control board. The mapping for the potentiometer is illustrated in Figure 34 - Mapping of the potentiometer Figure 34 - Mapping of the potentiometer The mapping in the potentiometer is a bit different, since the potentiometer is mapped to a lot of values, all depending on the range of the resistance in the potentiometer. The potentiometer produces a continuous range of numbers, but after the processing by the Teleo Intro module, 1024 different numbers is the maximum range. The potentiometer has a one-to-many relationship since it controls the level of volume, pitch, distortion, delay and reverberation. It uses a continuous range of values to control this, but with the sampling in bits set to 5, we have 31 different values, which are then scaled from 0-100 to make a more intuitive control of e.g. volume. Page 44 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The Tactile Sensor 37 What we are building is two sensors: A control board with six buttons measuring pressure at a very low level (voltage divider switching network) and a turning knob measuring motion (potentiometer). Both sensors convert, using the microprocessor, the pressure and motion into different voltages. When the user of the system presses a button it is determined whether the button is on or off and different voltages are produced. The different voltages are produced by changing the amount of resistance in the circuit depending on what button is pressed. We also use a potentiometer that supplies the system with a continuous range of values, which can be used to adjust methods inside the application which requires more fine steps. The tactile sensor chapter consists of two main parts: transducing pressure and motion into voltage and formatting the voltage into a digital stream. The voltage divider and potentiometer can be thought of as a transducer and the microprocessor as a formatter. Control Board (Voltage divider switching network) Turning Knob (Potentiometer) Microprocessor (Teleo Intro Module) Figure 35 - Overview of the tactile sensor 37 This section is mainly based on the slides provided by our teacher and supervisor Smilen Dimitrov, if nothing else is noted. Link: http://www.smilen.net/SensorTechMED4/ Page 45 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Transducing Motion and Pressure The potentiometer is transducing motion, but we will not cover this any further than already described in ‚Appendix 1 - Sensor Theory‛ on page 84.In stead we will focus on the control board, which works in the simple way of dividing voltage. The Teleo module supplies a 5V output as a power supply and it is this voltage that is being divided. When working with small voltages it is essential that the system is able to tell a difference in the voltage, i.e. if a button being pressed resulting in a 0.003V difference, it might be hard to detect the change. To accommodate this it has been made sure that the voltage is being spread out as much as possible in the 0V to 5V range, by using 4kOhm resistors between the different steps and a 10kOhm resistor as default resistor. When the user presses a button, the voltage divider circuit gets cut off at different places, and thereby giving different measurements, as shown in the illustrations below. The circuit contains 7 resistors and when no button is pressed the current runs through all of the resistors. When a button is pressed, a certain amount of resistance is taken out of the circuit, and thereby resulting in a higher measured output voltage. If no buttons in the circuit are pressed, there will be a total resistance of 34kOhms producing 1.466V output.For each button ‚low er‛ in the circuita button is pressed,the resistance is decreased by 4kO hm s.If the button at the ‚bottom ‛ (button 1) is pressed, only the default resistor (10kOhms) will be left in the circuit providing a 4.992V output (see Figure 36, Figure 37 and Figure 38). Figure 36 - no button is pressed Figure 37 – button 2 pressed Figure 38 - button 3 is pressed Page 46 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 A listing of all the values from the different buttons pressed is shown in Figure 39. The relative distribution schema State kΩ kΩ Spread Volts Standby 10 / 34 0.294 1.466V 1. button 10 / 10 1.000 4.992V 2. button 10 / 14 0.714 3.569V 3. button 10 / 18 0.556 2.786V 4. button 10 / 22 0.455 2.266V 5. button 10 / 26 0.385 1.917V 6. button 10 / 30 0.333 1.663V Figure 39: Voltage values from the voltage divider Formatting the Transduced Data Teleo is a tool developed for building prototypes of systems very fast (see Figure 40). Without the needs to understand low-level programming or even being an expert in electronics, it is possible to build systems that really work, and make them work in real-time. Figure 40 - Illustration of the Teleo module Page 47 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 As a formatter in our interface we have used the Teleo intro module as an easy-to-build translator, to translate the analogue data in to understandable data. The Teleo module translates the analogue data into a binary stream that the computer understands through an extension installed in Max/MSP. This gives us the possibility to modify the numerical representation of the input voltage, which from an input voltage in the range 0-5V can be sampled with the maximum of 10bit resolution, and represented with a number from 0 to 2 10 - 1. The Teleo module has 4 analogue inputs that can all be addressed simultaneously and with different kinds of sensors. Our control board has been connected to channel 0 and the potentiometer is connected to channel 2. In this way we are able to capture the values from each sensor individually. The Tactile Sensor – Software (Max/MSP) To m ake use of the values being sent from the Teleo m odule som e kind of‚logic‛ has to be build. The ‚board‛-patcher in the application is where the numbers are received, and the decision of w hat is going to happen is m ade.The ‚board‛-patcher w orks together w ith ‚m enu‛- patcher in order to make the user interface look nice, and the variables for the changeable parameters of the drum sounds are also saved in ‚m enu‛-patcher. Figure 41 - Data acquisition in Max/MSP After passing through the sensor, the voltage-values end up in Max/MSP being displayed as values on a scale from -100 to +100. These values are then scaled to positive values only, which gives us the values in Figure 42. The ‚t.intro.ain‛ object is the element in Max/MSP which makes the connection to the Teleo module, and returns the values. The object has 4 options: Page 48 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 1st = Sample period (1 1000ms) 2nd = Minimum value of the output (-100 100) 3rd = Maximum value of the output (-100 100) 4th = Resolution in bits (1 10) The output with the values we have selected is a number between 0 and 100. The object is receiving a value from the sensor every 250ms with a resolution of 5bits (which gives us 31 different possible values38). There are two of the above mentioned t.intro.ain modules in the patch. The second channel on the module is used by the potentiometer, which is there to give us the possibility to change e.g. the volume of the output sound more smoothly. Both blocks are almost identical and are doing almost the sam e thing,the only difference is thata ‚scale‛-object has been inserted after the block that receives the potentiometer value. Because of resistance in the circuit it is not possible to get a perfect 0 and a perfect 100 (rather 14-76), so the scale object is there to make sure the values are always between 0 and 100. The relative distribution schema State kΩ kΩ Spread Volts Max/MSP value Standby 10 / 34 0.294 1.466V 29.03 1. button 10 / 10 1.000 4.992V 100 2. button 10 / 14 0.714 3.569V 70.97 3. button 10 / 18 0.556 2.786V 54.84 4. button 10 / 22 0.455 2.266V 45.16 5. button 10 / 26 0.385 1.917V 35.48 6. button 10 / 30 0.333 1.663V 32.26 Figure 42: The Voltage values including the scaled Max/MSP values 38 Source:‚T e le o S ta rte r K it U s e r G u id e ” - http://makingthings.com/products/documentation/teleo_intro_user_guide/index.html Page 49 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The next important section of code is the part where it is actually determined what button was pressed and what function to call. This is achieved with six if-sentences, each determining whether the value received is within a certain threshold (e.g. between 99 and 101 to activate button 1) or not (see Figure 43). The thresholds are: Button 1 = 99 to 101 Button 2 = 69 to 72 Button 3 = 53 to 56 Button 4 = 43 to 46 Button 5 = 34 to 37 Button 6 = 31 to 33 Figure 43 - Thresholds The thresholds are made with a couple of numbers in between to make sure that the value received really is real + or – some decimals if the tactile sensor produces a little bit different numbers. This part of the patch has been made to also allow the user to manually use the system with the mouse. The ‚bt1_click‛ to ‚bt6_click‛ receives bangs from the front panel (user interface) and thereby simulates actual values being received in the system. The last part ofthis patch is the sm all‚controlpanel‛ (see Figure 44), which was made during the construction of the patch, for simpler control without the sensor. It is still important because some of the elements of this control panel are still used to control some parameters, but in a future version this could be taken out of the patch. Page 50 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 44 - The control panel Sound Input Analysis Introduction to Sound Input Analysis The purpose of this part of the system is to analyze the incoming sound from the microphone, and detect when the attack of a drum-type sound occurs and then determine whether the user is trying to sim ulate a snare drum (‚SD‛) or a bass drum (‚BD‛). Figure 45: Sound Input Analysis Diagram Page 51 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The Sound Input Analysis consists of four main parts (see Figure 45): The Input, where the analogue sound input is being converted into digital sound through the microphone and soundcard The Attack Detection, where the sound input is being analysed based on amplitude The Drum-type Detection, where the sound input is being analysed based on frequency The Drum Trigger, where the two detection parts are synchronized We have already covered how a microphone and analogue-to-digital conversion works in the ‚Digital Representation of Sound‛ chapter on page 12, so we will not cover that any further in this chapter. Instead we will take a closer look at the other three parts of the Sound Input Analysis. Attack Detection The attack detection part of the analysis patch revolves around the ‚B onk~‛ object39. As we found out in the ‚Percussive Sound Characteristics‛ chapter on page 23, a typical percussive sound has a rapid attack, which makes the ‚Bonk~ ‛ object very suitable for this task. Bonk~ is an extension to Max/MSP, that tracks relative changes in amplitude over time, and detects percussive attacks based on the adjustable threshold values, ‚hithresh‛ and ‚lothresh‛. If the am plitude of the incom ing signalrises above the ‚hithres‛ value w ithin one analysis interval, and then drops to below the ‚lothresh‛ value,B onk~ will detect an attack and send out a ‚bang‛ com m and (generalM ax/M SP term for ‚execute‛).In addition to these thresholds,you can also set a minimum velocity value (‚m invel‛), which simply just ignores signals with a lower amplitude value than the number it is set to, no matter what the relative amplitude change is. 40 39 40 Bonk is available from: http://www-crca.ucsd.edu/~tapel/software.html Source: "Real-time audio analysis tools for Pd and MSP" by Puckette, Apel and Zicarelli: http://www-crca.ucsd.edu/~tapel/icmc98.pdf Page 52 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 46: Attack Detection Figure 46 shows the attack detection part of the Sound Input Analysis patch.‚lothresh‛ has been set to a value of6, ‚hithresh‛ has been setto a value of75 and ‚m invel‛ has been setto a value of 25.The m ain m ethod oftw eaking these values has been ‚training by ear‛.W e sim ply fed the B onk object with a lot of percussive sounds, both in form of pre-recorded samples and live input from a microphone with voice-to-drum simulation and by banging on the table, etc, and then monitored when Bonk reported an attack. This way we found the best compromise between detecting the necessary attacks, without the system being too sensitive and detecting more attacks in one percussive sound, e.g. Drum-type Detection A s w e found out in the ‚Voice-to-drum Frequency Analysis‛ section on page 24, there are two frequency intervals that are relevant to look at, when it comes to deciding whether a human is trying to simulate the sound of a snare drum or a bass drum: The lower end of the frequency scale around 150-300 Hz, and the higher end of the frequency scale, around 2500-3500 Hz. The bass drum has a lot of activity in the lower end of the frequency scale, and the snare drum has a lot of activity in the higher end of the scale, but also significant activity in the lower end. Page 53 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 47: Two parallel band-pass filters Because of this, we have chosen the method of setting up two parallel band-pass filters (see page 15 for more information about band-pass filters). One for the lower end of the spectrum where a voice-to-drum simulated bass drum sound has its main activity, and one for the higher end of the spectrum where a voice-to-drum simulated snare drum sound has its main activity (see Figure 47). The idea is to monitor the strength of these two filtered signals, compare them, and then to make a set of rules based on the two values, that determine whether a sound is a simulated bass-drum or a simulated snare-drum. The first band-pass-filter has a centre frequency of 238Hz and a bandwidth of approximately 150Hz, the second filter has a centre frequency of approximately 2847Hz and a bandwidth of approximately 836Hz, corresponding to the values we extracted from our prior analysis (‚Voiceto-drum Frequency Analysis‛ section on page 24). Page 54 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 48: Drum type detection As shown on Figure 48, the two signals go into two separate ‚peakam p‛ objects, which monitor the peak am plitude ofeach signalin 5 m s intervals. The ‚peakam p‛ value for the lower bandpassed signalbelongs to the ‚B D D etector‛ (bass drum detector), and the value for the higher band-passed signalbelongs to the ‚SD D etector‛ (snare drum detector). These tw o ‚peakam p‛ values are then com pared in tw o sets ofrules. Ifthe first rule is true,the ‚B D D etector‛ w illdetect a bass drum and outputa ‚1‛. Ifthe rule is not true, it w illoutputa ‚2‛.The ‚SD D etector‛ w orks in the sam e w ay, by giving an outputofeither ‚1‛ or ‚2‛. - The BD Detector rule: If $f1 > 0.2 && $f1 > $f2 * 1.2 then set 1 else set 2 Where $f1 is the peakamp value for the lower band-passed signal (Low Peakamp) and $f2 is the peakamp value for the higher band-passed signal (High Peakamp). Explanation: If the Low Peakamp value is above 0.2 (in order to filter out noise and low signals) and if it is more than 1.2 times higher than the High Peakamp value, then it will report the sound as being a bass drum. As we deduced in the chapter about ‚Voice-to-drum Frequency Analysis‛ section on page 24, voice-to-drum simulated bass drum sounds have their main activity in the low-frequency area. Page 55 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The rule above ensures that only signals with more activity in the low-frequency area, compared to the high-frequency area, will be reported as being a bass drum. - The SD Detector rule: If $f2 > 0.2 && $f2 > $f1 * 1 then set 1 else set 2 Where $f1 is the peakamp value for the lower band-passed signal (Low Peakamp) and $f2 is the peakamp value for the higher band-passed signal (High Peakamp). Explanation: If the High Peakamp value is above 0.2 (in order to filter out noise and low signals) and if it is higher than the Low Peakamp value, then it will report the sound as being a snare drum. As we deduced in the chapter about ‚Voice-to-drum Frequency Analysis‛ section on page 24, voice-to-drum simulated snare drum sounds have their main activity in the high-frequency area. The rule above ensures that only signals with more activity in the high-frequency area, compared to the low-frequency area, will be reported as being a snare drum. Drum Triggering This part of the patch is divided into two drum triggers, one for the bass drum and one for the snare drum (See Figure 49). This is where the signals from the two detection parts are joined and synchronized, to ensure that e.g. a bass drum sound is only triggered if a sound has a rapid attack AND if the frequency of the sound is in the bass drum band. Page 56 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 49: Drum Triggers The bass drum trigger listens to the output of the BD Detector, and the snare drum trigger listens to the output of the SD Detector. I.e. they w illboth receive either a ‚1‛ or a ‚2‛. At the same time, both triggers also listen to the output of the Attack Detector, which outputs a ‚bang‛ com m and w henever itdetects a percussive attack. The drum triggers then compare the two kinds of inputs they are getting, and if they receive a ‚bang‛ com m and from the A ttack D etector and atthe sam e tim e receive a ‚1‛ from the Drum Detector, they will output a ‚hit‛. I.e.for a ‚hit‛ to be reported, the D rum Trigger has to receive both a ‚1‛ from respectively the B D or the SD detector AND receive a ‚bang‛ from the A ttack D etector,atthe sam e tim e.The B D Trigger will send a ‚bdhit‛ command and the SD Trigger will send an ‚sdhit‛ command, which is used by the Sound Synthesis part of the system. In order to properly synchronise the tw o inputs the D rum Triggers are receiving, the ‚bang‛ received from the Attack Detector has been delayed by 15ms for the BD Trigger, and by 30ms for the SD Trigger. This was done in order to compensate for calculation time of the SD and BD Detectors, and because the main frequency content of a percussive sound happens after the attack. Page 57 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Sound Synthesis Introduction to Sound Synthesis This part of the system creates the actual output sound, based on the information from the previous processes. The result of the previous part, Sound Input Analysis, is practically – in the end - a ‚bang‛, w hich instantiates the sound synthesis engine of either a bass drum or a snare drum. We have created two different drum synthesizers – one for the bass drum and one for the snare drum. The other input used in this part of the system, are the values from the tactile sensor. The buttons produce unique integer values used to trigger the menu system, as explained earlier on. These values are used to ‚bang‛ the belonging parts ofthe system .The potentiom eter produces integer values between 0 and 100, which are used to set the values of the different parts of the sound synthesis system. Since the tactile sensor and the menu system has been explained in the chapter ‚Mapping‛ on page 41, we will only refer to the results from the previous processes here,w hich are ‚bangs‛ and integer values between 0 and 100. The structure of both the bass- and snare drum synthesizer is inspired by the structure of a freeware software drum synthesizer called Drumsynth 2.0. DrumSynth creates synthetic drum sounds from a combination of swept-frequency sine waves, noise, complex waveforms, and noise with band-pass filters. It can reproduce sounds from classic analogue drum machines or make new drum sounds (see ‚Short Analysis of DrumSynth 2‛ on page 29). The sound from our two drum synthesizers can be manipulated with the tactile sensor, which can control both some of the parameters of the two synthesizers, but also add sound effects such as reverberation and delay to the snare drum synthesizer. When the drum sound is generated and the effects are added, the sound is passed on to the soundcard. The sound synthesis part of the system contains the following four major blocks: The Bass Drum Synthesizer, The Snare Drum Synthesizer, The Sound Effects, and The Master Section (see Figure 50). Page 58 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 50: Overview of sound synthesis blocks Description of the Sound Synthesis Blocks The Bass Drum Synthesizer The bass drum synthesizer consists of a sinusoidal sweep that interpolates linearly between frequency values from 250Hz to 80Hz in a period of 110ms. The amplitude of the bass drum is controlled by the following envelope (see Figure 51): Attack: from 0 to 1 amplitude in 1ms Decay: from 1 to 0.3 amplitude in 120ms Release: from 0.4 to 0 amplitude in 120ms Page 59 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 51: Amplitude envelope for bass drum synthesizer Figure 52: Max/MSP patch for the Bass Drum Page 60 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The Snare Drum Synthesizer The snare drum synthesizer is a bit more complex than the bass drum synthesizer, since it consists of more elements. The snare drum sound is created by adding four different elements together: a sinusoidal frequency sweep, overtones, white noise, and band-passed white noise. Each element has its own amplitude envelope in order to achieve a more natural sound with independent temporal evolution41. A llelem ents are ‚banged‛ at the sam e tim e,and the outputofeach elem ent is assem bled in a small mixer that controls the amount of each element in the final snare drum sound (see Figure 53). Figure 53: The elements of the snare drum synthesizer 41 Source:‚C om p uter M usic‛, page 88. Page 61 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 - Walkthrough of snare drum synthesizer elements Frequency sweep The sinusoidal frequency sweep interpolates linearly between frequency values from 454Hz to 250Hz in a period of 25ms. The amplitude of the sinusoidal frequency sweep is controlled by the following envelope: Attack: from 0 to 1 amplitude in 0ms Decay: from 1 to 0.5 amplitude in 25ms Sustain: from 0.5 to 0.2 amplitude in 45ms Release: from 0.2 to 0 amplitude in 110ms Overtones The overtones are created from addition of two sinusoids with the frequency of 500 Hz and 328 Hz. (see ‚Synthesis techniques‛ on page 21) We use an independent amplitude envelope for the overtone element in order to make the final snare drum sound more natural, than if we use the same amplitude envelope for each element of the snare drum. The amplitude used for the overtone element is controlled by the following envelope: Attack: from 0 to 1 amplitude in 0ms Decay: from 1 to 0.2 amplitude in 16ms Release: from 0.2 to 0 amplitude in 34ms Page 62 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 White noise The white noise enhances the frequency spectrum of the snare drum. The amplitude of the white noise is controlled by the following envelope: Attack: from 0 to 1 amplitude in 0ms Decay: from 1 to 0.3 amplitude in 25ms Release: from 0.3 to 0 amplitude in 150ms White noise with band-pass filter This element is used to give the snare drum a more punchy sound. The punch can be controlled by altering the centre frequency, bandwidth or gain of the filter. The bandwidth is described through Q, which corresponds to the bandwidth divided by the centre frequency. The gain is set relatively high, in order to emphasise the effect of this element. The parameters of the band-pass filter have the following values: Centre frequency: 5500Hz Q: 25 Gain: 50 The band-pass filtered white noise is also controlled by its own amplitude envelope: Attack: from 0 to 1 amplitude in 0ms Decay: from 1 to 0.05 amplitude in 32ms Release: from 0.05 to 0 amplitude in 50ms Mixer Here it is possible to control the volume of the four elements individually, to create the mix that suits the desired snare drum sound. The numbers are fixed in this version of the system, but it is possible to alter the values within the snare drum patch in Max/MSP. Page 63 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Changeable Parameters for the Two Synthesizers In order to make the sound from the two synthesizers more interesting to the user, we have made it possible to change some of the basic parameters of both synthesizers. Changeable parameters for bass drum synthesizer In the bass drum synthesizer you can change both the frequency domain of the sweep, the volume, and the amount of distortion of the sound by adding a number to the amplitude of the sound. - Frequency The frequency domain is changed by adding a number between -50 to 50 to the output of the sinusoidal frequency sweep. Example: if the current output frequency from the sinusoidal sweep is 250Hz to 80Hz and you have chosen to add 50Hz to the bass drum, you will hear the frequency sweep of 250Hz(+50Hz) to 80Hz(+50Hz) = 300Hz to 130Hz. The default value added to the frequency is 0. In order to protect the speakers that are used with the system, the lower limit of possible frequencies that can be produced is 35Hz, so even if you add the number – 50 to 80Hz, which is the last number of the frequency produced by the sinusoidal sweep, you will not get a frequency of 30Hz, but instead 35Hz. - Distortion The amplitude of the bass drum can be distorted by adding floats between 0 and 1 to the actual amplitude of the envelope. The value of the number to be added to the amplitude is set by rotating the potentiometer and applying the value to the synthesizer. In this way it is possible to obtain amplitude of 2 on the bass drum sound, which results in a distorted sound. The default value added to the amplitude is 0. - Volume The volume of the bass drum is controlled by sending values between 0 and 100 from the tactile sensor to the bass drum synthesizer and applying these. The values are then scaled from integers ranging between 0-100 to floats between 0-1 and applied to the amplitude of the synthesizer by multiplication. The default volume for the bass drum is 99, or 0.99. Page 64 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Changeable parameters for snare drum synthesizer In the snare drum synthesizer you can change the frequency domain of the element sinusoidal frequency sweep and the volume. Besides this, you can add the two sound effects delay and reverberation – which are mentioned in the next paragraph. - Frequency The frequency domain of the sinusoidal frequency sweep-element is made in exactly the same way as for the bass drum synthesizer. Only here, there is no limiter, so you can add numbers from -50 to 50 and get outer values like: 454Hz + 50Hz = 504Hz (upper limit) 250Hz – 50Hz = 200Hz (lower limit) The default value added to the frequency sweep is 0. - Volume You can control the overall volume of all of the 5 elements combined in the snare drum sound. This is done in exactly the same way as with the bass drum volume. The default volume for the snare drum is set to 99, or 0,99. The Sound Effects From a musical point of view not all sound effects are equally interesting on any instrument. This is why we have chosen to have sound effects like delay and reverb for the snare drum only. Delay patch The ‚D ELA Y‛ patch is a sub-patch to ‚SYN TH ‛ and itcreates a sim ple delay.The patch is divided in two blocks: An orange and a blue (see Figure 54). The orange block controls the delay time, and the blue block sets the delay mix. The orange block receives a default delay which is set to 0, and a variable delay which can be a value between 0-500 milliseconds. However, the delay object only takes delay time in samples. Therefore we convert the value in ms to number of samples with the ‚m stosam ps~ ‛ object. Page 65 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The blue block receives the snare drum audio signalfrom the ‚SYN TH ‛ patch w hich is then split up. It is sent both to a delay and directly to a signal multiplier. The delayed signal is afterwards also sent to a signal multiplier. The gain of the two signal multipliers is controlled by the delay mix value which sends out values from 0-100. 0 means that the amplitude of the delayed signal will be zero, and 100 means that it will have the same amplitude as the original signal. Figure 54: DELAY patch, sub-patch to SYNTH patch In the end the two signals are added together and is sent back to the ‚SYN TH ‛ patch. Reverberation patch In order to create a reverberator we will need to create early reflections, late reflections, and reverberation time, as explained earlier in the Reverberation Theory section (see page 18). Based on the knowledge we had on creating digital reverbs we tried to create a patch with both comb filters and all-pass filters. We felt it sounded too ‚m etallic‛, and in our search to improve it, we found a Max/MSP patch created by Scott Wieser42. We modified the reverb part of his patch so that it would fit into our system. It consist of two patches, the first one is called ‚R EVER B ‛ and it controls the am ountofearly- and late reflections, reverberation time and the mix between the original signal and the reverb signal. The ‚R EVER B ‛ patch is a sub-patch to the ‚SYN TH ‛ patch (see ‚System O verview‛, page 38). The second one is called ‚reverb‛ and itis a sub-patch to ‚R EVER B ‛. It is in this patch thatthe actual filters that creates the reverb effect are. 42 ‚C om puterm uzak‛ p atch available from : http://www.geocities.com/snottywong_1999/maxmsp/ Page 66 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 In the ‚R EVER B ‛ patch w e receive tw o signals,SD _delay w hich is the synthetically snare drum sound we have created, and reverb_mix which is a number from 0-100 that controls the mix between the reverb and the original sound (see Figure 55). The audio signal goes into a signal multiplier object and also into the channel1 ofthe ‚p reverb‛ object w hich sends itto the ‚reverb‛ sub -patch, inlet1. C hannel2 of the ‚p reverb‛ object is fed w ith a num ber w hich determ ines the gain ofthe early feedback;this is sent to inlet2 in ‚reverb‛. Channel 3 is fed with a positive number below 1 that controls the reverberation time, which is sent to inlet 3. Figure 55: Control patch (REVERB patch) of the reverberator, sub-patch to SYNTH patch In ‚reverb‛ w e have the three inlets called 1, 2, and 3 (see Figure 56). Inlet 1 goes to a ‚tapin~ ‛ object which stores the audio signal. The signal is then activated with different delays by the ‚tapout~ ‛ object.The num ber in the ‚tapin~ ‛ objectdeterm ines how long the stored audio signal can be in m s,and the num bers in the tw o ‚tapout~ ‛ objects setthe delays for the signalin m s. Page 67 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Figure 56: The content of the reverberator (reverb patch), sub patch to REVERB patch The 2*6 outlets ofthe ‚tapout~ ‛ objects are m ultiplied w ith differentam plitude factors and are then connected in a signal multiplier. These signals are the early reflections. The gain for the early reflections is set by inlet 2. The sixth outlet from each of the tw o ‚tapout~ ‛ objects also sends the signalto three all-pass filters each. The ‚allpass~ ‛ objects have different delay times, the lowest is 24ms and the highest is 40m s.The ‚allpass~ ‛ objects creates the late reflections and inlet3 controls the reverberation time. Finally,allof the signals are connected and sentback to the ‚R EVER B ‛ patch via an outlet. The Master Section The Master section collects the sound produced by the two synthesizers and the loop playback patch, controls the overall volume, and sends the final result to the soundcard (see Figure 57) The master volume can be controlled by the user by sending values from the tactile sensor. The Max/MSP object ‚dac~ ‛ is used to convert the signals from analogue to digital(see ‚ Page 68 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Digital Representation of Sound‛ on page 12 for details about analogue to digital conversion) Figure 57: Max/MSP patch for Master section Page 69 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 USER TEST Introduction to User Test The system has so far been optimized using results of the analysis of our own voices as described in the chapter ‚Voice-to-drum Frequency Analysis‛, page 24. In order to see if the adjustments we have made to the system also applies to other users we have performed a user test. The purpose of the user test is to ensure that we have reached our success criteria, which are: The system should be able to w ork in ’real-tim e’.(See chapter ‚R ealTim e‛ on page 35) Most people should be able to learn to use the system within a reasonable period of time (approx. 10-15 min). The test w as carried outas a ‚think-out-loud‛-test, where the users are supposed to use the system with only few instructions. In order to create a somewhat domestic atmosphere, the test took place in a room, where the user was accompanied only by the person taking notes during the test. This was done to prevent the user from getting too nervous or shy to provide us with useful results. We just monitor and take notes of the behaviour and the comments from the users. After testing and experiencing the system we ask them some questions according to a questionnaire43. We will compare the reactions and answers from the users to our success criteria. We have tested the system on six users, half of them female and the other half male. It was important to us to have an equal amount of both genders represented in the test, because the optimization before the test had been carried out using male voices only. We were interested in seeing how female voices could control the system with the Sound Input Analysis mechanism described on page 51. 43 See the questionnaire and the answ ers in ‛Appendix 4 – User Test Questionnaire with Answers‛, on page 91 Page 70 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The users for the test were from different nationalities and were all between 21 and 29 years old. All of them were experienced computer users, but only one of them was a little experienced in computer music. Three of the users had played regular instruments before. The test was carried out using a firewire low latency sound card and a good handheld microphone in order to create an optimal scenario for the users. (See ‚Appendix 2 - Sound Hardware Specifications‛ on page 89) Summary of the Answers All in all the users found the system fascinating and were thrilled to be able to trigger synthetic drum sounds with their voice. They were all able to trigger the two different drums sounds individually after only a few tries. However, it was more difficult for them to create a groove with more than 4 beats in a row. Some users noticed that the snare drum was more difficult to trigger than the bass drum, irrespective of gender. Many users found the menu system a bit difficult to understand, because of the sub-menu. Some of them said that this could be avoided by displaying the actual positioning in the menu system visually. Another problem to the users was reading the value of the potentiometer, which is not shown in the GUI after applying the value. All of the users agreed that the tactile sensor would be more efficient and easy to use on the floor. A user would like the control board to w ork in ‚real-tim e‛, w ithout having to apply your effects to the output sound every time you want to make a change. Two of the users noticed that sometimes the buttons produce voltages that make unwanted menus appear. Page 71 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The three users with musical experience would like the possibility of recording your performance within the system in order to improve their skills, but also to make more complex rhythmic patterns, by playing along with oneself. One user found that the music to play along with was too fast for him. Another user noticed the difficulties of saying the correct input sound in order to trigger the system, when the output sound was considerably different. All the users liked the sound of the snare and base drum, and the effects. One of them would have liked to have more effects for the bass drum. None of the users noticed any latency from the system. Conclusion of User Test This test was only performed on a total of six users, which is not quite enough to be representative according to statistics. Nevertheless the statements of the users can be used as pointers that can help us improve our prototype before testing a more final version on a real target group. We have partly fulfilled our criteria of success. All of the users found the system interesting and fun. All of them were able to trigger the sounds, even though the system was set up exactly the same way for all of the participants, and the fact that the users used different microphone techniques. This proves that the users after a short presentation will be able to use the fundamental functionalities of the program. Within the time frame of 15 minutes none of the users could really create a longer and coherent groove, which was part of our success criteria. The users noticed that it was difficult to trigger the sounds fast after each other. The triggering was also difficult because of the difference in the input sound they had to produce compared to the output sound from the system. These problems could possibly be avoided if the users had more time to adjust to the system and practise, just as you have to with other instruments. Page 72 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Generally, the users found the bass drum easier to trigger than the snare drum. This indicates that the Sound Input Analysis could be tweaked and improved in order to ease this. The issues concerning the tactile sensor is not our first priority, compared to the rest of the system, since the controller is a very basic prototype, which we in a final version of the system would create in cooperation with more trained and skilled engineers. Also the issues concerning the flow of the menu system will not be corrected, since these are out of the primary scope of this project. None of the users experienced any latency what so ever, so the system works sufficiently fast. This means that we have fully fulfilled one of our main success criteria. It is our experience that it is possible to achieve a useful result for entertainment purposes with an ordinary sound card, but if you want to use the system in a musical context, you need to have a low latency soundcard. We have obtained a lot of useful advice that can help us improve the system in order to remove bugs and implement small features that enhance the usability and the usefulness of the system. But basically the primary features of the system are working as intended. Nevertheless, we have only partly fulfilled our criteria of success. We were able to make a system that triggers bass drum sounds and snare drum sounds according to voice input, no matter the gender of the user. The users did not notice any latency on the system at all. The creation of rhythmic patterns is more difficult than triggering single sounds. This is due to the following issues: The snare drum is more difficult to trigger than the bass drum, which we should be able to improve. Creating patterns is difficult when the user has to focus on producing the right input sound, which is very different from the actual output sound. This is something that will improve with practice. Page 73 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 CONCLUSION In this chapter we will conclude on the different parts of the project. We will compare the questions in our problem formulation to our final results, and then wrap the conclusion up based on our test chapter. After this we will discuss the future improvements that can be done within the scope of the problem definition. Lastly we will try to put this project into perspective. General Conclusion Sound Analysis ‚H ow can w e analyze voice-to-drum simulated sounds, inputted through a microphone, and identify the differentdrum types?‛ When we took a look at the characteristics of percussive sounds, we found that the human voice has tw o im portant features w hen sim ulating drum sounds (see chapter ‚Percussive Sound Characteristics‛ on page 23). These are the difference in frequency content when simulating respectively a bass drum sound and a snare drum sound, and the rapid increase in amplitude at the beginning of the sound (the attack). These elements were implemented in our system as the criteria behind detecting if the incoming sound is a voice-simulated drum sound, and then which kind of drum sound it is. In order to detect a rapid attack we used the bonk~ object in Max\MSP. This object made it possible to set up rules defining what criteria the attack of the input sound should meet, in order to qualify as a potential drum triggering sound (see chapter ‚Attack Detection‛ on page 52). Based on our analysis of the human voices simulating drum sounds, we acquired knowledge of which frequency intervals to look for amplitude peaks within, in order to find the human voice simulating a bass drum sound and a snare drum sound (see ‚Voice-to-drum Frequency Analysis‛ on page 24). This knowledge was used to set up two band-pass filters to filter the input sound from the microphone. Based on the filtered signals, we set up some rules to determine whether the incoming sound is a bass drum, a snare drum or neither. When a specific drum is detected, and an attack has been detected by the bonk~ object, the sound analysis part of the system sends a trigger signal to the corresponding sound synthesis part (see chapter ‚Sound Input Analysis‛ on page 51). Page 74 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Sound Synthesis ‚H ow do w e create a synthetic set of drum sounds, that resemble the sounds from the vintage drum machines, and can be triggered in sync with the identified sounds from the voice-to-drum input?‛ The construction of our bass drum synthesizer and snare drum synthesizer was based on the structure of a software drum synthesizer called DrumSynth 2. The program is capable of producing many different drum sounds, resembling the sound of the vintage drum synthesizers, which we had chosen as the goal for our sound synthesis. Through a short analysis of DrumSynth 2 (see page 29), we figured out that a combination of sinusoidal frequency sweeps, noise, overtones created through additive synthesis, band-passed noise, and distortion in combination with the right amplitude envelope, could be used to create a bass drum sound and a snare drum sound. Both sounds sounded a lot like the bass drum and snare drum from one of our own favourite vintage drum synthesizer – the Roland TR-909. For the bass drum synthesizer we used a sinusoidal sweep in combination with an amplitude envelope. The snare drum sound is a bit more complex, and consists of a sinusoidal frequency sweep, two overtones made with additive synthesis, noise, and band-passed white noise, in combination with individual amplitude envelopes for the four different elements. (see Sound Synthesis on page 58) The playback of these two drum sounds are synchronized to the trigger signal received from the sound analysis part. Sensor ‚H ow do w e create a tactile sensor thatresem bles a floorboard,and can m anipulate the parameters ofthe synthetically created drum sounds?‛ Our primary idea was to create a foot controlled floor board for this system, but we ended up creating a handheld prototype, which had the exact same functionality as the intended one. On the sensor side, this project has mainly been focusing on the development of the handheld controller, and only describing the theory of a microphone, since the standardised microphones available to us suited our needs in that regard. Page 75 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 The tactile sensor created for this project was created from our theoretical sensor knowledge, described in ‛Appendix 1 - Sensor Theory‛ on page 84. The actual tactile sensor consists of a potentiometer and a control board created as a voltage divider switching network, which are both connected to a microprocessor (Teleo Intro Module). The voltage divider switching network was created to achieve unique voltages when pressing a button on the control board. These voltages, together with the voltages from the potentiometer, could be used to control some of the parameters of the software system after the conversion. The choice of accessing the computer through the microprocessor eased the conversion of voltages to binary digits, which else wise could have been achieved through the serial or parallel port. The Mapping We believe that we have made a reasonable mapping for the microphone for this system. Input sounds in the low frequency area has been mapped to the synthesis of a low frequency sound (bass drum), and the high frequencies have been mapped to a higher frequency sound (snare drum), which seems natural and easy to understand. The tactile sensor is also w orking as w e hoped,although a few m inor ‚errors‛ exists.For som e reason turning the potentiometer clockwise produces decreasing values. It would seem more natural to have it the other way around; since this is also the way most turning knobs work. Besides this we believe the mapping is good. Improvements for creating an even better mapping could be using e.g. the amplitude of the input voice in the output sound from the synthesizer. Testing The main purpose of the test was to measure if we had reached our predefined success criteria. The test was performed on six users, of who half was female and the other half was male. We needed to find out if the users noticed any latency, which could reduce the usability and usefulness of the system in a musical context. We also wanted the system to be easily accessible, meaning that the user should be able to play the instrument within a reasonable period of time (10-15 minutes). Even though six are not enough to conclude anything statistically, the results of the test can be used as a pointer giving us qualitative information for further improvements and tweaking. Page 76 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 All of the six participants were able to play the instrument within the given period of time, but not all succeeded equally well. They could all trigger the two sounds individually, but they all had difficulties creating rhythmic patterns with combinations of the two sounds. This can be due to the fact that it takes more practice to learn the system, and say a particular sound, when the output sound is obviously different. Some of the users also noticed that the snare drum was harder to trigger than the bass drum. This is possibly due to the fact, that the filter analysing the input sound is optimized using only the data of measurements of 36 different voice-simulated drum sounds done by 3 persons. This triggering could be im proved by ‛training‛ the analysis partofthe system w ith the sounds from more different users saying drum sounds. All in all, we feel the system works in a satisfying way, compared to the success criteria we initially had set up. There is always room for improvement, of course, and we will cover those in the next chapter. Future Improvements Here we will cover some of the improvements we feel we could make within the scope of the problem definition. The detection of the input sounds could be improved, if we spent some more time on analysing a wider range of sounds, like mentioned above, and by using more band-pass filters and a more complex set of detection rules, for example. The sound of the synthesized drums were satisfying, but could be made to sound more like the sounds from the TR-909, that we were trying to emulate, had we spent more time on tweaking them. We could also have provided a wider range of possible output sounds and effects, and maybe even have tried to emulate some more natural sounding drums. The tactile sensor works in the basic way it was intended to do, but we did not build it as a floor board as originally planned. This, of course, could have been changed if we had the time and resources necessary to do so. Page 77 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Perspective Throughout the last decade many home studios have appeared due to the fast dropping prices on PCs and the recording possibilities that fast modern computers provide. It has been made possible to create great sounding music in your bedroom, but still you need to have some knowledge of how to play instruments and arrange music. There has also been a tendency of easing the production of electronic music with programs that allows you to create music by combining small samples of music played by real musicians. We wanted to allow non-musicians to be able to hear their musical ideas played with instrument sounds on a stereo, instead of keeping the music inside their heads. The techniques behind this project are relatively simple and still need development in order to make them work in a more consistent way and for more users, but our vision has proven to work in reality. We imagine that a more final version of Funkmeister7 also would be able to track frequency and amplitude in combination in order to synthesize melodic instruments such as bas, piano or flute from these parameters. Perhaps the parameters of the input sound could be mapped to MIDI signals, in order to make the system trigger samples of real instruments and thereby expand the sonic possibilities of Funkmeister7. In this way the Funkmeister7 could be an entire music recording program that enables the user to create and edit entire musical productions without touching a regular instrument, but only by the use of the voice. Page 78 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 LITERATURE AND SOURCES Primary Literature ‚C om puter M usic – second edition‛ Charles Dodge and Thomas A. Jerse Published in 1997 by Schirmer, Thomson Learning ISBN: 0-02-864682-7 Papers ‚Percussion Synthesis‛ Stephen Dill, first-year EE graduate from Stanford University http://ccrma-www.stanford.edu/~sdill/220A-project/drums.html Found 2nd of May 2005 ‚The im portance of param eter m apping in electronic instrum ent design‛ Andy Hunt, Marcelo M. Wanderley, Matthew Paradis, for NIME-02. Borrowed from Juraj Kojs (juko@media.aau.dk), teacher at MED4, spring 2005 ‚M apping perform er param eters to synthesis engines‛ Andy Hunt and Marcelo M. Vanderlay, Department of Electronics, the University of York. Borrowed from Juraj Kojs (juko@media.aau.dk), teacher at MED4, spring 2005 ‚D istributed R eal-Tim e A udio Processing‛ Master by Nelson Posse Lago Distributed Systems Research Group http://gsd.ime.usp.br/~lago/masters/extended_abstract.pdf Found 15th of May 2005 Page 79 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Lectures ‚SignalProcessing in A utom atic Perception 2‛ Lecture 1 Review, by Stefania Serafin MED4 2005, Aalborg University Copenhagen Spring 2005 http://www.media.aau.dk/ap2/lecture1ap2.pdf Found 6th of May ‚Video Segm entation in A utom atic Perception1‛ Lecture 17 and 18, by Thomas Moeslund MED3, Aalborg University, Copenhagen Autumn 2004 http://www.cvmt.dk/education/teaching/e04/MED3/AP/ap17+18.ppt Found 11th of May Internet Wikipedia – web encyclopedia http://en.wikipedia.org/wiki/Main_Page Found 24th of April 2005 Synthopia – Portal for electronic music http://www.synthtopia.com/ Found 26th of April 2005 Max\MSP tutorials and topics http://www.synthesisters.com/download/MSP45TutorialsAndTopics.pdf Found 24th of April 2005 Chienworks – computer services http://www.chienworks.com/ Found 26th of April 2005 Page 80 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Signal Processing in AP2 – website for AP2 sound course on MED4 http://www.media.aau.dk/ap2/soundap2.html Found 18th of April 2005 Harmony Central - Internet resource for musicians http://www.harmony-central.com/ Found 27th of April 2005 Sound on Sound – web based music recording technology magazine http://www.soundonsound.com/ Found 2nd of May 2005 Geofex – Guitar effects oriented webpage http://www.geofex.com/ Found 10th of May 2005 Sensor Technology – website for Sensor Technology course MED4 http://www.smilen.net/SensorTechMED4/ Found 1st of April Makingthings – contract services and tools for prototyping and development tools http://www.makingthings.com/ Found 25th of March 2005 Realtime Audio Analysis Tools for PD and Max Real-time audio analysis tools for Pd and MSP http://www-crca.ucsd.edu/~tapel/icmc98.pdf Found 3rd of April Resistor Colour Code Tutorial http://www.uoguelph.ca/~antoon/gadgets/resistors/resistor.htm Found 5th of May Page 81 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 O hm ’s Law and M aterials Properties http://www.techfak.uni-kiel.de/matwis/amat/elmat_en/kap_1/backbone/r1_3_2.html#_13 Found 5th of May Play Hookey – technical information http://www.play-hookey.com/dc_theory/voltage_divider.html Found 6th of May Terratec – Audio Equipment http://audioen.terratec.net/modules.php?op=modload&name=News&file=article&sid=5 Found 25th of May Same Day Music – Musical instrument store http://www.samedaymusic.com/product--BEHMIC100 Found 25th of May Shure – audio equipment http://www.shure.com/microphones/models/sm58.asp Found 26h of May Group 405 – Funkmeister7 homepage http://cphstud.aue.aau.dk/~ka1147/ Found 26th of May 2005 Page 82 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Software Drumsynth 2 – Drum sound synthesizer http://www.hitsquad.com/smm/programs/DrumSynth/ Found 1st of April 2005 Max/MSP – demo of version 4.5.4 http://www.cycling74.com/products/dlmaxmsp.html Found 1st of April 2005 Bonk~ - extra Max\MSP object for amplitude detection http://www-crca.ucsd.edu/~tapel/software.html Found 3rd of April 2005-05-25 Synthesizer with reverb – used for reverb creation http://www.geocities.com/snottywong_1999/maxmsp/computermuzak.sit Found 6th of May 2005 Page 83 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 APPENDICES Appendix 1 - Sensor Theory To explain a little more deeply what we are going to build, we will have to look at some of the basics in electronics. We want to build a simple switching circuit, thereby general theory about voltage, current, resistance, circuits, switches, variable resistors, and voltage divider switching network circuits will need to be explained. Voltage Voltage is defined as electric potential between two points in a circuit; the voltage can be understood as the pressure pushing the current (electrons) at a given point. The electrons are pushed through the circuit by an electrical force (difference of pressure) because of potential difference in the circuit. If a difference in the amount of electrons exists between the poles of e.g. a battery in a circuit,the pole w ith the m ostelectrons w ill‚push‛ electrons through the circuit. In real life a battery achieves this by an electrochemical process, which results in the minus side having a large amount of charged electrons, while the positive side has fewer. If the two poles in the battery are connected through a circuit and difference in potential exists the electrons will be forced to move through the circuit thereby producing current. The voltage (difference in pressure) is measured in volts [V]. Current Current is the amount of free electrons pushed from one point to another (through an electric circuit) in a given period of time. The flow of electrons can be compared to the water running through a pipe. How fast the current goes is decided by the amount of pressure (voltage). The current is the amount of charged electrons at one point in the circuit per second, and it is measured in ampere [A], the amount of ampere is proportional to the number of charged electrons. Page 84 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Resistance As a result of the crystal structure of most conductors (metals), the free electrons whose movement makes up the electric current, experience collisions with the crystal lattice, which converts part of the electrical energy into heat – this phenomenon is known as electric resistance. R esistance is the m easure ofhow m uch an electricalcom ponent‚refuses‛ to let current flow through it. Everything in a circuit has a certain amount of resistance, even the wires in the copper lanes on print boards, but this resistance is so small that it is not taken into consideration. In electric circuits it is possible to insert resistance at any point one wishes. When resistance is introduced in our system, we are talking about specific components made to lower the flow of current. Such components are called Resistors, their resistance is measured in Ohms [ ] and they come in almost an endless variety of resistance values. How much resistance a resistor produces can visually be determined by the coloured rings painted on the resistor, which follow the resistor colour code standard44. As an example a 4Kohm resistor would look like Figure 58. Figure 58: A 4Kohm resistor45 Figure 59: A drawing of two resistors A s m entioned resistance is m easured in O hm ’s and in technical terms can be described as the relationship betw een voltage and current in O hm ic m aterials,also know n as O hm ’s law (V = I * R ). O hm ’s law 46 states that voltage at the ends of the poles and the current flowing through a conductor (i.e. resistor) are proportional to each other at a given temperature (V / I = R) 47. Voltage at the terminal ends of e.g. a resistor is proportional to both the current (I) flowing through the element and the resistance (R) of the element. Therefore we write as V = I * R. The resistance of an element changes with temperature, which is why the law is valid at a given temperature. By using this equation it can be determined what voltages we might end up with in the end. 44 http://www.uoguelph.ca/~antoon/gadgets/resistors/resistor.htm From S m ilen D im itrov’s slide: S T_M ED 4_05_C ircuit_Theory_Elem entary_M easurem ent_Labs01.ppt 46 http://www.techfak.uni-kiel.de/matwis/amat/elmat_en/kap_1/backbone/r1_3_2.html#_13 47 http://en.wikipedia.org/wiki/Ohms_Law 45 Page 85 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Circuits A simple circuit (see Figure 60) consists of a power source and a closed path of conductors, which forms a circular path with the electric power source, where the current flows through. If the energy source’s poles are only connected together w ith a good conductor it will be a short circuit, m eaning som ething w ill‚crash‛.This is because the electrons do notexperience m uch resistance from the conductor that connects the poles, and the electrons simply return the energy that the source gave back to the source. To make the circuit work, all that has to be done is to insert something that provides resistance, like a resistor or a light bulb, and then the circuit is complete and qualifies as the simplest circuit possible. Figure 60: Illustration of a simple circuit 48 When talking about circuits there are a few things one has to be aware of. First of all is the fact that when making the schematics/diagram for a circuit the charged electrons is written as moving from positive to negative, although they actually move from minus to plus. When the circuit is built from resistors in series, the current stays the same, but the voltage changes49. Switches Switches are used in circuits to open or close the circuit. When the switch is closed (e.g. pushed down when talking about push buttons) the current will flow through the circuit, when it is open nothing will happen. The pushbutton50 has a default state of open, which is whenever energy is not applied to the button nothing will happen in the circuit. This is illustrated in Figure 61. 48 Image from S m ilen D im itrov’s slid e: ST_M ED 4_05_C ircuit_Theory_Elem entary_M easurem ent_Labs01.ppt http://en.wikipedia.org/wiki/Resistor 50 http://www.bcae1.com/images/gifs/switpush.gif 49 Page 86 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Push button actuator Plunger Movable contact Stationary contacts Figure 61: Push button schematic (open) Solder lug terminals Variable Resistor and Voltage Divider Switching Network A potentiometer51 is also a resistor but it has a defined range of resistance that can be adjusted. The potentiometer has three terminals where the middle terminal is the ground and also it is where the difference is measured. If 5 V are connected to the right and left terminal, the connection from middle to right will have one output voltage and the connection from middle to left Figure 62 - Typical will have a second output voltage. As the meter is trimmed, the resistance potentiometer will be increasing or decreasing on both sides, as middle-left connection (taken from wikipedia) gets smaller, middle-right connection gets bigger and vice versa. This is also the basic concept of a voltage divider switching network (voltage divider), in another way one can say that the potentiometer acts as a voltage divider but with only two possible output voltages. The voltage divider52 has the ability to produce more than two output voltages using only one circuit. It is made of series combination of 7 resistors and 6 corresponding push button switches as on the illustration. There is only one loop in the circuit and thus only one current. By pressing a given switch, a part of the resistors are short-circuited, so the total amount of the resistance in the circuit changes, and the output voltage changes as well. Due to this specific construction, it is only possible to detect one button press at a time. Therefore if two buttons are pressed sim ultaneously,the ‚low est‛ one is reported,since it short-circuits the ‚upper‛ one.The principle of the operation can be demonstrated with two cases: 51 52 http://en.wikipedia.org/wiki/Potentiometer http://www.play-hookey.com/dc_theory/voltage_divider.html Page 87 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 In the first case all the switches are open: The current in the circuit is I (1) U R1 R2 R3 R4 R5 R6 R7 Then the output voltage (which is the voltage drop on resistor R1 due to current I) is (2) Vout R1 I R1 U R1 R2 R3 R4 R5 R6 R7 or the same formula, rewritten as a voltage divider: (3) Vout U R1 R1 R2 R3 R4 R5 R6 R7 In the second case switch nr. 3 is closed: The current in the circuit is (4) I U R1 R2 R3 Then the output voltage (which is the voltage drop on resistor R1 due to current I) is (5) Vout R1 I R1 U R1 R2 R3 or the same formula, rewritten as a voltage divider: (6) Vout U R1 R1 R2 R3 These two cases illustrate that for every pressed switch, there is generated a distinct output voltage. These distinct outputs can then be used to get distinct numerical values in our programming environment. Page 88 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Appendix 2 - Sound Hardware Specifications The following hardware has been used during sound analysis and test phase: Terratec FW2453 (low latency firewire soundcard) Behringer MIC10054 (Tube pre-amp) Shure SM5855 (handheld dynamic microphone) Labtec AM-22 (cheap handheld dynamic microphone for computer use) 53 http://audioen.terratec.net/modules.php?op=modload&name=News&file=article&sid=5 http://www.samedaymusic.com/product--BEHMIC100 55 http://www.shure.com/microphones/models/sm58.asp 54 Page 89 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Appendix 3 – Bass and Snare drum peak frequencies of recorded voice-to-drum samples Bass Drum: Filename Main Frequency Peak (Hz) BD_Allan_Labtec_01.wav 318 BD_Allan_Labtec_02.wav 289 BD_Allan_Labtec_03.wav 301 BD_Allan_Shure_01.wav 199 BD_Allan_Shure_02.wav 200 BD_Allan_Shure_03.wav 200 BD_Kasper_Labtec_01.wav 331 BD_Kasper_Labtec_02.wav 309 BD_Kasper_Labtec_03.wav 330 BD_Kasper_Shure_01.wav 167 BD_Kasper_Shure_02.wav 168 BD_Kasper_Shure_03.wav 170 BD_Mads_Labtec_01.wav 258 BD_Mads_Labtec_02.wav 302 BD_Mads_Labtec_03.wav 336 BD_Mads_Shure_01.wav 110 BD_Mads_Shure_02.wav 118 BD_Mads_Shure_03.wav 186 Page 90 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Snare Drum: Filename Main Frequency Peak (Hz) SD_Allan_Labtec_01.wav 2647 SD_Allan_Labtec_02.wav 3360 SD_Allan_Labtec_03.wav 2792 SD_Allan_Shure_01.wav 3317 SD_Allan_Shure_02.wav 3461 SD_Allan_Shure_03.wav 3346 SD_Kasper_Labtec_01.wav 2386 SD_Kasper_Labtec_02.wav 2395 SD_Kasper_Labtec_03.wav 2643 SD_Kasper_Shure_01.wav 2230 SD_Kasper_Shure_02.wav 2234 SD_Kasper_Shure_03.wav 2296 SD_Mads_Labtec_01.wav 2905 SD_Mads_Labtec_02.wav 2802 SD_Mads_Labtec_03.wav 3402 SD_Mads_Shure_01.wav 3360 SD_Mads_Shure_02.wav 2834 SD_Mads_Shure_03.wav 2841 Page 91 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Appendix 4 – User Test Questionnaire with Answers User test of Funkmeister7 – 01 Facts about the user Age, nationality, sex: 22 years, Danish, male Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no Has the user any kind of previous experience with musical instruments or singing? If yes; what kind of instrument and for how many years? no Has the user any kind of previous experience with drum-synth programs? If yes; what/which program(s) and for how many years? no Observations during the user test Was the user able to trigger the snare drum and base drum? Yes, individually he was able to trigger the drums ok, but he had difficulties producing in a sequence of sounds How long time did it take the user to figure out how to play the FUNKMEISTER7? After an introduction to the system, it only took him a couple of minutes, but he was able to create a coherent pattern with different drum sounds with the ten minutes it took. Could the user figure out how to use the control board? At first the user found the menu system difficult to understand, so I introduced it to him. After this it was ok. Questions to the users What was good/bad? ‚Easy to trigger w hen you know the sound‛. It’s difficultto figure out w hat value the chosen effect already has! What did you miss, what could have been done better/different? BD is too nice, need more rough effects How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were there any real-time related problems?) Nice that it actually works! No latency How do you think the control board works? The buttons do not work every time, it would be nice to have on the floor What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu system/entire system works?) Difficult to understand that the menu has a main and sub part What do you think of the bass and snare drum sounds? Fine What do you think of the effects you can apply? (Do you miss some effects?) More on BD Anything else? The interface is nice! Page 92 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 User test of Funkmeister7 – 02 Facts about the user Age, nationality, sex 29, Danish, male Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no Has the user any kind of previous experience with musical instruments or singing? If yes; what kind of instrument and for how many years? Sings, plays guitar, since age 12 Has the user any kind of previous experience with drum-synth programs? If yes; what/which program(s) and for how many years? no Observations during the user test Was the user able to trigger the snare drum and base drum? Starting difficulties, especially SD How long time did it take the user to figure out how to play the FUNKMEISTER7? Could create simple patterns after approx. 5 minutes Could the user figure out how to use the control board? Ok, after a short presentation. Questions to the users What was good/bad? BD pitch has a too wide range, ‚the m usic is too fast‛ m aybe use pitch correction on the songs that are already there? What did you miss, what could have been done better/different? How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were there any real-time related problems?) The system is very difficult to control when you say the sounds really fast. No latency How do you think the control board works? Ok, but it should be on the floor What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu system/entire system works?) R eally sim ple = nice.‚Itw ould be nice to be able to see w here Iam in the m enu system ‛ What do you think of the bass and snare drum sounds? Ok What do you think of the effects you can apply? (Do you miss some effects?) Anything else? Does the input monitor work for girls? It would be nice to have the possibility to save your user presets (favourite adjustments) Real-time adjustment of the sounds would give more expression possibilities. Page 93 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 User test of Funkmeister7 – 03 Facts about the user Age, nationality, sex 24, Danish, female Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no Has the user any kind of previous experience with musical instruments or singing? If yes; what kind of instrument and for how many years? no Has the user any kind of previous experience with drum-synth programs? If yes; what/which program(s) and for how many years? no Observations during the user test Was the user able to trigger the snare drum and base drum? Individually it was ok. The BD was fine,butthe SD dem anded m ore resources from her… itw as better over tim e How long time did it take the user to figure out how to play the FUNKMEISTER7? 3 minutes for the basics Could the user figure out how to use the control board? Difficult to understand the used terms, because she does not play m usic, ‚w hy do the values not w ork in real-tim e?‛,‚w hy can’tI see the currentstatus?‛ Questions to the users What was good/bad? It actually works. Nice! What did you miss, what could have been done better/different? The menu is a bit tricky How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were there any real-time related problems?) fast learner, so the basics took about 5 minutes. No latency How do you think the control board works? What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu system/entire system works?) 90s layout, easy to see what is going on,‚m aybe the m enu should lighten up the chosen instrum ent so you know w here you are?‛ What do you think of the bass and snare drum sounds? fine What do you think of the effects you can apply? (Do you miss some effects?) OK Anything else? ‚Saying another sound than w hatyou hear is difficult‛. D ifficultto trigger the drum s fast Page 94 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 User test of Funkmeister7 – 04 Facts about the users Age, nationality, sex 21, Macedonia, female Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no Has the user any kind of previous experience with musical instruments or singing? If yes; what kind of instrument and for how many years? no Has the user any kind of previous experience with drum-synth programs? If yes; what/which program(s) and for how many years? no Observations during the user test Was the user able to trigger the snare drum and base drum? Ok, after a while How long time did it take the user to figure out how to play the FUNKMEISTER7? A w hile… never really succeeded… was too shy Could the user figure out how to use the control board? Yes, itw as easy to her… but the system was a bit unstable because of voltage issues. Questions to the users What was good/bad? What did you miss, what could have been done better/different? How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were there any real-time related problems?) The basics were ok after a few minutes. No latency How do you think the control board works? ok What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu system/entire system works?) it is very nice What do you think of the bass and snare drum sounds? What do you think of the effects you can apply? (Do you miss some effects?) Anything else? - Page 95 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 User test of Funkmeister7 – 05 Facts about the users Age, nationality, sex 24, Island, female Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no Has the user any kind of previous experience with musical instruments or singing? If yes; what kind of instrument and for how many years? Plays piano and flute, since child Has the user any kind of previous experience with drum-synth programs? If yes; what/which program(s) and for how many years? no Observations during the user test Was the user able to trigger the snare drum and base drum? Good individually. SD difficult How long time did it take the user to figure out how to play the FUNKMEISTER7? Approx. 5 minutes Could the user figure out how to use the control board? Yes, after a short explanation Questions to the users What was good/bad? What did you miss, what could have been done better/different? How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were there any real-time related problems?) No latency How do you think the control board works? Ok, would be better on the floor What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu system/entire system works?) looks nice What do you think of the bass and snare drum sounds? good What do you think of the effects you can apply? (Do you miss some effects?) fun Anything else? It would be nice to hear your performance afterwards. A recording option Page 96 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 User test of Funkmeister7 – 06 Facts about the users Age, nationality, sex 26, Danish, male Has the user any kind of previous experience with beat-boxing? If yes; for how many years? no Has the user any kind of previous experience with musical instruments or singing? If yes; what kind of instrument and for how many years? Plays guitar and sings Has the user any kind of previous experience with drum-synth programs? If yes; what/which program(s) and for how many years? Not really, but has used Fruity Loops a bit, and Cool Edit (sound editor) Observations during the user test Was the user able to trigger the snare drum and base drum? Yes, individually, but did not trigger every time How long time did it take the user to figure out how to play the FUNKMEISTER7? Approx. 3 minutes, but the triggering was not perfect Could the user figure out how to use the control board? Yes, after a short introduction Questions to the users What was good/bad? B D difficult to do fast, it is not precise enough,‚m aybe practice will improve m y skills?‛ What did you miss, what could have been done better/different? How do you think the FUNKMEISTER7 works? (Did it take a long time to learn how to play it, were there any real-time related problems?) Fine, but sometimes there is sound even if you are not saying anything How do you think the control board works? What do you think of the layout/graphic, should it be different? (Is it easy to figure how the menu system/entire system works?) What do you think of the bass and snare drum sounds? Many adjustments are possible, but the interface is ‚narrow ‛… w hich is not good in a live situation What do you think of the effects you can apply? (Do you miss some effects?) I would expect them to be the w ay they are (m andatory)… they sound fine Page 97 of 98 Funkmeister7 - Group 405 - Aalborg University Copenhagen - Medialogy, 4th semester – spring 2005 Anything else? The songs are very ‚N ik & Jay‛-like, the amplitude clips, it would be nice with a recording option in order to play along with yourself and create more complex patterns. Page 98 of 98