DigitalAudioAndMIDICompared

advertisement
CSC 391/691
Digital Audio and Video
Spring 2009
Burg
Digital Audio and MIDI Compared
There are two basic ways in which sound is stored in a computer: as digital audio and as
MIDI.
Digital Audio
Sound is produced by vibrations that cause air pressure to change over time.
Mathematically, it can be represented as a function over time and graphed as a waveform.
The word “boo” recorded in Audacity
The amplitude of the wave corresponds to the loudness of the sound. It is customarily
measured in decibels. The frequency of the wave corresponds to the pitch of the sound.
Frequency is measured in Hertz. One Hertz is one cycle/second. One kilohertz,
abbreviated kHz, is 1,000 cycles per second. One megahertz, abbreviated MHz, is
1,000,000 cycles/second.
1.5
1
0.5
0
-0.5
-1
-1.5
0
2
4
6
8
10
12
Sound wave in blue, two cycles shown, first cycle ending at the red vertical line
Sound in digital audio format is stored as a sequence of numbers representing the
amplitude of air pressure as it varies over time. Sound is converted to digital audio
format by sampling and quantization. When you attach a microphone to your sound
card and create a digital recording using a program like Audacity or Music Creator as the
interface, the analog-to-digital converter (ADC) in your sound card is doing this
process of sampling and quantization. The mic detects the changing air pressure
amplitude, communicating this information to ADC at evenly spaced points in time. This
is the sampling process. The sound card quantizes these values and sends them to the
computer to be stored. When the sound file is played, the reverse process happens. The
sound card has a digital-to-analog converter (DAC) that converts the digital samples
1
back to continuously varying air pressure, a form that can be converted to vibrations in
the air – sound.
When you make a digital recording, the sampling rate must be at least twice the
frequency of the highest-frequency component in the sound. Otherwise, the recording
you make won’t have the true frequencies in it, so it won’t sound exactly like what you
were trying to record. This is called aliasing.
Since the highest frequency that humans can hear is about 20,000 Hz, CD quality
sampling rate is set at 44,100 samples/seconds. “Samples/second” is also abbreviated as
Hertz, so CD-quality digital audio is sampled at 44.1 kHz.
When samples of the air pressure amplitude of a sound are taken, they are stored in a
computer as a binary number. Binary numbers – base 2, that is – consist of bits, each of
which can have a value of 0 or 1. Each number in a computer is contained in a certain
number of bits. This is called the bit depth. The bit depth per audio sample puts a limit
on the number of values that can be represented. If two bits are used, then four values
can be represented.
Base 2
00
01
10
11
Base 10
0
1
2
3
If three bits are used, eight values can be represented.
Base 2
000
001
010
011
100
101
110
111
Base 10
0
1
2
3
4
5
6
7
In general, if b bits are used, 2b values can be represented. Each value is used for to
represent one air pressure amplitude level between a minimum and maximum. Thus, the
larger the bit depth, the more precisely you can represent air pressure amplitude. If you
have more bits, you don’t have to round values up or down so much to the nearest
allowable value.
The bit depth for CD-quality digital audio is 16 bits per channel with two stereo channels
for each sample. A byte is equal to eight bits, so that’s four bytes per sample for a stereo
recording and two bytes per sample for a mono recording.
2
When you record digital audio, you’ll need to choose the sampling rate and bit depth.
CD-quality is probably fine for your purposes, so a sampling rate of 44.1 kHz and bit
depth of 16 bits per sample is good. You don’t need to record in stereo. You can create
stereo channels later in the editing if you want to. Recording in mono is fine.
A digital audio recording captures exactly the sound that is being transmitted at the
moment the sound is made. With a high enough sampling rate and bit depth, the resulting
recording can have great fidelity to the original. For example, when a singer is being
recorded, all the nuances of the performance are captured – the breathing, characteristic
resonance of the voice, stylistic performance of the song, subtle shifts in timing, and so
forth. This is one of the advantages of digital audio over MIDI.
A disadvantage of digital audio is that it results in a large file. You can easily do the
arithmetic. If you have 44,100 samples/second, four bytes per sample, a one minute
recording, and 60 seconds in each minute, how many bytes of data do you get for a stereo
digital recording?
44,100 samples/second * 4 bytes/sample * 60 seconds/min * 1 min = 10,584,000 bytes/min
That’s about 10 megabytes per minute.
Because digital audio results in such big files, it is usually compressed before
distribution. Often, you keep it in uncompressed form while you are working on it, and
then compress it at the end to a format like .mp3. If you want to import an uncompressed
audio file into your project, you can use the .wav format.
In summary, digital audio is stored as a sequence of numbers, each representing the air
pressure amplitude of a sound wave at a certain moment in time. In CD-quality audio,
each number is stored in two bytes (with two parallel sequences of numbers, one for each
of the two stereo channels). There are 44,100 of these numbers per second for each of
the two stereo channels. This creates a big file, which is why digital audio is compressed
for distribution.
MIDI
MIDI stands for Musical Instrument Digital Interface. It is another way in which sound
can be stored and communicated in a computer. The recording, encoding, and playing of
MIDI is done by the interaction of three basic components:



a MIDI input device – often an electronic keyboard or other MIDI-enabled
instrument – which you attach to a computer;
a MIDI sequencer – often a piece of software like Cakewalk Music Creator,
Logic, or Pro Tools – that receives and records the message sent by the MIDI
input device;
and a MIDI synthesizer or sampler – e.g., the sound card of your computer or a
software synthesizer bundled with a MIDI sequencer – that knows how to
3
interpret the MIDI messages and convert them into sound waves that can be
played.
A MIDI file does not consist of audio samples. MIDI uses a different method of
encoding sound and music. When you play a note – say, middle C – on a musical
keyboard attached to a computer that is running a MIDI sequencer, the keyboard sends a
message that says, essentially
Note On, C, velocity v
When you lift your finger, a Note off message is then sent.
Your MIDI keyboard doesn’t even have to make a sound. It’s just engineered to send a
message to the receiving device, which in this case is your computer and the sequencer
on it (e.g., Music Creator).
This is different from hooking a microphone up to the computer, holding the mic close to
a music keyboard, and recording yourself playing the note C as digital audio. In this
case, your keyboard must make a sound in order for anything to be recorded. When you
record digital audio, the microphone and ADC in your sound card are sampling and
quantizing the changing air pressure amplitude caused by your striking the note C. A
sequence of samples is stored, that sequence lasting however long you make the
recording. If it’s one second, it will be 44,100 samples, stored as 176,400 bytes. In
comparison, the MIDI message for playing the note C and then releasing it requires only
four bytes. There are also messages that say what type of instrument you want to hear
when the file is played back. When the MIDI is played back, it doesn’t have to sound
like a piano or an electronic keyboard. You can say, in the file, that you want to hear a
clarinet, flute, or any other of the 128 standard MIDI instruments. (There can be even
more, but 128 instruments are always available in standard MIDI.) Each instrument is
called a patch. Indicating in a MIDI file that you want a different instrument at some
point is called a patch change. A whole set of 128 instruments is called a bank.
When digital audio is played, the changing air pressure amplitudes that were recorded are
reproduced by the playing device so that we hear the sound. MIDI is different, because
no air pressure amplitudes are recorded.
In the case of MIDI, a synthesizer must know how to create a note C that sounds like the
instrument you specified in the file. The synthesizer could be a hardware or software
device.
There are a number of places where MIDI sounds can be synthesized:


the sound card of your computer
a software synthesizer provided by the operating system of your computer, like
Microsoft GS Wavetable SW Synth
4

a software synthesizer, such as the Cakewalk TTS-1 provided by Music Creator
or the many synthesizers provided by Reason
The synthesizer converts the MIDI messages into a waveform representation of sound
and sends it to the sound card. When the music is played, the DAC in the sound card
converts the digitally-encoded waveforms into continuously varying air pressure
amplitudes – vibrations – that cause the sound to be played.
A MIDI synthesizer can convert MIDI messages into encoded sound waves in one of two
basic ways. The sound waves can be created by mathematical synthesis, using
operations that combine sine waves in ways that have been determined to create the
desired pitches and timbres of instruments. Alternatively, rather than creating the waves
by mathematics, a synthesizer can simply look up sound clips that have been stored in its
memory bank. Some people make a distinction between these two methods. They call
something a synthesizer if it uses mathematical synthesis and a sampler if it reads from
a memory bank of sound samples. Some people use the term synthesizer in either case.
In this course, we’ll use the terms interchangeably. For a more detailed discussion of the
different types of MIDI synthesis, see http://en.wikipedia.org/wiki/Synthesizer.)
Because MIDI sounds are synthesized or read from stored samples, the quality of the
sound that you end up with is dependent on the quality of your synthesizer. The sound
card that comes with your computer may or may not give you good MIDI sound.
Different software synthesizers offer a different number of sounds with different
qualities. This is why it’s good to experiment with different synthesizers if you have
them available, like the ones in Music Creator and Reason.
Another disadvantage of MIDI-created music is that it lacks the individuality of digitally
recorded individual performances. However, you can compensate for this lack of
individuality somewhat by altering the MIDI sounds with filters, pitch bends, and so
forth.
An advantage of MIDI is the ease with which you can make changes and corrections.
One quick patch change – a click of the mouse – can change your sound from a piano to a
violin. The key or tempo in which a piece is played can be changed just as easily. These
changes are non-destructive in that you can set things back to their original state any time
you want to. If you play music on a keyboard and record it as MIDI and you make a few
mistakes, you can go in and change those notes individually, which is impossible in
digital audio because the notes aren’t stored as separate entities.
Mixing Digital Audio and MIDI
You can combine digital audio and MIDI in a music production if you have a multi-track
editor that handles both – e.g., Music Creator, Logic, or Pro Tools.
Each track is designated as either a MIDI track or an audio track. You can designate
different inputs and outputs to each track.
5
You can combine audio and MIDI tracks in a wide variety of ways. For example, you
can record drums on one track, a violin on another, and piano on another, all in MIDI.
Then you might want to record one voice on one audio track and another voice on a
second audio track. All these recordings can be done at the same time or at different
times (depending on the power of your computer processor to handle simultaneous multitrack recording). You can record one voice on one track and then play that track while
you record a second voice in harmony.
The advantage of having different instruments and voices on different tracks is that you
can then edit them separately without one interfering with another. The adjustments that
you want to make to one voice or instrument may not be the same as the adjustments you
want to make for another. You can also separate tracks into different output channels, a
way to create stereo separation.
When you’re done with all your editing, you mix everything down to digital audio and
compress it for distribution. However, you should keep a copy of the multi-track file in
the file format of your sequencer (e.g., .cwp for Music Creator). That way you can edit it
more later if you want to.
6
Download