How Digital Audio Works:

EE-93 Special Topics in Recording Engineering (1 credit)
Thursdays 6 - 7:15 PM, plus 3 - 4:15 TU and THU in Granoff 252
Tom Bates, instructor; George Nagel, teaching assistant
Week 6
How Digital Audio Works:
The Continuous Analog World
In the analog world, audio signals are provided continuously.
Pebble in a pool analogy (moves like dominos, not bullets)
We put up a device with a large sail to catch the fluctuating air pressures.
Audio is converted from fluctuating air pressure to an equivalently fluctuating electrical
signal by the microphone. The resulting analog signals that we process in our audio work
are the electrical analogue/equivalent of the compression and rarefaction energy in the
original sound waves as they move through the air.
Everything we do to them after that is a modification of those electrical signals, and then
they are converted back to fluctuating air pressure by a loudspeaker or headphone to be
experienced by human listeners at a later time and different place.
Living in a Sampled World
Unlike the analog world, the sampled world is not a continuum.
It is a world where we take samples of a signal of interest and then later try to recreate the
original phenomenon by looking at our notes.
You have long been used to living in a sampled world.
Digitally sampled signals work because when samples are experienced by humans at a
high enough rate of speed they are perceived as being continuous.
The best example of this is movies. As you know, movies actually consist of a long
series of still pictures that are shown in the theater at a high rate of speed (at 24 pictures
per second in the US, 25 per second in Europe). This is fast enough that it looks like a
continuous moving picture to the human eye. But in reality, when you sit through a twohour movie, you have been sitting in the dark for at least one hour.
Latest advances in cinema projectors.
Television works the same way in that you see 30 pictures per second in the US and 25 in
You are all familiar with the still/pause button on your VCR or DVD player that lets you
pause and view just one of those pictures. When you release that mode and go back to
viewing the video, it looks like continuous action.
Note: In movies the persistence in the eye (light fading as it is turned off) works to hold
the last image until the new one is illuminated.
Note: In video, flicker is further reduced by the phosphor that remains glowing for a time
in a cathode ray tube monitor, after being struck by electrons. Example: P22.
Terms: Interlaced and Progressive Scans – in an LCD and Plasma world.
Digital audio works on the same principle, except that we must provide more samplesper-second than pictures. But in CDs, for example, new audio samples are provided at a
rate of 44,100 new samples per second, and the human ear hears that as a continuous flow
of audio information.
Term: As in the case of television phosphors, we hold the digital audio signal until the
next sample is ready to update our audio. A circuit which holds the previous signal until
updated is called a “Sample and Hold” circuit.
How Sampling Works:
In the digital world, a little man in a tiny house sleeps only to be awakened by an alarm
clock when it is time for him to take the samples. When the alarm goes off, he goes out
and attaches a (digital) volt meter to the line coming from the microphone and writes
down the voltage in a journal. He then goes back to sleep until the alarm clock goes off
Term: When he takes his voltage samples of the analog signal on the wire, he is
converting that analog sample into digital measurements, or a digital representation of
some of the analog signal. This function is called an “Analog to Digital Converter” or an
A/D converter.
Term: The writing down of the measurement numbers in a journal is called “Digital
Term: The Clock or Master Clock is the timing signal for taking samples
The Completion of the Analogy
The recreation of a digital signal, when it is to be re-experienced as an analog signal,
involves a very similar circuit analogy.
In the digital world, a little man in a tiny house sleeps only to be awakened by an alarm
clock (when it is time for him to recreate the analog audio). When the alarm goes off, he
goes out and attaches a (digital) volt meter to an electrical line. He then turns a knob that
adjusts a voltage on that line until it reads the same as the measurement that he had
written down in his journal. He then goes back to sleep until the alarm clock goes off
again. Then he goes out and adjusts the voltage again until it matches the next entry in
his journal.
Term: This act of creating a voltage on a wire that follows the digital recording of
journal entries is called a “Digital to Analog Converter” or D/A converter.
Errors in these processes result mostly in noise added to the signal rather than
conventional analog distortion products. We’ll return to distortion products in a few
Sample Rate
Nyquist Theorem: Not actually developed by Harry Nyquist - tell story.
A scientist/mathematician with Bell Labs proved many years ago that in order to describe
a signal by digital sampling, you need at least two points for every cycle. Another way to
state the same principle is to say that you need to sample at twice the rate of the highest
frequency in your signal of interest.
If audio is to consist of frequencies up to 20KHz, then the sample rate must be at least
Term: The rate at which samples are taken is called, obviously enough, the “Sample
Rate”, but is also often erroneously referred to as the “Nyquist Sampling Rate”.
However, there are several reasons why, in the real world, the sample rate needs to be at
least a bit higher than twice the rate of the highest signal of interest (and benefits from
being a large amount higher).
1. If there are any signals of interest higher in frequency than half the sample rate, they
violate the Nyquist Theorem and, instead of just disappearing, they show up in your
audio signal as nasty distortion elements. (i.e. difference tones)
Term: These are called Aliasing Frequencies. Alias means to appear under another
name or disguise, but in this context, it also means to appear as a different or changed
Alias signals are folded down from ½ the Nyquest sampling frequency – so they are
completely unrelated in frequency to their original source. Sampled at 40 KHz, a 21KHz
input signal would come out transformed into a 19KHz sampled signal (21KHz minus
20KHz). Instead of being 1KHz higher than ½ the sample rate, it becomes 1KHz lower.
These changed signals are so unattractive to experience that signals above ½ the Nyquest
frequency must be removed by a special filter. There must be room for such a filter to
work (keeping in mind the portion of the audio band needed for the transition band and
start of the stop band in the filter), and indeed the audio sounds better if more room is left
and the filter is made to be gentler. This is part of the reason why higher sampling rates
often sound better than lower sampling rates (i.e. 96KHz sampling sounds better than
Term: A filter to remove these offending signals is called an “Anti-Aliasing” filter.
2. There are two things that are analyzed by the human brain when it is absorbing audio
input. One is the complex interplay of frequencies from the lowest to the highest within
the range of human hearing. But the other is the phase differences as signals appear at
the left and right ear. (This goes back to a time when humans needed to know if
something was about to attack and eat them.)
Phase information is how we localize where a signal is coming from, and presenting that
phase information precisely allows for the brain to perceive where the instruments are on
the sound stage to a very exact degree.
Higher sample rates can more effectively preserve this phase information, although
further improvement is diminished as the sample rate goes up.
3. Another part of the listening experience is the effect of an acoustic shock wave. A
physical event which results in an impact wave, such as an explosion or the crash of two
objects as they collide, sets up shock waves which the brain analyzes by their phase more
than their frequency. Shock waves don’t have a coherent frequency, although they often
result in a noise component following them. Even a bass drum is often more shock wave
than frequency, as is Fourth of July fireworks.
This is best recreated if the sampling rate is high, thus allowing more accurate impact
phase information to be transmitted.
Note: These are some of the reasons that audio recorded at higher sample rates may
sound a bit more vivid and life-like than audio sampled at lower rates. However, the
higher the sample rate, the less important the contribution of moving to even higher
sample rates.
Bit Depth and Audio Resolution
What does bit depth have to do with audio resolution, or quality? It defines the accuracy
with which you can measure, record, manipulate and reproduce an audio signal.
Tell the story of measuring the plexiglas box.
Tell the story of measuring your child’s height with 3’ increments, then 1.5’, etc.
(Zero to 7’ requires 8 steps of 1’each - zero to 7.5’ requires 16 steps of ½’ each)
Some Results of Varying Bit Depth
Every time you add a bit of measurement to a digital audio system, you double its
accuracy and lower its noise by about 6 db.
If you add 2 bits, you are 4 times as accurate and therefore about 12 db quieter.
Note: In digital audio, if you add 8 bits (such as going from 16 bits to 24 bits in a
workstation) you are 256 times more accurate and, in theory, have lowered the noise floor
of the system by 48 db.
Further Note: The interesting thing is that 24 bit A/D converters have a noise floor that
is down 117db or, in a rare few instances, 120db. At 24 bits they should have a noise
floor that is down 144db. Why can’t they make a more accurate A/D converter than that?
The answer is that the largest noise contribution doesn’t actually come from inaccuracies
in the digital converter itself. It comes from the analog input stage to the converter
whose molecules are excited in the presence of the heat in our living environment. It is
not easily possible, by the laws of physics, to get the analog noise down any further,
because at room temperature (290K as opposed to absolute zero) the random motion of
molecules in the signal path creates random electrical energy at that –120 db level. There
is no easy way to reduce that noise level in the real world, other than freezing the
environment to near absolute zero, which would kill any human performers! So the
element that is holding up digital audio from being more accurate at this time, is actually
the limitations of analog audio!! You may remember a similar discussion when we
analyzed noise sources in mic pres.
Digital Audio Terms and Principles:
Sample Rate (Covered) (Use 44.1KHz for recording music for CDs rather than a higher
sample rate followed by down converting!)
Bit Depth (Covered)
Aliasing (Covered)