Signal processing for Direct Stream Digital

advertisement
Signal processing for Direct Stream Digital
A tutorial for digital Sigma Delta modulation and 1-bit digital audio processing
Derk Reefman
derk.reefman@philips.com
and
Erwin Janssen
erwin.e.janssen@philips.com
version 1.0
18 December 2002
1
Contents
1 Introduction
2 Characteristics of Direct Stream
2.1 Example: Filtering . . . . . . .
2.2 Example: Non-linear operations
2.3 Example: Anti-aliasing filters .
6
Digital
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
7
9
10
10
3 Sigma Delta Modulation
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 A linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Bit stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
12
14
15
4 Characteristics of SD modulators
4.1 SDM silence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 SDM stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Idle tones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
18
18
20
5 Design of SDM modulators: I
5.1 Loop-filter design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Enforcing SDM stability . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
21
26
6 Design of SDM modulators: II
28
7 Signal processing
30
8 Dithering and linearizing SDM’s
34
9 Non-linearity in a SDM
9.1 Pre-correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2 SDPC and dither . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.3 Performance of a realistic SDM with SDPC . . . . . . . . . . . . . . . . . .
36
38
40
45
10 Acknowledgements
47
A SDM-code
48
2
Glossary
ADC: Analogue-to-Digital Converter. This device converts analogue input signals (from,
e.g., a microphone) to a digital signal that can be used in computations (for example in a
PC program)
(Anti-) aliasing filter: Filter designed to remove any signal larger than the Nyquist frequency.
Authoring: Process in which the final disc image is created. This includes lossless compression, creation of the table of contents etc..
Class-D: Amplifier topology that relies on Pulse Modulation. The pulses drive switches
which connect the load (loudspeaker) either to the positive or negative supply voltage.
Characterised by high efficiency; often also called ‘digital amplifier’.
Clipping: The phenomenon that when a format is designed to handle signal levels no larger
than a level C, every level larger than C is coded as C. For example, the digital format on
a CD cannot handle more than 65536 sub-levels; any signal corresponding to a level larger
than +32767 is represented as +32767 (and likewise for negative signals less than -32768).
Clock jitter: Technically the unwanted phase shift of digital pulses over a transmission
medium. A discrepancy between when a digital edge transition is supposed to occur and
when it actually does occur.
DAC: Digital-to-Analogue Converter: the reverse of a ADC.
Distortion: Any deviation from a linear input/output relationship, where a linear relationship is defined such that the output equals (apart from a constant gain factor) the
input.
Dithering: The addition of a (quasi-)random number to the signal which is subsequently
quantised. Due to the dither, the quantization appears as an (almost) linear process.
DSD: The digital format stored in Super Audio CD. DSD is a format in which 2822400
times per second a 1-bit signal is stored. Lowpass-filtering this signal will restore the
original waveform.
DST encoding: Direct Stream Digital, a lossless compression algorithm specifically tailored
3
to the lossless compression of DSD signals.
Editing: In it simplest form, editing is the process of ‘cutting and pasting’ the music such
that undesirable parts of the recording are removed. Often, also volume changes are applied
and mixing of different channels is performed.
Filter ringing: The effect that a filter with a steep transition band in the frequency domain
produces artefacts in the time domain that extent over a significant period of time.
Idle tone: Tone appearing at the output of a noise shaper that bears a simple relation to
the input of the Sigma Delta Modulator.
Limit cycle: Signal at the output of a Sigma Delta Modulator that requires a precisely
defined input in order to occur, and disappears if the input deviates slightly from the
mentioned precise value.
Linearity: See distortion.
Lossless compression: A way of compacting digital audio streams such, that when they are
unpacked the original stream is restored. Comparable with the ‘ZIP’ program on PCs.
Mastering: Process in which the edit master is subject to processes such as EQ to obtain
the best sound performance.
Matching: The accuracy to which electronic components are the same. This is important
if an electronic circuit relies on the cancellation of two signals: if the components are not
exactly identical, a residual (undesirable) signal will remain.
Noise shaping: The shift of spectral content of the (quantization) noise. For example, in a
Sigma Delta Modulator the energy of the quantization noise is shifted to high frequencies,
leaving no or little noise at low frequency.
Nyquist Frequency: The largest frequency that can be represented by a digital format; the
Nyquist frequency is half the sample frequency.
PCM: Pulse Code Modulation. A digital format, used for example in CD, whereby a digital
signal is represented by an accurate representation (e.g., 16 bits, meaning that the range
-1,+1 is subdivided in 65536 sub-intervals) of the wave form at equidistant points in time
(for example, in CD 44100 times per second a 16-bit approximation of the wave form is
stored).
Pulse Density Modulation: A form of pulse modulation where a large positive signal is
represented by a long series of positive pulses; a zero signal is represented by alernating
4
positive and negative pulses.
Recording: The process of storing the music signals on a medium - either in analogue form
or in digital form.
(Re-)Quantization: The mapping of a signal of infinite precision to a signal with limited
precision. On a CD, e.g., a signal is quantized to 16 bits.
Sigma Delta Modulator: Device which transforms an analogue or PCM signal in a DSD
signal. Often abbreviated to SDM, and also often referred to as Delta Sigma Modulator.
Super Audio CD: Super Audio Compact Disc. Format for music distribution proposed by
Philips and Sony. Super Audio CD is based on a new digital format called DSD.
Topology: Particular way of connecting building blocks to create a circuit.
Up/Down sampling: A signal processing technique whereby the sample rate of a digital
signal is enlarged or reduced. In the latter case, this also corresponds to a loss of information.
5
1
Introduction
The introduction of Super Audio Compact Disc (SACD) as a successor to the CD, has
introduced the need for a change in signal processing. Underlying this change, is the
radically different signal format that is adopted in SACD compared to CD. Whereas in
CD the audio format is called Pulse Code Modulation (PCM), a 16-bit word, at a sample
rate of 44100 samples per second, for SACD this is Direct Stream Digital (DSD), a 1bit word at a sample rate of 64 times 44100 samples per second. In the early nineties,
the time of the conception of DSD, analogue-to-digital converters (ADCs) and digital-toanalogue converters (DACs) were built with 1-bit technology [9]. The driving forces for
the use of this technology were pure technical: in the CD era, demands for distortion levels
were becoming more stringent, and it proved virtually impossible to create low distortion
devices with many (16) bits. Contrary to that, it was much easier to create low-distortion
converters using a digital format of 1 bit, which were running at very high sample rates
such as 64 or 128 times 44.1 kHz. The conversion of this high speed, 1-bit format to
44.1 kHz/16 CD format can easily be accomplished in the digital domain using filtering
and signal processing, which does not introduce any non-linear distortion. This technique
has been highly successful, and the so-called ‘oversampling’ and /or ‘bitstream’ technology
has dramatically increased the performance of the CD-players in the nineties. In fact, those
CD-players were all generating their own DSD internally from the CD source; this DSD
would then be fed into a high quality, 1-bit DAC.
It therefore seemed logical to introduce a format that would store this 1-bit output directly,
instead of the ‘intermediate’ CD format: in this way, all filtering and signal processing
needed to convert to and from the 1-bit format is eliminated which, by definition, can only
increase the sound quality. After the first experiments with DSD, it appeared indeed that
the sound quality was significantly better compared to the 44.1 kHz/16 bit format. Also,
at the same time, new ADCs and DACs were appearing on the market, that were still
using high sample rates (64 or 128 times 44.1 kHz), but exploited a few bits (1.5 to 5)
instead of 1. Again, this had purely technical fundamentals: ingenious tricks to reduce the
distortion problems of a multi-bit converter had appeared, and were feasible to implement
for a limited number of bits. Because 1-bit converters are more sensitive to clock-jitter,
the ‘few-bit’ converters took their place in the high-end audio market. This re-introduced
the need for some mild signal processing, because SACD can only store a 1-bit format.
Interestingly, this did not lead to any observable degradation in sound quality. Therefore,
it is now believed, that the very high sample rate of DSD is the key factor in the extremely
good sound quality of SACD. The fact that the data is 1 bit instead of few bit, however, has
retained its value because it reduces the storage requirements of the audio, thus creating
the possibility to store over 70 minutes of stereo and multi-channel DSD on a single Super
Audio CD.
The purpose of this document is to explain some technical details of Direct Stream Digital.
It tries to give an overview of several signal processing steps which are needed in the world
of DSD, which are different from the accustomed way of doing things. Its purpose is not
to give a full explanation of the perceived sound quality of DSD; this white paper is meant
6
to be an introduction to DSD and DSD signal processing for the educated ‘DSD-novice’.
Reflecting the importance for SACD, a crucial part in this paper is the 1-bit Sigma Delta
Modulator (SDM). The design of such a device will be discussed in detail, and a working
example will be designed to illustrate the design process.
Another important issue that will be dealt with, is DSD signal processing. A typical signal
processing chain for DSD is provided in Fig. 1. In Fig. 1, several steps are envisaged
which occur typically in the creation of an SACD. Most of these steps involve analog or
digital signal processing in one way or another. Starting with the AD converter, this is not
necessarily a native 1-bit converter. Often, high-end AD converters are 3-6 bit converters
running at sample rates between 128fs and 512fs , where fs is symbolic for a sample rate
of 44.1 kHz or 48 kHz. These signal formats need to be converted to 1-bit formats, where
any change to the signal information is to be avoided. As this introduces the need for
a 1-bit SDM, we will start with some introduction to Sigma Delta Modulation, and the
various options that exist to realize a SDM.
In the editing phase, volume adjustments need to be done, and switching between bit
streams is necessary. Switching of bit streams is a technique which is rather different from
standard signal processing, and is detailed in a separate document [12]. In the mastering
phase, heavy signal processing is often involved, ranging from relatively simple equalization
to sophisticated reverberation techniques. In the sequel, it will be demonstrated how most
of the sophisticated techniques developed for PCM can be easily adjusted for application
to DSD. In this respect, it is essential to realize that DSD at 64fs is a consumerformat hence, not necessarily the format that is used in the studio which can be in principle any
format as long as it is of equal or better quality compared to standard DSD.
In the authoring phase, finally, no changes to signal content are made anymore. However,
in most cases the format of the data will be transformed to DST (Direct Stream Transfer),
which is the compressed format of DSD. This lossless compression scheme allows multichannel, high quality DSD data to fit on a the approximately 4.7 Gbyte of a high density
layer of an SACD disk.
2
Characteristics of Direct Stream Digital
Before diving into the generation of Direct Stream Digital (DSD), we will first review some
characteristics of the format as it is used within the context of Super Audio CD. First
and foremost, DSD characterizes itself by the huge sample rate of 64 times 44.1 kHz, or
2.8 MHz. Rather irrespective of the number of bits, high sample rates in the digital world
are desirable because the larger the sample rate, the less the audio artefacts introduced by
the time quantization. We will review a few examples, which show up the phenomenon
that 44.1 kHz (or a small multiple of it) is not enough to avoid significant signal distortions
due to the time quantization.
7
Recording
Editing
Mastering
Authoring/
DST encoding
SACD Player
Figure 1: Typical signal processing chain for DSD applications.
8
1.6
1.4
warped frequency
1.2
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
analog frequency
1
1.2
1.4
Figure 2: The effect of warping the analog frequencies to the limited range of digital frequencies; the frequencies are in reduced units (i.e., 0 . . . π). Red: the warped frequency. Green:
the original frequency. The vertical line shows till what frequency the warped frequency can
be considered to be an accurate representation of the actual frequency.
2.1
Example: Filtering
A well-known issue in (time discrete) digital systems is the problem of mapping (‘warping’)
the infinitely high frequencies, which are allowed in a time-continuous (analogue) system,
to a system where the highest representable frequency is the Nyquist frequency (half the
sample rate). Obviously, as the ultimate goal of digital signal processing is to present an
improvement over analog signal processing, this is a very serious issue. Exemplary for this
problem is the bi-linear transform, which maps the analogue frequency ωa to the digital
frequency ωd according to:
2
atan(ωa T /2)
(1)
T
where T is the sampling period. As illustrated in Fig. 2, this mapping is almost linear only
for a limited frequency regime; for frequencies above 0.1 fs quite substantial deviations
occur. As a result, mapping an analog filter (say, a Butterworth filter) to its digital
equivalent causes significant distortion of its frequency response. If the sample rate is very
high (as, for example, with DSD) the mapping artefacts are benign in the frequency regime
which is most important for audio reproduction. Obviously, it is still possible to create a
filter which has the characteristics of a digital filter at low sample rate; hence with the use
of DSD, one has significant freedom in the choice of filters and filter characteristics.
ωd =
9
2.2
Example: Non-linear operations
In audio signal processing, operations such as compression/limiting and clipping are quite
common. In compression/limiting, the gain of the signal is adjusted according to the signal;
this clearly represents a non-linear operation. Also in clipping, the signal transfer is highly
non-linear. If these non-linear operations are performed in the analog domain, they will
cause higher harmonics to appear. For example, if a 14 kHz signal is clipped, this will give
rise to a third harmonic component at 42 kHz. In the analog domain, this could then be
filtered off, if desired. If the clipping were done in the digital domain at a sample rate
of 44.1 kHz, however, the 42 kHz harmonic would alias back to low frequency: 42 kHz
is 19.95 kHz above the Nyquist frequency (22.05 kHz). The third harmonic would thus
be aliased to (22.05-19.95) = 2.1 kHz, which would give very audible distortion as that
frequency is not harmonically related to 14 kHz. Also, there is no way to remove this
distortion by a filter operation. The only remedy is to up-sample to a high frequency,
and do the non-linear operation at that high rate, thus ensuring that only high frequency,
high order harmonic components are aliased. This causes less harm, because high order
components tend to be of lower amplitude. Then, down-sample with the appropriate low
pass filtering again to 44.1 kHz. Now, obviously, in DSD the sample rate is so high that
non-linear operations behave as they would in the analog domain. Hence, no up- and down
sampling is required, and the decision whether to remove high order distortion components
or not is to the sound engineer - and not dictated by the format.
2.3
Example: Anti-aliasing filters
Because of the extremely high sample rate, DSD sets only very relaxed requirements for the
anti-aliasing filters, which, hence, can be chosen to be rather sloppy. As a result, the ringing
in the time domain is substantially lower compared to systems of lower sample rate where
steep anti-aliasing filters are mandatory. This effect is clearly illustrated in Fig. 3. The
impulse responses of 4 different systems in a multi-channel configuration are depicted: a
48 kHz system, with a bandwidth of 20 kHz (that is, 8 kHz transition bandwidth is allowed
for anti-aliasing filtering), a 96 kHz system with 35 kHz bandwidth (26 kHz transition
bandwidth), a 192 kHz system with 75 kHz bandwidth (42 kHz transition bandwidth)
and an SACD system with 95 kHz bandwidth (and about 120 kHz transition bandwidth).
Though none of the systems reproduce the input exactly, the DSD system shows the least
artefacts. Clearly, the 48 kHz system has great difficulty in reproducing the click; due to
the steep filtering it starts ‘wobbling’, or ringing, at a -30 dB level with respect to the
top of the response approximately 0.4 ms before the click, which is very audible (this is
also the reason why many people prefer ‘sloppy’ anti-alias filters in CD-players; even at
the cost of reduced anti-aliasing characteristics). It also continues to ring after the click
for the same length of time, but most possible this ‘after-ringing’ is audibly masked by
the click it self, and, hence, not as important as the pre-ringing. Apart from this effect,
also the amplitude is only a fifth from what it should be. Especially when the sound will
traverse through a non-linear medium, such as the human ear, this may lead to even larger
10
0.25
’test.48’
’test.dsd’
’test.192’
’test.96’
0.2
Amplitude
0.15
0.1
0.05
0
-0.05
0.0052
0.0054
0.0056
0.0058
time (s)
0.006
0.0062
0.0064
Figure 3: Responses (from left to right) of a DSD, a 192 kHz, a 96 kHz and a 48 kHz
system on a -6 dB block input (‘click’) of 3 µs duration, and amplitude 0.25. Note the
linear amplitude scale.
perceived differences than what can be concluded directly from Fig. 3. Also at the higher
sampling frequencies, the ringing phenomenon cannot be removed, though it is reduced
significantly. Only the DSD system is very effective in suppressing the ringing effect, due
to very slow filtering above 95 kHz. The price to pay for this is the increase in noise floor
with respect to the other systems; however, as the noise floor contains only high frequency
components which are uncorrelated with the audio, they are not perceptible.
3
Sigma Delta Modulation
In this section, it will be assumed throughout that the sample rate equals 64 times 44.1 kHz,
(≈ 2.8MHz) i.e., the sample rate of SACD. By far the most common way to generate such a
1-bit DSD stream is by the use of a Sigma Delta Modulator (SDM), although it is nowhere
stated in the definitions of Super Audio CD [10] that the bit-stream present on the disk
must be generated by a SDM.
In fact, recently many other methods have been developed which are not simply a (single)
SDM. For example, in [3] a type of SDM with an elaborate re-ordering scheme is presented,
and in [5] a so-called Trellis-SDM is presented. In [11], a cascaded structure of 2 SDM’s
is presented, which will be presented in a slightly modified form in Sec. 9. All of these
new developments have in common that their performance is in some way better than that
11
0
-20
-40
-60
Power (dB)
-80
-100
-120
-140
-160
-180
-200
-220
100
1000
10000
Frequency
100000
1e+06
Figure 4: Typical output spectrum of an SDM (4 kHz, -6 dB input).
of an ordinary SDM, but at the same time there is a substantial increase in complexity.
Because a single SDM is still at the basis of all these new developments, and because a
standard SDM is still by far the most widely used device to generate a bitstream, we will
continue by elaborating on the principles of a simple SDM.
3.1
Overview
Sigma Delta Modulation, often also known as noise shaping, is in most general terms a
technique which allows (digital) quantization errors to be spectrally shaped. In the SDM’s
that are typically used for DSD applications, the aim of this spectral shaping is to push
the gross quantization errors made by the course 1-bit quantizers to high frequencies,
where these errors are inaudible. This is possible due to the high oversampling factor: 64,
which leaves a band of approximately 80-100 kHz (which is determined by the maximum
allowable input, as will be discussed later in Sec. 5) to 1.4 MHz (the Nyquist frequency)
to accommodate virtually all the quantization errors. An illustration of this phenomenon
is given in Fig. 4.
Indeed, the spectrum illustrates that this SDM design allows for a very high dynamic range
in the audio band (0-20 kHz), decreasing dynamic range in the band from 20 to 80-100 kHz,
from where the dynamic range remains constant till 1.4 MHz.
12
Schematically, a SDM can be represented as in Fig. 5.
u
H(z)
-
y
Q
u
Q
-
y
-
F(z)
Figure 5: Above: Sigma Delta structure (in feed forward configuration). Below: equivalent
noise shaper structure.
Historically, the SDM is preceded by the noise-shaper (NS) (also see Fig. 5). The most
significant difference between a noise shaper architecture and a sigma delta structure is
the position of the filter: in a noise shaper, the filter is in the feedback loop, in a SDM
the filter in the feedforward loop. Due to the filter in the feedback loop, the error of the
quantizer is spectrally shaped by the filter F (z) and fed back to the input of the quantizer.
It is this process, which is called noise shaping of the quantization error. Though this
appears rather different from a SDM, the noise shaper structure is virtually identical to
the SDM topology. In fact, the SDM and the NS in Fig. 5 are identical if the filter F (z) in
the noise shaper equals F (z) = H(z)/(H(z) + 1). In that case, the input still needs to be
pre-amplified by the filter H(z)/(H(z) + 1) to obtain an identical signal transfer function.
It is important to realize that, because of their equivalence, both a noise shaper and an
SDM perform noise shaping of the quantization noise. Because of that reason, a SDM is
often (mistakenly) called a noise shaper, even though the topology of a noise shaper is
different from a SDM.
The noise-shaper architecture is not often used in analog to digital converters because
matching in the analog domain is difficult, and thus leads to implementation problems.
Generally one resorts to SDM topologies, where one has less analogue problems. In the
digital domain, where precision is arbitrary, matching is not a fundamental problem and
13
both structures can be used. Because of the identical nature, we will restrict the discussion
to the SDM-like structures.
3.2
A linear model
For applications in SACD, the quantizer Q in a SDM is a 1-bit quantizer, which outputs
only values of +1 and −1. This is a highly non-linear element, which has its ramifications
on the operation of the SDM. To gain some initial insight in the characteristics of the
SDM, however, we will resort to a simple linear model and replace the highly non-linear
quantizer by a (linear) gain c and a noise source n, which models the quantization error,
as indicated in Fig. 6.
n
u
-
c
H(z)
y
Figure 6: Linearization of Sigma Delta structure. The quantizer is replaced by a (signal
independent) gain, and an additive noise source. The signal transfer function STF and
noise transfer function NTF are defined by Y = ST F.U + N T F.N , where Y is the fourier
transform of the output y, U is the fourier transform of the input u and N the fourier
transform of the additive noise n.
Doing this, we can write for the signal transfer function (STF) and the noise transfer
function (NTF) the following expressions:
ST F (z) =
cH(z)
1 + cH(z)
(2)
N T F (z) =
1
1 + cH(z)
(3)
Assuming that the quantizer gain c ≈ 1, this shows how, in a situation where the loop-gain
H(z) is very large, the signal transfer function approximates 1. The noise transfer function,
on the contrary, is negligible for large H(z). As the loop-filter H(z) typically is a low pass
filter, with large LF gains, it shows that in SACD applications, the quantization noise is
suppressed in the audio band. In Fig. 4, for example, the loop-filter is a Chebyshev type
II design with a corner frequency of 90 kHz.
It is of crucial importance, however, to realize that the replacement of the quantizer by a
gain element c and an additive noise source, is a very crude approximation, the more so if
c = 1 is taken. Typically, the Signal-to-Noise Ratios (SNRs) as calculated from simulations
on the actual SDM with the non-linearity included, differ significantly from those obtained
14
x
-
T
T
T
c1
T
c
c
2
c4
3
+
Q
y
x
−
T
c’4
T
−
c’3
T
−
c’
T
−
Q
y
c’
2
1
Figure 7: Above: A fourth order Sigma Delta structure in feed forward configuration.
Below: a fourth order feedback topology. If c01 = c1 /c4 ; c02 = c2 /c4 etc., the NTF’s of these
modulators are identical.
by the use of the linearized model. Also other characteristics, discussed in Sec. 4, are not
properly, or not at all, explained by the linearized model.
There also exist other SDM realizations. Whereas the SDM structure in Fig. 5 is referred
to as a feed-forward topology, there also exist feedback topologies. A feedback topology
is displayed in Fig. 7. Like in the comparison of the noise-shaper vs. feed-forward SDM,
there is some equivalence between a feedback and feed-forward topology. We will see this
in a next section. The choice of which topology to use is then dependent on the design of
the complete system.
3.3
Bit stream
In Fig. 8, a characteristic output sequence of a SDM is shown, receiving a sinewave of
amplitude 0.95 and frequency 20 kHz as its input. Even though Fig. 4 leaves no doubt
15
1
0.5
0
-0.5
-1
250
300
350
400
450
500
sample number
Figure 8: Comparison of the DSD output of a SDM (red) and the input to the SDM (blue).
Clearly, in regions where the input sine wave is negative, the bits that are output from the
SDM are predominantly negative, and vice versa.
about the very high accuracy with which the signal is represented in the SDM output,
it is hard to visualize the sine wave from a series of +1s and −1s. An idea is that the
signal that is represented by the bit stream can be obtained by taking a local average of
the bitstream: clearly, when the input sine wave is positive/negative, most bits that are
output from the SDM are positive/negative too, and outnumber the opposite bits by far.
Likewise, around zero input the number of positive and negative bits is roughly identical.
Hence, the global wave form of the underlying (low frequency) signal can be estimated
by taking a local average of the bit stream - akin to pulse-density modulation, which is
sometimes used in Class-D amplifiers. Obviously, this local average will not represent a
highly accurate representation of the wave form. A better impression about the accuracy
with which the input is represented is obtained by filtering the output of the SDM with a
filter which removes the signal in the DSD stream above 20 kHz (in fact, local averaging
is a low pass filter, albeit not a very good one).
It is, therefore, informative to build a system as presented in Fig. 9.
This system allows us to compare the original input signal with the signal which has passed
through the Sigma Delta Modulator. To this end, the bit stream output of the SDM is
lowpass filtered with a steep filter, such to remove any components above 20 kHz. The
16
SDM
−
ε
nT
Figure 9: Setup which allows to compare an upsampled, low rate high resolution signal
with its DSD equivalent. Note, that the down sampling is necessary only for the purpose
of comparison.
input signal can be any signal of large enough resolution; below, we will take signals with
a (digital) resolution of 24 bits. Due to the filter after the SDM, the input signal has to be
delayed by an appropriate amount to compensate for the delay introduced by the filter. If
the input and output signals are subtracted, the residual signal can be inspected.
8e-07
’1kHz’
’20kHz’
6e-07
Absolute difference
4e-07
2e-07
0
-2e-07
-4e-07
-6e-07
-8e-07
0
5e-05
0.0001
0.00015
0.0002
Time (s)
0.00025
0.0003
0.00035
0.0004
Figure 10: Time domain representation of the difference signal , for both a 1 kHz input
signal (red) and a 20 kHz signal (green).
In Fig. 10, two results are displayed: a first residual signal , where the input signal was a
sine wave (0 dB SACD, 1 kHz) and a second signal (0 dB SACD, 20 kHz). The resulting
signal is, for both inputs, noise-like, with an amplitude that corresponds to a resolution
of at least 120 dB. Like wise, this experiment can be performed with real audio signals,
where the result will be the same. Obviously, when the low-pass filtering applied in the
down sampling process is not suppressing the noise above 20 kHz to a level of -120 dB, the
residual signal will be larger.
17
A separate issue that becomes clear from Fig. 8 is the fact that while negative parts of a
sine wave are represented by predominantly negative output values, the pattern in which
the +1s and −1s appear is never the same. This observation leads to the issue of editing
DSD streams. While this is an important topic, the reader is referred to [12] for a discussion
of editing and switching DSD, as this is a non-trivial issue.
4
Characteristics of SD modulators
Sigma Delta modulators represent a new class of devices, which will display other phenomena as we are used to in the PCM world. In the sequel, a few of these features which are
important in practical applications will be highlighted.
4.1
SDM silence
Sigma Delta modulators have some characteristics which we are not familiar with in the
PCM world. A first important aspect is that the output of the SDM always has a power
equaling 1, because the output can only take the values ±1. As a result, silence, as referred
to in DSD, only means that the power spectrum of the DSD is empty below a threshold,
above which any signal cannot be perceived. For example, the following repetitive patterns
are often used and are referred to as DSD silence patterns:
pattern
01010101
10101010
10010110
01101001
hex code
0x55
0xaa
0x96
0x69
pattern frequency
1.4 MHz
1.4 MHz
352.8 kHz
352.8 kHz
Indeed, the patterns do not contain signal components below 80 kHz; however, they still
represent signals with a total power of 1. Often, these patterns are referred to by their
hexadecimal equivalent. The fact that these signals are silent, but still contain information,
can be exploited to use these signal as synchronization words [4].
4.2
SDM stability
Another important aspect is that, while the output of the SDM varies between ±1, its input
most often cannot vary over this range because the SDM becomes unstable for inputs of
high amplitude. While a full theoretical description of this phenomenon is still lacking, a
wealth of heuristic knowledge [9] is available on the stability of higher (> 2) order SDMs.
Because of all this experimentally obtained insight, accurate descriptions of instability are
present that can be used in the design of properly functioning modulators (see Sec. 5).
In fig. 11, the performance of a SDM is shown as function of its input amplitude. Clearly,
above a certain threshold, the performance collapses (in fact, the SDM gets into wild
oscillations if no precautions are taken). The exact amplitude where the sudden collapse
18
140
120
Signal to Noise Ratio (dB)
100
80
60
40
20
0
0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7
input amplitude
Figure 11: Graphical representation of the stability problems for large inputs: for a simple
SDM (in red) discussed in Sec. 5 the SNR collapses for signal amplitudes (in this case: a
4 kHz sine wave) of more than 0.59. In green the result of so-called ‘graceful degradation’
is shown.
19
occurs, is dependent on the wave form of the input and its frequency, and the SDM design,
and is thus not an easy quantity to determine. In section 5, precautions that can be
taken to prevent this uncontrolled behaviour are discussed, that lead to so-called graceful
degradation: instead of a sudden collapse in performance, the performance drops in a much
less aggressive way. This overload phenomenon is the reason why the SACD 0 dB reference
level has been set to 50% of the ‘maximum theoretically possible modulation depth’ [10];
in the cases discussed here this means that allowable input levels vary between −0.5 and
0.5. This definition introduces the possibility to allow signal levels which are larger than
0 dB, in contrast to PCM which has a clear limit at 0 dB: all inputs larger than 0 dB are
harshly clipped to 0 dB. As will be clear in following sections, for SDMs this overload is
possible at the limited cost of increased distortion (clipping of the internal integrators). In
this respect, the DSD format compares to analogue tape recordings, which also allowed for
serious signal overload, but also at the cost of significant distortion. Obviously, for high
fidelity recordings for Super Audio CD, the 0 dB level should never be crossed.
4.3
Idle tones
As discussed in the section Sec. 4.1, silence in DSD is often equivalent to having a high
powered tone outside the signal band. These tones are called ‘idle tones’. For higher
order SDMs, the 1-bit output signal still carries these idle tones, although they have much
reduced amplitude compared to the purely repetitive patterns shown in Sec. 4.1, and are
embedded in a large amount of uncorrelated noise. For non-zero DC inputs, these tones
start to move down in frequency with increasing DC level; at the same time, tones may
start to appear in the LF part, which can, potentially, be audible.
The origin of the tones appearing in the LF part lies in the feedback character of a SDM:
suppose, we have a DC input of 0.25. The most likely combination of bits which represents
that value is 1, −1, 1, −1, 1, −1, 1, 1. If this sequence is repeated, a tone a frequency of
8 × 44.1 kHz will result. For each halving of the DC input, this frequency will be halved
too; eventually, this tone will end up below 20 kHz. This phenomenon can be reduced, or
even removed, in several ways, as will be discussed more extensively in Sec. 8, by dithering
and other means.
If the SDM is undithered, audibility of these tones depends on the SDM used. Typically,
the higher the order of the SDM, the lower the power of the tone in the audio band. For
a typical (undithered) SDM the tones in the audible band are below -130 dB.
5
Design of SDM modulators: I
In this section, a fully operational SDM will be designed. We will use the linearized model
of the SDM to obtain values for the coefficients of the SDM, following to a large extent
the design route proposed in [2], and also discuss ways to ameliorate the stability problem.
From the start, it is important to know that the only way to obtain reliable insight in
the performance of the SDM, is by simulation; although the linear approximation usually
20
results in a working SDM, it is too crude to provide numbers about SNR and, even more
important, it does not provide any insight in stability.
Also, in the design process, we assume an effective quantizer gain c = 1. Simulations based
on this design can give some idea about what the effective gain c actually is within the
limitations of the linear model, and be used for further refinement of the loop-filter.
5.1
Loop-filter design
A very convenient way to start the design of a SDM modulator is the linear model of Fig. 6,
where we take the gain c = 1. We take a feed-forward structure from Fig. 7, and write
down the NTF that is associated with it. We can write for the loop-filter H(z):
z −1
z −1 2
z −1 3
z −1 4
+
c
(
)
+
c
(
)
+
c
(
)
2
3
4
1 − z −1
1 − z −1
1 − z −1
1 − z −1
and making use of the relation N T F (z) = 1/(1 + H(z)) we arrive at:
H(z) = c1
N T F (z) =
(4)
(1 − z −1 )4
(5)
(1 − z −1 )4 + c1 z −1 (1 − z −1 )3 + c2 z −2 (1 − z −1 )2 + c3 z −3 (1 − z −1 ) + c4 z −4
which is to be recognized as a filter of the appearance N T F (z) = (1 − z −1 )n /Pn (z −1 ). This
is the form of a Butterworth or a Chebyshev type II filter1 ; the choice of either of those
realizations dictates the final appearance of the polynomial P (z). Likewise, the STF can
be computed as ST F (z) = 1 − N T F (z), resulting in:
c1 z −1 (1 − z −1 )3 + c2 z −2 (1 − z −1 )2 + c3 z −3 (1 − z −1 ) + c4 z −4
ST F (z) =
(6)
(1 − z −1 )4 + c1 z −1 (1 − z −1 )3 + c2 z −2 (1 − z −1 )2 + c3 z −3 (1 − z −1 ) + c4 z −4
The approach that can now be followed is to design a high-pass filter for N T F (z), according
to Butterworth or a Chebyshev-II (or any other) rules, and reorganize terms such that it
is in the shape of Eq. (5). One way of approaching this is to use a symbolic manipulation
package such as Mathematica [14], or to collect terms in powers of z and equate identical
powers. From an engineering point of view, a very easy way of obtaining the coefficients
ci is by recognizing that 1/N T F (z) is linear in the coefficients ci . It is then possible to set
up a linear system for (at least as many as the order of the system) different values of z.
These values must have no simple relation to each other, but need not be complex. In this
way, it is also irrelevant whether the Butterworth filter is provided as a cascade of biquads,
or as a direct realization.
When we inspect the feedback structure (lower part of Fig. 7), we see that the transfer characteristic for the N T F (z) is identical to the NTF of the feed-forward structure discussed
above.
1
albeit scaled such that the first term c0 z 0 of H(z) equals zero. If this term were non-zero, the resulting
SDM would not contain a delay in the closed loop and hence be not realizable.
21
However, the STF is given by
z −4
(7)
ST F (z) =
(1 − z −1 )4 + c1 z −1 (1 − z −1 )3 + c2 z −2 (1 − z −1 )2 + c3 z −3 (1 − z −1 ) + c4 z −4
which, for low frequencies equals about 1 if the coefficient c4 equals unity (this refers to the
scaling as applied in Fig. 7). For higher frequencies, the STF displays an almost third-order
roll-off. This is in contrast to the feed-forward topology, where the STF rolls off only very
slightly (first order) for high frequencies.
As an example, we will design a fourth order SDM, with a NTF according to a Butterworth
high-pass filter design. The cut-off frequency is chosen as 150 kHz. Because the SDM needs
to be realizable, the total loop needs to embody at least a single delay, i.e., the term with
z 0 in the STF needs to be zero. This corresponds with the requirement that the high pass
filter should have 1 as its first value of the impulse response. This can be accomplished by
multiplying the high pass filter with a certain coefficient (larger than 0), resulting in a HF
gain which is larger than 1. With the above in mind, we obtain for the NTF:
+1.00z −0 − 4.00z −1 + 6.00z −2 − 4.00z −3 + 1.00z −4
N T F (z) =
+1.00z −0 − 3.13z −1 + 3.75z −2 − 2.03z −3 + 0.42z −4
This results in the following coefficients in the feed-forward structure:
c1
c2
c3
c4
= 0.8707115357
= 0.3594322506
= 0.0811807847
= 0.0083240406
(8)
(9)
For the feed-forward structure, the STF is now given by:
+0.00z −0 + 0.87z −1 − 2.25z −2 + 1.97z −3 − 0.58z −4
+1.00z −0 − 3.13z −1 + 3.75z −2 − 2.03z −3 + 0.42z −4
For the feedback structure, the STF is given by:
ST F (z) =
(10)
z −4
(11)
ST F (z) =
+1.00z −0 − 3.13z −1 + 3.75z −2 − 2.03z −3 + 0.42z −4
In Fig. 12, the different STF’s for a feed-forward and feedback structure, with an identical
NTF, have been calculated. The NTF’s are designed as 4’th order Butterworth high pass
characteristics, with a cut-off frequency of 150 kHz. Clearly, the strong roll-off characteristic of the feedback structure can be observed. Interestingly, the feed-forward topology
displays a strong peak in its transfer characteristic at the cross-over frequency. This feature
is not obvious from Eq. (3) if only the magnitude response |H| is used. The maximum
peak height is in this case about 6 dB.
This loop-filter design gives rise to an SDM with a maximum input of about -5 dB (i.e.,
0.57 w.r.t. the feedback signal from the quantizer). At an input of a sine with an amplitude
22
10
’FF.STF’
’FB.STF’
0
Gain (dB)
-10
-20
-30
-40
-50
-60
10
100
1000
10000
frequency (Hz)
100000
1e+06
1e+07
Figure 12: Signal transfer functions for a feed-forward topology (red) and a feedback topology
(green) with identical NTF’s.
23
cut-off (kHz)
100
120
150
170
DR (dB)
85
90
97
100
max. input level
0.77
0.70
0.57
0.49
Table 1: Trade-off of the maximum input range and the SNR in the base-band.
of 0.5, the (unweighted) Signal to Noise Ratio (SNR) in the band 0-20 kHz is about 97
dB. In SACD applications, this is not sufficient: a signal-to-noise ratio of at least 100 dB
is desirable. However, one might argue that the A-weighted SNR is much better, because
the noise floor is large only for frequencies close to 20 kHz. Indeed, for this example, the
A-weighted SNR amounts to about 105 dB. More important is the maximum modulation
depth of the modulator. The definition of the 0 dB level in SACD is 50% modulation depth,
i.e., the sine wave from the previous example would correspond to 0 dB SACD exactly.
Peaks in the signal of +3.1 dB are allowed (though for a short period only)2 . Hence, the
SDM needs to be stable for inputs up to a level of about 0.71.
For every SDM design, there is a trade-off between stability of the modulator and the SNR
in the base-band. As an example, consider the results in table 1 for different 4’th order
SDM’s, which have all been created using Butterworth high pass filters as design NTF.
Clearly, for these modulators it is not possible to obtain a dynamic range (unweighted)
exceeding 100 dB, while maintaining the possibility for seriously overloading the SDM to
a level of +3.1 dB.
One way of increasing the SNR in the audio band, while hardly reducing the maximum
input level, is to use higher order filters for the NTF, and to use a Chebyshev type II -like
high pass filter for the NTF design instead of a Butterworth characteristic. Chebyshev
type II high pass filters can easily be created in SDM’s by the construction of resonator
sections, as displayed in Fig. 13.
The construction in Fig. 13 is, in principle, applicable to a feed-forward topology; for a
feedback topology, a similar arrangement with a feedback loop over two integrator sections
is possible. In Fig. 13, two outputs of the resonator section are indicated, R1 and R2 ; the
relation between these is that R2 (z) = h(z)R1 (z), designating the transfer characteristic
of the integrator section as h(z) = z −1 /(1 − z −1 ).
Also, two different realizations of the feedback path (with coefficient f ) are possible. The
full drawn curve in Fig. 13 doesn’t incorporate the delay that the dotted realization does.
The effects of the dotted feedback structure can be obtained as follows. The transfer R 2 (z)
of the resonator section becomes:
R2 (z) =
2
h2 (z)
1 + f h(z)2
These and other audio requirements are in part 2 of the SACD scarlet book[10]
24
(12)
f
−
T
T
R2
R1
Figure 13: A cascade of two integrator sections in a SDM, with a feedback loop between the
integrators. The two different ways of incorporating the feedback loop result in slightly different pole characteristics. Indicated are the two different outputs, which are characterized
by a transfer function R1 (z) and R2 (z), respectively.
This function has a pole at zp when z = zp solves (1 + f )z −2 − 2z −1 + 1 = 0, i.e.,
zp = 1 ± i
p
f
(13)
Hence, the norms |zp | > 1. The reduced frequencies fpole of these poles are thus given by
fpole = atan(
p
f)
(14)
In the case of the full feedback path in Fig. 13, the resonator has a transfer function
R1 , R2 (z) given by:
R1 (z) =
h(z)
;
1 + zf h(z)2
R2 (z) = h(z)R1 (z)
(15)
In this case, the poles are given by
ip
f
±
4f − f 2
(16)
2 2
Contrary to the previous case, these poles are exactly on the unit circle. The pole frequencies are given by:
zp = 1 −
f
)
(17)
2
which, for small values of f , virtually coincides with the pole frequencies given by Eq. (14).
As such a feedback loop over two integrator sections transforms the two poles at DC
(z −1 = 1) in two complex conjugate poles away from DC, care should be taken that there
is enough DC gain in the loop-filter to avoid DC drift. As an example, consider the 4th order
SDM with a Butterworth design, corner frequency 150 kHz. Choosing the poles to move
from DC to ±10 and ±19 kHz, the numerical values of the feedback coefficients obtained
are 0.000496 and 0.001789. The SDM obtained has a maximum input of 0.57 (0.57 without
resonators) and a SNR of 107 dB (97 dB without resonators). Indeed, the addition of the
fpole = acos(1 −
25
T
−C
+C
Figure 14: Principle of a clipped integrator. The absolute value of the output of the integrator cannot exceed a value of C.
poles, turning the Butterworth characteristic in a Chebyshev II - like characteristic, gives
significant better SNR; the DC suppression of the loopfilter is still better than 120 dB,
which is sufficient. Compared to the A-weighted SNR figures, the improvement is less,
because the poles primarily serve to suppress the noise between 10 and 20 kHz.
A further improvement can be obtained when using a fifth order SDM, with a Butterworth
NTF design (corner frequency 110 kHz) plus the poles at 10,19 kHz: in that case the SDM
is stable to inputs up to 0.58, with a SNR of 120 dB. Note, that in this case, there is still
1 integrator with a pole at DC, and thus there cannot be any DC drift. To clarify the
operation of such a SDM, pseudo-code of the SDM is provided in App. A.
A drawback of the above implementations of resonator sections is that the resulting filter
is not minimum phase; due to this, not the full potential that noise shaping offers can be
realized. Although the improvement that can be realized by a minimum-phase filter is (in
this case) limited, a very interesting suggestion is the following3 . Suppose that we create
a resonator section, which contains both the dotted and the full drawn realization of the
feedback. Denote the feedback coefficient in the full drawn realization by f1 , the feedback
coefficient in the dotted structure by f2 . The poles ((1 + f2 ) ≥ (1 − f21 )2 ) are then given by
f1
zp = (1 − ) ± i
2
r
(1 + f2 ) − (1 −
f1 2
)
2
(18)
q
√
1+f2
and have |zp | = 1 + f2 , with reduced pole frequencies fpole = atan( (1−f
2 − 1).
1 /2)
Hence, the radius and pole position can be adjusted independently, and it is possible
to have |zp | < 1 at the cost of an additional feedback path in the resonator section.
5.2
Enforcing SDM stability
So far, we have not bothered about what happens if the SDM input exceeds its maximum:
the SDM gets into wild oscillations, with constantly increasing amplitude in the integrator
states and decreasing frequency. Even worse, when the input is removed from the system,
the SDM does not return to its original state. To avoid such a situation, it is customary
to use clippers in each integrator stage. In Fig. 14, a schematic representation of a clipped
3
This observation has been made by prof. S.P. Lipshitz.
26
integrator is given. The idea is that the output of the integrator can never exceed its clip
value, C. In other words, the integrator section simply stops integrating when the cliplevel
C has been reached
The purpose of these clippers is to avoid a situation where the values in the integrator
stages get too high (and cause the SDM to start to oscillate), while still allowing integrator
values which occur during normal operation. Whereas the main purpose of the clippers
is to let the SDM return to normal operation after overload, it is also desirable to avoid
serious distortion in the signal if clipping occurs.
Heuristic ways of obtaining reasonable numerical values for the clipper levels are monitoring
the integrator levels during very large sine wave inputs and square wave inputs, close to
overload of the SDM. The clipper levels C1 and C2 of the first 2 integrator stages can be
set according to these values. If the higher integrator stages are assigned values according
to this recipe as well, the situation occurs that the SDM returns to normal operation after
overload, but can have all clippers activated simultaneously. This will cause serious clicks
and pops (especially if the first integrators run in their clippers). Hence, the higher order
clippers should be designed such that the high order clippers are activated first, before the
low order clippers are activated.
As an example, let us consider the fifth order SDM designed previously. It’s feed-forward
coefficients are:
c1 = 0.79188240; c2 = 0.30454538; c3 = 0.06992965; c4 = 0.00949572; c5 = 0.00060680
with resonator coefficients:
f1 = 0.000496; f2 = 0.001789
The pseudo-code of this SDM is provided in App. A, suitable for easy implementation in
any programming language. Without any clippers, the SDM is stable for sine inputs up
to 0.58; for higher amplitudes, the SDM gets fully unstable. Looking at the maximum
integrator values during operation close to overload, we obtain a value C1 and C2 for the
first and second clipper respectively of about 4 and 9. The following clipper values are
chosen such, that the product Ci ci of the clipper value and the corresponding feed-forward
coefficient is reduced by about 1.5 - 2 per integrator stage. This is illustrated in table 2.
From table 2, we can obtain some idea about the influence of the clippers on the SDM
operation. The clippers are sometimes activated during operation at 0.5 input level, which
causes a small reduction in SNR with respect to the 120 dB without clippers. However,
whereas the original SDM turned unstable at inputs of 0.59, its clipped version shows
continuous stable operation. Even at inputs of 0.65, the first integrator is not clipped,
indicating that the signal distortion is still limited, and highly audible clicks are absent.
In fact, only at input levels exceeding 0.75, the initial integrator will clip, which causes a
clearly audible effect. At the level of 0.75, the SNR has dropped to about 60 dB.
As an alternative to clipping in the SDM, clipping before the SDM might be considered.
However, in this case dynamic range must be sacrificed, although the resulting system is
unconditionally stable for large inputs.
27
Integrator
1
2
3
4
5
Input level
0.5
0.55
0.59
0.60
0.65
C1
0
0
0
0
0
Ci
4
9
25
92
700
C2
0
0
0
5
512
ci
0.7918824022
0.3045453872
0.0699296548
0.0094957213
0.0006068024
C3
0
0
12
48
2283
C4
0
0
57
175
3258
ci Ci
3.16
2.7
1.75
0.87
0.42
C5
1836
6595
16285
18829
38155
SNR (dB)
118
117
107
104
67
Table 2: Determination of clipper values for a SDM (above) and the influence of the clippers
on the normal SDM operation (below). The columns with clippers Ci indicate the number
of times a clipper was activated in a run of 300,000 samples.
The above route has given a complete design example of a modulator reaching the ‘magical’
120 dB SNR limit. In practice, however, it is dubious whether 120 dB SNR is necessary.
As most electronic equipment seems to be closer to 110 dB, and human hearing seems
not capable of reaching a dynamic range of more than 100 dB, 110 dB SNR in the SDM
design seems realistic. As an alternative, one might consider a SDM design according to a
Butterworth NTF design with a corner frequency of 95 kHz. The resonator poles remain
unchanged. The coefficients for such a modulator are:
c1 = 0.68402124; c2 = 0.22813609; c3 = 0.04563584; c4 = 0.00542804; c5 = 0.00030590
with clipper levels C1 = 5; C2 = 12; C3 = 40; C4 = 200; C5 = 1100. This SDM is stable
(without clipping) up to inputs of about 0.65, while reaching a SNR figure of about 115
dB. These figures seem to represent a very agreeable compromise between dynamic range
and maximum allowable input. However, for every application this balance should be
re-judged.
6
Design of SDM modulators: II
In the previous section, a design method is presented which in general leads to SDM’s
of good performance with a Butterworth high-pass type NTF. However, sometimes there
may be specific demands which necessitate the use of other designs. An example of such
a demand may be the specification of a limited amount of HF noise in the band above
40 kHz. Though several designs exist which allow for this, we will outline two.
28
0
’test.AvgPwr’
-50
-100
-150
-200
-250
10
100
1000
10000
100000
1e+06
1e+07
Figure 15: Example of a SDM which has been created according to NTF design by cascading
a third order high pass filter and a fourth order high pass filter.
The first, in line with the previous section, consists of cascading 2 (or more) high pass
filters, which then make up the SDM NTF. For example, one could wish to create a SDM
which is third order starting from 150 kHz, and than turns 7’th order at about 40 kHz.
An example of such a design is given in Fig. 15. That SDM has been obtained by designing
an NTF as a cascade of a third order Chebyshev high pass filter, with a corner frequency
of 150 kHz, and a fourth order filter of the same type with a corner frequency of 40 kHz.
The cascade is hence 7’th order below 40 kHz, and in this way some of the merits of a low
order and high order SDM can be combined.
A more heuristic approach is to set each coefficient ci in the SDM to a fraction of its
previous coefficient ci−1 . An example of such a SDM in non-delayed feed-forward topology
is given in Fig. 16, which represents a 7’th order SDM where each coefficient is 0.475 times
its previous coefficient. Note, that this is really a recipe; the actual performance of the
SDM is determined too by its topology (e.g., a SDM with delayed feedback topology would
be unstable with these coefficients).
It is interesting to see, that a NTF characteristic as displayed in Fig. 16 can be approximated by a cascade of first order filters with different corner frequencies. In that case,
there is full control over the SDM design.
29
0
’test.AvgPwr’
-50
-100
-150
-200
-250
10
100
1000
10000
100000
1e+06
1e+07
Figure 16: Example of a SDM which has been created by setting each feed-forward coefficient
ci to 0.475ci−1 (c1 = 1).
7
Signal processing
A crucial point in any audio chain is signal processing, ranging from simple volume adjustments to complex equalizations. It is immediately apparent, that a direct translation of
the ‘PCM-way’ of signal processing does not exist in DSD. For example, if a DSD signal is
volume-adjusted, with a gain g = 0.123456, the resulting output (the one-bit signal multiplied with g) is a multi-bit word. Hence, any signal processing for DSD is always consisting
of a cascade of the actual processing step, followed by a re-quantization as shown in Fig. 17.
It is possible to contract some signal processing steps and the SDM re-modulator. An
example, where an IIR filter is contracted with a SDM, is shown in Fig. 18. It is important
to note, however, that such a device is not different from the cascade of signal processing/remodulation, although the intermediate multi-bit path is absent.
To obtain a realizable system, a low pass filter is generally necessary as indicated in Fig. 19.
The reason for this is that the SDM which is used as a re-modulator, cannot cope with the
high signal levels the DSD presents. As virtually all of the power of these signals is above
100 kHz, a low pass filter operating above this frequency is sufficient to remove enough
power such that the re-modulator remains in stable operation. In this respect, the feedforward and feedback structures have quite different behaviour. As elaborated in Sec. 5,
the feed-forward structure has little suppression of the input signal over the whole band
(up to Nyquist), and sometimes even a gain just at the corner frequency of the NTF filter
characteristic. The feedback structure, on the contrary, has strong suppression of the input
30
DSD
input
Gain
Multi−bit
intermediate
DSD
output
Σ∆
High rate!
DSD
input
IIR
Multi−bit
intermediate
DSD
output
Σ∆
Figure 17: Examples of DSD signal processing: gain adjustment and filter operations.
DSD input
DSD
output
T
T
T
T
T
Figure 18: Contraction of IIR filter characteristic and SDM, giving a structure with DSD
input and DSD output.
DSD
input
Gain
Multi−bit
intermediate
Σ∆
DSD
output
High rate!
Figure 19: Advisable way of performing two operations on DSD data. First, a gain adjustment is applied, after which an IIR filter operation is applied without leaving the intermediate high rate, multi-bit domain.
31
20
’Total’
0
Magnitude (dB)’
-20
-40
-60
-80
-100
-120
-140
0
50000
100000
150000
200000
frequency (Hz)’
250000
300000
350000
Figure 20: Transfer function of a filter which can be used to remove the HF of a DSD
signal, such that it can be input to a subsequent SDM.
32
Signal quality
Amplitude (dB)
20
100
frequency (kHz)
# requantizations
Figure 21: Schematic presentation of the effect of multiple quantizations.
signal from the fore mentioned corner frequency (see also Fig. 12). Hence, a ‘feed-forward’
SDM will need more severe filtering of its input signal compared to a ‘feedback’ SDM in
order to maintain stability. The response of a (64 taps) FIR filter which gives sufficient
HF suppression to allow subsequent re-quantization, is shown in Fig. 20. The total signal
transfer characteristic of the cascade of a feed-forward SDM and this filter will be roughly
identical to the STF of a feedback SDM. Clearly, the application of such a filter will turn
the 1-bit signal directly in a multi-bit signal. It is therefore important to realize, that the
benefits of DSD are in the high sample rate, they are not in the fact that DSD is 1-bit! The
importance of this remark is further emphasized by the following notion: suppose, that the
sequence of signal processing steps is necessary. If each of these steps is built according
to Fig. 17, the total signal path will contain multiple requantizations. As a result of this,
build-up of HF noise will occur. This effect is illustrated in Fig. 21, where schematically
the effect of multiple requantizations is displayed. This figure can be explained as follows.
If we have a DSD signal, its noise starts to rise above 20-30 kHz, and reaches an almost
flat level at about 90 kHz. If, in a subsequent re-quantization, the bandwidth of DSD is
maintained, the signal is low pass-filtered at a frequency of about the same value (90 kHz).
If this signal is fed to a next SDM, its output signal will contain both its own quantization
noise, as well as the quantization noise that has been input to it. If this cascade is repeated,
it is easy to see why there will be a build-up of HF noise in the area of about 80-90 kHz.
Eventually, this signal will be large enough to drive the SDM into its clippers, or, worse:
instability. This effect is shown in the right of Fig. 21; as the number of requantizations
increases, the signal quality drops slowly. At the moment that the HF noise is large enough
to activate the clippers, the signal quality drops rapidly.
Hence, all signal processing should be done in a multi-bit domain; only after the final signal
processing step the conversion to 64fs 1-bit signals should be made.
33
8
Dithering and linearizing SDM’s
SDM’s are devices with a quantizer; as we are used to with the quantizers from the PCM
world, we need to linearize the devices that use a quantizer. With the multi-bit quantizing
PCM devices, it is common knowledge that the quantizers need to be dithered with TPDF
dither (dither, distributed according to a Triangular shaped Probability Density Function)
of full width at half height of 1 LSB [6]. Such dither can easily be obtained by adding
2 random numbers from a uniform distribution of width 1 LSB. For SDM, this recipe
is a contradiction in terminis, since the quantizer spans only one bit and, hence cannot
accommodate the afore mentioned tpdf dither which spans 2 bits. Still, dithering in what
we will coin ‘the classical sense’ is a very useful technique and has been well-researched;
see [8] and [9] and references therein. Even so, new dither techniques are being discussed,
which are more appropriate for 1-bit coders; see, e.g. [3]. Next, we will discuss some
aspects of dithering in the classical sense.
As dither is used to remove the effect of non-linearity, we can distinguish two different
appearances of the non-linearity: limit cycles4 , idle-tones and distortion. As the idle
tones and distortion are heavily suppressed by the loopfilter, we will ignore it for the
moment. In Sec. 9, a more detailed discussion about non-linearity in an SDM is presented.
Limit cycles, however, can be very annoying: they can appear in the audible range and,
even in the audible range, have high power. Consider, for example, an SDM with the
topology at the top of Fig. 7, characterized by the following feedforward coefficients: c 1 =
2048; c2 = 768; c3 = 128; c4 = 16; c5 = 1. Clearly, this SDM is extremely well-suited
for implementation in hardware, as the coefficients represent simple powers of 2, except c2 ,
which is the sum of two powers.
It’s spectrum, input zero, is displayed in Fig. 22 which does not show any resemblance
with the familiar noise-shaped curve: it is a limit cycle. A limit cycle is a purely repetitive
pattern of certain length; for example, a repeated sequence (representing zero input - see
also Sec. 4.1) of 1, −1, −1, 1, −1, 1, 1, −1 represents a limit cycle of length 8. The limit
cycle in Fig. 22 has length 32, as can be read from its fundamental at 88 kHz.
Fortunately, little needs to be done to break up the limit cycle. For example, any input
signal exceeding an amplitude of -90 dB will remove the limit cycle completely. To allow
for digital silence, though, the use of dither is required, and a very useful way is by applying
dither with a rectangular PDF (RPDF dither) just before the quantizer. In the case of
the SDM we are discussing here, an appropriate amount of dither has a pdf with a width
of 200 (and a mean of 0), and needs to be added immediately before the quantizer. The
resulting spectrum is displayed in Fig. 22. This has the advantage, that the dither will
become noise-shaped too (as the quantization error) and the increase in noise floor will be
marginal. In this case, the undithered SDM has a dynamic range of 98.4 dB (full scale
SACD), whereas the dithered SDM has a dynamic range of 98.0 dB. The maximum input,
before the SDM turns unstable, has been reduced from 0.7104 to 0.7098 for an input of a
4
In the literature, these tones are sometimes also called idle tones. We reserve the name idle tones for
signals which are not purely repetitive - see also Sec. 9.
34
0
-50
Power (dB)
-100
-150
-200
-250
-300
100
1000
10000
frequency (Hz)
100000
1e+06
Figure 22: Example of a limit cycle occurring in a SDM with zero input (green). In red,
the spectrum after application of dither (also zero input) is shown.
35
-20
-40
-60
-80
Power (dB)
-100
-120
-140
-160
-180
-200
-220
100
1000
10000
Frequency
100000
1e+06
Figure 23: A noise shaper which is typically used in SACD applications. The spectrum
has been coherently averaged 100 times, and this has been repeated 10 times to obtain a
power averaged spectrum.
1 kHz sine wave. Hence, this amount of dither has hardly any drawbacks, and significant
advantages.
The distortion introduced by the SDM amounts to -150 dB in the band 0-20 kHz (see
Sec. 9 for a more detailed discussion about non-linearity in an SDM). The dither added
to the quantizer, will hardly change that number, but it is disputable that this amount of
distortion (in PCM, this would have been below the 25 bit level) would lead to audible
effects.
9
Non-linearity in a SDM
To present a realistic situation, a spectrum of a SDM that is typically used in SACD
applications is presented in Fig. 23. For the purpose of this discussion, this SDM has not
been dithered. The input to this SDM has been a 4 kHz sine (-6 dB SACD amplitude).
If we are interested in the base-band, extending from 0 to 20 kHz, the relevant distortion
products are the 2’nd up to the 4’th component. From inspection of Fig. 23, it can be
concluded that the distortion components are all at most -165 dB, where the noise in the
FFT obscures any information deeper than that. The noise floor of this SDM is at -127 dB,
resulting in a DR of about 120 dB (recall, that the SACD reference 0 dB level has been
defined as -6 dB with respect to the level in the feedback path). It is also instructive to
extend the region of interest to the band 0-80 kHz. Obviously, the noise floor is increasing
36
DIGITAL
n-BIT
LPF
SDM
DSD; 64fs
DAC
ANALOGUE
LPF
multi-bit; m.64fs
n-bit; m.64fs
analogue
Figure 24: Example of an audio chain found in an SACD-capable player. The DSD is first
low pass filtered in the digital domain, followed by up-sampling to m · fs , typically, 128 or
256 fs . This high-rate signal is then fed to an n-bit SDM, where n typically varies between
1.5 and 5. Finally, the analog output is passed through an analog low pass filter.
steeply (in the case presented in Fig. 23, this increase is fifth order) causing the maximum
Signal-to-Noise Ratio (SNR) to drop to about 90 dB in the band 0-40 kHz, and about
55 dB in the band 0-80 kHz. Any harmonic distortion component, however, is at a level
at least below -95 dB. Clearly, any harmonic distortion component that we are dealing
with in the broader sense of the audio band, is extremely small, and its importance for
the perceptual audio quality can be doubted. In view of the fact that this SDM has not
been dithered, it is clear that dithering will even further reduce these numbers. In fact, if
this SDM is dithered to its maximum level (where it is just not overloaded) the distortion
components in the audio band are all below -180 dB, only observable after 5000 coherent
averages, and the components in the broader audio band are below -110 dB.
Still, the total amount of coherent power that is present in the dithered signal is significant.
The amount of coherent power can easily be estimated if the actual noise is assumed to
have no correlation with the signal. It appears that the total amount of coherent power
which is present in Fig. 23, is about -10 dB. It is obvious that this power is mostly above
1 MHz; 99.99% of the coherent power is found in this high frequency area. The exact
value of the frequency above which most of the correlated signal is found, is dependent
on the signal which is input to the SDM; it will, however, never be very much lower than
the quoted 1 MHz. It is beyond doubt, that the origin of these signals in the very high
frequency area is in the non-linear behavior of the SDM. Indeed, if a triangular pdf dithered
multi-bit quantizer is used in the noise-shaper, the high frequency components disappear.
Thus, the coherent signal above 1 MHz can be considered in some sense to be distortion.
To judge whether these distortion components are harmful, we need to look at the full audio
chain which is used to replay DSD in a typical SACD-capable player. Such a configuration
is shown in Fig. 24. A typical DAC-chip (see e.g. [1] or [7]) contains the first 4 blocks
displayed in Fig. 24. The digital filter in the path leading to the n-bit SDM is a crucial
part, where most of the HF signal present in the DSD signal can be removed without any
compromise. As an example, consider a filter that is designed according to the following
criteria: pass-band: 0-100 kHz, flat within 0.01 dB; transition band 100 kHz - 900 kHz;
stop band: 900 kHz - 1.4MHz, suppression 100 dB. This leads to a filter with only 22 taps,
and thus does not pose any additional constraint in terms of hardware; the filters which are
necessary to do proper up-sampling from a low sample rate format to the required m · 64f s ,
are much more demanding. Also, the digital LPF does not influence the impulse response
37
of DSD [13], as the transition width is extremely large. It is clear, that the application of
this filtering will lead to significant suppression of the high frequency components present
in the original DSD stream. Still, the signal contains substantial amounts of HF, which
is foremost white noise. The signal is then up-sampled to a frequency that is used to
perform the digital-to-analog conversion on. The SDM will noise-shape this signal into
an n-bit signal, where n typically varies between 3 [1] and 5 [7]. It is this signal, which
is converted to the analog domain. Due to the noise shaping process, which is intrinsic
in modern, high-end DA converters, and is the sole basis for their very high performance,
some additional high frequency noise extending to frequency regimes well above 1 MHz is
introduced. This noise is usually removed by an analog low pass filter of first or second
order. This filtering is most often passive, and can thus be performed with exceptionally
low distortion and inter-modulation. In most SACD players, some additional filtering is
provided, to reduce the amount of HF noise (which by then, is mostly due to the DSD
signal) even further to levels well below -30 dB. It is important to remark, that the HF
signal levels at which these additional filters need to operate are quite low due to the
digital pre-filtering (which removed a very substantial amount of HF signal causing the
total signal power to be substantially less than 1); hence, the linearity of the filters can
be quite high and the filtering operation is performed without additional inter-modulation
products.
This example of a typical SACD signal path shows, that the non-linearity above 1 MHz is
not important at all, and does not influence the signal quality. In fact, one can argue that
these components are favorable. Because the total power of the SDM output is constant
and equals 1, the power which is present in these high frequency tones causes the SNR in the
lower frequencies to be higher than anticipated on basis of the linear noise transfer function.
Hence, they contribute favorably to the dynamic range of an SDM. This discussion then
leads to the question whether it would be possible to linearize a SDM in the important
signal band, without bothering about its high frequency behavior.
9.1
Pre-correction
In order to have a system which demonstrates in a clear way the effects that we will study
in this section, a third-order SDM has been designed. Such low order SDM’s are notorious
for their relatively bad signal properties [9]. The spectrum of the third order SDM that
will be used in the sequel of this paper is shown in Fig. 25.
While this third-order SDM has a dynamic range of about 90 dB, its third harmonic is at
a level of -104 dB. While this is still a rather respectable number, it is about 60 dB larger
than the distortion component of the SDM shown in the previous section. The higher
order harmonic distortion products are significant, too. Also in the broader signal band
(0-80 kHz) the distortion components are larger. It should be remarked, that this type of
SDM is not recommended for practical use.
When we model the SDM as a non-linear element Σ∆, its transfer characteristic can be
written as:
38
-20
-40
-60
Power (dB)
-80
-100
-120
-140
-160
-180
100
1000
10000
Frequency
100000
1e+06
Figure 25: Spectrum of the third order noise-shaper used in the analysis of the precorrection technique. The input signal is a 3 kHz sine wave, -6 dB SACD. To obtain
this spectrum, a series of 4 coherent averages and 10 power averages has been used.
Σ∆(x) = x + α2 x2 + α3 x3 + . . .
(19)
Now, if we could create a signal s(x) according to:
s(x) = x − α2 x2 − α3 x3 − . . .
(20)
then the resulting output signal f (v(x)) would be given by:
Σ∆(s(x)) = x − 2α22 x3 + O(x4 )
(21)
In other words, the second harmonic distortion component has been completely removed,
and the third harmonic component has been substantially reduced (note, that for the low
distortions we are dealing with, αi 1). An estimate of the signal s(x) can be obtained
using the structure depicted in Fig. 26.
The topology of Fig. 26 operates as follows. The first SDM generates a signal, which
is subtracted from the original input signal x. This difference signal v now contains all
the distortion components which are generated by the SDM, and the uncorrelated noise
which has been added to the signal because of the noise shaping. This signal v is now
low-pass filtered in the filter F , which has, for example, a cut-off frequency of 100 kHz.
This results in the signal denoted F (v) in Fig. 26. Next, the original input signal x (after
the appropriate delay to correct for the delay in the filter f ) is added to F (v), resulting in
39
x
Delay
SDM
−
v
+
F
F(v) +
+
s’(x)
SDM
y
SDPC
Figure 26: Basic Sigma Delta Pre-Correction (SDPC) structure.
the signal s0 (x). While the filtering action has removed all HF noise, more in particular,
it has removed the strong signals above 1 MHz, it has not removed any noise in the band
below 100 kHz. Hence, the signal s0 (x) presents only an approximation to the signal s(x)
in Eq. 20. The signal s0 (x) is than input to a next SDM, which is identical to the SDM
used to generate v, resulting in the final output signal y.
To gain some insight in the performance of this algorithm, which we will refer to as Sigma
Delta pre-correction (SDPC), we have applied it to the third order SDM displayed in
Fig. 25. The spectrum of the resulting signal y is displayed in Fig. 27 in the range 0100 kHz. The huge suppression of the distortion components is clearly visible. Typically,
the distortion has been reduced by about 20 dB. For higher frequencies, the suppression
becomes less effective, even though the signal s0 (x) contains all distortion components unattenuated in the frequency regime. As always, there is a price to pay for this improvement
in THD, which in this case is an increase in the noise floor by 3 dB. This is clear from inspection of Fig. 27, when one realizes that the corrected spectrum has been obtained using
twice as many coherent averages which lowers the noise floor by 3 dB, and that the noise
floor is identical to the noise floor of the uncorrected spectrum. This also corroborates the
fact that this is white noise indeed; if it was correlated, it would result in a more than
3 dB increase. The origin of the increase of the noise floor is the fact that the signal s0 (x)
still contains the quantization noise present in the low frequency range; the second SDM in
the cascade adds its own quantization noise to it. Though not visible in Fig. 27, the high
frequency signals above 1 MHz are completely unchanged using the new topology, which
is expected on basis of the absence of correction components in the signal s0 (x).
9.2
SDPC and dither
To appreciate the effect of SDPC, it is also instructive to study the combined action of
dither and pre-correction. To that end, we have applied a dither level of 0.1 (the SDM
starts overloading at levels of 0.8) to the SDM.
Spectra of the original SDM, and the SDPC spectrum are displayed in Fig. 28. Also in this
case, the suppression of the distortion components is at least 22 dB in the band 0-20 kHz;
in fact, even after 64 coherent averages, no distortion components can be observed. Note,
that distortion has decreased to levels below -135 dB! Hence, the combined action of small
amounts of dither, and the pre-correction technique result in extremely low distortion
40
-20
-40
-60
Power (dB)
-80
-100
-120
-140
-160
-180
1000
10000
Frequency
100000
Figure 27: Spectra of the original SDM (green), and its implementation according to Fig. 26
(red). The spectrum of the original SDM has been obtained using 4 coherent averages and
10 power averages; the other using 8 coherent averages and 10 power averages. The fact
that the noise floors of the spectra coincide precisely illustrates the 3 dB loss in SNR due
to SDPC.
41
-20
-40
-60
Power (dB)
-80
-100
-120
-140
-160
-180
1000
10000
Frequency
100000
Figure 28: Spectra of the original (dithered) SDM (green), and its implementation according to Fig. 26 (red) using the same dither. The spectrum of the original SDM has been
obtained using 8 coherent averages and 10 power averages; the other using 64 coherent
averages and 5 power averages.
42
0.06
0.05
Phase (rad)
0.04
0.03
0.02
0.01
0
-0.01
100
1000
10000
Frequency
Figure 29: Phase characteristic of the signal transfer function of the third order SDM used
in this paper.
figures. Again, the reduced distortion suppression for higher frequencies is visible; for
example in the region above 40 kHz, the suppression is typically only 8-10 dB.
While the higher harmonics are suppressed less than the lower harmonics, which is shown
by Eq. (21), this does not fully explain the reduced suppression. Another origin of this
reduced suppression for higher frequencies lies in the fact that the phase characteristic of
the SDM used here is not straight for frequencies above 20 kHz. This results in some
phase distortion, which is not accounted for in the pre-correction technique according to
Fig. 26. To obtain an estimate of the significance of these errors, consider a single harmonic
h(ωt) = A sin (ωt), which is positioned around 50 kHz. The absence of phase correction ∆
will cause incomplete cancellation of the harmonic; a residual power of 4A2 ∆2 will remain.
In this case, this results in a maximum power reduction of the harmonic by only 14 dB.
An improved pre-correction technique is therefore displayed in Fig. 30. In this diagram, the
phase error introduced by the SDM, is corrected for by the filter L. Another improvement
can be obtained by cascading the structures displayed in Figs. 26 and 30. In a non-cascaded
structure, the cancellation of lower order terms, causes the generation of higher order terms,
albeit of much lower amplitude, as can be concluded from Eq. (21). These new, higher
order terms, can in turn be canceled in exactly the same way as the lower order ones were
canceled, resulting in cascading the structure in Fig. 30.
43
x
Delay
SDM
v
−
+
G
F(v) +
+
s’(x)
SDM
y
L
SDPC
Figure 30: Improved pre-correction structure. By cascading the Sigma Delta PreCorrection structure (SDPC) n times, n harmonics can be removed.
-80
-100
Power (dB)
-120
-140
-160
-180
-200
10000
20000
30000
40000
Frequency
50000
60000
70000
80000
Figure 31: Fifth order SDM, with a 3 kHz input of -6 dB. The uncorrected spectrum (green)
has been obtained after 16 coherent and 10 power averages; the corrected spectrum (red)
after 2048 coherent and 10 power averages.
44
-60
-80
-100
Power (dB)
-120
-140
-160
-180
-200
-220
100
1000
10000
100000
Frequency
Figure 32: Fifth order SDM, with a DC input of 1/1024. The uncorrected spectrum (green)
has been obtained after 4 coherent and 10 power averages; the corrected spectrum (red)
after 32 coherent and 10 power averages.
9.3
Performance of a realistic SDM with SDPC
To end with a realistic situation, and to show how SDPC also suppresses DC tones, a
standard fifth order SDM has been designed, with a SNR of 118 dB over 0-20 kHz.
As illustrated in Fig. 31, harmonic distortion levels of this SDM in the phase-corrected
SDPC structure are reduced to well below -185 dB if undithered, which amounts to an
improvement of about 35 dB compared to 20 dB improvement with the standard SDPC.
If the SDM is slightly dithered, the distortion levels drop to much deeper levels, which
numerically appeared to be inaccessible (i.e., below -220 dB). Also, distortion levels at
higher frequencies are reduced more compared to the standard SDPC algorithm. As with
the uncorrected SDM, the SNR in the base-band (0-20 kHz) is slightly reduced from 118 dB
to about 115 dB (no dithering) or 114 dB (with dithering).
The effects of a DC input to the SDPC system are illustrated in Fig. 32. As input to this
system, a DC value of 1/1024 has been applied, which results in a tone around 5.5 kHz.
The SDM has not been dithered.
In the spectrum of Fig. 32, a tone can be observed with an amplitude of about -145 dB.
Application of the pre-correction algorithm, in its basic form, reduces this amplitude to
about -165 dB. If a small amount of dithering (RPDF with amplitude 0.05) is applied, which
is much less than the maximum allowed amount of dither (0.4 RPDF), the amplitude of
the tone cannot be observed after 256 coherent averages, indicating that the tone is at
least less than about -175 dB. Also application of the improved SDPC results in values for
45
spurious signals that are not easily accessible numerically.
46
10
Acknowledgements
The authors want to thank prof. S.P. Lipshitz, prof. J. Vanderkooy, Dr. J.D. Reiss and
H. ten Pierick for their valuable comments and proofreading of the manuscript.
47
A
SDM-code
In this appendix, we provide the C-like pseudo code for the SDM discussed in Sec. 5.2.
The code simulates 100000 clock cycles of the SDM, with a DC input of 0.1 .
/* Coefficients: */
c = {
0.791882,
0.304545,
0.069930,
0.009496,
0.000607
};
f = {
0.000496,
0.001789
};
/* Initialization */
s0 = s1 = s2 = s3 = s4 = 0;
y = 1;
N = 100000;
/* Main loop */
for (i = 0; i < N; i++) {
sum = c[0]*s0 + c[1]*s1 + c[2]*s2 + c[3]*s3 + c[4]*s4;
if (sum >= 0)
y = 1;
else
y = -1;
x = 0.1;
s4
s3
s2
s1
s0
}
}
=
=
=
=
=
s4
s3
s2
s1
s0
+
+
+
+
+
s3;
s2 - f[1]*s4;
s1;
s0 - f[0]*s2;
(x-y);
48
References
[1] B. Adams, K. Nguyen, and K. Sweetland. A 116 db snr multi-bit noise shaping dac
with 192 khz sample rate. In Proceedings of the 106’th AES convention, 1999. preprint
4963, Munich (1999).
[2] R.W. Adams, P.F. Ferguson, A. Ganesan, S. Vincelette, A. Volpe, and A. Libert.
Theory and practical implementation of a fifth order sigma-delta a/d converter. J.
Audio Eng. Soc., 39:515–528, 1991.
[3] M.O.J. Hawksford. Time-quantized frequency modulation with time dispersive codes
for the generation of sigma-delta modulation. In Proceedings of the AES 112’th convention, 2002. Preprint 5618, 2002 may 10-13 munich.
[4] H. Inose and Y Yasuda. A unity bit coding method by negative feedback. Proc. IEEE,
51:1524–1535, 1963.
[5] H. Kato. Trellis noise-shaping convertors and 1-bit digital audio. In Proceedings of
the AES 112’th convention, 2002. Preprint 5615, 2002 may 10-13 munich.
[6] S.P. Lipshitz, R.A. Wannamaker, and J. Vanderkooy. Quantization and dither: a
theoretical survey. J. Audio Eng. Soc., 40:355–375, 1992.
[7] S. Nakao, H. Terasaw, F. Aoyagi, N. Terada, and T. Hamasaki. A 117db d-range
current-mode multi-bit audio dac for pcm and dsd audio playback. In Proceedings of
the 109’th AES convention, 2000. preprint 5190, Los Angeles (2000).
[8] S.R. Norsworthy and D.A. Rich. Idle channel tones and dithering in delta-sigma
modulators. In Proceedings of the AES 95th convention, 1993. preprint 3711, 1993
october New York.
[9] S.R. Norsworthy, R. Schreier, and G.C. Temes. Delta-Sigma Converters, Theory,
Design and Simulation. IEEE Press, New York, 1997.
[10] Philips and Sony. Super Audio CD System Description. Philips licensing, Eindhoven,
The Netherlands, 2002.
[11] D. Reefman and E. Janssen. Enhanced sigma delta structures for super audio cd
application. In Proceedings of the AES 112’th convention, 2002. preprint 5616, 2002
may 10-13 munich.
[12] D. Reefman and P.A.C.M. Nuijten. Editing and switching in 1-bit audio streams.
In Proceedings of the AES 110’th convention, 2001. preprint 5399, 2001 may 12-15
amsterdam.
49
[13] D. Reefman and P.A.C.M. Nuijten. Why direct stream digital is the best choice as
a digital format. In Proceedings of the AES 110’th convention, 2001. preprint 5396,
2001 may 12-15 amsterdam.
[14] S. Wolfram. The Mathematica Book. Wolfram Media/Cambridge University Press,
Cambridge, 4 edition, 1999.
50
Related documents
Download