Signal processing for Direct Stream Digital A tutorial for digital Sigma Delta modulation and 1-bit digital audio processing Derk Reefman derk.reefman@philips.com and Erwin Janssen erwin.e.janssen@philips.com version 1.0 18 December 2002 1 Contents 1 Introduction 2 Characteristics of Direct Stream 2.1 Example: Filtering . . . . . . . 2.2 Example: Non-linear operations 2.3 Example: Anti-aliasing filters . 6 Digital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 9 10 10 3 Sigma Delta Modulation 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 A linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Bit stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 12 14 15 4 Characteristics of SD modulators 4.1 SDM silence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 SDM stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Idle tones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 18 18 20 5 Design of SDM modulators: I 5.1 Loop-filter design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Enforcing SDM stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 21 26 6 Design of SDM modulators: II 28 7 Signal processing 30 8 Dithering and linearizing SDM’s 34 9 Non-linearity in a SDM 9.1 Pre-correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 SDPC and dither . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Performance of a realistic SDM with SDPC . . . . . . . . . . . . . . . . . . 36 38 40 45 10 Acknowledgements 47 A SDM-code 48 2 Glossary ADC: Analogue-to-Digital Converter. This device converts analogue input signals (from, e.g., a microphone) to a digital signal that can be used in computations (for example in a PC program) (Anti-) aliasing filter: Filter designed to remove any signal larger than the Nyquist frequency. Authoring: Process in which the final disc image is created. This includes lossless compression, creation of the table of contents etc.. Class-D: Amplifier topology that relies on Pulse Modulation. The pulses drive switches which connect the load (loudspeaker) either to the positive or negative supply voltage. Characterised by high efficiency; often also called ‘digital amplifier’. Clipping: The phenomenon that when a format is designed to handle signal levels no larger than a level C, every level larger than C is coded as C. For example, the digital format on a CD cannot handle more than 65536 sub-levels; any signal corresponding to a level larger than +32767 is represented as +32767 (and likewise for negative signals less than -32768). Clock jitter: Technically the unwanted phase shift of digital pulses over a transmission medium. A discrepancy between when a digital edge transition is supposed to occur and when it actually does occur. DAC: Digital-to-Analogue Converter: the reverse of a ADC. Distortion: Any deviation from a linear input/output relationship, where a linear relationship is defined such that the output equals (apart from a constant gain factor) the input. Dithering: The addition of a (quasi-)random number to the signal which is subsequently quantised. Due to the dither, the quantization appears as an (almost) linear process. DSD: The digital format stored in Super Audio CD. DSD is a format in which 2822400 times per second a 1-bit signal is stored. Lowpass-filtering this signal will restore the original waveform. DST encoding: Direct Stream Digital, a lossless compression algorithm specifically tailored 3 to the lossless compression of DSD signals. Editing: In it simplest form, editing is the process of ‘cutting and pasting’ the music such that undesirable parts of the recording are removed. Often, also volume changes are applied and mixing of different channels is performed. Filter ringing: The effect that a filter with a steep transition band in the frequency domain produces artefacts in the time domain that extent over a significant period of time. Idle tone: Tone appearing at the output of a noise shaper that bears a simple relation to the input of the Sigma Delta Modulator. Limit cycle: Signal at the output of a Sigma Delta Modulator that requires a precisely defined input in order to occur, and disappears if the input deviates slightly from the mentioned precise value. Linearity: See distortion. Lossless compression: A way of compacting digital audio streams such, that when they are unpacked the original stream is restored. Comparable with the ‘ZIP’ program on PCs. Mastering: Process in which the edit master is subject to processes such as EQ to obtain the best sound performance. Matching: The accuracy to which electronic components are the same. This is important if an electronic circuit relies on the cancellation of two signals: if the components are not exactly identical, a residual (undesirable) signal will remain. Noise shaping: The shift of spectral content of the (quantization) noise. For example, in a Sigma Delta Modulator the energy of the quantization noise is shifted to high frequencies, leaving no or little noise at low frequency. Nyquist Frequency: The largest frequency that can be represented by a digital format; the Nyquist frequency is half the sample frequency. PCM: Pulse Code Modulation. A digital format, used for example in CD, whereby a digital signal is represented by an accurate representation (e.g., 16 bits, meaning that the range -1,+1 is subdivided in 65536 sub-intervals) of the wave form at equidistant points in time (for example, in CD 44100 times per second a 16-bit approximation of the wave form is stored). Pulse Density Modulation: A form of pulse modulation where a large positive signal is represented by a long series of positive pulses; a zero signal is represented by alernating 4 positive and negative pulses. Recording: The process of storing the music signals on a medium - either in analogue form or in digital form. (Re-)Quantization: The mapping of a signal of infinite precision to a signal with limited precision. On a CD, e.g., a signal is quantized to 16 bits. Sigma Delta Modulator: Device which transforms an analogue or PCM signal in a DSD signal. Often abbreviated to SDM, and also often referred to as Delta Sigma Modulator. Super Audio CD: Super Audio Compact Disc. Format for music distribution proposed by Philips and Sony. Super Audio CD is based on a new digital format called DSD. Topology: Particular way of connecting building blocks to create a circuit. Up/Down sampling: A signal processing technique whereby the sample rate of a digital signal is enlarged or reduced. In the latter case, this also corresponds to a loss of information. 5 1 Introduction The introduction of Super Audio Compact Disc (SACD) as a successor to the CD, has introduced the need for a change in signal processing. Underlying this change, is the radically different signal format that is adopted in SACD compared to CD. Whereas in CD the audio format is called Pulse Code Modulation (PCM), a 16-bit word, at a sample rate of 44100 samples per second, for SACD this is Direct Stream Digital (DSD), a 1bit word at a sample rate of 64 times 44100 samples per second. In the early nineties, the time of the conception of DSD, analogue-to-digital converters (ADCs) and digital-toanalogue converters (DACs) were built with 1-bit technology [9]. The driving forces for the use of this technology were pure technical: in the CD era, demands for distortion levels were becoming more stringent, and it proved virtually impossible to create low distortion devices with many (16) bits. Contrary to that, it was much easier to create low-distortion converters using a digital format of 1 bit, which were running at very high sample rates such as 64 or 128 times 44.1 kHz. The conversion of this high speed, 1-bit format to 44.1 kHz/16 CD format can easily be accomplished in the digital domain using filtering and signal processing, which does not introduce any non-linear distortion. This technique has been highly successful, and the so-called ‘oversampling’ and /or ‘bitstream’ technology has dramatically increased the performance of the CD-players in the nineties. In fact, those CD-players were all generating their own DSD internally from the CD source; this DSD would then be fed into a high quality, 1-bit DAC. It therefore seemed logical to introduce a format that would store this 1-bit output directly, instead of the ‘intermediate’ CD format: in this way, all filtering and signal processing needed to convert to and from the 1-bit format is eliminated which, by definition, can only increase the sound quality. After the first experiments with DSD, it appeared indeed that the sound quality was significantly better compared to the 44.1 kHz/16 bit format. Also, at the same time, new ADCs and DACs were appearing on the market, that were still using high sample rates (64 or 128 times 44.1 kHz), but exploited a few bits (1.5 to 5) instead of 1. Again, this had purely technical fundamentals: ingenious tricks to reduce the distortion problems of a multi-bit converter had appeared, and were feasible to implement for a limited number of bits. Because 1-bit converters are more sensitive to clock-jitter, the ‘few-bit’ converters took their place in the high-end audio market. This re-introduced the need for some mild signal processing, because SACD can only store a 1-bit format. Interestingly, this did not lead to any observable degradation in sound quality. Therefore, it is now believed, that the very high sample rate of DSD is the key factor in the extremely good sound quality of SACD. The fact that the data is 1 bit instead of few bit, however, has retained its value because it reduces the storage requirements of the audio, thus creating the possibility to store over 70 minutes of stereo and multi-channel DSD on a single Super Audio CD. The purpose of this document is to explain some technical details of Direct Stream Digital. It tries to give an overview of several signal processing steps which are needed in the world of DSD, which are different from the accustomed way of doing things. Its purpose is not to give a full explanation of the perceived sound quality of DSD; this white paper is meant 6 to be an introduction to DSD and DSD signal processing for the educated ‘DSD-novice’. Reflecting the importance for SACD, a crucial part in this paper is the 1-bit Sigma Delta Modulator (SDM). The design of such a device will be discussed in detail, and a working example will be designed to illustrate the design process. Another important issue that will be dealt with, is DSD signal processing. A typical signal processing chain for DSD is provided in Fig. 1. In Fig. 1, several steps are envisaged which occur typically in the creation of an SACD. Most of these steps involve analog or digital signal processing in one way or another. Starting with the AD converter, this is not necessarily a native 1-bit converter. Often, high-end AD converters are 3-6 bit converters running at sample rates between 128fs and 512fs , where fs is symbolic for a sample rate of 44.1 kHz or 48 kHz. These signal formats need to be converted to 1-bit formats, where any change to the signal information is to be avoided. As this introduces the need for a 1-bit SDM, we will start with some introduction to Sigma Delta Modulation, and the various options that exist to realize a SDM. In the editing phase, volume adjustments need to be done, and switching between bit streams is necessary. Switching of bit streams is a technique which is rather different from standard signal processing, and is detailed in a separate document [12]. In the mastering phase, heavy signal processing is often involved, ranging from relatively simple equalization to sophisticated reverberation techniques. In the sequel, it will be demonstrated how most of the sophisticated techniques developed for PCM can be easily adjusted for application to DSD. In this respect, it is essential to realize that DSD at 64fs is a consumerformat hence, not necessarily the format that is used in the studio which can be in principle any format as long as it is of equal or better quality compared to standard DSD. In the authoring phase, finally, no changes to signal content are made anymore. However, in most cases the format of the data will be transformed to DST (Direct Stream Transfer), which is the compressed format of DSD. This lossless compression scheme allows multichannel, high quality DSD data to fit on a the approximately 4.7 Gbyte of a high density layer of an SACD disk. 2 Characteristics of Direct Stream Digital Before diving into the generation of Direct Stream Digital (DSD), we will first review some characteristics of the format as it is used within the context of Super Audio CD. First and foremost, DSD characterizes itself by the huge sample rate of 64 times 44.1 kHz, or 2.8 MHz. Rather irrespective of the number of bits, high sample rates in the digital world are desirable because the larger the sample rate, the less the audio artefacts introduced by the time quantization. We will review a few examples, which show up the phenomenon that 44.1 kHz (or a small multiple of it) is not enough to avoid significant signal distortions due to the time quantization. 7 Recording Editing Mastering Authoring/ DST encoding SACD Player Figure 1: Typical signal processing chain for DSD applications. 8 1.6 1.4 warped frequency 1.2 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 analog frequency 1 1.2 1.4 Figure 2: The effect of warping the analog frequencies to the limited range of digital frequencies; the frequencies are in reduced units (i.e., 0 . . . π). Red: the warped frequency. Green: the original frequency. The vertical line shows till what frequency the warped frequency can be considered to be an accurate representation of the actual frequency. 2.1 Example: Filtering A well-known issue in (time discrete) digital systems is the problem of mapping (‘warping’) the infinitely high frequencies, which are allowed in a time-continuous (analogue) system, to a system where the highest representable frequency is the Nyquist frequency (half the sample rate). Obviously, as the ultimate goal of digital signal processing is to present an improvement over analog signal processing, this is a very serious issue. Exemplary for this problem is the bi-linear transform, which maps the analogue frequency ωa to the digital frequency ωd according to: 2 atan(ωa T /2) (1) T where T is the sampling period. As illustrated in Fig. 2, this mapping is almost linear only for a limited frequency regime; for frequencies above 0.1 fs quite substantial deviations occur. As a result, mapping an analog filter (say, a Butterworth filter) to its digital equivalent causes significant distortion of its frequency response. If the sample rate is very high (as, for example, with DSD) the mapping artefacts are benign in the frequency regime which is most important for audio reproduction. Obviously, it is still possible to create a filter which has the characteristics of a digital filter at low sample rate; hence with the use of DSD, one has significant freedom in the choice of filters and filter characteristics. ωd = 9 2.2 Example: Non-linear operations In audio signal processing, operations such as compression/limiting and clipping are quite common. In compression/limiting, the gain of the signal is adjusted according to the signal; this clearly represents a non-linear operation. Also in clipping, the signal transfer is highly non-linear. If these non-linear operations are performed in the analog domain, they will cause higher harmonics to appear. For example, if a 14 kHz signal is clipped, this will give rise to a third harmonic component at 42 kHz. In the analog domain, this could then be filtered off, if desired. If the clipping were done in the digital domain at a sample rate of 44.1 kHz, however, the 42 kHz harmonic would alias back to low frequency: 42 kHz is 19.95 kHz above the Nyquist frequency (22.05 kHz). The third harmonic would thus be aliased to (22.05-19.95) = 2.1 kHz, which would give very audible distortion as that frequency is not harmonically related to 14 kHz. Also, there is no way to remove this distortion by a filter operation. The only remedy is to up-sample to a high frequency, and do the non-linear operation at that high rate, thus ensuring that only high frequency, high order harmonic components are aliased. This causes less harm, because high order components tend to be of lower amplitude. Then, down-sample with the appropriate low pass filtering again to 44.1 kHz. Now, obviously, in DSD the sample rate is so high that non-linear operations behave as they would in the analog domain. Hence, no up- and down sampling is required, and the decision whether to remove high order distortion components or not is to the sound engineer - and not dictated by the format. 2.3 Example: Anti-aliasing filters Because of the extremely high sample rate, DSD sets only very relaxed requirements for the anti-aliasing filters, which, hence, can be chosen to be rather sloppy. As a result, the ringing in the time domain is substantially lower compared to systems of lower sample rate where steep anti-aliasing filters are mandatory. This effect is clearly illustrated in Fig. 3. The impulse responses of 4 different systems in a multi-channel configuration are depicted: a 48 kHz system, with a bandwidth of 20 kHz (that is, 8 kHz transition bandwidth is allowed for anti-aliasing filtering), a 96 kHz system with 35 kHz bandwidth (26 kHz transition bandwidth), a 192 kHz system with 75 kHz bandwidth (42 kHz transition bandwidth) and an SACD system with 95 kHz bandwidth (and about 120 kHz transition bandwidth). Though none of the systems reproduce the input exactly, the DSD system shows the least artefacts. Clearly, the 48 kHz system has great difficulty in reproducing the click; due to the steep filtering it starts ‘wobbling’, or ringing, at a -30 dB level with respect to the top of the response approximately 0.4 ms before the click, which is very audible (this is also the reason why many people prefer ‘sloppy’ anti-alias filters in CD-players; even at the cost of reduced anti-aliasing characteristics). It also continues to ring after the click for the same length of time, but most possible this ‘after-ringing’ is audibly masked by the click it self, and, hence, not as important as the pre-ringing. Apart from this effect, also the amplitude is only a fifth from what it should be. Especially when the sound will traverse through a non-linear medium, such as the human ear, this may lead to even larger 10 0.25 ’test.48’ ’test.dsd’ ’test.192’ ’test.96’ 0.2 Amplitude 0.15 0.1 0.05 0 -0.05 0.0052 0.0054 0.0056 0.0058 time (s) 0.006 0.0062 0.0064 Figure 3: Responses (from left to right) of a DSD, a 192 kHz, a 96 kHz and a 48 kHz system on a -6 dB block input (‘click’) of 3 µs duration, and amplitude 0.25. Note the linear amplitude scale. perceived differences than what can be concluded directly from Fig. 3. Also at the higher sampling frequencies, the ringing phenomenon cannot be removed, though it is reduced significantly. Only the DSD system is very effective in suppressing the ringing effect, due to very slow filtering above 95 kHz. The price to pay for this is the increase in noise floor with respect to the other systems; however, as the noise floor contains only high frequency components which are uncorrelated with the audio, they are not perceptible. 3 Sigma Delta Modulation In this section, it will be assumed throughout that the sample rate equals 64 times 44.1 kHz, (≈ 2.8MHz) i.e., the sample rate of SACD. By far the most common way to generate such a 1-bit DSD stream is by the use of a Sigma Delta Modulator (SDM), although it is nowhere stated in the definitions of Super Audio CD [10] that the bit-stream present on the disk must be generated by a SDM. In fact, recently many other methods have been developed which are not simply a (single) SDM. For example, in [3] a type of SDM with an elaborate re-ordering scheme is presented, and in [5] a so-called Trellis-SDM is presented. In [11], a cascaded structure of 2 SDM’s is presented, which will be presented in a slightly modified form in Sec. 9. All of these new developments have in common that their performance is in some way better than that 11 0 -20 -40 -60 Power (dB) -80 -100 -120 -140 -160 -180 -200 -220 100 1000 10000 Frequency 100000 1e+06 Figure 4: Typical output spectrum of an SDM (4 kHz, -6 dB input). of an ordinary SDM, but at the same time there is a substantial increase in complexity. Because a single SDM is still at the basis of all these new developments, and because a standard SDM is still by far the most widely used device to generate a bitstream, we will continue by elaborating on the principles of a simple SDM. 3.1 Overview Sigma Delta Modulation, often also known as noise shaping, is in most general terms a technique which allows (digital) quantization errors to be spectrally shaped. In the SDM’s that are typically used for DSD applications, the aim of this spectral shaping is to push the gross quantization errors made by the course 1-bit quantizers to high frequencies, where these errors are inaudible. This is possible due to the high oversampling factor: 64, which leaves a band of approximately 80-100 kHz (which is determined by the maximum allowable input, as will be discussed later in Sec. 5) to 1.4 MHz (the Nyquist frequency) to accommodate virtually all the quantization errors. An illustration of this phenomenon is given in Fig. 4. Indeed, the spectrum illustrates that this SDM design allows for a very high dynamic range in the audio band (0-20 kHz), decreasing dynamic range in the band from 20 to 80-100 kHz, from where the dynamic range remains constant till 1.4 MHz. 12 Schematically, a SDM can be represented as in Fig. 5. u H(z) - y Q u Q - y - F(z) Figure 5: Above: Sigma Delta structure (in feed forward configuration). Below: equivalent noise shaper structure. Historically, the SDM is preceded by the noise-shaper (NS) (also see Fig. 5). The most significant difference between a noise shaper architecture and a sigma delta structure is the position of the filter: in a noise shaper, the filter is in the feedback loop, in a SDM the filter in the feedforward loop. Due to the filter in the feedback loop, the error of the quantizer is spectrally shaped by the filter F (z) and fed back to the input of the quantizer. It is this process, which is called noise shaping of the quantization error. Though this appears rather different from a SDM, the noise shaper structure is virtually identical to the SDM topology. In fact, the SDM and the NS in Fig. 5 are identical if the filter F (z) in the noise shaper equals F (z) = H(z)/(H(z) + 1). In that case, the input still needs to be pre-amplified by the filter H(z)/(H(z) + 1) to obtain an identical signal transfer function. It is important to realize that, because of their equivalence, both a noise shaper and an SDM perform noise shaping of the quantization noise. Because of that reason, a SDM is often (mistakenly) called a noise shaper, even though the topology of a noise shaper is different from a SDM. The noise-shaper architecture is not often used in analog to digital converters because matching in the analog domain is difficult, and thus leads to implementation problems. Generally one resorts to SDM topologies, where one has less analogue problems. In the digital domain, where precision is arbitrary, matching is not a fundamental problem and 13 both structures can be used. Because of the identical nature, we will restrict the discussion to the SDM-like structures. 3.2 A linear model For applications in SACD, the quantizer Q in a SDM is a 1-bit quantizer, which outputs only values of +1 and −1. This is a highly non-linear element, which has its ramifications on the operation of the SDM. To gain some initial insight in the characteristics of the SDM, however, we will resort to a simple linear model and replace the highly non-linear quantizer by a (linear) gain c and a noise source n, which models the quantization error, as indicated in Fig. 6. n u - c H(z) y Figure 6: Linearization of Sigma Delta structure. The quantizer is replaced by a (signal independent) gain, and an additive noise source. The signal transfer function STF and noise transfer function NTF are defined by Y = ST F.U + N T F.N , where Y is the fourier transform of the output y, U is the fourier transform of the input u and N the fourier transform of the additive noise n. Doing this, we can write for the signal transfer function (STF) and the noise transfer function (NTF) the following expressions: ST F (z) = cH(z) 1 + cH(z) (2) N T F (z) = 1 1 + cH(z) (3) Assuming that the quantizer gain c ≈ 1, this shows how, in a situation where the loop-gain H(z) is very large, the signal transfer function approximates 1. The noise transfer function, on the contrary, is negligible for large H(z). As the loop-filter H(z) typically is a low pass filter, with large LF gains, it shows that in SACD applications, the quantization noise is suppressed in the audio band. In Fig. 4, for example, the loop-filter is a Chebyshev type II design with a corner frequency of 90 kHz. It is of crucial importance, however, to realize that the replacement of the quantizer by a gain element c and an additive noise source, is a very crude approximation, the more so if c = 1 is taken. Typically, the Signal-to-Noise Ratios (SNRs) as calculated from simulations on the actual SDM with the non-linearity included, differ significantly from those obtained 14 x - T T T c1 T c c 2 c4 3 + Q y x − T c’4 T − c’3 T − c’ T − Q y c’ 2 1 Figure 7: Above: A fourth order Sigma Delta structure in feed forward configuration. Below: a fourth order feedback topology. If c01 = c1 /c4 ; c02 = c2 /c4 etc., the NTF’s of these modulators are identical. by the use of the linearized model. Also other characteristics, discussed in Sec. 4, are not properly, or not at all, explained by the linearized model. There also exist other SDM realizations. Whereas the SDM structure in Fig. 5 is referred to as a feed-forward topology, there also exist feedback topologies. A feedback topology is displayed in Fig. 7. Like in the comparison of the noise-shaper vs. feed-forward SDM, there is some equivalence between a feedback and feed-forward topology. We will see this in a next section. The choice of which topology to use is then dependent on the design of the complete system. 3.3 Bit stream In Fig. 8, a characteristic output sequence of a SDM is shown, receiving a sinewave of amplitude 0.95 and frequency 20 kHz as its input. Even though Fig. 4 leaves no doubt 15 1 0.5 0 -0.5 -1 250 300 350 400 450 500 sample number Figure 8: Comparison of the DSD output of a SDM (red) and the input to the SDM (blue). Clearly, in regions where the input sine wave is negative, the bits that are output from the SDM are predominantly negative, and vice versa. about the very high accuracy with which the signal is represented in the SDM output, it is hard to visualize the sine wave from a series of +1s and −1s. An idea is that the signal that is represented by the bit stream can be obtained by taking a local average of the bitstream: clearly, when the input sine wave is positive/negative, most bits that are output from the SDM are positive/negative too, and outnumber the opposite bits by far. Likewise, around zero input the number of positive and negative bits is roughly identical. Hence, the global wave form of the underlying (low frequency) signal can be estimated by taking a local average of the bit stream - akin to pulse-density modulation, which is sometimes used in Class-D amplifiers. Obviously, this local average will not represent a highly accurate representation of the wave form. A better impression about the accuracy with which the input is represented is obtained by filtering the output of the SDM with a filter which removes the signal in the DSD stream above 20 kHz (in fact, local averaging is a low pass filter, albeit not a very good one). It is, therefore, informative to build a system as presented in Fig. 9. This system allows us to compare the original input signal with the signal which has passed through the Sigma Delta Modulator. To this end, the bit stream output of the SDM is lowpass filtered with a steep filter, such to remove any components above 20 kHz. The 16 SDM − ε nT Figure 9: Setup which allows to compare an upsampled, low rate high resolution signal with its DSD equivalent. Note, that the down sampling is necessary only for the purpose of comparison. input signal can be any signal of large enough resolution; below, we will take signals with a (digital) resolution of 24 bits. Due to the filter after the SDM, the input signal has to be delayed by an appropriate amount to compensate for the delay introduced by the filter. If the input and output signals are subtracted, the residual signal can be inspected. 8e-07 ’1kHz’ ’20kHz’ 6e-07 Absolute difference 4e-07 2e-07 0 -2e-07 -4e-07 -6e-07 -8e-07 0 5e-05 0.0001 0.00015 0.0002 Time (s) 0.00025 0.0003 0.00035 0.0004 Figure 10: Time domain representation of the difference signal , for both a 1 kHz input signal (red) and a 20 kHz signal (green). In Fig. 10, two results are displayed: a first residual signal , where the input signal was a sine wave (0 dB SACD, 1 kHz) and a second signal (0 dB SACD, 20 kHz). The resulting signal is, for both inputs, noise-like, with an amplitude that corresponds to a resolution of at least 120 dB. Like wise, this experiment can be performed with real audio signals, where the result will be the same. Obviously, when the low-pass filtering applied in the down sampling process is not suppressing the noise above 20 kHz to a level of -120 dB, the residual signal will be larger. 17 A separate issue that becomes clear from Fig. 8 is the fact that while negative parts of a sine wave are represented by predominantly negative output values, the pattern in which the +1s and −1s appear is never the same. This observation leads to the issue of editing DSD streams. While this is an important topic, the reader is referred to [12] for a discussion of editing and switching DSD, as this is a non-trivial issue. 4 Characteristics of SD modulators Sigma Delta modulators represent a new class of devices, which will display other phenomena as we are used to in the PCM world. In the sequel, a few of these features which are important in practical applications will be highlighted. 4.1 SDM silence Sigma Delta modulators have some characteristics which we are not familiar with in the PCM world. A first important aspect is that the output of the SDM always has a power equaling 1, because the output can only take the values ±1. As a result, silence, as referred to in DSD, only means that the power spectrum of the DSD is empty below a threshold, above which any signal cannot be perceived. For example, the following repetitive patterns are often used and are referred to as DSD silence patterns: pattern 01010101 10101010 10010110 01101001 hex code 0x55 0xaa 0x96 0x69 pattern frequency 1.4 MHz 1.4 MHz 352.8 kHz 352.8 kHz Indeed, the patterns do not contain signal components below 80 kHz; however, they still represent signals with a total power of 1. Often, these patterns are referred to by their hexadecimal equivalent. The fact that these signals are silent, but still contain information, can be exploited to use these signal as synchronization words [4]. 4.2 SDM stability Another important aspect is that, while the output of the SDM varies between ±1, its input most often cannot vary over this range because the SDM becomes unstable for inputs of high amplitude. While a full theoretical description of this phenomenon is still lacking, a wealth of heuristic knowledge [9] is available on the stability of higher (> 2) order SDMs. Because of all this experimentally obtained insight, accurate descriptions of instability are present that can be used in the design of properly functioning modulators (see Sec. 5). In fig. 11, the performance of a SDM is shown as function of its input amplitude. Clearly, above a certain threshold, the performance collapses (in fact, the SDM gets into wild oscillations if no precautions are taken). The exact amplitude where the sudden collapse 18 140 120 Signal to Noise Ratio (dB) 100 80 60 40 20 0 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 input amplitude Figure 11: Graphical representation of the stability problems for large inputs: for a simple SDM (in red) discussed in Sec. 5 the SNR collapses for signal amplitudes (in this case: a 4 kHz sine wave) of more than 0.59. In green the result of so-called ‘graceful degradation’ is shown. 19 occurs, is dependent on the wave form of the input and its frequency, and the SDM design, and is thus not an easy quantity to determine. In section 5, precautions that can be taken to prevent this uncontrolled behaviour are discussed, that lead to so-called graceful degradation: instead of a sudden collapse in performance, the performance drops in a much less aggressive way. This overload phenomenon is the reason why the SACD 0 dB reference level has been set to 50% of the ‘maximum theoretically possible modulation depth’ [10]; in the cases discussed here this means that allowable input levels vary between −0.5 and 0.5. This definition introduces the possibility to allow signal levels which are larger than 0 dB, in contrast to PCM which has a clear limit at 0 dB: all inputs larger than 0 dB are harshly clipped to 0 dB. As will be clear in following sections, for SDMs this overload is possible at the limited cost of increased distortion (clipping of the internal integrators). In this respect, the DSD format compares to analogue tape recordings, which also allowed for serious signal overload, but also at the cost of significant distortion. Obviously, for high fidelity recordings for Super Audio CD, the 0 dB level should never be crossed. 4.3 Idle tones As discussed in the section Sec. 4.1, silence in DSD is often equivalent to having a high powered tone outside the signal band. These tones are called ‘idle tones’. For higher order SDMs, the 1-bit output signal still carries these idle tones, although they have much reduced amplitude compared to the purely repetitive patterns shown in Sec. 4.1, and are embedded in a large amount of uncorrelated noise. For non-zero DC inputs, these tones start to move down in frequency with increasing DC level; at the same time, tones may start to appear in the LF part, which can, potentially, be audible. The origin of the tones appearing in the LF part lies in the feedback character of a SDM: suppose, we have a DC input of 0.25. The most likely combination of bits which represents that value is 1, −1, 1, −1, 1, −1, 1, 1. If this sequence is repeated, a tone a frequency of 8 × 44.1 kHz will result. For each halving of the DC input, this frequency will be halved too; eventually, this tone will end up below 20 kHz. This phenomenon can be reduced, or even removed, in several ways, as will be discussed more extensively in Sec. 8, by dithering and other means. If the SDM is undithered, audibility of these tones depends on the SDM used. Typically, the higher the order of the SDM, the lower the power of the tone in the audio band. For a typical (undithered) SDM the tones in the audible band are below -130 dB. 5 Design of SDM modulators: I In this section, a fully operational SDM will be designed. We will use the linearized model of the SDM to obtain values for the coefficients of the SDM, following to a large extent the design route proposed in [2], and also discuss ways to ameliorate the stability problem. From the start, it is important to know that the only way to obtain reliable insight in the performance of the SDM, is by simulation; although the linear approximation usually 20 results in a working SDM, it is too crude to provide numbers about SNR and, even more important, it does not provide any insight in stability. Also, in the design process, we assume an effective quantizer gain c = 1. Simulations based on this design can give some idea about what the effective gain c actually is within the limitations of the linear model, and be used for further refinement of the loop-filter. 5.1 Loop-filter design A very convenient way to start the design of a SDM modulator is the linear model of Fig. 6, where we take the gain c = 1. We take a feed-forward structure from Fig. 7, and write down the NTF that is associated with it. We can write for the loop-filter H(z): z −1 z −1 2 z −1 3 z −1 4 + c ( ) + c ( ) + c ( ) 2 3 4 1 − z −1 1 − z −1 1 − z −1 1 − z −1 and making use of the relation N T F (z) = 1/(1 + H(z)) we arrive at: H(z) = c1 N T F (z) = (4) (1 − z −1 )4 (5) (1 − z −1 )4 + c1 z −1 (1 − z −1 )3 + c2 z −2 (1 − z −1 )2 + c3 z −3 (1 − z −1 ) + c4 z −4 which is to be recognized as a filter of the appearance N T F (z) = (1 − z −1 )n /Pn (z −1 ). This is the form of a Butterworth or a Chebyshev type II filter1 ; the choice of either of those realizations dictates the final appearance of the polynomial P (z). Likewise, the STF can be computed as ST F (z) = 1 − N T F (z), resulting in: c1 z −1 (1 − z −1 )3 + c2 z −2 (1 − z −1 )2 + c3 z −3 (1 − z −1 ) + c4 z −4 ST F (z) = (6) (1 − z −1 )4 + c1 z −1 (1 − z −1 )3 + c2 z −2 (1 − z −1 )2 + c3 z −3 (1 − z −1 ) + c4 z −4 The approach that can now be followed is to design a high-pass filter for N T F (z), according to Butterworth or a Chebyshev-II (or any other) rules, and reorganize terms such that it is in the shape of Eq. (5). One way of approaching this is to use a symbolic manipulation package such as Mathematica [14], or to collect terms in powers of z and equate identical powers. From an engineering point of view, a very easy way of obtaining the coefficients ci is by recognizing that 1/N T F (z) is linear in the coefficients ci . It is then possible to set up a linear system for (at least as many as the order of the system) different values of z. These values must have no simple relation to each other, but need not be complex. In this way, it is also irrelevant whether the Butterworth filter is provided as a cascade of biquads, or as a direct realization. When we inspect the feedback structure (lower part of Fig. 7), we see that the transfer characteristic for the N T F (z) is identical to the NTF of the feed-forward structure discussed above. 1 albeit scaled such that the first term c0 z 0 of H(z) equals zero. If this term were non-zero, the resulting SDM would not contain a delay in the closed loop and hence be not realizable. 21 However, the STF is given by z −4 (7) ST F (z) = (1 − z −1 )4 + c1 z −1 (1 − z −1 )3 + c2 z −2 (1 − z −1 )2 + c3 z −3 (1 − z −1 ) + c4 z −4 which, for low frequencies equals about 1 if the coefficient c4 equals unity (this refers to the scaling as applied in Fig. 7). For higher frequencies, the STF displays an almost third-order roll-off. This is in contrast to the feed-forward topology, where the STF rolls off only very slightly (first order) for high frequencies. As an example, we will design a fourth order SDM, with a NTF according to a Butterworth high-pass filter design. The cut-off frequency is chosen as 150 kHz. Because the SDM needs to be realizable, the total loop needs to embody at least a single delay, i.e., the term with z 0 in the STF needs to be zero. This corresponds with the requirement that the high pass filter should have 1 as its first value of the impulse response. This can be accomplished by multiplying the high pass filter with a certain coefficient (larger than 0), resulting in a HF gain which is larger than 1. With the above in mind, we obtain for the NTF: +1.00z −0 − 4.00z −1 + 6.00z −2 − 4.00z −3 + 1.00z −4 N T F (z) = +1.00z −0 − 3.13z −1 + 3.75z −2 − 2.03z −3 + 0.42z −4 This results in the following coefficients in the feed-forward structure: c1 c2 c3 c4 = 0.8707115357 = 0.3594322506 = 0.0811807847 = 0.0083240406 (8) (9) For the feed-forward structure, the STF is now given by: +0.00z −0 + 0.87z −1 − 2.25z −2 + 1.97z −3 − 0.58z −4 +1.00z −0 − 3.13z −1 + 3.75z −2 − 2.03z −3 + 0.42z −4 For the feedback structure, the STF is given by: ST F (z) = (10) z −4 (11) ST F (z) = +1.00z −0 − 3.13z −1 + 3.75z −2 − 2.03z −3 + 0.42z −4 In Fig. 12, the different STF’s for a feed-forward and feedback structure, with an identical NTF, have been calculated. The NTF’s are designed as 4’th order Butterworth high pass characteristics, with a cut-off frequency of 150 kHz. Clearly, the strong roll-off characteristic of the feedback structure can be observed. Interestingly, the feed-forward topology displays a strong peak in its transfer characteristic at the cross-over frequency. This feature is not obvious from Eq. (3) if only the magnitude response |H| is used. The maximum peak height is in this case about 6 dB. This loop-filter design gives rise to an SDM with a maximum input of about -5 dB (i.e., 0.57 w.r.t. the feedback signal from the quantizer). At an input of a sine with an amplitude 22 10 ’FF.STF’ ’FB.STF’ 0 Gain (dB) -10 -20 -30 -40 -50 -60 10 100 1000 10000 frequency (Hz) 100000 1e+06 1e+07 Figure 12: Signal transfer functions for a feed-forward topology (red) and a feedback topology (green) with identical NTF’s. 23 cut-off (kHz) 100 120 150 170 DR (dB) 85 90 97 100 max. input level 0.77 0.70 0.57 0.49 Table 1: Trade-off of the maximum input range and the SNR in the base-band. of 0.5, the (unweighted) Signal to Noise Ratio (SNR) in the band 0-20 kHz is about 97 dB. In SACD applications, this is not sufficient: a signal-to-noise ratio of at least 100 dB is desirable. However, one might argue that the A-weighted SNR is much better, because the noise floor is large only for frequencies close to 20 kHz. Indeed, for this example, the A-weighted SNR amounts to about 105 dB. More important is the maximum modulation depth of the modulator. The definition of the 0 dB level in SACD is 50% modulation depth, i.e., the sine wave from the previous example would correspond to 0 dB SACD exactly. Peaks in the signal of +3.1 dB are allowed (though for a short period only)2 . Hence, the SDM needs to be stable for inputs up to a level of about 0.71. For every SDM design, there is a trade-off between stability of the modulator and the SNR in the base-band. As an example, consider the results in table 1 for different 4’th order SDM’s, which have all been created using Butterworth high pass filters as design NTF. Clearly, for these modulators it is not possible to obtain a dynamic range (unweighted) exceeding 100 dB, while maintaining the possibility for seriously overloading the SDM to a level of +3.1 dB. One way of increasing the SNR in the audio band, while hardly reducing the maximum input level, is to use higher order filters for the NTF, and to use a Chebyshev type II -like high pass filter for the NTF design instead of a Butterworth characteristic. Chebyshev type II high pass filters can easily be created in SDM’s by the construction of resonator sections, as displayed in Fig. 13. The construction in Fig. 13 is, in principle, applicable to a feed-forward topology; for a feedback topology, a similar arrangement with a feedback loop over two integrator sections is possible. In Fig. 13, two outputs of the resonator section are indicated, R1 and R2 ; the relation between these is that R2 (z) = h(z)R1 (z), designating the transfer characteristic of the integrator section as h(z) = z −1 /(1 − z −1 ). Also, two different realizations of the feedback path (with coefficient f ) are possible. The full drawn curve in Fig. 13 doesn’t incorporate the delay that the dotted realization does. The effects of the dotted feedback structure can be obtained as follows. The transfer R 2 (z) of the resonator section becomes: R2 (z) = 2 h2 (z) 1 + f h(z)2 These and other audio requirements are in part 2 of the SACD scarlet book[10] 24 (12) f − T T R2 R1 Figure 13: A cascade of two integrator sections in a SDM, with a feedback loop between the integrators. The two different ways of incorporating the feedback loop result in slightly different pole characteristics. Indicated are the two different outputs, which are characterized by a transfer function R1 (z) and R2 (z), respectively. This function has a pole at zp when z = zp solves (1 + f )z −2 − 2z −1 + 1 = 0, i.e., zp = 1 ± i p f (13) Hence, the norms |zp | > 1. The reduced frequencies fpole of these poles are thus given by fpole = atan( p f) (14) In the case of the full feedback path in Fig. 13, the resonator has a transfer function R1 , R2 (z) given by: R1 (z) = h(z) ; 1 + zf h(z)2 R2 (z) = h(z)R1 (z) (15) In this case, the poles are given by ip f ± 4f − f 2 (16) 2 2 Contrary to the previous case, these poles are exactly on the unit circle. The pole frequencies are given by: zp = 1 − f ) (17) 2 which, for small values of f , virtually coincides with the pole frequencies given by Eq. (14). As such a feedback loop over two integrator sections transforms the two poles at DC (z −1 = 1) in two complex conjugate poles away from DC, care should be taken that there is enough DC gain in the loop-filter to avoid DC drift. As an example, consider the 4th order SDM with a Butterworth design, corner frequency 150 kHz. Choosing the poles to move from DC to ±10 and ±19 kHz, the numerical values of the feedback coefficients obtained are 0.000496 and 0.001789. The SDM obtained has a maximum input of 0.57 (0.57 without resonators) and a SNR of 107 dB (97 dB without resonators). Indeed, the addition of the fpole = acos(1 − 25 T −C +C Figure 14: Principle of a clipped integrator. The absolute value of the output of the integrator cannot exceed a value of C. poles, turning the Butterworth characteristic in a Chebyshev II - like characteristic, gives significant better SNR; the DC suppression of the loopfilter is still better than 120 dB, which is sufficient. Compared to the A-weighted SNR figures, the improvement is less, because the poles primarily serve to suppress the noise between 10 and 20 kHz. A further improvement can be obtained when using a fifth order SDM, with a Butterworth NTF design (corner frequency 110 kHz) plus the poles at 10,19 kHz: in that case the SDM is stable to inputs up to 0.58, with a SNR of 120 dB. Note, that in this case, there is still 1 integrator with a pole at DC, and thus there cannot be any DC drift. To clarify the operation of such a SDM, pseudo-code of the SDM is provided in App. A. A drawback of the above implementations of resonator sections is that the resulting filter is not minimum phase; due to this, not the full potential that noise shaping offers can be realized. Although the improvement that can be realized by a minimum-phase filter is (in this case) limited, a very interesting suggestion is the following3 . Suppose that we create a resonator section, which contains both the dotted and the full drawn realization of the feedback. Denote the feedback coefficient in the full drawn realization by f1 , the feedback coefficient in the dotted structure by f2 . The poles ((1 + f2 ) ≥ (1 − f21 )2 ) are then given by f1 zp = (1 − ) ± i 2 r (1 + f2 ) − (1 − f1 2 ) 2 (18) q √ 1+f2 and have |zp | = 1 + f2 , with reduced pole frequencies fpole = atan( (1−f 2 − 1). 1 /2) Hence, the radius and pole position can be adjusted independently, and it is possible to have |zp | < 1 at the cost of an additional feedback path in the resonator section. 5.2 Enforcing SDM stability So far, we have not bothered about what happens if the SDM input exceeds its maximum: the SDM gets into wild oscillations, with constantly increasing amplitude in the integrator states and decreasing frequency. Even worse, when the input is removed from the system, the SDM does not return to its original state. To avoid such a situation, it is customary to use clippers in each integrator stage. In Fig. 14, a schematic representation of a clipped 3 This observation has been made by prof. S.P. Lipshitz. 26 integrator is given. The idea is that the output of the integrator can never exceed its clip value, C. In other words, the integrator section simply stops integrating when the cliplevel C has been reached The purpose of these clippers is to avoid a situation where the values in the integrator stages get too high (and cause the SDM to start to oscillate), while still allowing integrator values which occur during normal operation. Whereas the main purpose of the clippers is to let the SDM return to normal operation after overload, it is also desirable to avoid serious distortion in the signal if clipping occurs. Heuristic ways of obtaining reasonable numerical values for the clipper levels are monitoring the integrator levels during very large sine wave inputs and square wave inputs, close to overload of the SDM. The clipper levels C1 and C2 of the first 2 integrator stages can be set according to these values. If the higher integrator stages are assigned values according to this recipe as well, the situation occurs that the SDM returns to normal operation after overload, but can have all clippers activated simultaneously. This will cause serious clicks and pops (especially if the first integrators run in their clippers). Hence, the higher order clippers should be designed such that the high order clippers are activated first, before the low order clippers are activated. As an example, let us consider the fifth order SDM designed previously. It’s feed-forward coefficients are: c1 = 0.79188240; c2 = 0.30454538; c3 = 0.06992965; c4 = 0.00949572; c5 = 0.00060680 with resonator coefficients: f1 = 0.000496; f2 = 0.001789 The pseudo-code of this SDM is provided in App. A, suitable for easy implementation in any programming language. Without any clippers, the SDM is stable for sine inputs up to 0.58; for higher amplitudes, the SDM gets fully unstable. Looking at the maximum integrator values during operation close to overload, we obtain a value C1 and C2 for the first and second clipper respectively of about 4 and 9. The following clipper values are chosen such, that the product Ci ci of the clipper value and the corresponding feed-forward coefficient is reduced by about 1.5 - 2 per integrator stage. This is illustrated in table 2. From table 2, we can obtain some idea about the influence of the clippers on the SDM operation. The clippers are sometimes activated during operation at 0.5 input level, which causes a small reduction in SNR with respect to the 120 dB without clippers. However, whereas the original SDM turned unstable at inputs of 0.59, its clipped version shows continuous stable operation. Even at inputs of 0.65, the first integrator is not clipped, indicating that the signal distortion is still limited, and highly audible clicks are absent. In fact, only at input levels exceeding 0.75, the initial integrator will clip, which causes a clearly audible effect. At the level of 0.75, the SNR has dropped to about 60 dB. As an alternative to clipping in the SDM, clipping before the SDM might be considered. However, in this case dynamic range must be sacrificed, although the resulting system is unconditionally stable for large inputs. 27 Integrator 1 2 3 4 5 Input level 0.5 0.55 0.59 0.60 0.65 C1 0 0 0 0 0 Ci 4 9 25 92 700 C2 0 0 0 5 512 ci 0.7918824022 0.3045453872 0.0699296548 0.0094957213 0.0006068024 C3 0 0 12 48 2283 C4 0 0 57 175 3258 ci Ci 3.16 2.7 1.75 0.87 0.42 C5 1836 6595 16285 18829 38155 SNR (dB) 118 117 107 104 67 Table 2: Determination of clipper values for a SDM (above) and the influence of the clippers on the normal SDM operation (below). The columns with clippers Ci indicate the number of times a clipper was activated in a run of 300,000 samples. The above route has given a complete design example of a modulator reaching the ‘magical’ 120 dB SNR limit. In practice, however, it is dubious whether 120 dB SNR is necessary. As most electronic equipment seems to be closer to 110 dB, and human hearing seems not capable of reaching a dynamic range of more than 100 dB, 110 dB SNR in the SDM design seems realistic. As an alternative, one might consider a SDM design according to a Butterworth NTF design with a corner frequency of 95 kHz. The resonator poles remain unchanged. The coefficients for such a modulator are: c1 = 0.68402124; c2 = 0.22813609; c3 = 0.04563584; c4 = 0.00542804; c5 = 0.00030590 with clipper levels C1 = 5; C2 = 12; C3 = 40; C4 = 200; C5 = 1100. This SDM is stable (without clipping) up to inputs of about 0.65, while reaching a SNR figure of about 115 dB. These figures seem to represent a very agreeable compromise between dynamic range and maximum allowable input. However, for every application this balance should be re-judged. 6 Design of SDM modulators: II In the previous section, a design method is presented which in general leads to SDM’s of good performance with a Butterworth high-pass type NTF. However, sometimes there may be specific demands which necessitate the use of other designs. An example of such a demand may be the specification of a limited amount of HF noise in the band above 40 kHz. Though several designs exist which allow for this, we will outline two. 28 0 ’test.AvgPwr’ -50 -100 -150 -200 -250 10 100 1000 10000 100000 1e+06 1e+07 Figure 15: Example of a SDM which has been created according to NTF design by cascading a third order high pass filter and a fourth order high pass filter. The first, in line with the previous section, consists of cascading 2 (or more) high pass filters, which then make up the SDM NTF. For example, one could wish to create a SDM which is third order starting from 150 kHz, and than turns 7’th order at about 40 kHz. An example of such a design is given in Fig. 15. That SDM has been obtained by designing an NTF as a cascade of a third order Chebyshev high pass filter, with a corner frequency of 150 kHz, and a fourth order filter of the same type with a corner frequency of 40 kHz. The cascade is hence 7’th order below 40 kHz, and in this way some of the merits of a low order and high order SDM can be combined. A more heuristic approach is to set each coefficient ci in the SDM to a fraction of its previous coefficient ci−1 . An example of such a SDM in non-delayed feed-forward topology is given in Fig. 16, which represents a 7’th order SDM where each coefficient is 0.475 times its previous coefficient. Note, that this is really a recipe; the actual performance of the SDM is determined too by its topology (e.g., a SDM with delayed feedback topology would be unstable with these coefficients). It is interesting to see, that a NTF characteristic as displayed in Fig. 16 can be approximated by a cascade of first order filters with different corner frequencies. In that case, there is full control over the SDM design. 29 0 ’test.AvgPwr’ -50 -100 -150 -200 -250 10 100 1000 10000 100000 1e+06 1e+07 Figure 16: Example of a SDM which has been created by setting each feed-forward coefficient ci to 0.475ci−1 (c1 = 1). 7 Signal processing A crucial point in any audio chain is signal processing, ranging from simple volume adjustments to complex equalizations. It is immediately apparent, that a direct translation of the ‘PCM-way’ of signal processing does not exist in DSD. For example, if a DSD signal is volume-adjusted, with a gain g = 0.123456, the resulting output (the one-bit signal multiplied with g) is a multi-bit word. Hence, any signal processing for DSD is always consisting of a cascade of the actual processing step, followed by a re-quantization as shown in Fig. 17. It is possible to contract some signal processing steps and the SDM re-modulator. An example, where an IIR filter is contracted with a SDM, is shown in Fig. 18. It is important to note, however, that such a device is not different from the cascade of signal processing/remodulation, although the intermediate multi-bit path is absent. To obtain a realizable system, a low pass filter is generally necessary as indicated in Fig. 19. The reason for this is that the SDM which is used as a re-modulator, cannot cope with the high signal levels the DSD presents. As virtually all of the power of these signals is above 100 kHz, a low pass filter operating above this frequency is sufficient to remove enough power such that the re-modulator remains in stable operation. In this respect, the feedforward and feedback structures have quite different behaviour. As elaborated in Sec. 5, the feed-forward structure has little suppression of the input signal over the whole band (up to Nyquist), and sometimes even a gain just at the corner frequency of the NTF filter characteristic. The feedback structure, on the contrary, has strong suppression of the input 30 DSD input Gain Multi−bit intermediate DSD output Σ∆ High rate! DSD input IIR Multi−bit intermediate DSD output Σ∆ Figure 17: Examples of DSD signal processing: gain adjustment and filter operations. DSD input DSD output T T T T T Figure 18: Contraction of IIR filter characteristic and SDM, giving a structure with DSD input and DSD output. DSD input Gain Multi−bit intermediate Σ∆ DSD output High rate! Figure 19: Advisable way of performing two operations on DSD data. First, a gain adjustment is applied, after which an IIR filter operation is applied without leaving the intermediate high rate, multi-bit domain. 31 20 ’Total’ 0 Magnitude (dB)’ -20 -40 -60 -80 -100 -120 -140 0 50000 100000 150000 200000 frequency (Hz)’ 250000 300000 350000 Figure 20: Transfer function of a filter which can be used to remove the HF of a DSD signal, such that it can be input to a subsequent SDM. 32 Signal quality Amplitude (dB) 20 100 frequency (kHz) # requantizations Figure 21: Schematic presentation of the effect of multiple quantizations. signal from the fore mentioned corner frequency (see also Fig. 12). Hence, a ‘feed-forward’ SDM will need more severe filtering of its input signal compared to a ‘feedback’ SDM in order to maintain stability. The response of a (64 taps) FIR filter which gives sufficient HF suppression to allow subsequent re-quantization, is shown in Fig. 20. The total signal transfer characteristic of the cascade of a feed-forward SDM and this filter will be roughly identical to the STF of a feedback SDM. Clearly, the application of such a filter will turn the 1-bit signal directly in a multi-bit signal. It is therefore important to realize, that the benefits of DSD are in the high sample rate, they are not in the fact that DSD is 1-bit! The importance of this remark is further emphasized by the following notion: suppose, that the sequence of signal processing steps is necessary. If each of these steps is built according to Fig. 17, the total signal path will contain multiple requantizations. As a result of this, build-up of HF noise will occur. This effect is illustrated in Fig. 21, where schematically the effect of multiple requantizations is displayed. This figure can be explained as follows. If we have a DSD signal, its noise starts to rise above 20-30 kHz, and reaches an almost flat level at about 90 kHz. If, in a subsequent re-quantization, the bandwidth of DSD is maintained, the signal is low pass-filtered at a frequency of about the same value (90 kHz). If this signal is fed to a next SDM, its output signal will contain both its own quantization noise, as well as the quantization noise that has been input to it. If this cascade is repeated, it is easy to see why there will be a build-up of HF noise in the area of about 80-90 kHz. Eventually, this signal will be large enough to drive the SDM into its clippers, or, worse: instability. This effect is shown in the right of Fig. 21; as the number of requantizations increases, the signal quality drops slowly. At the moment that the HF noise is large enough to activate the clippers, the signal quality drops rapidly. Hence, all signal processing should be done in a multi-bit domain; only after the final signal processing step the conversion to 64fs 1-bit signals should be made. 33 8 Dithering and linearizing SDM’s SDM’s are devices with a quantizer; as we are used to with the quantizers from the PCM world, we need to linearize the devices that use a quantizer. With the multi-bit quantizing PCM devices, it is common knowledge that the quantizers need to be dithered with TPDF dither (dither, distributed according to a Triangular shaped Probability Density Function) of full width at half height of 1 LSB [6]. Such dither can easily be obtained by adding 2 random numbers from a uniform distribution of width 1 LSB. For SDM, this recipe is a contradiction in terminis, since the quantizer spans only one bit and, hence cannot accommodate the afore mentioned tpdf dither which spans 2 bits. Still, dithering in what we will coin ‘the classical sense’ is a very useful technique and has been well-researched; see [8] and [9] and references therein. Even so, new dither techniques are being discussed, which are more appropriate for 1-bit coders; see, e.g. [3]. Next, we will discuss some aspects of dithering in the classical sense. As dither is used to remove the effect of non-linearity, we can distinguish two different appearances of the non-linearity: limit cycles4 , idle-tones and distortion. As the idle tones and distortion are heavily suppressed by the loopfilter, we will ignore it for the moment. In Sec. 9, a more detailed discussion about non-linearity in an SDM is presented. Limit cycles, however, can be very annoying: they can appear in the audible range and, even in the audible range, have high power. Consider, for example, an SDM with the topology at the top of Fig. 7, characterized by the following feedforward coefficients: c 1 = 2048; c2 = 768; c3 = 128; c4 = 16; c5 = 1. Clearly, this SDM is extremely well-suited for implementation in hardware, as the coefficients represent simple powers of 2, except c2 , which is the sum of two powers. It’s spectrum, input zero, is displayed in Fig. 22 which does not show any resemblance with the familiar noise-shaped curve: it is a limit cycle. A limit cycle is a purely repetitive pattern of certain length; for example, a repeated sequence (representing zero input - see also Sec. 4.1) of 1, −1, −1, 1, −1, 1, 1, −1 represents a limit cycle of length 8. The limit cycle in Fig. 22 has length 32, as can be read from its fundamental at 88 kHz. Fortunately, little needs to be done to break up the limit cycle. For example, any input signal exceeding an amplitude of -90 dB will remove the limit cycle completely. To allow for digital silence, though, the use of dither is required, and a very useful way is by applying dither with a rectangular PDF (RPDF dither) just before the quantizer. In the case of the SDM we are discussing here, an appropriate amount of dither has a pdf with a width of 200 (and a mean of 0), and needs to be added immediately before the quantizer. The resulting spectrum is displayed in Fig. 22. This has the advantage, that the dither will become noise-shaped too (as the quantization error) and the increase in noise floor will be marginal. In this case, the undithered SDM has a dynamic range of 98.4 dB (full scale SACD), whereas the dithered SDM has a dynamic range of 98.0 dB. The maximum input, before the SDM turns unstable, has been reduced from 0.7104 to 0.7098 for an input of a 4 In the literature, these tones are sometimes also called idle tones. We reserve the name idle tones for signals which are not purely repetitive - see also Sec. 9. 34 0 -50 Power (dB) -100 -150 -200 -250 -300 100 1000 10000 frequency (Hz) 100000 1e+06 Figure 22: Example of a limit cycle occurring in a SDM with zero input (green). In red, the spectrum after application of dither (also zero input) is shown. 35 -20 -40 -60 -80 Power (dB) -100 -120 -140 -160 -180 -200 -220 100 1000 10000 Frequency 100000 1e+06 Figure 23: A noise shaper which is typically used in SACD applications. The spectrum has been coherently averaged 100 times, and this has been repeated 10 times to obtain a power averaged spectrum. 1 kHz sine wave. Hence, this amount of dither has hardly any drawbacks, and significant advantages. The distortion introduced by the SDM amounts to -150 dB in the band 0-20 kHz (see Sec. 9 for a more detailed discussion about non-linearity in an SDM). The dither added to the quantizer, will hardly change that number, but it is disputable that this amount of distortion (in PCM, this would have been below the 25 bit level) would lead to audible effects. 9 Non-linearity in a SDM To present a realistic situation, a spectrum of a SDM that is typically used in SACD applications is presented in Fig. 23. For the purpose of this discussion, this SDM has not been dithered. The input to this SDM has been a 4 kHz sine (-6 dB SACD amplitude). If we are interested in the base-band, extending from 0 to 20 kHz, the relevant distortion products are the 2’nd up to the 4’th component. From inspection of Fig. 23, it can be concluded that the distortion components are all at most -165 dB, where the noise in the FFT obscures any information deeper than that. The noise floor of this SDM is at -127 dB, resulting in a DR of about 120 dB (recall, that the SACD reference 0 dB level has been defined as -6 dB with respect to the level in the feedback path). It is also instructive to extend the region of interest to the band 0-80 kHz. Obviously, the noise floor is increasing 36 DIGITAL n-BIT LPF SDM DSD; 64fs DAC ANALOGUE LPF multi-bit; m.64fs n-bit; m.64fs analogue Figure 24: Example of an audio chain found in an SACD-capable player. The DSD is first low pass filtered in the digital domain, followed by up-sampling to m · fs , typically, 128 or 256 fs . This high-rate signal is then fed to an n-bit SDM, where n typically varies between 1.5 and 5. Finally, the analog output is passed through an analog low pass filter. steeply (in the case presented in Fig. 23, this increase is fifth order) causing the maximum Signal-to-Noise Ratio (SNR) to drop to about 90 dB in the band 0-40 kHz, and about 55 dB in the band 0-80 kHz. Any harmonic distortion component, however, is at a level at least below -95 dB. Clearly, any harmonic distortion component that we are dealing with in the broader sense of the audio band, is extremely small, and its importance for the perceptual audio quality can be doubted. In view of the fact that this SDM has not been dithered, it is clear that dithering will even further reduce these numbers. In fact, if this SDM is dithered to its maximum level (where it is just not overloaded) the distortion components in the audio band are all below -180 dB, only observable after 5000 coherent averages, and the components in the broader audio band are below -110 dB. Still, the total amount of coherent power that is present in the dithered signal is significant. The amount of coherent power can easily be estimated if the actual noise is assumed to have no correlation with the signal. It appears that the total amount of coherent power which is present in Fig. 23, is about -10 dB. It is obvious that this power is mostly above 1 MHz; 99.99% of the coherent power is found in this high frequency area. The exact value of the frequency above which most of the correlated signal is found, is dependent on the signal which is input to the SDM; it will, however, never be very much lower than the quoted 1 MHz. It is beyond doubt, that the origin of these signals in the very high frequency area is in the non-linear behavior of the SDM. Indeed, if a triangular pdf dithered multi-bit quantizer is used in the noise-shaper, the high frequency components disappear. Thus, the coherent signal above 1 MHz can be considered in some sense to be distortion. To judge whether these distortion components are harmful, we need to look at the full audio chain which is used to replay DSD in a typical SACD-capable player. Such a configuration is shown in Fig. 24. A typical DAC-chip (see e.g. [1] or [7]) contains the first 4 blocks displayed in Fig. 24. The digital filter in the path leading to the n-bit SDM is a crucial part, where most of the HF signal present in the DSD signal can be removed without any compromise. As an example, consider a filter that is designed according to the following criteria: pass-band: 0-100 kHz, flat within 0.01 dB; transition band 100 kHz - 900 kHz; stop band: 900 kHz - 1.4MHz, suppression 100 dB. This leads to a filter with only 22 taps, and thus does not pose any additional constraint in terms of hardware; the filters which are necessary to do proper up-sampling from a low sample rate format to the required m · 64f s , are much more demanding. Also, the digital LPF does not influence the impulse response 37 of DSD [13], as the transition width is extremely large. It is clear, that the application of this filtering will lead to significant suppression of the high frequency components present in the original DSD stream. Still, the signal contains substantial amounts of HF, which is foremost white noise. The signal is then up-sampled to a frequency that is used to perform the digital-to-analog conversion on. The SDM will noise-shape this signal into an n-bit signal, where n typically varies between 3 [1] and 5 [7]. It is this signal, which is converted to the analog domain. Due to the noise shaping process, which is intrinsic in modern, high-end DA converters, and is the sole basis for their very high performance, some additional high frequency noise extending to frequency regimes well above 1 MHz is introduced. This noise is usually removed by an analog low pass filter of first or second order. This filtering is most often passive, and can thus be performed with exceptionally low distortion and inter-modulation. In most SACD players, some additional filtering is provided, to reduce the amount of HF noise (which by then, is mostly due to the DSD signal) even further to levels well below -30 dB. It is important to remark, that the HF signal levels at which these additional filters need to operate are quite low due to the digital pre-filtering (which removed a very substantial amount of HF signal causing the total signal power to be substantially less than 1); hence, the linearity of the filters can be quite high and the filtering operation is performed without additional inter-modulation products. This example of a typical SACD signal path shows, that the non-linearity above 1 MHz is not important at all, and does not influence the signal quality. In fact, one can argue that these components are favorable. Because the total power of the SDM output is constant and equals 1, the power which is present in these high frequency tones causes the SNR in the lower frequencies to be higher than anticipated on basis of the linear noise transfer function. Hence, they contribute favorably to the dynamic range of an SDM. This discussion then leads to the question whether it would be possible to linearize a SDM in the important signal band, without bothering about its high frequency behavior. 9.1 Pre-correction In order to have a system which demonstrates in a clear way the effects that we will study in this section, a third-order SDM has been designed. Such low order SDM’s are notorious for their relatively bad signal properties [9]. The spectrum of the third order SDM that will be used in the sequel of this paper is shown in Fig. 25. While this third-order SDM has a dynamic range of about 90 dB, its third harmonic is at a level of -104 dB. While this is still a rather respectable number, it is about 60 dB larger than the distortion component of the SDM shown in the previous section. The higher order harmonic distortion products are significant, too. Also in the broader signal band (0-80 kHz) the distortion components are larger. It should be remarked, that this type of SDM is not recommended for practical use. When we model the SDM as a non-linear element Σ∆, its transfer characteristic can be written as: 38 -20 -40 -60 Power (dB) -80 -100 -120 -140 -160 -180 100 1000 10000 Frequency 100000 1e+06 Figure 25: Spectrum of the third order noise-shaper used in the analysis of the precorrection technique. The input signal is a 3 kHz sine wave, -6 dB SACD. To obtain this spectrum, a series of 4 coherent averages and 10 power averages has been used. Σ∆(x) = x + α2 x2 + α3 x3 + . . . (19) Now, if we could create a signal s(x) according to: s(x) = x − α2 x2 − α3 x3 − . . . (20) then the resulting output signal f (v(x)) would be given by: Σ∆(s(x)) = x − 2α22 x3 + O(x4 ) (21) In other words, the second harmonic distortion component has been completely removed, and the third harmonic component has been substantially reduced (note, that for the low distortions we are dealing with, αi 1). An estimate of the signal s(x) can be obtained using the structure depicted in Fig. 26. The topology of Fig. 26 operates as follows. The first SDM generates a signal, which is subtracted from the original input signal x. This difference signal v now contains all the distortion components which are generated by the SDM, and the uncorrelated noise which has been added to the signal because of the noise shaping. This signal v is now low-pass filtered in the filter F , which has, for example, a cut-off frequency of 100 kHz. This results in the signal denoted F (v) in Fig. 26. Next, the original input signal x (after the appropriate delay to correct for the delay in the filter f ) is added to F (v), resulting in 39 x Delay SDM − v + F F(v) + + s’(x) SDM y SDPC Figure 26: Basic Sigma Delta Pre-Correction (SDPC) structure. the signal s0 (x). While the filtering action has removed all HF noise, more in particular, it has removed the strong signals above 1 MHz, it has not removed any noise in the band below 100 kHz. Hence, the signal s0 (x) presents only an approximation to the signal s(x) in Eq. 20. The signal s0 (x) is than input to a next SDM, which is identical to the SDM used to generate v, resulting in the final output signal y. To gain some insight in the performance of this algorithm, which we will refer to as Sigma Delta pre-correction (SDPC), we have applied it to the third order SDM displayed in Fig. 25. The spectrum of the resulting signal y is displayed in Fig. 27 in the range 0100 kHz. The huge suppression of the distortion components is clearly visible. Typically, the distortion has been reduced by about 20 dB. For higher frequencies, the suppression becomes less effective, even though the signal s0 (x) contains all distortion components unattenuated in the frequency regime. As always, there is a price to pay for this improvement in THD, which in this case is an increase in the noise floor by 3 dB. This is clear from inspection of Fig. 27, when one realizes that the corrected spectrum has been obtained using twice as many coherent averages which lowers the noise floor by 3 dB, and that the noise floor is identical to the noise floor of the uncorrected spectrum. This also corroborates the fact that this is white noise indeed; if it was correlated, it would result in a more than 3 dB increase. The origin of the increase of the noise floor is the fact that the signal s0 (x) still contains the quantization noise present in the low frequency range; the second SDM in the cascade adds its own quantization noise to it. Though not visible in Fig. 27, the high frequency signals above 1 MHz are completely unchanged using the new topology, which is expected on basis of the absence of correction components in the signal s0 (x). 9.2 SDPC and dither To appreciate the effect of SDPC, it is also instructive to study the combined action of dither and pre-correction. To that end, we have applied a dither level of 0.1 (the SDM starts overloading at levels of 0.8) to the SDM. Spectra of the original SDM, and the SDPC spectrum are displayed in Fig. 28. Also in this case, the suppression of the distortion components is at least 22 dB in the band 0-20 kHz; in fact, even after 64 coherent averages, no distortion components can be observed. Note, that distortion has decreased to levels below -135 dB! Hence, the combined action of small amounts of dither, and the pre-correction technique result in extremely low distortion 40 -20 -40 -60 Power (dB) -80 -100 -120 -140 -160 -180 1000 10000 Frequency 100000 Figure 27: Spectra of the original SDM (green), and its implementation according to Fig. 26 (red). The spectrum of the original SDM has been obtained using 4 coherent averages and 10 power averages; the other using 8 coherent averages and 10 power averages. The fact that the noise floors of the spectra coincide precisely illustrates the 3 dB loss in SNR due to SDPC. 41 -20 -40 -60 Power (dB) -80 -100 -120 -140 -160 -180 1000 10000 Frequency 100000 Figure 28: Spectra of the original (dithered) SDM (green), and its implementation according to Fig. 26 (red) using the same dither. The spectrum of the original SDM has been obtained using 8 coherent averages and 10 power averages; the other using 64 coherent averages and 5 power averages. 42 0.06 0.05 Phase (rad) 0.04 0.03 0.02 0.01 0 -0.01 100 1000 10000 Frequency Figure 29: Phase characteristic of the signal transfer function of the third order SDM used in this paper. figures. Again, the reduced distortion suppression for higher frequencies is visible; for example in the region above 40 kHz, the suppression is typically only 8-10 dB. While the higher harmonics are suppressed less than the lower harmonics, which is shown by Eq. (21), this does not fully explain the reduced suppression. Another origin of this reduced suppression for higher frequencies lies in the fact that the phase characteristic of the SDM used here is not straight for frequencies above 20 kHz. This results in some phase distortion, which is not accounted for in the pre-correction technique according to Fig. 26. To obtain an estimate of the significance of these errors, consider a single harmonic h(ωt) = A sin (ωt), which is positioned around 50 kHz. The absence of phase correction ∆ will cause incomplete cancellation of the harmonic; a residual power of 4A2 ∆2 will remain. In this case, this results in a maximum power reduction of the harmonic by only 14 dB. An improved pre-correction technique is therefore displayed in Fig. 30. In this diagram, the phase error introduced by the SDM, is corrected for by the filter L. Another improvement can be obtained by cascading the structures displayed in Figs. 26 and 30. In a non-cascaded structure, the cancellation of lower order terms, causes the generation of higher order terms, albeit of much lower amplitude, as can be concluded from Eq. (21). These new, higher order terms, can in turn be canceled in exactly the same way as the lower order ones were canceled, resulting in cascading the structure in Fig. 30. 43 x Delay SDM v − + G F(v) + + s’(x) SDM y L SDPC Figure 30: Improved pre-correction structure. By cascading the Sigma Delta PreCorrection structure (SDPC) n times, n harmonics can be removed. -80 -100 Power (dB) -120 -140 -160 -180 -200 10000 20000 30000 40000 Frequency 50000 60000 70000 80000 Figure 31: Fifth order SDM, with a 3 kHz input of -6 dB. The uncorrected spectrum (green) has been obtained after 16 coherent and 10 power averages; the corrected spectrum (red) after 2048 coherent and 10 power averages. 44 -60 -80 -100 Power (dB) -120 -140 -160 -180 -200 -220 100 1000 10000 100000 Frequency Figure 32: Fifth order SDM, with a DC input of 1/1024. The uncorrected spectrum (green) has been obtained after 4 coherent and 10 power averages; the corrected spectrum (red) after 32 coherent and 10 power averages. 9.3 Performance of a realistic SDM with SDPC To end with a realistic situation, and to show how SDPC also suppresses DC tones, a standard fifth order SDM has been designed, with a SNR of 118 dB over 0-20 kHz. As illustrated in Fig. 31, harmonic distortion levels of this SDM in the phase-corrected SDPC structure are reduced to well below -185 dB if undithered, which amounts to an improvement of about 35 dB compared to 20 dB improvement with the standard SDPC. If the SDM is slightly dithered, the distortion levels drop to much deeper levels, which numerically appeared to be inaccessible (i.e., below -220 dB). Also, distortion levels at higher frequencies are reduced more compared to the standard SDPC algorithm. As with the uncorrected SDM, the SNR in the base-band (0-20 kHz) is slightly reduced from 118 dB to about 115 dB (no dithering) or 114 dB (with dithering). The effects of a DC input to the SDPC system are illustrated in Fig. 32. As input to this system, a DC value of 1/1024 has been applied, which results in a tone around 5.5 kHz. The SDM has not been dithered. In the spectrum of Fig. 32, a tone can be observed with an amplitude of about -145 dB. Application of the pre-correction algorithm, in its basic form, reduces this amplitude to about -165 dB. If a small amount of dithering (RPDF with amplitude 0.05) is applied, which is much less than the maximum allowed amount of dither (0.4 RPDF), the amplitude of the tone cannot be observed after 256 coherent averages, indicating that the tone is at least less than about -175 dB. Also application of the improved SDPC results in values for 45 spurious signals that are not easily accessible numerically. 46 10 Acknowledgements The authors want to thank prof. S.P. Lipshitz, prof. J. Vanderkooy, Dr. J.D. Reiss and H. ten Pierick for their valuable comments and proofreading of the manuscript. 47 A SDM-code In this appendix, we provide the C-like pseudo code for the SDM discussed in Sec. 5.2. The code simulates 100000 clock cycles of the SDM, with a DC input of 0.1 . /* Coefficients: */ c = { 0.791882, 0.304545, 0.069930, 0.009496, 0.000607 }; f = { 0.000496, 0.001789 }; /* Initialization */ s0 = s1 = s2 = s3 = s4 = 0; y = 1; N = 100000; /* Main loop */ for (i = 0; i < N; i++) { sum = c[0]*s0 + c[1]*s1 + c[2]*s2 + c[3]*s3 + c[4]*s4; if (sum >= 0) y = 1; else y = -1; x = 0.1; s4 s3 s2 s1 s0 } } = = = = = s4 s3 s2 s1 s0 + + + + + s3; s2 - f[1]*s4; s1; s0 - f[0]*s2; (x-y); 48 References [1] B. Adams, K. Nguyen, and K. Sweetland. A 116 db snr multi-bit noise shaping dac with 192 khz sample rate. In Proceedings of the 106’th AES convention, 1999. preprint 4963, Munich (1999). [2] R.W. Adams, P.F. Ferguson, A. Ganesan, S. Vincelette, A. Volpe, and A. Libert. Theory and practical implementation of a fifth order sigma-delta a/d converter. J. Audio Eng. Soc., 39:515–528, 1991. [3] M.O.J. Hawksford. Time-quantized frequency modulation with time dispersive codes for the generation of sigma-delta modulation. In Proceedings of the AES 112’th convention, 2002. Preprint 5618, 2002 may 10-13 munich. [4] H. Inose and Y Yasuda. A unity bit coding method by negative feedback. Proc. IEEE, 51:1524–1535, 1963. [5] H. Kato. Trellis noise-shaping convertors and 1-bit digital audio. In Proceedings of the AES 112’th convention, 2002. Preprint 5615, 2002 may 10-13 munich. [6] S.P. Lipshitz, R.A. Wannamaker, and J. Vanderkooy. Quantization and dither: a theoretical survey. J. Audio Eng. Soc., 40:355–375, 1992. [7] S. Nakao, H. Terasaw, F. Aoyagi, N. Terada, and T. Hamasaki. A 117db d-range current-mode multi-bit audio dac for pcm and dsd audio playback. In Proceedings of the 109’th AES convention, 2000. preprint 5190, Los Angeles (2000). [8] S.R. Norsworthy and D.A. Rich. Idle channel tones and dithering in delta-sigma modulators. In Proceedings of the AES 95th convention, 1993. preprint 3711, 1993 october New York. [9] S.R. Norsworthy, R. Schreier, and G.C. Temes. Delta-Sigma Converters, Theory, Design and Simulation. IEEE Press, New York, 1997. [10] Philips and Sony. Super Audio CD System Description. Philips licensing, Eindhoven, The Netherlands, 2002. [11] D. Reefman and E. Janssen. Enhanced sigma delta structures for super audio cd application. In Proceedings of the AES 112’th convention, 2002. preprint 5616, 2002 may 10-13 munich. [12] D. Reefman and P.A.C.M. Nuijten. Editing and switching in 1-bit audio streams. In Proceedings of the AES 110’th convention, 2001. preprint 5399, 2001 may 12-15 amsterdam. 49 [13] D. Reefman and P.A.C.M. Nuijten. Why direct stream digital is the best choice as a digital format. In Proceedings of the AES 110’th convention, 2001. preprint 5396, 2001 may 12-15 amsterdam. [14] S. Wolfram. The Mathematica Book. Wolfram Media/Cambridge University Press, Cambridge, 4 edition, 1999. 50