Digital Audio Introduction The concept of digital audio is a simple one: Transform a time-continuous analog signal into a discrete sequence of numbers. There are many ways to do this but the most basic and perhaps widely used technique is linear pulse code modulation (linear PCM for short). The basic idea is to measure the waveform at a constant rate and turn those measurements into (usually) integer values. The sequence of values can be thought of as a simple one-dimensional array for storage and processing purposes. For reconstruction of the signal (i.e., conversion back to analog), an inverse circuit retrieves the values and turns them back into voltage levels. The resulting waveform is then smoothed with a filter to reconstruct the original signal. The accuracy of the conversion is primarily controlled by two parameters: fs (the sampling rate), and the bit-depth (i.e., maximum number of bits in the values used). Shannon’s sampling theorem along with Nyquist’s limitations dictate that the sampling frequency must be at least twice as high as the highest frequency component in any broad band signal. Failure to do so will result in alias signals. Aliases are, in essence, signal components above fs/2 that are folded back into the desired bandwidth. Aliases are a particularly nasty form of distortion and should be avoided. A simple way to do so is to insert a low pass filter prior to the digitizing circuit. The upper break frequency and roll-off rate must be sufficient to guarantee that no signal above fs/2 gets to the digitizer. In contrast, the bit-depth controls the accuracy of the individual sample points, and thus determines noise1. A good rule-of-thumb is that each bit contributes approximately 6 dB to the signal-tonoise ratio. Obviously then, high sample rates with long word lengths will yield the highest quality, however this technique will also yield the greatest amount of data. In practical systems there needs to be a balance between quality and storage requirements. Standards The most widely used audio digitizing standards currently in use are those for CD audio and for telephone service. CD audio is digitized using 16 bit resolution at a sampling rate of 44.1 kHz. This yields a maximum permissible bandwidth of 22.05 kHz, although 20 kHz is a practical upper limit. The signal-to-noise ratio is approximately 96 dB maximum. In contrast, standard phone lines are digitized at an 8 kHz rate using a compressive (non-linear) 8 bit scheme. The signal-tonoise ratio and bandwidth are both severely degraded compared to CD audio, although they are sufficient for simple voice communication. The advantage is that the telephone scheme only requires a 64 k bit-per-second data rate for real-time transfer (8000 samples per second times 8 bits per sample) while CD audio requires over 700 k bps for mono and nearly 1.5 M bps for stereo2. On personal computers, several common variations exist. First is the often-called multi-media rate. This is 8 bit linear PCM sampled at 22.05 kHz. Per channel, it requires only one-fourth the storage of CD audio. Also, the bandwidth is cut in half and the signal-to-noise ratio is degraded considerably to about 48 dB. Voice quality sampling also uses 8 bit linear PCM but the sample rate is cut in half again, to yield 11.025 kHz, and thus a practical bandwidth of about 5 kHz. It is roughly equivalent to the telephone scheme mentioned earlier. 1 Any deviation from the exact signal level can be seen as a noise signal in the long term. The larger the deviations, the greater the noise. 2 Stereo is roughly equivalent to a T-1 line. ET163 Audio Technology Lecture Notes: Digital Audio 1 In professional audio circles, both higher sample rates and higher bit-depths are used. Many professional digital tape and hard-drive based recording systems use a 48 kHz sample rate. Socalled high resolution audio typically runs at twice the normal sample rates although quadruple rates are not uncommon. These yield rates of 88.2 kHz, 96 kHz, 176.4 kHz, and 192 kHz. The newer DVD-audio offers these (with some limitations). Bit depths may also be increased. Proprietary systems may use 20, 22, or 24 bit integer representations, with 24 bits generally preferred as it is an integer byte multiple. Some systems also use single precision floating point representation. This offers the advantage of resolution comparable to the 24 bit systems but with much greater dynamic range. With the availability of low cost computing hardware, the system cost for digital audio has dropped considerably in the past decade. Modern personal computers have sufficient computational ability to perform complex calculations on the digital data in a real-time setting. This makes possible a variety of signal processing tasks in the digital domain that were unheard of only 10 or 15 years ago. Coupled with inexpensive, large hard drives and the now ubiquitous CD burner, an individual can create a miniature recording and production facility on a desktop. This is not to say that the era of the large recording studio with equally large mixing desk and arrays of outboard gear has come to a close, but rather, individuals can now afford many of the same functions and processing tools in more modest settings. Example Problems 1. Q: You wish to digitize the sounds of bat echo location which lay below 75 kHz. What is the minimum acceptable sampling frequency? A: fs needs to be at least twice the highest frequency component for wide band signals, or 150 kHz in this case. 2. Q: A digitizing system uses 12 bit resolution at a 32 kHz sampling rate. What are the approximate signal-to-noise and bandwidth limits of the system? A: At 6 dB per bit, the signal to noise ratio is about 72 dB. The upper frequency limit is half of the sampling rate, or 16 kHz (although it will be somewhat less in a practical system). 3. Q: How many megabytes of disk space are required to store Stravinsky’s Rite of Spring at normal CD audio rates, in stereo (35 minutes)? Q: 35 minutes x 60 seconds/minute x 44.1 samples/second x 2 bytes/sample x 2 channels = 370.44 MB. ET163 Audio Technology Lecture Notes: Digital Audio 2