12/9/99 (T.F. Weiss) Outline: Lecture #25: Digital audio — the compact disc • Brief history of sound reproduction from the phonograph to the CD Motivation: • Description of a compact disc • An example of the integrated use of both CT and DT signal processing techniques. • CD has contributed to the culmination of a century old quest to record and reproduce the music of a live performance. • Signal processing in the compact disc • Conclusion 1 Brief history of sound reproduction Mechanical storage — phonograph The phonograph was developed by Thomas Alva Edison in 1877 during a time when the telegraph was well-established and the telephone was under development by Alexander Graham Bell. The sketch (left) of the phonograph is from Edison’s notebook. Edison’s phonograph was a purely mechanical device. 3 2 Mechanical storage — phonograph, cont’d Sound set a diaphragm and an attached stylus in motion. Depressions were made on tin foil wrapped around a brass cylinder which had a spiral groove and was mounted on a feedscrew. The cylinder was turned by a hand crank and advanced axially. Playback was achieved by cranking the cylinder and listening to the sound produced by the diaphragm as a lightly spring-loaded stylus passed over the groove. Recording of sounds could only be achieved by shouting onto the diaphragm and playback resulted in a very faint sound. 4 Mechanical storage — phonograph, cont’d Mechanical storage — phonograph, cont’d The photograph was taken on April 18, 1879 in Washington, DC by Mathew Brady (the famous photographer who recorded the Civil War in hundreds of photographs). It shows Thomas Edison sitting in front of a phonograph. Around 1890, Emile Berliner invented the flat recording disc which had a spiral groove on a shellac surface giving rise to the record player or gramophone. The stylus moved side-to-side rather than up and down in the groove. By 1908 the Victor Talking Machine Company and the Columbia Company were mass producing recording discs as the medium for storage of audio. These discs recorded 3 minutes of music. Between 1910 and 1930, electronic means of recording and playback replaced the purely mechanical methods. The following components were developed: microphone, electronic amplifier, magnetic pickup, and the dynamic loudspeaker. 78 rpm records were developed in the 1930s. The grooves and stylus were 3 mils wide. A 12-inch record played 7-12 minutes of sound on one side. This was definitely lo-fi recording and reproduction (bandwidth was typically 200-5000 Hz). In addition, the records and needles wore out easily. 33 1/3 rpm records were developed in the 1950s. The grooves were reduced to 0.7 mils so that 1/2 hour of music was recorded per side. The fidelity was greatly increased. 5 6 Magnetic storage, cont’d N Magnetic storage In 1898, Valdemar Poulsen, a Dane, obtained a patent for a device, called the telegraphone, to record and reproduce sound by the orientation of magnetic domains. He won the grand prize at the Paris Exposition of 1900 for this achievement. The telegraphone consisted of a cylindrical drum that contained a spiral groove into which a steel piano wire was wrapped. For recording, a microphone was used to transduce sound and to apply a current to the electromagnet that slid along the wire and magnetized it. 7 S Magnetized material Magnetic reading head For reproduction, a reading head was used to produce a voltage proportional to the magnetization in the wire. Although it created a sensation at the time, the sound produced by the telegraphone was weak and of poor quality. The development of the electronic amplifier solved the first problem, the use of biasing solved the second problem. 8 Magnetic storage, cont’d Compact disc 1927 Magnetic tapes were developed. Prior to then all magnetic recording was done on wire or steel tape. 1947 Magnetic recording of broadcast radio was first used (the Bing Crosby show). 1972 Optical storage of digital audio developed by Phillips Corporation in the Netherlands. 1947 Oxide tapes were developed by the 3M Company. 1980 Sony Corporation and Phillips Corporation proposed a standard format for signal storage and for the disc material which was adopted by a group of 25 manufacturers. 1948 Audio tape recorders became available from Ampex. 1982-83 The compact disc was introduced. 1955 Development of the magnetic disk drive by IBM to provide random access memory for digital computers. 1996 Over a billion compact discs are sold annually. 9 Description of a compact disc Specifications based on human hearing Design of audio systems for human use must include the specifications of the receiver — the human auditory system. 120 J Dadson & King, 1952 HH Yeowart, Brian, & J B Tempest, 1967 J H Kidd, & J H Green, Stevens, 1987 H H J H BJ HB H B H HBHBB B B BBB BB B Threshold (dB SPL) 100 80 60 40 20 0 10 100 1000 10000 Frequency (Hz) 100000 These are measurements of the threshold of hearing of human subjects for tones as a function of tone frequency. The threshold is expressed in dB SPL which is the pressure in decibels above 20 µPa. The frequency range of hearing is approximately 20Hz20kHz. Sound levels above 120 dB SPL are painful. Hence the dynamic range of hearing from the threshold of hearing to the threshold of pain is approximately 120 dB. 11 10 Specifications Number of channels. Two stereo signal channels. Quantization. Signal is quantized to 16 bits to give a dynamic range of 20 log10(216) = 96 dB. Sampling. Signal is sampled at 44.1 kHz which is greater than 2 × 20 kHz. Duration. A CD usually contains a maximum of 74 minutes of music. 12 Specifications, cont’d CD Tracks The information on the CD is stored in spiral tracks spaced 1.6 µm apart. Each track contains a sequence of 0.11 µm deep pits of width 0.6 µm. The lengths of the pits encode the information. Dimensions. 120 mm 1.2 mm 15 mm 1.6 µm Capacity. The total storage capacity on a CD is Capacity = 2 stereo × (74 × 60) × 44.1 × 103 × sampling rate duration 9 = 6.3 × 10 bits = 0.78 Gbytes. 0.83 µm minimum 16 bits/sample Actually, there is additional data on the CD for error correction and synchronization which raises the information stored to 15.5 × 109 bits = 1.9 Gbytes. 13 14 Pits in plastic Plastic Label Transparent plastic Pits Land Metal reflective layer Schematic diagram of a CD showing pits. Laser light The surface of the transparent plastic substrate is covered with the pits which are covered by a thin (50-100 nm thick) layer of metal which is covered by a layer of plastic. Laser light travels through the plastic layer and is reflected by the metallic layer. The difference in path length of the light reflecting from a pit and from the land between pits is detected by an interferometric method. Reflection from a pit is canceled by the interference while reflection from the land is not. 15 From pits to bits Each pit and land has a length n∆ where n is an integer (3 ≤ n ≤ 11), ∆ ≈ 0.28 µm. The disc drive has a feedback system to keep the linear velocity constant at the read site. The sequence of pits and lands gives rise to a square wave of reflected light and the transitions produce a pulse train that defines a binary number. 1001000100010000100010000000010001 3 4 4 5 3 9 4 16 Signal processing in the compact disc Recording The recording of a compact disc can be represented by the block diagram shown below. T Analog audio input Amplifier Anti-aliasing CT filter Sample and hold Recording — 1 × sampling The notation is shown in the diagram. First, we consider a system for which the sampling frequency is 1/T = fs = 44.1 kHz. fs = 44.1kHz Analog-todigital converter Error correction Record modulation Store data on CD xm(t) Anti-aliasing CT filter x(t) Sample and hold Analog-to- x[n] x(t) digital converter We will focus on the design of a portion of this system — the anti-aliasing filter and the conversion of a continuous time to a discrete time signal. 17 Recording — 1 × sampling, cont’d The audible spectrum extends to about 20 kHz. Hence, the sound should be sampled at a sampling frequency > 40 kHz. The sampling rate for a CD is 44.1 kHz for historical reasons related to the synchronization of sound samples with video, i.e., fs = (3 samples/line) × (490 lines/frame) × (30 frames/sec) = 44.1kHz. Xm(f ) −20 Ideal anti-aliasing filter 20 f (kHz) The spectrum of the audio signal may extend above 20 kHz even though the signal above 20 kHz is inaudible. Recording the inaudible signal requires storage on the disc which results in a decrease in the duration of audible sound on the disc. If ideal LPFs were causal, they could be used to filter the signal at 20kHz and the signal could be sampled at just above 40 kHz. 19 18 Recording — aliasing for 1 × sampling The anti-aliasing filter is key to avoiding aliasing. (f ) X Without anti-aliasing filter 0 44.1 f (kHz) (f ) X With ideal anti-aliasing filter 0 20 44.1 f (kHz) 24.1 Since, an ideal LPF is not causal, the gap between 20 and 24.1 kHz simplifies the design of a causal anti-aliasing filter. 20 Recording — anti-aliasing filter design The specifications for a LPF for a CD is that the ripple in the passband is < 0.5 dB, and the stopband is 80 dB below the passband. Thus, the attenuation between 20 kHz and 24.1 kHz must be 80 dB which is 80/ log10(24.1/20) ≈ 988 dB/decade. The frequency re0 sponse is shown for 1 -20 Butterworth filters of -40 order 1, 10, 20, 30, 10 50 -60 40, and 50. The 20 -80 CD specification can 0 be achieved with a -500 50th order Butterworth filter, but this -1000 0 filter has consider2 1 10 10 10 able phase distortion Frequency (kHz) in the passband. Angle (deg) Magnitude (dB) Magnitude (dB) Recording — aliasing for 1 × sampling, cont’d Ideal lowpass filters are not causal and cannot be built as CT filters. However, lowpass filters with sharp cutoffs can be designed. Design specifications Passband 0 of a lowpass filter Ripple Ideal LPF are illustrated for a Elliptic filter ninth-order elliptic -10 filter design whose specifications are: the ripple in the passband Stopband -20 is < 1 dB, the attenuation in the stopband is > 20 dB, and the -30 0 2 1 10 corner frequency is 20 10 10 Frequency (kHz) kHz. 21 Angle (deg) Magnitude (dB) Recording — anti-aliasing filter design, cont’d Other (lower-order) filters can be designed that meet the magnitude specifications. These designs include 0 a 50th order Butter-20 worth, an 18th or-40 der Chebyshev, and a -60 9th order elliptic fil-80 0 ter. The phase distortion in the passElliptic -500 band differs for these Butterworth Chebyshev 3 filters which all -1000 0 2 1 10 10 10 meet the same magFrequency (kHz) nitude specifications. 23 22 Recording — problems with CT filters Problems with higher order CT filters with many components. • Tolerances on component values complicate matching the frequency response specifications. • Analog filter component values change with time and with changes in environmental variables (e.g., temperature). • Components add noise to the signal. • It is difficult to match the frequency responses of stereo channels. • Filters with a sharp cut-off have complex phase responses in the passband and oscillatory step responses. 24 Recording — 4 × over sampling The solution used in CD recording is to sample at fs = 1/T = 4×44.1 kHz = 176.4 kHz, called 4 × over sampling. The design of the CT anti-aliasing filter is simplified and the DT filter is easily designed to attenuate the frequencies above 20 kHz. To reduce storage, the DT signal is then downsampled to 44.1 kHz before the information is stored on the CD. Recording — 4 × over sampling, cont’d The CT anti-aliasing filter for sampling at 176.4 kHz needs to insure that the component that is 20 kHz below 176.4, i.e., at 156.4 kHz, is attenuated at least by 80 dB. This specification is easily met with a low-order filter. X (f ) CT filter 0 fs = 176.4kHz xm(t) Anti-aliasing CT filter x(t) Sample and hold x(t) x[n] DT filter y[n] Downsample 0 -40 -80 0 -200 -400 1 10 f (kHz) 2 Frequency (kHz) 10 3 A fifth-order Butterworth filter with a cutoff frequency of 20 kHz has an attenuation of more than 80 dB above 156.4 kHz. To reduce the phase distortion further, an elliptic filter could be used. 27 26 Recording — DT filter for 4 × over sampling As an example of a DT filter design, we examine the frequency response of 200 point FIR filter designed by the Parks-McClellan algorithm using MATLAB’s remez function. 20 24.1 kHz 88.2 kHz 0 In the passband (f ≤ -40 20 kHz), the magnitude is 1 (within 0.5 -80 dB) and the angle -120 0 changes linearly with -40 frequency. In the stopband (f ≥ 24.1 -80 kHz), the attenuation -120 0 0.1 0.2 0.3 0.4 0.5 exceeds ∼ 80 dB. DT frequency, φ Angle (deg) Magnitude (dB) Angle (deg) Magnitude (dB) Recording — anti-aliasing filter for 4 × over sampling 10 176.4 ydown[n] 25 0 f (kHz) (f ) X Analog-todigital converter 0 10 176.4 28 1 Imaginary part The filter design specifications are achieved in the stopband by placing zeros on the unit circle for 0.5 ≥ |φ| ≥ 0.1366. In the passband, a gain of one is achieved by placing zeros on either side of the unit circle for 0 ≤ |φ| ≤ 0.1134. 24.1 = 0.1366 176.4 20 = 0.1134 176.4 0 -1 -1 0 1 Real part 2 3 Recording — DT filter for 4 × over sampling, cont’d The unit sample response of the DT filter is shown below. 0.3 0.2 Amplitude Recording — DT filter for 4 × over sampling, cont’d The pole-zero diagram of the DT FIR filter contains all zeros (except at z = 0). The unit sample response of the filter contains 200 points in time. 0.1 0 -0.1 0 100 n 200 29 30 Recording — downsampling for 4 × over sampling After DT filtering above 20 kHz, the signal is downsampled before it is encoded and recorded onto a CD. Playback — block diagram Playback involves the reverse of the signal processing steps involved in recording. Once again we will be concerned only with the D/A conversion, sample and hold, and filtering stages. X̃ (φ) DT signal after sampling at 176 kHz DT filter 0 Ỹ(φ) 1 φ Data stored on CD After filtering with DT filter 0 1 φ Ỹdown(φ) After downsampling 0 1 Demodulation Error correction Digital-toanalog converter Sample and hold Anti-imaging filter Amplifier Analog audio output φ 31 32 Playback — the problem of image frequencies The spectrum of the sequence recorded on the CD is shown below. Playback — 4× upsampling to the rescue The problems with making a sharp CT anti-imaging filter are the same as in recording, so the following scheme is used. Spectrum of samples on CD Ỹdown(φ) 0 1 φ DT filter 0 1 φ Upsampled 4× This sequence could be played out through a sample and hold circuit at 44.1 kHz through a CT LPF that attenuates the high frequencies. Since humans do not hear much above 20 kHz the image frequencies at multiples of 44.1 kHz would be inaudible. However, the images at multiples of 44.1 kHz could be a problem if the CD player were connected to electronics that demodulated the signal so that the image frequencies were made audible. Therefore, the image frequencies at multiples of 44.1 kHz are removed. 0 1 φ After filtering with DT filter 0 1 φ The data are then played out at 176.4 kHz through the sample and hold circuit and then through a CT anti-imaging filter that needs to remove images at 176.4 kHz rather than at 44.1 kHz. 33 34 Conclusion, cont’d The CD has a number of attractive attributes. Duration. About 1 hr — similar to 33 1/3 rpm records. Conclusion The compact disc, together with developments in the design of speakers and headphones, has approached the vision of a music reproduction system sought for over a century — to reproduce the sound of a live performance. Ease of use. Compact size, portable, easy to cue. Media permanence. Rugged medium, does not deteriorate. Quality of reproduction. Stereophonic, high fidelity recording — low distortion, low noise, minimal artefacts (pops, hiss, etc.). 35 36 Conclusion, cont’d It has been claimed that the CD is the most successful electronic device ever produced. Furthermore, it is a marvelous example of engineering design that utilizes much of the material taught in 6.003. 37