Chapter 1 Audio signals and systems 1.1 Some basics of acoustics Sound originates from elastic vibrations or oscillations of the particles (molecules or groups of molecules) in a particular medium (gas, liquid or solid). These vibrations propagate from particles to (neighbouring) particles inside the medium, which is called the propagation medium. In this course, we consider mainly the propagation of sound waves in the air. Sound waves do not propagate in the empty space. The acoustical source (or emitter) is in fact a transducer: it transforms a non-acoustical energy (be it for example mechanical or electrical energy) into acoustical energy: • The vibrations of plucked guitar strings (mechanical energy) are transmitted to the musical instrument's body, and then to the surrounding air mass (acoustical energy). • A loudspeaker transforms the electrical energy provided by the audio amplier into mechanical energy (the vibration of the membrane), and afterwards into acoustical energy (air vibrations created by the membrane). Acoustical vibrations propagate in the medium from the sources to the receivers. The receiver is either an eardrum or a physical sensor (like a microphone). The human auditory system is able to detect vibrations whose frequency is comprised between 20 Hz and 20 kHz: the term "sound" is often restricted to those audible vibrations. 1 Figure 1.1: Positions of some particles during a transverse propagation: A, B, C,... are their equilibrium (or at rest) positions, whereas A', B', C',... are their positions at a given time the (y displacement y = 0). λ t1 . The positions are represented by of each particle, measured from its equilibrium position is the wavelength. Figure 1.2: Time evolution of the displacement y(t) of the particle A, measured from its equilibrium position (y=0). T is the period of the oscillation. 1.1.1 Transverse propagation We can imagine the propagation medium as being composed of a great number of particles (groups of molecules) which are able to oscillate around their equilibrium position. We then choose a particular direction of propagation → − x and some particles lying along this direction (in A, B, C, . . . , N on the gure 1.1). We further imagine that an harmonic (sinusoidal) movement is imposed 2 Figure 1.3: Time evolution of some particles' positions, at given times t2 > t1 . and A The movements of the particles and N t1 are synchronous (in-phase). to the particle A by the sound source. The movement is → − transversal, i.e. perpendicular to the direction of propagation x . Figure 1.2 illustrates the time evolution of the amplitude of this movement y(t). The time origin t=0 has been arbitrarily chosen, once the harmonic movement can be considered as stabilized. t = 0 and t1 , the displacements (positions and directions) of the period of oscillation T is the time interval (in seconds) between t = 0 and t1 . The number of periods T per second is the frequency f in hertz (Hz). At times particle A are identical. The When perturbations are created at a given location in an elastic medium (like air), they are transmitted to other locations, because of the existence of elastic forces linking the neighbouring particles. Therefore, the movement of the particle A causes a similar movement of B, but with a certain delay, because the transmission of forces is not immediate. Similarly, the movement of B will cause a displacement of C , ... → − Figure 1.1 illustrates the positions of the particles along the direction x at a given time. This can be viewed as a snapshot of the medium at this particular instant. The delay between the movement of any particle and the movement of A increases with the distance between both particles. At the location N (gure 1.3), the delay will be just equal to one period T , and movements of particles A and N are therefore in-phase. The distance particle of particle the 3 A, B, . . . A0 , B 0 , . . . are their po- Figure 1.4: Displacement vectors in a longitudinal propagation : are the equilibrium positions of the particles, whereas sitions at a given time (snapshot). French to english translations: "déplacement longitudinal" = between A Particle sound wave and N N longitudinal displacement, "détente" = rarefaction. is called the wavelength λ. follows the same movement as particle A: we say that the A to N in T seconds. As the distance travelled A and N is (by denition) λ, the sound speed (or celerity) (c in m/sec) is given by : λ = c T . ◦ The sound speed in the air slightly depends on its temperature θ( C) : ◦ c = 331, 4 + 0, 607 · θ (m/sec). At 20 C , c = 344 m/sec. propagates from by the sound wave between 1.1.2 Longitudinal propagation The movement of the particle → − tion x (gure 1.4). B, C A is now parallel to the direction of propaga- and all the particles along that direction follow the same movement ... still after some delay. A is in A0 , B is in B 0 , and so on. If the AA0 , BB 0 , . . . are folded towards the y−axis, then we At a given time, the particle displacement vectors obtain a representation similar to the transverse case. Transverse waves can occur in liquids and solids, together with longitudinal waves. On the contrary, longitudinal waves are the only mode of propagation in gases, and in particular in the air medium. 4 1.1.3 Existence of the acoustic pressure The existence of acoustic pressure will be illustrated in the case of longitudinal waves. Looking at the displacement vectors on gure 1.4, we observe the exis- compression, tence of regions where the air pressure increases ( particle C, around the for example, the instantaneous quantity of particles tends to increase), followed by other regions of creases (around G). rarefactions where the pressure de- Two successive regions of compression are separated by just one wavelength, so the phenomenon is spatially periodic. If the movement is harmonic, it is also periodic in time, since the air pressure continuously increases and decreases at a given location on the → − direction of propagation x . The acoustic pressure is dened as the dierence between the instanta- neous air pressure and the atmospheric pressure existing at the same location, without sound. As the displacement y(x, t) depends on time and space, the acoustic pressure is also a function of time and space: p(x, t). 1.1.4 The acoustic pressure in time and frequency If the movement of the particles is harmonic (sinusoidal), the acoustic pressure p(x, t) is generally represented by a sine (or cosine) function: p(x, t) = A(x)cos(ωt − kx) • A(x) is the amplitude of the acoustic pressure, (1.1) it does not depend on time in this case, • ω = 2πf is the angular frequency, • k = 2π/λ is the wave number. Indeed, at a given location (period T = 2π ω ). x, the acoustic pressure is periodic in time Similarly, the term (−kx) represents the propagation x = 0: if x = λ, then the delay 2π and the acoustic pressure evolves synchronously (in phase ) at both delay between this location and the position equals locations. The amplitude waves A(x) depends on the type of wave propagation. Plane are characterized by their constant amplitude: there's no attenuation 5 of the pressure's amplitude with distance. In isotropic point sources), the amplitude x distance A(x) spherical waves (created by is inversely proportional to the between the source and the receiver. If the function p(x, t) periodic in time is Fourier ) can be expressed as a ( frequencies are given by p(x, t) = and its period is series of sine and cosine functions whose 0, ±f0 , ±2f0 , . . . , ±mf0 (m +∞ X 1 f0 , then it αm (x)ejωm t is an integer) : ωm = 2πmf0 (1.2) m=−∞ The complex coecients αm (x) 1 αm (x) = T are obtained by the following formula: ˆ T 2 p(x, t)e−jωm t dt f0 = T1 is called the fundamental frequency and the (mf0 , m = ±2, ±3, . . .) are simply called the harmonics. The spectrum ecients αm (1.3) − T2 other frequencies of the acoustic pressure is the representation of the co- or simply their amplitude (absolute value) as a function of frequency. The gure 1.5 represents the spectrum of a periodic function. If the acoustic pressure form p(x, t) is not periodic, then the Fourier trans- can be used to nd the (co-)sine functions that compose the corre- sponding time signal: ˆ +∞ P (x, ω) = p(x, t)e−jωt dt (direct transform) (1.4) −∞ 1 p(x, t) = 2π ˆ +∞ P (x, ω)ejωt dω (inverse transform) (1.5) −∞ In this case, an innite number of (co-)sine functions is generally required to compose the time signal p(x, t), and one can nd a component at each real frequency. The frequency (or spectral) representation of a sound event is often used in acoustics and audio engineering, together with the time representation. Both representations are complementary, regarding the type of information that they can reveal. The amplitude of the acoustic pressure is related to the energy of the sound wave at a particular location. 6 The spectral representation of the Figure 1.5: Time and frequency representations of the same rectangular oscillation (period T = 4, arbitrary units). In this case, the coecients are real. 7 αk amplitude |P (x, ω)| therefore gives us an idea of the energy distribution as a function of frequency, which is a fundamental information. On the contrary, the phase of P (x, ω) is seldom represented. Figure 1.12 shows the amplitude spectrum of a non-periodic sound signal. As human hearing is eective (at best) from 20 Hz to 20 kHz, spectral representations are often limited to this frequency range in acoustics. Acoustic waves whose frequency is less than 20 Hz are called whose frequency exceeds 20 kHz are called A pure tone infrasound, and those ultrasound. is produced by an harmonic (sinusoidal) vibration : the spectrum of a pure tone therefore consists in only one spectral line (cfr. gure 1.8) at the frequency at −f0 ). Bass tones f0 (more precisely, one at f0 correspond to low frequencies and and another one treble to high fre- quencies. Periodic non-sinusoidal vibrations are composed of harmonics, i.e. frequency components at f0 , 2f0 , . . . A physical denition of noise ( gure 1.9). could be any non-periodic vibration. In psychoacoustics, noise is rather dened as an undesired sound, be it periodic or not. Note that some non-periodic sounds can be judged as pleasant (for example, a friendly voice). The octave band system In acoustics, the frequency analysis of sounds is often performed by grouping frequencies in frequency bands. In the octave system, each fre- quency band has a width which is proportional to the band's central fre- ∆fi frequencies fi quency: the range (20 Hz, 20 kHz) is divided into ten frequency bands (i = 1, 2, . . . , 10) (ten octave bands ) whose standard central are listed in the table 1.1. These central frequencies have the property that each of them is twice the immediately lower one. Figure 1.14 shows two spectra resulting from an octave band analysis. Note that each octave band contains three third-octave bands, each of them having (again) a width proportional to its central frequency. Grouping the frequencies in octaves is inspired by the properties of human hearing, and more precisely the way we perceive frequency. It is indeed known that (except at very low frequencies), frequency components belonging to (approximately) the same third-octave band tend to be associated in the inner ear (in the cochlea). 8 Octave number central frequency lower limit higher limit (Hz) (Hz) (Hz) 1 31.25 20.0 44.2 2 62.5 44.2 88.4 3 125 88.4 177 4 250 177 354 5 500 354 707 6 1000 707 1414 7 2000 1414 2828 8 4000 2828 5657 9 8000 5657 11314 10 16000 11314 20000 Table 1.1: Standard system of octave bands 1.1.5 Acoustic power and decibels The energy contained in a sound wave is composed of kinetic energy related to the particles' velocity dy dt (if y is their displacement) and (spring) potential energy related to the acoustic pressure. The acoustic power (symbol W) is the energy that is measured or ob- served in one second. For example, the acoustic power of a sound source like a loudspeaker is the energy emitted by the source that propagates through (the outside of ) any surface enclosing the source in one second (if we suppose that there's no absorption of energy inside the propagation medium). In acoustics, the pressure and the power are often expressed or measured in decibels. Here follows the denition of the decibel. 2 · 10−5 Pa (at 1 kHz) is the approximate hearing threshold: below this level, sound amplitudes are too weak to be heard by humans. At the opposite side of the loudness scale, the threshold of pain corre- sponds to about 20 Pa. The sound level ably reduces the Lp in decibels (dB) is a logarithmic scale that consider- dynamics of pressure amplitude, i.e. the dierence between the highest and lowest amplitudes appearing in acoustics' problems: Lp (dB) = 20 · log In this denition, prms is the prms p0 p0 = 2 · 10−5 P a root mean square 9 (1.6) pressure (dened here- after), expressed in Pascals, and the default basis of the logarithmic function is 10. Therefore, the sound pressure level at the hearing threshold is 0 dB, and the threshold of pain is at 120 dB. RMS value Let's recall that the of a function depending on time (the acoustic pressure, for example) is the square root of the mean value of the square of this function, computed on a time interval T . It is a common way of expressing the amplitude of a phenomenon depending on time. The time interval T is either the period for a periodic function, or a suciently long and representative interval for non-periodic functions. Some orders of magnitude : • quiet levels, in the country, far from any human activity : 20 to 30 dB • speech level : 50 to 70 dB • a crowded street : 70 to 80 dB • inside industrial halls : 60 to 85 dB • the motor of an aircraft, measured at short distances : 100 to 120 dB. The denition of the decibel is in accordance with the way we perceive the sound intensities. In acoustics indeed, sensation increases as the logarithm of the excitation. As a consequence, the just noticeable dierence between two sound intensities is approximately one decibel, whatever the level of this intensity. Besides acoustic pressure, the acoustic power ( expressed in decibels. For example, the W in watts) can also be power level of a sound source is dened in dB by: P W L(dB) = 10 · log W W0 W0 = 10−12 watts (1.7) Propagation of spherical waves In free eld (without reections and obstacles), and at relatively great distance from the source, the pressure amplitude (or its RMS value) decreases as the distance (r ) to the source increases, in an inversely proportional way: prms = A r , where A is a constant. Expressed in decibels, this law of spherical propagation says that the sound pressure level SP L (or Lp ) decreases by 6 dB each time the distance to the source is doubled. 10 Indeed, applying (1.6) gives: prms Lp (r) = 20 · log p0 = Lp (r = 1m) − 20logr (1.8) 1.1.6 Addition of acoustic pressures If two sound waves reach the same location in space, their individual contributions to the acoustic pressure are added. At each given time t, the instantaneous pressure is written as: − − − p(→ r , t) = p1 (→ r , t) + p2 (→ r , t) If both waves are pure tones with the same frequency, then their addition gives rise to interferences: at some locations, depending on the distances in phase constructive in- travelled by the waves, the individual pressure contributions are and the global amplitude is increased: this is the case of terferences. But at other locations, it may happen that the amplitude is attenuated by the combination of structive out of phase contributions: these are de- interferences. This example shows that the acoustic powers or intensities (represented by prms 2 ) are not always added at the receiver. But what are the conditions for this "energy summation" to be veried ? Well, generally it can be said that if the acoustic signals are relatively independent from each other (or uncorrelated), then their individual energetic contributions can be added: prms,tot 2 = p1rms 2 + p2rms 2 ⇒ Lp,tot = 10 log 10Lp1 /10 + 10Lp2 /10 In particular, if both signals have the same RMS value (1.9) Lp = Lp1 = Lp2 , the global amplitude is increased by 3 decibels: 2 2 prms,tot = 2 p1rms ⇒ 10 log More generally, n identical levels 2 p1rms 2 p0 2 Lp ! = Lp + 3 dB give (when added): Lp,tot = Lp + 10 · log n Some applications of this formula are listed in table 1.2. 11 (1.10) n 10 log(n) 2 +3 dB 4 +6 dB 8 +9 dB 10 +10 dB 100 +20 dB Table 1.2: Increase of the sound pressure level of n Lp due to the superposition uncorrelated sources having the same intensity. On the contrary, if we want to decrease the sound pressure level by 20 dB, then the acoustic energy must be attenuated by a factor 100 ! This simple example illustrates the dicult challenge of performing acoustic isolations (or attenuations) greater than 20 dB. 1.1.7 Some facts on psychoacoustics The sensation of loudness (our perception of intensity) is one of the components of hearing perception. The gure 1.6 illustrates the standard equal-loudness contours for pure tones. These curves connect those points of the frequency-level plane which cause the same loudness sensation than a given pure tone of 1 kHz frequency. For example, 78 dB at 50 Hz and 52 dB at 4 kHz are situated on the same contour, and therefore are perceived as equally loud. pressed in phons : n The loudness is ex- phons is the loudness level of a pure tone giving the same sensation of intensity than n dB at 1 kHz. A closer look at gure1.6 reveals that the contour 5 phons is the hearing threshold (which includes 0dB at 2 kHz). The frequency components of an acoustic signal which are below this threshold are thus inaudible: they can therefore be neglected when transmitting or recording this signal, without perceptual coding of audio signals whose goal is to reduce the amount of information any impact on its audible quality. This property is widely used in the during transmission or recording. Now, if we want to measure loudness and take into account the real sensibility of human hearing as a function of frequency, then we should correct (e.g. by ltering) the pressure levels measured by a microphone by an amount corresponding to the equal-loudness contours. The attenuation provided by this loudness lter should therefore vary with frequency and with the intensity of the signal (see gure 1.6): this is quite dicult to realize in 12 Figure 1.6: Standard equal-loudness contours (ISO R 226). Sound pressure level, Standard hearing threshold, "Phones" = phons, glish translation: "Niveau de pression acoustique" = "Seuil normal d'audibilité" = "fréquence" = frequency. French to en- 13 Figure 1.7: Weigthing curves A, B and C as a function of frequency: the points on the curves are given at the central frequency of the standard octave bands : 31.25, 62.5, ... 16000 Hz ("Fréquence" = Frequency ). a simple measuring instrument. Therefore, only three weighting curves have been dened (see gure 1.7): • the A-weighting curve : representing the equal-loudness contours between 0 and 40 phons (levels measured in dBA), • the B-weighting curve : representing the equal-loudness contours between 40 and 70 phons (levels measured in dBB), • the C-weighting curve : representing the equal-loudness contours above 70 phons (levels measured in dBC). 14 1.2 Denition and properties of audio signals and systems 1.2.1 Denitions The audio signal Installed in a sound eld, an ideal microphone delivers an electrical voltage v(t) or intensity i(t) high-delity ) whose time evolution exactly follows ( the evolution of the acoustic pressure at the same location. By denition, the audio signal is the function v(t) or i(t), generally limited in the bandwith 20 Hz to 20 kHz. The audio signal can be measured, recorded, digitized, transformed or processed, transmitted (by air or cable transmission) and nally played and reproduced by loudspeakers. The audio signal can be mono- or multichannel, if it results from the simultaneous acquisition by one or several microphones. A stereo signal stereophony ) is the simplest example of a multichannel signal, with the left and right channels. Another example is the 5.1 soundtrack of a movie for ( the "home-cinema" (5 full-bandwith + 1 bass channels). Finally, the audio signal can also be created by other means than by microphones. It can be directly synthesized by electronic circuits or by a computer and its soundcard. Musical signals can for example be generated by synthesizers. RMS value of the audio signal Square root of the mean value of the power contained in the signal s(t), evaluated on a period p suciently long s2 . For periodic signals, the power is averaged on the period 1 f0 . Peak value Maximum of the absolute value of the signal, which is partic- ularly interesting to describe short pulses for which the RMS value is weak and not representative of the signal amplitude. Peak factor Ratio of the peak value to the RMS value (1, 414 :1 for a sine function). Power of a signal It is determined by the voltage measured across a load resistance whose value (in ohms) has been specied (W power is expressed in one of the following units: 15 = V2 R ). The Figure 1.8: Audio signal corresponding to a 440 Hz pure tone : time and frequency representations. • dBm: • dBu: 10 log10 : often, a 600 Ω (dBm600 ) load was watts used in this denition (less used nowadays). 20 log10 V Vref Vref = 0, 7746 volts. case of a 600 Ω load. , with equivalent to dBm in the • W 10−3 This denition is dBV: has the same denition of dBu, but with the reference voltage Vref = 1 volt. 1.2.2 Time and frequency representations of some typical audio signals Pure tones The audio signal corresponding to a pure tone can be represented by a sine or cosine function of time. The spectrum is characterized by only one line at the tone frequency (see gure 1.8). 16 Figure 1.9: The audio signal of a periodic and non-sinusoidal sound whose fundamental frequency is 440 Hz : time and frequency representations. Harmonics (spectral lines) can be seen at 880 Hz, 1320 Hz, . . . Harmonics Harmonics are typical of non-sinusoidal periodic signals (period 1/f0 ). T = f0 The corresponding spectrum is composed of lines at the frequencies (the fundamental frequency), 2f0 , 3f0 . . . (the harmonics): fundamental frequency gives the (music) note. see gure 1.9. The Many musical instruments produce harmonics. Beats The superposition of two pure tones with very close frequencies f1 and f2 creates beats, i.e. the modulation of the amplitude of a signal with frequency (f1 + f2 )/2 by a sine wave at (f1 − f2 ): cos(ω1 t) + cos(ω2 t) = 2cos( ω1 − ω2 ω1 + ω2 t)cos( t) 2 2 (1.11) Figure 1.10 shows the audio signal resulting from beats between the pure tones 440 Hz and 540 Hz. On the time axis, one can indeed measure the periodicity of the signal's envelope (a maximum every 0.01s). If the modulation frequency (f1 − f2 ) is less than 20 Hz, then the uctu- ation of the signal amplitude is clearly audible. Signals produced by musical instruments Wind and string instruments most often emit sounds containing harmonics. However, these sounds also have some noise components: for example, 17 Figure 1.10: Beats between the pure tones 440 Hz and 540 Hz. Upper diagram: time representation (amplitude in relative units, time in seconds). Lower diagram: frequency representation (amplitude in relative units, frequency in Hz). 18 Figure 1.11: ADSR model of the envelope of the audio signal corresponding to a music note played by wind or string instruments. the musician blows in the transverse ute or the panpipe. The dynamic evolution of the "harmonic" signal produced by a musical instrument is dened by the contour of its envelope (see gure 1.11), the so-called A: ADSR contour: Attack (for example, the hammer striking the string of a piano: the amplitude reaches its maximum in less than 50 ms); D: Decay (the piano key remains pressed, the amplitude of the signal begins to decrease); S: Sustain (the piano key is still pressed, the amplitude keeps a nearly constant value); R: Release (the key is released and the sound fades, more or less rapidly). Figure 1.12 shows the audio signal of a violin playing the note A (440 Hz fundamental frequency). 19 Figure 1.12: Time and frequency representations of the audio signal produced by a violin playing the note A (440 Hz). French to english translations: Violin, string A played open, "Spectre de module de la corde LA du Violon" = Amplitude spectrum of the Violin's A string, "Fréquence" = Frequency. "Violon, corde La à vide" = 20 Ceci est un message en chambre anéchoïque (translation: This message has been uttered in an anechoic chamber ). Amplitudes are in relative units. The time axis is Figure 1.13: Speech signal corresponding to the french sentence: graduated in number of samples (sampling frequency 44.1 kHz). Speech signal Speech is neither a pure tone nor a combination of harmonics, but a complex and elaborated sound whose frequency components are comprised bewteen 300 Hz and 3400 Hz (see the example in gure 1.13). White noise, pink noise Noise signals are generally non-periodic, but rather (pseudo-) random functions of time. Corresponding spectra are continuous in the audible frequency range, meaning that components are present at each frequency, with variable amplitude. For some particular noises, spectral lines indicate the contributions of electrical motors (fans) or periodic mechanical movements 21 Figure 1.14: Octave bands' spectra of white and pink noise. French to english translation: centrale" = "Bruit blanc/rose" = White/pink noise, Octave central frequency. "Octave: fréquence to the sound emission. White noise has a constant intensity (dB) in each frequency interval of 1 Hz (we hear a blow like chchchch...). If the spectrum of a white noise is represented in the system of octave bands, then we obtain an increase of 3 dB per octave (see gure 1.14): try to explain this ! By denition, a pink noise is a white noise that has been (low-pass) ltered to obtain the same intensity in each octave band (see gure 1.14). If the spectrum of a pink noise is represented in a diagram where the frequency axis is linear, then we obtain a decrease of 3 dB if the frequency is doubled: try to explain this ! 1.2.3 Properties of audio systems An audio system is a system whose main goal is to operate on audio signals. In particular, systems acquiring audio signals at their input, or generating audio signals at their output. Frequency response and bandwith The frequency response is obtained by feeding (exciting) the system with 22 Figure 1.15: Frequency response (dBV) and system. French to english translation: Bandwith at +/- 1 dB. ±1 dB bandwith of an audio "bande passante à +/- 1 dB" = a sinusoidal signal of constant amplitude and variable (increasing) frequency a sweep signal ) and measuring the output signal as a function of frequency. ( The magnitude of the output signal is analysed to determine the of the system (for example, gure 1.15 illustrates a bandwith at bandwith ±1 dB). The frequency response of the system can also be obtained by measuring its impulse response h(t) and then computing its Fourier transform (the system is of course assumed to be linear and time invariant). Several methods are nowadays available for measuring the impulse responses of audio systems: the most popular use sinesweeps or MLS test signals to: • feed the system with a wideband signal sin (t), • record the corresponding output sugnal sout (t) = sin (t) ⊗ h(t) • convolve Note : sout (t) The "inverse" convolved with sin (t). with the "inverse" signal s−1 in (t) s−1 in (t) to retrieve h(t). is the signal which gives a Dirac pulse if it is The symbol ⊗ represents the convolution operation. Measuring the noise generated by an audio system. 23 Figure 1.16: Measuring the noise at the output of an audio system."Système audio" = The audio system. noise generated by a system is dened as any undesirable signal added by the system to the "useful" output signal. Noise is measured by connecting a known impedance at the system's input Zin (0 Ω or any other specied value): is fed to the system. see gure 1.16. No input signal The output voltage therefore only characterizes the noise generated by the system. The Signal to Noise Ratio SNR or S/N is the ratio of the amplitude of the output signal to the noise amplitude, when the system is fed by a test signal (generally a 1 kHz sine wave at the maximum level specied for this system). Harmonic distorsions Figure 1.17 shows the typical transfer function of an audio system (Vout as a function of Vin , for a given test signal) : this function is roughly linear, except at low levels where the system's output is contaminated by noise, and at high levels due to the saturation of the system. Any system whose tranfer function is not linear modies the shape of the input signal (in the time domain). In particular, the saturation of the system limits the amplitude of the output signal and creates distorsion. Indeed, a clipped sinusoid (frequency f0 ) remains periodic, and therefore its spectrum is composed of lines at multiples of The total harmonic distorsion harmonic f0 . (THD) of an audio system is measured as the following: a sinusoidal signal (f0 ) is fed to the system's input and the power of all the harmonic components (2f0 , 3f0 , ...) is evaluated at the output, with a bandpass lter tuned on each harmonic frequency. This procedure can be rather long and tedious. 24 Figure 1.17: Typical transfer function of an audio system, showing the output amplitude (dB) as a function of the amplitude of the input signal (dB). The reference of the decibel scale is not mentioned here: 0dB could represent the maximum value specied by the system's manufacturer (gure reproduced from the Audio precision documentation). Another indicator of distorsion is the total harmonic distorsion plus noise notch lter removing (THD+N). For this measurement, one only needs a the fundamental frequency from the output signal. In the output volt- age (VHDN ), there only remains the noise signal and the harmonics. The THD+N is generally expressed as a percentage of the RMS value of the fundamental component V0 (100 VHDN V0 ). Figure 1.18 shows the typical variation of the THD+N value of an audio system as a function of the amplitude of the input signal. Three dierent regions are clearly distinguished on this diagram: • at low amplitudes of the input signal, noise is predominant in the THD+N, but its relative importance decreases as the amplitude of the signal increases; • the central, rather at region corresponds to the slowly increasing inuence of the harmonics on the THD+N value; • beyond the "clipping treshold", the saturation of the system results in a dramatic increase of the non-linearities, and therefore of harmonic 25 Figure 1.18: THD+N (in percents) as a function of the amplitude of the input signal (relative units) (gure reproduced from the Audio precision documen- tation). distorsion. This curve can be used by the manufacturer to x the maximum input level of its system, i.e. the level corresponding to a given value of the THD+N (for example, 3%). Intermodulation distorsion (IMD) If the audio system is fed with two sinusoidal signals having dierent frequencies f1 and f2 , additional frequency components such as can be generated at the output (n, m are integers): this is nf1 ± mf2 intermodulation. Figure 1.19 shows the result of clipping the sum of two sinusoidal signals, one with frequency 250 Hz and the other at 8 kHz. Frequency components are created at 7750 Hz, 8250 Hz, ...) and they clearly appear in the Fourier transform. The measurement of IMD can be operated by several methods. In par- ticular (in the previous example), one can demodulate the output signal to recover the components created around 8 kHz. Their power is then measured and compared to the power of the 8 kHz signal. Multichannel systems and channel separation If we can locate sound sources in space, it's mainly because we have two ears and because (for a point source) the emitted sound waves reaching 26 Figure 1.19: Intermodulation: this is the result of clipping the sum of two sinusoidal signals, one with frequency 250 Hz and the other at 8 kHz (see text). Upper gure: time evolution (in seconds). representation (relative amplitudes vs Hz). 27 Lower gure: spectral them are dierent, depending on their direction of incidence. A monochannel audio system cannot render the richness of a 3D audio environment: at least two signals are needed. Stereophony left is based on two dierent signals ( and right ) played by two loudspeakers placed in front of the listener (at aim is to reproduce in some way the the ILD ITD that are ±30◦ ). The (interaural time dierences) and (interaural loudness dierences) created by the sound contributions at both eardrums. The resulting sound scene is generally limited between the two loudspeakers in this case. Movies and Home cinema sound systems are based on the 5.1 multichan- nel technology: it contains ve full-bandwith channels (left, center, right, surround left, surround right) and an additional low frequency channel. They allow for an increased immersion of the moviegoer in the audio virtual scene. In a multichannel audio system, the signal emitted on one channel can induce weak replicas on the other channels. ences These cross-channel interfer- are for example created by capacitive or inductive coupling between hardware components. The so-called Channel separation signal of given amplitude V0 is evaluated by sending a sinusoidal on a channel, and measuring the amplitude at the output of the other channels. The dierence between these amplitudes of the replica and V0 is expressed in decibels, it depends on the frequency of the signal. 28 1.3 Exercises 1.3.1 Exercise 1 Evaluate the RMS value and the crest factor of a periodic square wave signal (period T ). The amplitude is noted A (time evolution between −A and A, see gure 1.5). Compare with the sinusoidal signal having the same amplitude and period. 1.3.2 Exercise 2 An audio amplier generates an output voltage of 1V RMS, if connected to a 600Ω resistance. Evaluate the output power in dBm, dBu and dBV. 1.3.3 Exercise 3 Prove that the power of a periodic signal (expressed by the square RMS value) is equal to the sum of the individual powers contained in the fundamental and each harmonic component. 1.3.4 Exercise 4 A sin ωt (5 kHz frequency) is clipped at half its maximum −A/2 and A/2). Calculate the resulting THD value (consider A sinusoidal signal value (between only the harmonics below 20 kHz) and compare with the THD+N value. 1.3.5 Exercise 5 Two loudspeakers emit uncorrelated signals in free eld. Each of them is individually tested, which gives 80 dB at 1m. What is the sound pressure level at a receiving point located 2m away from the rst loudspeaker and 4m away from the second one, if both loudspeakers are emitting simultaneously ? 1.3.6 Exercise 6 A vertical rack of loudspeakers placed in front of the scene in a rock concert produces a sound pressure level (SPL) of 120 dBA at 1m from its center. If one assumes that the sound propagation is of the spherical type, at which distance from the rack would the SPL be less than 100 dBA ? 90 dBA ? 80 dBA ? 29 Comment : Occupational noise exposure must be limited to 80 dBA (equivalent level during 8 hours' exposure), following European regulations. Beyond this level, actions must be undertaken either to protect the workers hearing protections ) or to reduce noise emissions. ( 1.3.7 Exercise 7 Try to reproduce gure 1.10 from analytical developments (or with Matlab) and explain the "pseudo-periodicity" of the resulting signal's envelope (at 100 Hz frequency). 1.3.8 Exercise 8 Prove that the RMS value of the sum of several acoustical pressure signals is the square root of the individual p2rms , if the pressure signals are uncorrelated (or independent). 30 1.3.9 Exercises' solutions 1.3.9.1 Exercise 1 √ A (crest factor = 1). A/ 2 and the crest factor is equal to The RMS value of a periodic square wave is √ The RMS value of the sine wave is 2. 1.3.9.2 Exercise 2 The output power is 2.22 dBm = 2.22 dBu = 0 dBV. 1.3.9.3 Exercise 3 If: +∞ X s(t) = αm ejωm t ωm = 2πmf0 m=−∞ Then: 1 T ˆ T s2 (t)dt = |α0 |2 + 2 0 +∞ X |αm |2 m=1 This can easily be proved by solving the integral in the previous equation. The contribution of the mth harmonic component (m > 0) in the Fourier series is: αm ejωm t + α−m e−jωm t As s(t) has a real value, α−m = (αm )∗ , and the result of the previous equation is: 2|αm |cos ωm t + phase(αm ) The corresponding power (square of the RMS value) is therefore 1.3.9.4 2|αm |2 . Exercise 4 The clipped signal remains periodic and therefore it can be expressed as in the previous equations (see Exercise 3). Using equation 1.3, the coecients of the Fourier series are: 31 s(t) (for m even): αm = 0 (for m odd 6= 1): jA sin((m − 1)π/6) sin((m + 1)π/6) cos(mπ/6) = −( ) − + π m−1 m+1 m αm (for m = 1): jA π sin(π/3) − + cos(π/6) α1 = −( ) π 6 2 Therefore: • α1 = −0.305jA, • α3 = −0.069jA, • α5 = −0.014jA, • etc... Furthermore, the power of the clipped signal is the following: 0.1955A2 . 1 T ´T 0 s2 (t)dt = The THD value is then expressed by: VRM S (harmonics < 20kHz) T HD = 100 VRM S (f undamental) 0.069A = 100 0.305A = 22.6% if the harmonic components are restricted to the frequencies below 20 kHz. On the other hand, the THD+N value is the ratio of the clipped signal's power WITHOUT the fundamental component to the power of this fundamental component. From the exercise 3, the power of the clipped signal without the fundamental component is the total power of the clipped signal minus : p T HD + N = 100 0.1955A2 − 2(0.305)2 A2 √ 0.305A 2 32 ! = 22.5% 2|α1 |2 1.3.9.5 Exercise 5 The SPL at 2m of the rst loudspeaker is: Lp1 = 80dB − 20log(distance) = 80dB − 20log(2) = 74dB . The SPL at 4m of the second loudspeaker is: Lp2 = 80dB − 20log(4) = 68dB . The signals being the total SPL writes: uncorrelated, Lp,tot = 10log 1.3.9.6 p2ef f,1 +p2ef f,2 p20 = 10log 10Lp1 /10 + 10Lp2 /10 = 75dB . Exercise 6 The spherical law of propagation allows to write: Lp = Lp (1m) − 20log(distance) = 120dBA − 20log(distance) Applying this equation gives : Lp < 100dBA if distance > 10m. (>32m for 90dBA, >100m for 80dBA). 1.3.9.7 The Exercise 8 n uncorrelated acoustical The total pressure signals are noted p1 (t), p2 (t), ...pn (t). square RMS value is the time average of the square instanta- neous presssure: p2tot = (p1 + p2 + ...pn )2 Therefore, as the average of a sum is the sum of the averages: p2tot = n X p2m n X X + m=1 Furthermore, one must consider that for dent) pm pl m=1 l6=m uncorrelated (or indepen- pressure signals, each of them having a zero mean, pm pl = pm pl = 0 which proves the initial proposition. 33