Auralization of sound fields in auditoria using Wave Field Synthesis Emmanuelle Bourdillat M.Sc. Thesis Supervisors: dr. ir. D. de Vries drs. E. M. Hulsebos Laboratory of Acoustical Imaging and Sound Control Department of Applied Physics Faculty of Applied Sciences Delft University of Technology Delft, July 2001 ii Graduation Committee: Prof. dr. ir. A. Gisolf Laboratory of Acoustic Imaging and Sound Control Department of Applied Physics Delft University of Technology Dr. ir. D. de Vries Laboratory of Acoustic Imaging and Sound Control Department of Applied Physics Delft University of Technology Drs. E. M. Hulsebos Laboratory of Acoustic Imaging and Sound Control Department of Applied Physics Delft University of Technology Ir. J. Baan TNO TPD Afdeling Beeldbewerking Delft Prof. dr. ir. Bilsen Laboratory of Perceptual Acoustics Department of Applied Physics Delft University of Technology Ir. R. A. Metkemeijer Adviesbureau Peutz & Associés bv Zoetermeer iii iv Abstract In this thesis a new method of sound reproduction in an enclosure is described, based on the theory of wave field synthesis (WFS): auralization. WFS is a technique based on Huygens’s principle of wave field propagation and was introduced for application in acoustics in [Berkhout, 1987]. The synthesis in a reproduction room, of a wave field recorded elsewhere with preservation of its acoustical properties is called auralization. By measuring the impulse responses of a hall with closely spaced microphone arrays, the spatial and temporal structure of the sound field in that hall can be determined. Pressure and velocity microphones reveal directivity information allowing the simulation of a detector with a cardioid directivity characteristic to discriminate between signals coming from different directions. These microphone signals can then be extrapolated to loudspeaker arrays at all sides. Thus, when the loudspeakers are driven with these extrapolated signals, the acoustics of (a part of) the original hall will spatially and temporally correctly be recreated within the reproduction area. First of all, the auralization system has been evaluated on a simple type of wave field: one plane wave. Next, auralization on a complex concert hall wave field was performed. The original impulse reponses to be auralized were measured in two different concert halls: the Concertgebouw and De Doelen. The auralized wave field was physically compared to the original one. The main features are well reproduced however some artefacts due to the limitations of the system were found to be present. From perceptual experiments it can be concluded that there are some differences which are perceived by the human auditory system that should ideally be eliminated. v vi Samenvatting In dit verslag wordt een nieuwe methode beschreven om geluid in een besloten ruimte te reproduceren, gebaseerd op de theorie van de golfveldsynthese (Wave Field Synthesis): auralisatie. WFS is een techniek gebaseerd op het golf voortplantings principe van Huygens en werd geı̈ntroduceerd voor toepassing in de akoestiek in [Berkhout, 1987]. Met auralisatie bedoelen we de synthese in een reproductiekamer van een golfveld dat ergens anders is opgenomen, met behoud van zijn akoestische eigenschappen. Door de impulsresponsies van een zaal te meten met een array van dicht aaneengesloten microfoons kan de ruimtelijke en temporele structuur van het geluidsveld in die zaal worden bepaald. Druk- en (deeltjes) snelheidmicrofoons geven richtingsinformatie waarmee het mogelijk wordt om detectoren met een cardioı̈de richtingsgevoeligheid te simuleren om signalen die uit verschillende richtingen komen te scheiden. Deze metingen kunnen dan worden geëxtrapoleerd naar luidsprekerarrays aan alle zijden van de reproductiekamer. Als vervolgens de luidsprekers worden aangestuurd met deze geëxtrapoleerde signalen wordt de akoestiek van (een deel van) de originele zaal ruimtelijk en temporeel correct nagebootst binnen de reproductieruimte. Eerst is het auralisatie systeem geëvalueerd met één eenvoudig soort golfveld: een vlakke golf. Vervolgens is de auralisatie voor een complex golfveld uit een concertzaal uitgevoerd. De originele impulsresponsies voor de auralisatie zijn gemeten in twee verschillende concertzalen: het Concertgebouw en De Doelen. Het geauraliseerde golfveld is objectief vergeleken met het originele. De belangrijkste elementen worden goed gereproduceerd, maar een aantal artefacten, vii veroorzaakt door de beperkingen van het systeem, zijn aanwezig. Uit perceptieve experimenten kan worden geconcludeerd dat er verschillen zijn die worden opgemerkt door het menselijke gehoorsysteem en die idealiter zouden moeten worden verwijderd. viii Contents Abstract v Samenvatting vii 1 Introduction 1.1 Historical reproduction methods . . . . . . . . . . Stereophony . . . . . . . . . . . . . . . . Binaural recording and reproduction . . . . Reproduction system with array technology 1.2 The acoustics of a hall . . . . . . . . . . . . . . . 1.3 Outline of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Wave Field Synthesis Theory 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The acoustic wave equation . . . . . . . . . . . . . . . . 2.2.1 Plane wave solution . . . . . . . . . . . . . . . 2.2.2 Spherical wave solution . . . . . . . . . . . . . 2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals 2.3.1 The Kirchhoff-Helmholtz integral . . . . . . . . 2.3.2 The 3-D Rayleigh integrals - planar arrays . . . . 2.3.3 The 2-D Rayleigh integrals - linear arrays . . . . 2.3.4 The 2 12 -D Rayleigh I integral - linear arrays . . . 2.4 Discretisation . . . . . . . . . . . . . . . . . . . . . . . 2.5 Finite arrays . . . . . . . . . . . . . . . . . . . . . . . . 3 Auralization of wave fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 1 2 3 4 . . . . . . . . . . . 5 5 5 6 7 7 8 10 11 12 14 14 17 ix 3.1 3.2 . . . . . . . . . . . . . . 17 17 17 19 19 24 25 25 27 28 29 29 30 31 Auralization of plane waves 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Properties of plane waves . . . . . . . . . . . . . . . . . . . . . . 4.3 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Artefacts due to discretisation . . . . . . . . . . . . . . . . . . . 4.5 Artefacts due to finiteness of array . . . . . . . . . . . . . . . . . 4.6 Simulation of plane waves . . . . . . . . . . . . . . . . . . . . . 4.6.1 Reproduction of non-steering synthesized cardioid microphone arrays . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Reproduction of synthesized steering cardioid arrays . . . 4.6.3 Reproduction of a plane wave coming with an elevation angle different from zero . . . . . . . . . . . . . . . . . . 4.7 Auralization of plane waves in a non-anechoic room . . . . . . . . 4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 35 35 36 38 40 40 3.3 3.4 4 5 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . Recording . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linear array . . . . . . . . . . . . . . . . . . . . . . . Linear array of omnidirectional pressure microphones Linear array of pressure and velocity microphones . . Cross array configuration . . . . . . . . . . . . . . . . Experimental setup . . . . . . . . . . . . . . . . . . . Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Forward extrapolator . . . . . . . . . . . . . . . . . . 3.3.2 Inverse extrapolator . . . . . . . . . . . . . . . . . . . 3.3.3 Extrapolator for 2-D . . . . . . . . . . . . . . . . . . Forward extrapolation . . . . . . . . . . . . . . . . . Inverse extrapolation . . . . . . . . . . . . . . . . . . Reproduction . . . . . . . . . . . . . . . . . . . . . . . . . . Physical comparison 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Influence of the reproduction room on the reconstruction of monopole source signal . . . . . . . . . . . . . . . . . . . . . 5.3 Auralization of the Concertgebouw impulse responses . . . . . 5.3.1 Reproduction in area 1 . . . . . . . . . . . . . . . . . x . . . . . . . . . . . . . . 42 44 45 47 49 51 . . 51 a . . 53 . . 54 . . 56 5.4 5.5 Simulation . . . . . . . . . . . . . . Measurements . . . . . . . . . . . . 5.3.2 Reproduction in area 2 . . . . . . . . 5.3.3 Reproduction in area 3 . . . . . . . . Auralization of De Doelen impulse responses 5.4.1 Reproduction in areas 1 and 2 . . . . Conclusions and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 . 56 . 57 . 60 . 60 . 60 . 64 6 Perceptual comparison of original and auralized impulse responses 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 KEMAR head . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Spatial impression - Apparent Source Width . . . . . . . . . . . . Time window . . . . . . . . . . . . . . . . . . . . . . . . Frequency window . . . . . . . . . . . . . . . . . . . . . 6.4 Coloration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Removal of coloration using Patterson filters . . . . . . . 6.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Description of the experiments . . . . . . . . . . . . . . . Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . 6.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . Observations from the coloration experiment . . . . . . . Observations from the Apparent Source Width experiment 6.5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 65 65 67 67 68 69 71 72 73 74 74 74 75 75 75 77 7 Conclusions and recommendations 7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . 79 79 80 Bibliography 81 xi xii Chapter 1 Introduction 1.1 Historical reproduction methods A few techniques will be investigated which allow to reproduce the same sound image as an original sound field, i.e., to obtain an exact copy of the original sound field. Advantages and disadvantages will be mentioned. Stereophony Since the 1930’s, sound has been reproduced in mono causing the listener to localise all sound at the loudspeaker position. In stereophony (a technique which started in the 1950s) a sound field is recorded with a pair of microphones placed in front of the source area. This method is mainly used for classical music recording and almost no manipulation is done to preserve the natural sound image. The sound field is then reproduced by two loudspeakers placed in front of the listener. A listener seated midway in front of these loudspeakers perceives a sound image similar to that of the original sources. A listener seated off-center receives a distorted stereo image that inclines to the nearest loudspeaker, which is a severe disadvantage of the stereophonic technique. Binaural recording and reproduction An other technique is binaural reproduction. The sound field is recorded with two microphones placed at the entrance of the ear canals of an artificial head. The 1 2 Chapter 1: Introduction reproduction of the recorded sound is done using a headphone. The listener perceives an exact copy of the original sound field during the recording session. Disadvantages of this technique are that the listener will still hear the same signals at the same position in the room when he moves his head or walkes around. Besides, sound reproduction through headphones often leads to ’in-head localization’ such that a good evaluation of spatial cues becomes impossible. Since the 1970s, efforts have been made to develop a multi-channel reproduction system to generate a sound field that envelops the listener as in natural sound fields and that resembles the original as closely as possible: the quadraphonic or the surrond sound configurations are examples of this. But the generated sound fields are only correct within a small area. Outside this area, spatial and temporal distortion occur [Verheijen, 1997]. Reproduction system with array technology A new method of sound reproduction has been developed at the laboratory of Acoustical Imaging and Sound control. When loudspeaker arrays are driven with the correct functions, wave fronts can be synthesized with predefined temporal and spatial properties. Hence, the generated sound fields are correct within the entire space. The concept of WFS is depicted in figure 1.1. W y = yM Fig. 1.1 y = yL Concept of WFS. The wave field, emitted by a primary source, is recorded by a microphone array lo- 1.2 The acoustics of a hall 3 = cated at the plane y yM . The recorded signals are extrapolated to a loudspeaker array at plane y yL with the operator W. The extrapolated signals are fed to the loudspeakers yielding a synthesized wave field which is an image of the original sound field. A prototype of such a reproduction system was built, consisting of 160 loudspeakers disposed in a rectangular shape. = 1.2 The acoustics of a hall Consider a monopole source placed on the stage of a concert hall, emitting a spherical wave propagating in the enclosed space. Most of the sound energy will reflect at boundaries, such as the front-, back- and side-walls, the ceiling, the floor, seats and other obstacles, giving rise to a complex wave field. Part of the energy is absorbed by the walls. After successive reflections the wave field vanishes by absorption. The temporal and spatial properties of the direct wave front will be modified after each reflection, thus determining the acoustics of the concert hall. Hence, each concert hall with its specific (enclosing) geometry and specific reflecting properties at the boundaries will have its own specific acoustics. The sound pressure of a wave field due to a source on the stage emitting a spherical acoustical pulse can be recorded at a receiver position, yielding the transfer function between source and receiver. We can write p(t) = () () Z +1 1 h( )s(t )d (1.1) () where p t represents the pressure at the receiver position, s t is the source signal and h t is the impulse response or acoustical response. For a causal signal the boundary of the integral 1 is replaced by . The impulse response h t which is the response for a source signal s t Æ t describes all properties of the transmission path between source and receiver. In the frequency domain the convolution reduces to a multiplication: 0 ()= () P (! ) = S (! )H (! ) ( ) with H ! the transfer function for the source-receiver pair and S of the source. () (1.2) (!) the spectrum A single impulse response measured in a concert hall shows only the temporal properties of the specific transmission path between the source and one receiver. 4 Chapter 1: Introduction Measuring the impulse response along an array of microphones with equidistant spacing between the microphones gives a multi-channel recording with a temporal and spatial distribution, thus characterising the acoustics of the hall. This is sufficient information to allow a reproduction of the acoustical properties in another room. 1.3 Outline of this thesis The technique of the auralization of a wave field consists of measurement of the impulse responses of a hall on a microphone array, analysis of these measurements, followed by extrapolation and emission by loudspeaker arrays in a reproduction room. The aim of this thesis is to explain this process in more detail and to give a comparison between auralized and original signals in a physical and perceptual evaluation. In the next chapter, the Rayleigh integrals which describe the theory of WFS will be given and the simplifications necessary to arrive at a realizable system are discussed. In chapter 3 the three different steps necessary to auralize wave fields will be investigated: recording, processing and reproduction. Chapter 4 will evaluate the quality of the reproduction system for a simple wave field: one plane wave. When auralization takes place in a reproduction room with its own acoustical properties extra reflections will be added to the synthesized original wave field. The influence of this reproduction room will be looked at. Chapter 5 gives a physical comparison between an original, a simulated (in a virtual anechoical reproduction room) and a measured auralized signal. To complement the physical description, a subjective appreciation was made in chapter 6. Finally, in chapter 7 some conclusions are drawn and recommendations are made for further research. Chapter 2 Wave Field Synthesis Theory 2.1 Introduction In this chapter a summary of the wave field synthesis (WFS) theory is presented. Starting from two fundamental equations to which the names of Newton and Hooke are closely related, the acoustic wave equation will be derived. Its solutions for a plane wave and a monopole source are treated in subsections 2.2.1 and 2.2.2. In section 2.3 the Kirchhoff and Rayleigh integrals are derived, being the fundamentals of WFS. These integrals describe the reconstruction of a (real or virtual) primary wave field inside a surface from a continuous distribution of secondary sources on that closed surface. In the last sections the effects of finite and discrete arrays will be investigated. For a more elaborate treatment the reader is referred to [Berkhout, 1987]. 2.2 The acoustic wave equation Consider a homogeneous isotropic fluid with zero viscosity. The derivation of the wave equation for a compressional wave is obtained from two basic equations. The first equation follows Newton’s second law of motion. It states the relationship between pressure variations in space and changes in particle velocity over time: rp r; t 0 @ v r; t ; (2.1) ( )= with scalar ( ) @t p and vector v the acoustic pressure and the particle velocity respec5 6 Chapter 2: Wave Field Synthesis Theory =( ) tively as a function of the position r x; y; z ; 0 represents the mass density. In the frequency domain, this equation reads rP (r; !) = j!0V(r; !); (2.2) with ! the radial frequency. The second equation follows Hooke’s law for compressional fluids. It gives a relation between particle velocity variations in space and pressure changes in time: r:v(r; t) = K1 @p(@tr; t) ; (2.3) where K represents the compression modulus. In the frequency domain this equation reads r:V(r; !) = j! P (r; ! ): K (2.4) From these two basic equations the wave equation is derived: = p 2 r2p(r; t) c12 @ p@t(r2; t) = 0; (2.5) where c K=0 represents the propagation velocity of the wave (the density 0 is supposed to be homogeneous). This wave equation describes the distribution of the pressure field in space and time. Applying the temporal Fourier transform, we get the so-called Helmholtz equation r2P (r; !) + k2 P (r; !) = 0; (2.6) where k is the wave number and equals !=c. The wave field at all positions can be calculated by solving this wave equation. In the two next subsections we will look at two fundamendal solutions of the wave equation: the plane wave and the spherical wave. 2.2.1 Plane wave solution The pressure field of a plane wave with propagation direction n is described by p(r; t) = s t n:r c ; (2.7) 2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals 7 (t) is the source function. In the frequency domain we get P (r; ! ) = S (! )e jkn:r where s (2.8) This equation satisfies the Helmholtz equation 2.6. The particle velocity of this plane wave can be derived using Newton’s equation 2.2 ( ) = 1c P (r; !)n V r; ! (2.9) 0 V is proportinal to P with a factor 0 c without phase difference. 2.2.2 Spherical wave solution The pressure field of a spherical wave at position r generated by a monopole source at the origin is given in the time domain by p(r; t) = s t jrj jrj c where jrj is the distance from the source to the receiver position and source function. In the frequency domain this equation reads: P (r; ! ) = S (! ) e (2.10) s(t) is the jkjrj jrj : (2.11) This equation shows that the sound pressure is inversely proportional with the distance jrj and that the sound pressure is delayed by jrj=c, due to the finite propagation velocity c of sound waves. The particle velocity (using Newton’s equation) can be written by r 1 1 + jkjrj P (r; ! ) : (2.12) V(r; ! ) = 0 c jkjrj jrj Note that in the far field (k jrj 1) the spherical wave can locally be considered as a plane wave. 2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals In this section the theory of WFS is summarized. 8 Chapter 2: Wave Field Synthesis Theory 2.3.1 The Kirchhoff-Helmholtz integral According to the Huygens principle each element of a wave front coming from a point source can be seen as a secondary source emitting a spherical wave. The contribution of all these secondary sources form a new wave front (see figure 2.1). t t + t 1 0 0 1 1 0 0 1 1 0 0 1 Primary source 1 0 0 1 secondary sources Fig. 2.1 Representation of the Huygens principle. The quantification of Huygens’s principle is given by the Kirchhoff-Helmholtz integral: the wave field in a source-free volume V due to sources outside V can be described by a distribution of sources along the surface S of the volume. The pressure at a point A in volume V is mathematically described by (see e.g. [Berkhout, 1987]), I 1 @ e jkr P (rA ; ! ) = 4 S P (r; !) @n r @P (r; ! ) e jkr @n r dS (2.13) where r = jrj, r being the vector from the secondary source to the point of interest A, n = jnj, n being the normal vector to the surface S pointing inward the recon- struction volume V, 2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals 9 k is the wave number and P (r; ! ) is the Fourier transformed pressure distribution on S due to primary sources outside S , as shown in figure 2.2. From this equation follows that any pressure wave field within a source-free volume V resulting from a source distribution outside V may be synthesized by means of monopole and dipole distributions on a closed surface S (figure 2.2). The strength of each monopole (second term in equation 2.13), is determined by the gradient of the pressure on S which is proportional to the normal component of the particle velocity. The strength of each dipole (first term in equation 2.13), is given by the pressure P on S . Note that in absence of primary sources the pressure distribution inside V can be exactly synthesized using correct driving signals for the secondary sources. In the following subsection we will find that in some cases the Kirchhoff-Helmholtz integral can be simplified using only monopole or dipole secondary sources, leading to the Rayleigh integrals. 111 000 000 111 000 111 000 111 V r A n S Fig. 2.2 The pressure field at point A inside volume V caused by primary sources outside V can be synthesized from the wave fields of monopole and dipole distributions on S. 10 Chapter 2: Wave Field Synthesis Theory 2.3.2 The 3-D Rayleigh integrals - planar arrays S1 R n r A 00 11 S0 Fig. 2.3 z x y Configuration for the Rayleigh integrals using a planar distribution of secondary sources. ( ) Consider the situation where a wave field P r; ! is generated by a primary source distribution in the infinite half space y > . Volume V is determined by the plane y S0 and a hemisphere of radius R in the half space y < S1 . The situation is depicted in figure 2.3. If we want to calculate the pressure in a point A in the upper half space for a finite time interval (0 t Tmax , R can always be chosen such that the integral contribution from the surface S 1 has not yet reached point A in this time interval, so that this surface does not contribute to the total integral. According to [Berkhout, 1987], we can write the following equations for PA in space-frequency domain: 0 = 0( ) 0( ) ) Z 1 e jkr P (rA ; ! ) = j!0 Vn (r; ! ) 2 r dS S (2.14) 0 and jk P (rA ; ! ) = 2 where cos = jyA j=r . Z S0 1 + jkr e jkr P (r; ! ) cos jkr r dS (2.15) Equations 2.14 and 2.15 are called the first and second Rayleigh integral, respectively. The Rayleigh integrals show us that only contributions of monopoles or dipoles located on a plane need to be used for wave field synthesis. The first Rayleigh integral states that the pressure field PA at position A can be synthesized from the wave field of a monopole distribution on the plane surface 2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals 11 S0 . The strength of each monopole is given by the normal component of the particle velocity of the wave field measured at the position of this particular monopole . in the plane y =0 The second Rayleigh integral states that the pressure PA can be synthesized from the wave field of a dipole distribution on S 0 , the strength of each dipole being given by the pressure of the incident wave field at the specific dipole location. Equations (2.14) and (2.15) describe forward wave field extrapolation. Using these equations, the wave field at any position A can be synthesized if the wave field at the plane y is known. =0 This result is valid for an infinite continuous planar distribution of monopoles or dipoles. In practice two simplifications must be made : 1. the continuous distribution will be replaced by a discrete distribution of sources. This means that above a certain frequency, aliasing will occur, depending on spacing of this discrete distribution. 2. the infinite distribution of sources is replaced by a finite distribution of sources. This means that at the edges diffraction effects occur. Examples of these undesirable effects will be discussed in chapter 4. In the next section we will see that it is possible to synthesize a wave field with a linear array instead of a planar array. However the wave field is not controlled throughout the entire volume anymore, but only in a horizontal plane through the array. The shape of the wave front is modified in the vertical direction and becomes circular. However, in the horizontal plane the shape of the wave front is unchanged. 2.3.3 The 2-D Rayleigh integrals - linear arrays The 2D integrals can be obtained by integration of equations 2.14 and 2.15 over the z-axis. The primary source is replaced by a primary line source such that the integrand Vn and P are independent of z, yielding the 2D Rayleigh I integral: P (rA ; ! ) = j!0 2 Z +1 1 Vn(x; ! )H0(2) (kr)dx; (2.16) 12 Chapter 2: Wave Field Synthesis Theory and the 2D Rayleigh II integral: jk P (rA ; ! ) = with Z 2 +1 P (x; ! ) cos H1(2) (kr)dx; 1 q r = (x xA )2 + yA2 : (2.17) (2.18) H0(2) and H1(2) are the zeroth-order and first order Hankel functions of the second kind. These functions can be approximated by an exponential function for k r 1 (far-field approximation), such that: r jk P (rA ; ! ) = 0 c 2 Z +1 1 Vn (x; ! ) jkr e p r dx; (2.19) for the 2D Rayleigh I integral, and similarly the Rayleigh II integral in the far-field approximation is given by: P (rA ; ! ) = r jk 2 Z +1 1 P (x; ! ) cos e jkr p r dx: (2.20) The infinite secondary source plane S0 is now replaced by an infinite secondary source line. 2.3.4 The 2 12 -D Rayleigh I integral - linear arrays Consider again the 3D Rayleigh configuration of figure 2.3. The primary source and the receiver are positioned in the horizontal (x; y )-plane. The wave field of the primary source is synthesized by a secondary monopole source distribution on the plane S0 according to the 3-D Rayleigh I integral. On the plane S0 the contribution of the secondary sources on each vertical line can be approximated by the contribution of just one source on that line. This point source is taken in the horizontal plane of the primary source and the receiver on y yL (figure 2.4). Each point source on yL should then be driven with an adapted source signal. This approximation is called the stationary phase approximation and can be found in [Start, 1997] and [Verheijen, 1997]. The surface S 0 is now transformed to a line y yL since the contribution of secondary sources on column L are replaced by one single secondary monopole source. The 12 D = = 2 2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals 13 S0 x z y 11 00 00 r 11 11 00 S 00 11 00 11 r 11 00 00A 11 00 11 n r 1 0 0 1 0 1 r0 0 rL yL yS yR L Fig. 2.4 Representation for the Rayleigh I integral. The secondary point source at rL gives the largest contribution to the wave field in point A of all points on L. Rayleigh I integral is given by : P (rA ; ! ) = S (! ) r jk 2 ( Z +1 r 1 ) r0 cos epjkr e jkr dx: r0 + r0 r0 r0 0 0 (2.21) The driving function Qm xL ; ! of the secondary monopole point source is : r Qm (xL ; ! ) = S (! ) p jk 2 ( + ) Z +1 r 1 r0 cos epjkr ; r0 + r0 r0 0 (2.22) where the factor r0 = r0 r0 is a function of x and must be determined for each secondary source-receiver combination. It has been shown in [Start, 1997] that this factor can be replaced by: r r0 = r0 + r0 r yR yR yL ; yS (2.23) 14 Chapter 2: Wave Field Synthesis Theory under the stationary phase assumption. Thus the driving function will be independant of the secondary source-to-receiver distance for receivers on the line y yR . But the amplitude of the synthesized wave field will not be correct at other receiver lines. By choosing an appropriate distance for the reference line, the amplitude errors can be kept small in a large listening area [Sonke, 2000]. = 2.4 Discretisation For practical situations, continuous secondary source distributions will be replaced by discrete arrays of loudspeakers. When a signal is sampled in the spatial domain with sampling distance x, the spectrum in the k x -domain is convolved with a pulse-train with period = x. x should be chosen such that overlap does not occur. For time domain sampling, the maximum angular frequency that can be reproduced without aliasing for a given sampling interval t, referred to as the Nyquist frequency !Nyq , is given by : !Nyq (2.24) 2 = t An analogy can be made for sampling in the space domain: the maximum spatial frequency that can be distinguished is the spatial Nyquist frequency given by (2.25) x : The maximum value of kx for a given k is k sin max where max is the maxikx;Nyq = mum angle with respect to the x-axis of the plane wave components present in the signal. The temporal frequency where spatial aliasing starts to occur is given by fal = c 2x sin max : (2.26) Practical examples will be given in the next chapters. 2.5 Finite arrays When using finite arrays instead of infinite arrays diffraction effects in the synthesized wave field will occur. The approximate reconstruction area of a finite array can be found by drawing lines from the primary source towards the edges of the truncated array to the receiver line. We see that the receiver line can not 2.5 Finite arrays 15 receiver line Fig. 2.5 S 0 1 0 111111 00000 000000000000000 111111111111111 0 000001 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 00000 111111111111111 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 00000 111111111111111 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 A B secondary sources array 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 111111111111111 11111 000000000000000 111111111111111 00000 11111 000000000000000 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 00000 111111111111111 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 00000 11111 000000000000000 00000 111111111111111 11111 000000000000000 111111111111111 00000 11111 000000000000000 111111111111111 The correct reconstruction area is defined by the edges of the finite array. be covered completely by the array (figure 2.5). Outside the reconstruction area diffraction will be seen whereas inside this area diffraction waves and the synthesized wave field will interfere. Examples of this effect of truncation will be shown in chapter 4. To reduce these effects a Hanning window can be applied over the length of the array, known as tapering. This will cause the diffraction waves to be attenuated inside as well as outside the reconstruction area. Figure 2.6 shows a Hanning window applied on the truncated array. tapering window A B secondary sources array Fig. 2.6 Reduction of diffraction effects by using a Hanning window applied to the amplitude of the driving function of the secondary sources. Chapter 3 Auralization of wave fields 3.1 Introduction In this chapter the concept of auralization based on the theory of wave field synthesis will be described. Auralization means making audible the acoustics of a room (a concert hall for example) in another room (a reproduction room), so the reproduced sound field should resemble the original one as closely as possible. The three steps of auralization will be investigated in the next three sections of this chapter: recording, extrapolation and reproduction. After measurement (or simulation) of the impulse responses which characterize the sound field in (part of) a concert hall with array technology, the spatial and temporal characteristics of the hall are known. These impulse responses can be extrapolated to the positions of the loudspeaker arrays in the reproduction room and convolved with an anechoic (music) signal to yield this signal with the same temporal and spatial structure, i.e. with the same acoustics, as the original hall. 3.2 Recording Linear array To reproduce the acoustic field of a (sub-)volume of the source-free audience area of a concert hall in a reproduction room (see figure 3.1), the pressure or the particle velocity at the bounding surface of this volume should be known, according to theory in previous chapter. This can be done by recording the sound at the surface 17 18 Chapter 3: Auralization of wave fields with planar arrays of pressure or velocity microphones. The sound field could then be re-created using planar arrays of loudspeakers with monopole or dipole characteristic according to the Rayleigh integrals to reconstruct the correct 3D sound field in the reproduction room. Stage Audience Reproduction room sub−volume Fig. 3.1 The acoustics of part of the listening area of a concert hall can be re-created in the reproduction room. However, this is not realizable because of the enormous number of loudspeakers that would have to be used and for which the signals would have to be calculated, leading to a problem of computational power. Therefore, limiting ourselves to a horizontal planar (sub-)area, linear arrays of microphones and loudspeakers will contribute to an approximately correct sound field reconstruction using the Rayleigh 12 -D integral described in chapter 2. Hence, the impulse responses of a source on the stage will be measured with a linear array of microphones giving a multi-trace offset-traveltime recording that reveals the wave fronts traveling in the enclosure. Note that just recording at the boundary surface of the sub-area is not flexible since we can only reproduce the wave field within that sub-area. At the end of this section a more flexible measurement-method will be described which allows to reproduce an arbitrary sub-area of the hall considered. 2 3.2 Recording 19 Linear array of omnidirectional pressure microphones We can measure the pressure of the wave field using a linear array of omnidirectional pressure microphones. However in order to be able to extrapolate the composing waves to the different sides in the next step, we need to know where they came from. When measuring the pressure with an omnidirectional microphone array, no discrimination in the elevation plane can be detected due to the symmetry of the linear array: wave fronts coming from front, back, above and below are all projected in the same offset-traveltime plane. The different reflections could be identified taking the hall geometry into account, however this requires making a detailed model for each hall to be measured and difficult analysis of the measurements. Linear array of pressure and velocity microphones When using pressure and particle velocity microphones, the elevation angle of incidence can be determined. Directivity information is obtained from the pressure and velocity components. Since the three spatial components of the particle velocity vector are recorded, we can calculate the velocity component in an arbitrary direction. Consider a plane wave propagating in the direction n. The particle velocity v r; t for a plane wave given in section 2.2.1 reads: ( ) ( ) = p(r;ct) v r; t 0 n: (3.1) Taking a linear combination of the pressure and this velocity component in the direction m will synthesize a directivity pattern with the main lobe in the chosen direction m (m and n are unit vectors): ( ) = = pm r; t a p(r; t) + b v(r; t) : m b a+ cos p(r; t) 0 c (3.2) where is the angle between n and m. The expression between brackets in equation 3.2 describes the directivity pattern which depends on the values of a and b. In table 3.1 the values of a and b are given for a monopole, dipole and cardioid characteristic. In figure 3.2 the different directivity patterns of table 3.1 are drawn. 20 Chapter 3: Auralization of wave fields characteristic monopole dipole cardioid Table 3.1 a 1 0 1 2 b 0 0 c 1 0 c 2 directivity 1 cos 1+cos 2 Directivity patterns of a monopole, a dipole and a cardioid. y m n x m=0 monopole dipole cardioid Fig. 3.2 Directivity patterns of a monopole, a dipole and a cardioid. Thus we can simulate the response of an array of directional microphones with a cardioid characteristic. If we define as the angle between m and the (x; y )-plane and as the angle between y-axis and the projection of m on the (x; y )-plane (figure 3.3), we can write: mx my mz = sin cos ; = cos cos ; = sin : 3.2 Recording 21 z m y Fig. 3.3 x Elevation ( ) and azimuth () of a vector m. Then from equation 3.2 and table 3.1 the cardioid characteristic gives: p; (r; t) = = = 1 p(r; t) + 1 c v(r; t) : m 2 20 1 p + 1 c [v m + v m + v m ] 2 20 x x y y z z 1 p + 1 c [v sin cos + v cos cos + v sin ]: y z 2 20 x First let us have a look at turning the cardioid in the (y; z )-plane ( 1 1 p = p + 0 c [vy cos + vz sin ]: 2 2 (3.3) = 0): (3.4) This allows us to discriminate between waves from different elevation angles of incidence. However in this thesis we are not interested in sound waves coming from another elevation angle than zero (x; y -plane) since the wave fields will be reproduced with loudspeaker arrays only in the horizontal plane (see section 3.4). In this case we obtain: 1 2 1 2 p = p + 0 c [vx sin + vy cos ]: (3.5) In the simplest case, we can simulate an array of cardioid microphones oriented to ). See figure 3.4, cardioid number 1 for a drawing of this situation. the front ( =0 22 Chapter 3: Auralization of wave fields y 1 3 2 = = m=0 v Fig. 3.4 x m=0 0 3 Three different orientations for the cardioid for a wave coming in with angle . As said we want to extrapolate waves to the direction from which they came. However this cardiod will detect, and thus extrapolate to the front, any wave coming with an angle 6 . As we will see in the next chapter, reconstruction using this cardiod is not perfect. = To improve the separation, the angle of incidence of the incoming wave front needs to be known. For a complex wave field is not defined. However by applying a 2D Fourier transform in the time- and space domain, the wave field can be written as a weighted sum of monochromatic plane waves, each with a different kkx : angle of incidence, defined by sin = k X P~ (mk ; 0; n!)e j(mkx)x ; P (x; 0; n! ) = x (3.6) x 2 m where P~ (mkx ; 0; n! ) represents the discretized two-dimentional Fourier transform of p(x; 0; t). Thus we can select only the waves coming from the front for extrapolation. Now that the angle of incidence is known we can turn the cardioid towards this direction 3.2 Recording 23 to improve its sensibility. Maximal sensitivity in this direction is obtained when (cardioid 2 in figure 3.4). = For each plane wave with angle , we can calculate the contribution by: pfront = = 1 p + 1 c [v sin( ) + v cos( y 2 20 x 1 p + 1 c [v sin( ) v cos( )]; y 2 20 x )] (3.7) where vx and vy are the velocities in the x and y direction respectively. Knowing: vx vyI we obtain: pfront = = = = v cos( 2 ) = v sin = v sin( 2 ) = v cos ; (3.8) 1 p + 1 cv [sin sin + cos cos ] 2 20 1 p + 1 cv 2 20 p; (3.9) (3.10) A property of the Fourier transform is that waves coming with angle of from the front and from the rear ( ) end up with the same kx . This means that if a complex wave field contains contributions from and from , P mkx ; ; n ! kx will also contain a contribution from used with a cardioid for k , which will be picked up by the cardioid. Let’s take p p + p and v v + v , where p+ and v + correspond to the part of the wave field pressure due to a wave coming an angle and p and v to the part due to a simultaneous wave from . Then: + = arcsin( ) = + pfront + ~( = 0 ) + + = = = 1 (p+ + p ) + 1 c [(v+ + v ) sin (v+ + v ) cos ] y y 2 20 x x 1 p+ + 1 cv+ + 1 p + 1 c [v sin v cos ] xI 2 I 2 0 I 2 I 2 0 xI 1 p+ + 1 cv+ + 1 p + 1 c [v sin2 v cos2 ] I 2I 20 I 2I 20 I (3.11) 24 Chapter 3: Auralization of wave fields We can reduce the contribution from cardioid to the wave coming from pfront to zero by turning the back of the ( = , cardioid 3 in figure 3.4): 1 p+ + 1 c [ v+ sin v+ cos ] + 0 x y 2 20 1 p + 1 cv [ sin sin + cos cos ] 2 20 1 p + 1 cv cos(2 ); 2 20 = = = Note that this also reduces the sensitivity for a wave coming in with For the rear array we will use cardiod 3 mirrored in the x-axis ( 30 ): prear = 12 p + 21 0 c [ vx sin + vy cos ]: = (3.12) + . , cardioid (3.13) Cross array configuration When approches 2 , the cardioid must be turned such that almost no signal is left. In order to detect waves coming with these angles better, we can add a second array perpendicular to the first. For this second array we find: pleft pright = = 1 p + 1 c [v sin + v cos ]; y 2 20 x 1 p + 1 c [ v sin + v cos ]: x y 2 20 (3.14) (3.15) By recording impulse responses along two perpendicular arrays of microphone positions over the length and the width of the hall (as shown in figure 3.5), with a linear array of pressure and velocity microphones, a sound field can be re-created at arbitrary sub-areas using wave field extrapolation. This will be further discussed in section 3.3 and 3.4. In chapter 5 examples of auralization at different sub-areas will be shown. Using equations 3.12, 3.13, 3.14 and 3.15 we can separate the wave fields coming from front, rear, left and right. Examples of wave fields separated with a cardioid and a steering cardioid will be shown in chapter 4. 3.3 Extrapolation 25 Listening area array I Stage x Source array II y Fig. 3.5 z Measuring configuration in the concert hall along two arrays of microphone positions over the full length and width of the hall. Experimental setup The impulse responses are measured along a cross array of microphone positions. The measurements were made every 5 cm along a rail, using a remote control. To discriminate wave fields coming from different directions, pressure and particle velocity were measured. The microphone used for the measurements, the Soundfield MKV microphone, is able to measure the pressure and the three particle velocity vector components. It is composed of four microphones placed on a tetrahedron. Thus directivity patterns can be synthesized by taking a linear combination of the pressure and the velocity components giving simulated cardioid microphones, as described above. 3.3 Extrapolation After the separation of the signals coming from front/rear and left/right with a cardioid characteristic, the inverse extrapolation to the loudspeaker arrays takes place as shown in figure 3.6. The inverse extrapolation is done using the formula derived in this section. In section 2.3, we have seen that: 1. the 3D Rayleigh integral gives a correct synthesized wave field for all types of sources in a volume using continuous planar arrays of point sources 26 Chapter 3: Auralization of wave fields I right array source x y II front array rear array left array Fig. 3.6 Inverse extrapolation to the loudspeaker arrays. 2. the 2D Rayleigh integral gives a correct synthesized wave field for plane waves and vertical line sources in a horizontal plane using continuous linear arrays of (vertical) line sources 3. the 2 12 D Rayleigh integral gives an approximately correct synthesized wave field in a horizontal plane using continuous linear arrays of point sources. In this thesis we are dealing with linear arrays of point sources so the 2 12 D Rayleigh integral should be used, but in equation 2.22, an amplitude factor depending on the distance secondary source-receiver and primary source-secondary source is present: the distance from secondary source to receiver is known, but the distance from primary source to secondary source is only known for the direct wave and not any more for the reflected wave fields which appear to come from virtual mirror sources. In this section the 2D Rayleigh integral will therefore be used in which the distance primary source to secondary source does not appear. Plane waves and ver- 3.3 Extrapolation 27 tical line sources will be correctly reproduced whereas amplitude errors will be present for the reproduction of a point source. 3.3.1 Forward extrapolator =( ) In the space-frequency domain, the pressure at a point A xA ; y1; zA is given y0 which separates the space by the Rayleigh II integral over a surface S at y into a receiver area and a source area (see section 2.3 and figure 3.7): = S source area z Fig. 3.7 xi y = y0 y x y y = y1 n ri recording level receiver area extrapolation level A The wave field in A resulting from source S can be written as a superposition of wave fields generated by secondary point sources on level y = y0 . Z 1 + jkr cos e jkrdS; (3.16) 1 P (x; y0 ; z; ! ) P (xA ; y1; zA ; ! ) = 2 S r2 p with r = jrj = (xA x)2 + (zA z )2 + y 2 . Equation 3.16 can be writ- ten with the symbolic convolution notation P (x; y1; z; ! ) = W + (x; y; !; z ) P (x; y0; z; ! ); where ’**’ denotes convolution in x and z and extrapolation operator: W + (x; y; z; ! ) = y = jy1 1 1 + jkr cos e 2 r2 (3.17) y0 j. W + is called the jkr : (3.18) 28 Chapter 3: Auralization of wave fields Equation 3.17 describes the extrapolation of the pressure from the surface y 0 to the surface y1 , as shown in figure 3.8. S y = y0 y z Fig. 3.8 Forward extrapolation W+ W Inverse extrapolation x y = y1 Forward extrapolation describes wave propagation away from the source and inverse extrapolation describes wave propagation towards the source. In the wavenumber-frequency domain the spatial convolution is replaced by a multiplication operation yielding ~ +(kx; y; kz ; !)P~ (kx; y0; kz ; !); P~ (kx; y1 ; kz ; ! ) = W ~ + (spatial Fourier transform of W +), being given by: the extrapolator W ~ +(kx; y; kz ; !) = W ( e e (3.19) p(k (k +k )y 2 2 2 p(k +k )x k zy ; kx2 + kz2 k2 j 2 2 2 ; kx + kz > k : ~ +(kx; y; kz ; !) describes the For kx2 + kz2 k 2 the multiplication operator W phase shift of the waves traveling from the plane y = y0 towards y = y1 . For ~ + describes the near field waves called evanescent waves, the kx2 + kz2 > k2 , W x2 z2 2 amplitude of which decrease exponentially for increasing y. 3.3.2 Inverse extrapolator The inverse extrapolation is defined in the same way as the forward extrapolation. The wave field extrapolated at y y1 can also be inversely extrapolated from y y1 to y y0 as shown on figure 3.8. = = = From equation 3.19 the inverse extrapolator in the wavenumber-frequency domain noted W is easily found: ~ ~ (kx; y; kz ; !)P~ (kx; y1; kz ; !): P~ (kx; y0 ; kz ; ! ) = W (3.20) 3.3 Extrapolation 29 ~ In the spatial domain the inverse extrapolator W is defined by the inversion of the forward operator W + . Here we define the inverse wave field extrapolator as the conjugate of W + , the so-called ’matched filter’: ~ ~ ~ (kx; y; kz ; !) = W 1 ~ + ~W +(kx; y; kz ; !) ' W (kx; y; kz ; !); yielding ~ (kx; y; kz ; !) = W + ( p p(k +k ) (3.21) kx2 + kz2 k2 k y ; k 2 + k 2 > k 2 : x z ej k2 (kx2 +kz2 )y ; e 2 2 x z 2 For kx2 kz2 > k 2 the inverse extrapolator equals forward extrapolator. The p(kthe 2 +k 2 ) k 2 y + + x z real inversion of W in this case would be e which is unstable because it has a positive real exponent. Therefore we keep the negative sign. For more details the reader is referred to [Berkhout, 1987]. For kx2 ~ + kz2 k2 the operator is the inverse of the forward operator. In the space-frequency domain: W (x; y; z; ! ) = W + ( x; y; z; ! ): (3.22) 3.3.3 Extrapolator for 2-D Forward extrapolation The far-field approximation for the 2D Rayleight II integral was defined in equation 2.20: P (xA ; y = y; ! ) = r Z jk e jkr P (x; y = 0; ! ) cos p dx; 2 y=0 r where A is an arbitrary point at some plane y (3.23) = y (figure 3.9). The extrapolation operator is defined as: W +(x; y; ! ) = r jk e jkr 2 cos pr (3.24) 30 Chapter 3: Auralization of wave fields microphone array loudspeaker array x z r n r Source A listening area y y y = y y=0 Fig. 3.9 Forward extrapolation. Inverse extrapolation Consider now the situation of inverse extrapolation depicted in figure 3.10. The loudspeaker array is positioned between the source S and the microphone array. The inverse extrapolator can be found by applying equation 3.22 to equation 3.24, resulting in: W (x; y; ! ) = r e+jkr jk cos p : 2 r (3.25) Using W , we can compute the pressure which will need to be emitted by the loudspeaker arrays: P (xA; y; ! ) = r Z jk e+jkr P (x; y = 0; ! ) cos p dx: 2 y=0 r (3.26) The inverse operator is only valid if no sources are positioned between the loudspeaker array and the microphone array. Note that equation 3.26 does not depend on the distance between the source and the microphone array. This equation has been used for the examples that will be shown in the remainder of this thesis. 3.4 Reproduction 31 loudspeaker array x n r Source z r A microphone array y y listening y = y Fig. 3.10 3.4 area y=0 Inverse extrapolation. Reproduction The system used in the laboratory for sound reproduction by wave field synthesis is built of 10 arrays of light-weight electrodynamic loudspeakers which are active in a large frequency range. In conventional reproduction of audio signals, the bandwidth of electrodynamic loudspeakers is usually covered by two or three drivers with complementary frequency ranges, because it is very difficult to construct a single driver over the whole audio spectrum with enough output power. But in array systems the power is generated by many elements. So elements with low-power and large frequency ranges can be used in array systems. The elements of the array used are electrodynamic loudspeakers with an oval shape as shown in figure 3.11 below. This shape gives a higher directivity in the horizontal plane compared to the vertical plane since the larger dimension lies in the horizontal direction. The length of this loudspeaker is 12.6 cm while the width is 5.8 cm. Electrodynamic transducers are placed in large bars in a separate enclosure. The volume of each element determines the resonance frequency. To keep a low resonance frequency, the volume should be large. But too large a volume will increase 32 Chapter 3: Auralization of wave fields the dimensions and also the weight of the array. The size of the volume is about 1.8 dm3 and the resonance frequency lies around 180 Hz. The elements of the array are spaced by 12.6 cm, due to the horizontal dimension of the loudspeakers. The value of the spatial aliasing frequency is determined by the spacing between the loudspeakers and equals 1360 Hz. The reproduction system consists of 160 loudspeakers repartitioned over ten line array bars in a rectanglar configuration, enclosing a listening area of about m 2 , see figure 3.13 and 3.11 top. The arrays hang at a height of about 1.65 m above the floor. Thus listeners can easily walk through the room to check the spatial properties of the reproduced sound field. 24 Fig. 3.11 Above: the reproduction room with loudspeaker arrays at the Laboratory of Acoustics; below: array of electrodynamic loudspeakers with oval shaped diaphragm. Two DSP (Digital Signal Processor) systems are used to calculate the driving 3.4 Reproduction 33 signals for the loudspeakers. One DSP system convolves the early reflections (first 160 ms) of the extrapolated impulse response with the audio signal yielding the auralized wave field and adds them to the reverberation signals thus forming the driving signals Q m for the loudspeakers. The reverberation signals are generated by the other DSP system by real-time convolution of four plane waves with the anechoic audio signal (figure 3.12). DSP 2 calculates the approximate delays and amplitudes for the individual loudspeakers, in such a way that the loudspeaker arrays synthesize the desired plane wave. The four plane waves are coming from each of the walls (see figure 3.13(b)) generating the reverberation signals. 64 impulse responses (first 160 ms) dsp1 anechoic anechoic signal convolution signal s(t) 64 driving signals Qm 4 plane waves Fig. 3.12 dsp2 Reverberation signal: 4 plane waves convolved with anechoic signal Schematic setup of the DSP system. The output of the DSPs contains 64 driving signals. As a total of 160 loudspeakers are present and there are only 64 channels in the system, loudspeakers in the 6 front array bars (numbered 4 through 9 in figure 3.13(a)) are coupled per two: two loudspeakers per output channel. Loudspeakers in the 4 rear arrays (1 through 3 and 10) are coupled per 4. 34 Chapter 3: Auralization of wave fields 6 7 5 8 arrays 4 9 3 10 2 (a) Fig. 3.13 1 (b) (a) top view of the arrays of loudspeakers. Each array contains 16 loudspeakers. (b) configuration of 4 plane waves for the reverberation field. Chapter 4 Auralization of plane waves 4.1 Introduction In order to better understand the results of the auralization of a complex wave field measured in a concert hall, we will start by studying a very simple wave field: a single plane wave. It will allow us to easily identify the limitations of the reproduction system (spatial aliasing, truncation artefacts) and the influence of the reproduction room (reflections). In the first section the properties of a plane wave are discussed. Then the differences between continuous and discrete arrays are shown. Also the effects of transition from infinite to finite arrays are shown. In the last two parts, simulation and measurement results for a single plane wave will be given. 4.2 Properties of plane waves The plane wave solution of the homogeneous wave equation described in equation 2.7 in the positive x direction is: p(x; t) = p(t = x ): c (4.1) The wave front is defined by t xc constant which gives a straight line in the x; t -plane, independent of y (figure 4.1). So the amplitude of the wave front must be constant. Theoretically the amplitude of a plane wave does not decay in a loss-free medium: there is no attenuation during propagation. ( ) 35 36 Chapter 4: Auralization of plane waves x t Fig. 4.1 Impulsive plane wave in the space-time domain. The propagation direction is perpendicular to the wave front. If t is increased by t then x has to be increased by x c t to maintain the same amplitude, with the constant velocity of sound c. The specific acoustic impedance, defined by: = Zs(x; ! ) = P (x; ! ) ; V (x; ! ) (4.2) P (x; t) ; V (x; t) (4.3) is real and frequency independent for a plane wave and is given by 0 c and in this special case also holds in the (x,t) domain: Zs(x; t) = 4.3 Experimental setup The source function used in the following sections is a pulse of 2.2 kHz bandwidth depicted in figure 4.2(a) in the time domain and its amplitude spectrum in (b). 1. the pressure and the particle velocity are simulated on a cross array in the horizontal plane (x,y) to obtain a perfect input signal for the auralization to emulate the recorded signal described in section 3.2. In figure 4.2(c) the pressure field for a plane wave oriented with an angle Æ with respect to the x-axis is depicted in the (x,t)-domain at the microphone positions on array I, = 30 4.3 Experimental setup 37 0.04 10 1.5 8 1 6 0.5 0 4 −0.5 2 −1 traveltime t [s] 0.045 0.036 0.038 0.04 0.042 traveltime t [s] (a) Fig. 4.2 0.044 0.046 0 0.05 0.055 0.06 0 2 4 frequency [kHz] 6 8 (b) 0.065 −4 −2 0 2 4 offset x [m] (c) (a) source signal, (b) amplitude spectrum of the source signal and (c) wave field of a plane wave recorded by microphone array I. 2. cardioid microphones are simulated allowing the separation of the signals coming from different directions, 3. the separated signals are extrapolated according to the theory of section 3.3 to the loudspeaker arrays at positions y m for array I and x m for array II, as depicted in figure 4.3, = 3 = 2 4. the extrapolated signals are fed to the loudspeaker arrays, 5. the pressure of the auralized signals is measured at the position of array I for comparison with the input signal. Before using these extrapolated results for reconstruction in a real room, we have simulated the reconstruction in a reflection-free virtual room. This allows us to see the quality of an idealised auralization without artefacts due to distortion from the loudspeaker imperfections and reflections from the reproduction room. The results will be discussed in section 4.6. First some effects of discretisation and finite array length will be investigated in the next two sections. 38 Chapter 4: Auralization of plane waves x 3m right array 72 57 74 80 56 II source 1 3 front array 2m y rear array 15 41 40 25 23 left array 17 I Fig. 4.3 4.4 Configuration of the recording microphones and the auralization loudspeakers. Artefacts due to discretisation By sampling a continuous signal in space, spatial aliasing will occur if the maximum frequency in the signal is larger than the aliasing frequency: fal = c 2x sin max : (4.4) where max is the maximal angle of the sound emitted by the loudspeakers. The discretisation of continuous secondary source distributions by regularly spaced arrays of point sources leads to a periodic signal in the wave number domain. When the signal is undersampled in the spatial domain, the spectrum of the signal overlaps in the wave number domain causing spatial aliasing. To illustrate Æ with the xthis effect, simulated plane waves traveling with an angle axis are shown in figure 4.4 with different distances between the loudspeakers. Subfigure 4.4(a) represents the reconstructed pressure field as a function of time with a distance of 6.3 cm between the loudspeakers. No spatial aliasing occurs = 30 4.4 Artefacts due to discretisation 39 traveltime t [s] (a) dx = 0.126/2 cm (c) dx = 0.126*4 cm 0.06 0.06 0.07 0.07 0.07 0.08 0.08 0.08 −1 amplitude 2 −1 2 0 offset 1 −1 2 1 1 0 0 0 20 amplitude 0 1 offset x [m] 1 −1 0.06 0.07 0.08 traveltime t [s] 10 0 Fig. 4.4 (b) dx =0.126*2 cm 0.06 −1 0.06 20 0.07 0.08 travel time t [s] 10 0 2 4 frequency [kHz] 0 −1 0.06 20 0 1 offset x [m] 0.07 0.08 travel time t [s] 10 0 2 4 frequency [kHz] 0 0 2 4 frequency [kHz] A plane wave traveling under an angle = 30Æ synthetized by a loudspeaker array for three different sampling distances dx. (a) above: dx=12.6/2 = 6.3 cm, no aliasing occurs, middle: signal at the receiver position x=-1.3 m and below: spectrum at receiver position x=-1.3 m, (b) above: dx= 12.6*2 = 25.2 cm, with little aliasing, middle: signal at the receiver position x=-1.3 m and below: spectrum at receiver position x=-1.3 m, (c) above: dx= 12.6*4 = 50.4 cm, aliasing occurs, middle: signal at the receiver position x=-1.3 m and below: spectrum at receiver position x=-1.3 m. since fmax < fal = 5444 Hz. Only effects of the finite array length occur, which will be discussed in the next paragraph. In subfigures 4.4(b) and (c) the sampling distance is chosen equal to 25.2 cm and 50.4 cm, respectively. The wave fronts are still visible but distortion of the wave field occurs: the criterion of anti-aliasing is not respected anymore because f al = 1361 Hz and 680 Hz respectively. The subfigures below give the reconstructed signals at the receiver position x = -1.3 m as a function of time and their spectra, respectively. We see that the source 40 Chapter 4: Auralization of plane waves signal speads out in time due to aliasing. In the amplitude spectrum, we see that the reconstructed field is not correct for frequencies above a certain frequency. In the remainder of this thesis the sampling distance is 25.2 cm (the loudspeakers are coupled per 2 at the front of the room) except for the back of the room where the loudspeakers are coupled per 4 and the distance is 50.4 cm, as depicted in figure 4.3. So we expect spatial aliasing to be present in our wave field synthesis. 4.5 Artefacts due to finiteness of array A loudspeaker array used to reproduce wave fields will always have a finite length, so diffraction effects will occur in the synthesized wave field due to truncation of the array. Subfigure 4.5(a) shows the wave front with two additional signals, starting from the edges of the loudspeaker array. The middle image of subfigure 4.5(a) gives the amplitude of the wave front at position x=-1.3 m where the diffraction effect is visible. To avoid truncation effects the driving signals at the edges of the array should be attenuated relative to the center using a tapering function. Subfigure 4.5(b) shows the results of the tapering. The diffraction effects are then attenuated. Note that the wave front is not constant anymore at the edges, the amplitude has decreased (subfigure 4.5(b) middle) as well as the amplitude spectrum. In the following a tapering window will be used to avoid diffraction effects. 4.6 Simulation of plane waves We will now have a look at the simulation of plane waves auralization. The driving signals for the loudspeaker arrays are used to simulate the reconstruction in a reflection-free virtual room. The simulation was carried out as follows: at each microphone position x m the distances to all loudspeaker positions, d ml , were calculated. The driving signal of each loudspeaker Ql is delayed by the travel time corresponding to this distance, dml =c, and attenuated by =dml giving a new signal Qml . The simulated response Sm of the microphone at position xm was found by summing the signals Q lm for 1 4.6 Simulation of plane waves 41 travel time t [s] (a) without tapering 0.04 0.04 0.045 0.045 0.05 −1 amplitude 1 0.04 0.045 travel time t [s] −1 0.05 0 1 offset x [m] 0.04 0.045 travel time t [s] 10 5 0 −1 1 0 10 amplitude 0.05 0 1 offset x [m] 0 −1 Fig. 4.5 (b) with tapering 0.05 5 0 0 2 4 frequency [kHz] 0 2 4 frequency [kHz] Plane wave synthesized by a loudspeaker array (a) above: without spatial tapering function, middle: signal at receiver position x=-1.3 m and below: spectrum at the same position, (b) above: with spatial tapering function, middle: signal at receiver position x=-1.3 m and below: spectrum at the same position. all loudspeakers l. Sm (t) = X l Qml (t) = X l 1 dml Ql t dml c (4.5) To start we will consider the reconstruction of a plane wave using non-steering cardioid microphone arrays. Next we will simulate steering cardioid microphone arrays to see if the separation is better than with (non-steering) cardioid microphone arrays. Then we will consider the case when a plane wave travels in a plane different than the horizontal one. 42 Chapter 4: Auralization of plane waves 4.6.1 Reproduction of non-steering synthesized cardioid microphone arrays First of all, we will look at a cross array of non-steering cardioid microphones. Subfigure 4.6(a) shows the driving signals for the loudspeaker arrays of a plane wave oriented with an angle = 30Æ with respect to the x-axis. The loudspeaker configuration and its numbering are shown in figure 4.3. Thus the first 16 columns of subfigure (a) correspond to the rear array, the next 24 to the left, then 16 for the front and the last 24 drive the right array. The empty columns at the back ends of the side arrays and for the rear array are caused by the fact that the speakers are coupled per four. We see that the cardioid attributes a driving signal to the rear and right arrays (though smaller than for front and left), where there should be none. This is because the cardioid oriented to the rear array is sensitive for all signals coming in with an angle different from zero. Similarly the cardioid oriented to the right array will pick up any signal not coming from 270 Æ with respect to the x-axis. Subfigure 4.6(b) shows the reconstructed signal between x=-1.7 m and x=+1.7 m at microphone array I. To get a better insight in the reconstruction, the ideal signal that we could expect to obtain has been added at the sides of the reconstructed signal, i.e. the original signal. The wave front is correctly reproduced, it matches the arrival time and wave length of the original on both sides. We also observe a wave field at a different angle from the original wave front. To better understand this additional signal, we can simulate the reproduction of the plane wave using one array at a time instead of all four loudspeaker arrays. The contributions of each of the four arrays are shown in subfigures 4.6(c) through (f). The grayscale on subfigure (d) has been changed to see the small signal from the rear array. The reconstruction of the wave front by the front array does not cover the whole width. This can be understood from the fact that the left most loudspeaker has no left neighbour that constructively adds to the wave front. A shadow zone of 1.5 m is created on microphone array I as shown in drawing 4.7 where 0 tan = x 3 ) x0 = 1:5 m (4.6) This shadow zone will be compensated for by the contribution of the left array which covers the complementary area (figure 4.6(e)). In this way the wave front 4.6 Simulation of plane waves 43 0.03 front array 0.04 0.04 0.045 0.045 0.05 0.05 0.055 0.06 0.055 0.06 0.045 traveltime t [s] 0.04 traveltime t [s] traveltime t [s] 0.035 0 20 40 60 0.065 −3 80 −2 −1 rear array 2 0.065 −3 3 0.06 1 2 (d) Signal reproduced by loudspeaker array at the rear. 3 0.05 0.055 0.065 −3 2 3 0.045 0.05 0.055 0.06 0.06 offset x [m] 1 0.04 traveltime t [s] traveltime t [s] 0.055 0 right array left array 0.05 0 −1 (c) Signal reproduced by loudspeaker array at the front. 0.045 −1 −2 offset x [m] 0.04 0.045 traveltime t [s] 1 (b) Signal reproduced on microphone array I by all four loudspeaker arrays. 0.04 Fig. 4.6 0 offset x [m] (a) Driving signals for the loudspeaker arrays. −2 0.055 0.06 loudspeaker number 0.065 −3 0.05 −2 −1 0 1 2 offset x [m] (e) Signal reproduced by loudspeaker array at the left. 3 0.065 −3 −2 −1 0 1 2 3 offset x [m] (f) Signal reproduced by loudspeaker array at the right. Reproduction of a plane wave after separation with cardioid microphone arrays. is reconstructed over the whole width. For the right and rear arrays, we can see that the cardioid has attributed the same driving signal as for the left and front arrays respectively but with a (much) smaller amplitude. We can conclude that using a cardioid microphone perpendicular to the arrays does not allow a perfect reconstruction of the plane wave because of these parasite signals. Note also the presence of aliasing as expected. 44 Chapter 4: Auralization of plane waves wave front y 3m I x x0 2m II Fig. 4.7 Reproduction by the front array of a plane wave arriving under an angle of 30Æ with respect to the x-axis. 4.6.2 Reproduction of synthesized steering cardioid arrays Next, auralization of a plane wave with the same incidence angle as in the previous subsection has been simulated, but this time the cardioid is oriented in the direction of incidence of the plane wave, i.e. using a steering cardioid. This direction can be obtained from the two horizontal particle velocity components as explained in chapter 3. Figure 4.8 shows the results in the same way as figure 4.6. In subfigure (a), no signal is attributed to the right and rear array as expected. We can see that the separation with the steering cardioid is perfect. The reconstructed signal in subfigure (b) matches the original signal well on both sides. In subfigures 4.8(c) through (f) only the front and left arrays contribute to the reconstruction of the plane wave. Aliasing is still present. 4.6 Simulation of plane waves 45 front array 0.03 0.04 0.04 0.045 0.045 0.04 0.045 0.05 traveltime t [s] traveltime t [s] traveltime t [s] 0.035 0.05 0.055 0.055 0.055 0.06 0.06 0.06 0.065 −3 0.065 −3 0 20 40 60 80 −2 −1 loudspeaker number 2 3 0.05 0.055 0 1 2 offset x [m] (d) Signal reproduced by loudspeaker array at the rear 3 1 2 3 right array 0.04 0.04 0.045 0.045 0.05 0.055 0.065 −3 0 (c) Signal reproduced by loudspeaker array at the front 0.05 0.055 0.06 0.06 0.06 −1 −1 offset x [m] traveltime t [s] traveltime t [s] 0.045 −2 −2 left array rear array Fig. 4.8 1 (b) Signal reproduced on microphone array I by all four loudspeaker arrays. 0.04 0.065 −3 0 offset x [m] (a) Driving signals for the loudspeaker arrays traveltime t [s] 0.05 −2 −1 0 1 2 offset x [m] (e) Signal reproduced by loudspeaker array at the left 3 0.065 −3 −2 −1 0 1 2 3 offset x [m] (f) Signal reproduced by loudspeaker array at the right Reproduction of a plane wave after separation with steering cardioid microphone arrays. 4.6.3 Reproduction of a plane wave coming with an elevation angle different from zero Consider now the case of a plane wave propagating out of the horizontal plane: a plane wave coming with an elevation angle of 45 Æ and an azimuthal angle of 30Æ . The calculated driving signals for the loudspeakers are shown in subfigure 4.9(a). The reconstructed signal embedded in the ideal plane wave simulation is depicted in figure 4.9(b). 46 Chapter 4: Auralization of plane waves front array 0.03 0.04 0.04 0.045 0.05 0.045 0.05 0.055 0.055 0.06 0.04 travel time t [s] travel time t [s] traveltime t [s] 0.035 0 20 40 60 0.06 −3 80 −2 −1 0 0.06 −3 3 0.05 1 2 offset x [m] (d) Signal reproduced by loudspeaker array at the rear 3 2 3 0.04 0.045 0.05 0.06 −3 1 right array 0.055 0.055 0 (c) Signal reproduced by loudspeaker array at the front travel time t [s] travel time t [s] 0.045 0 −1 offset x [m] 0.04 −1 −2 left array 0.04 travel time t [s] 2 (b) Signal reproduced on microphone array I by all four loudspeaker arrays rear array Fig. 4.9 1 offset x [m] (a) Driving signals for the loudspeaker arrays −2 0.05 0.055 loudspeaker number 0.06 −3 0.045 0.045 0.05 0.055 −2 −1 0 1 2 offset x [m] (e) Signal reproduced by loudspeaker array at the left 3 0.06 −3 −2 −1 0 1 2 3 offset x [m] (f) Signal reproduced by loudspeaker array at the right Reproduction of a plane wave coming out of the horizontal plane. Plane waves propagating with an elevation angle should be recorded and reconstructed with planar arrays. But, as mentioned before, in practice we use linear array leading to errors in the reproduced signal. We can see that the reproduced signal comes at a completely wrong time in subfigures (e) and (f) (the grayscale of (d) has been modified), and that the contributions of front/rear and left/right are not consistent with each other. We can calculate the apparent angle of incidence for a horizontal plane wave corresponding to the observed apparent velocity of the wave front as seen by arrays I and II either geo- 4.7 Auralization of plane waves in a non-anechoic room 47 metrically or by measuring the slope of the wave fronts in the measurements. We find an apparent angle of 21Æ for array I and 51Æ for array II. This explains why the contributions from both arrays are not consistent with each other. In the auralization of concert hall acoustics (in the next chapter), no correction was applied to suppress these effects. 4.7 Auralization of plane waves in a non-anechoic room Now that we have seen the results of the simulation of plane wave reconstruction, we will have a look at the auralization in a real room as described in chapter 3. The first auralized signal is a plane wave parallel to array I. The measured result is displayed in figure 4.10(a), together with the ideal signal. The plane wave is reconstructed over the whole width with the correct wave length. Since the room in which we have done the auralization is not anechoic, we observe that there are reflections of the reproduction room present in the auralized signal. We can identify them by calculating the delays corresponding to reflections from the walls, the floor and the ceiling. This will be useful in order to recognize them in a more complex auralized signal. We have identified the following reflections as follow: Event 1: is the first reflection from the ceiling. Event 2: is the first reflection from the wall behind the front array together with the first reflection from the floor. Event 3: is the first reflection from the floor followed by a second reflection from the ceiling and conversely and also a first reflection from the wall behind the front array followed by a second reflection from the floor, which explains the stronger event. Event 4: is the reflection from the ceiling, then from the floor and a third reflection from the ceiling. Event 5: is the first reflection from the wall behind the front array, then from the floor and the ceiling and a fourth one from the floor. 48 Chapter 4: Auralization of plane waves Event 6: is the first reflection from the wall behind rear array. The multiple reflections are more difficult to identify because of the complex pathways. After 60 ms it becomes harder to make out the individual reflections in the ’noise’ of multiple reflections and diffractions. 0.045 0.04 0.045 2 1 0.05 2 0.05 0.045 3 1 0.055 0.055 2 0.05 0.06 0.055 5 0.06 6 0.06 3 traveltime t [s] traveltime t [s] traveltime t [s] 4 0.065 0.07 3 0.065 0.07 0.065 0.07 0.075 0.08 −4 −2 0 2 offset x [m] (a) plane wave with incident parallel to the x-axis. Auralized after processing with steering cardioid microphones. Fig. 4.10 4 0.075 0.075 0.08 0.08 0.085 0.085 0.09 −4 −2 0 2 offset x [m] (b) Plane wave incident under 30 Æ with respect to the x-axis. Auralized after processing with steering cardioid microphones. 4 0.09 −4 −2 0 offset x [m] 2 4 (c) Plane wave incident under 30 Æ plotted in octave band centered around 500 Hz. Plane waves with different incident angles auralized in the reproduction room. Figure 4.10(b) is the result of auralization of a plane wave incident under 30 Æ with respect to the x-axis. A steering cardioid was used to make the separation between signals coming from front-rear and signals coming from left-right. Again the wave front is perfectly reproduced. We can clearly see aliasing between the 4.8 Conclusion 49 arrows denoted 1 caused by the front array, and aliasing caused by the left array denoted 2. By plotting this signal in the octave band centered around 500 Hz (see figure (c)) we can see that the aliasing is not present anymore since the frequencies above the aliasing frequency have been removed. Arrows 3 (in figure (b)) show the reflections of the aliasing at the right wall. Also other reflections are visible but difficult to identify because of the complex pathways for a wave incident with an angle. 4.8 Conclusion We can conclude that: a simple wave can be auralized, a steering cardioid gives a perfect suppression of the signals from undesired directions, a lot of aliasing is observed, the influence of the reproduction room can easily be found on a plane wave with incident parallel to the x-axis. Chapter 5 Physical comparison 5.1 Introduction In the previous chapter, we have seen that: 1. we can obtain a near perfect reconstruction for a simple signal 2. the reproduction room has an influence on the reconstructed signal, that can not be neglicted In this chapter we will see how a total impulse response, formed by, amongst others, wave front and reflections of a complex room (a concert hall), will be reconstructed. In step 1 of the setup of section 4.3, the simple perfect input is now replaced by a complex measured signal, after which the extrapolation was carried out. The resulting driving signals were again fed to loudspeaker arrays in a virtual anechoic and a real (non-anechoic) reproduction room, where the resulting wave field was measured at the microphone positions of array I. In the following sections, physical characteristics of both the simulation and the measurements, will be described. The next chapter will discuss perceptual aspects. The original impulse responses were measured in two different concert halls: the Concertgebouw in Amsterdam and De Doelen in Rotterdam. Figure 5.1 shows the ground plan and the measurement configuration of both halls. The wave field was auralized in different areas in each concert hall. The center of the first area 51 52 Chapter 5: Physical comparison (a) Concertgebouw in Amsterdam (43m*28m*17m) (b) De Doelen in Rotterdam (57m*32m*14m) Fig. 5.1 Ground plan of the Concertgebouw and De Doelen, the microphone cross-array and the position of the areas where the wave field have been auralized. 5.2 Influence of the reproduction room on the reconstruction of a monopole source signal 53 coincides with the center of the concert halls. A second area was chosen 8 meters to the right of the center, and a third one 8 meters to the right and 5 meters to the back (only for the Concertgebouw). 5.2 Influence of the reproduction room on the reconstruction of a monopole source signal We have measured the response of the reproduction room on a wave front signal. This will be helpfull for the identification of reflections which were not present in the original signal. For this we have taken the first 3 ms of the measured impulse response of the Concertgebouw. 35 35 40 40 1 2 45 45 50 50 3 4 38 55 55 60 65 42 65 44 70 70 75 75 80 80 48 85 85 50 90 −4 −2 0 offset x [m] 2 (a) original signal Fig. 5.2 40 60 6 traveltime t [ms] traveltime t [ms] traveltime t [ms] 5 4 90 −4 46 52 −2 0 offset x [m] 2 (b) measured signal 4 −2 −1 0 1 offset x [m] 2 3 (c) zoom Auralization of a wave front in the reproduction room. Figure 5.2(a) shows the wave front of a monopole source and figure (b) gives the 54 Chapter 5: Physical comparison reconstructed wave front in the reproduction room. We find the same reflections already identified in figure 4.10(a). The reader is referred to section 4.7 for the identification of the numbering of the reflections. In figure 5.2(c) we can see the aliasing better. We observe that there is a lot of reverberation signal after the input signal has died off. 5.3 Auralization of the Concertgebouw impulse responses Measurements of the impulse responses of a monopole source positioned on the stage of the concertgebouw have been done by J. Baan and J.J. Sonke, recorded along the microphone position arrays I and II. The offset-traveltime representation of the measurements on array I is shown in figure 5.3 and in figure 5.4(a) for an enlargement. 40 60 traveltime t [ms] 80 100 120 140 160 −10 −5 0 5 10 offset x [m] Fig. 5.3 Impulse responses of the Concertgebouw as a function of position. The vertical axis represents the traveltime t beginning at zero when the source fires. The horizontal axis gives the offset x of the microphone from the center of the hall. The figure shows the temporal and spatial structure of the impulse 5.3 Auralization of the Concertgebouw impulse responses 55 1 1’ 40 40 40 2 2’ 3 3’ 50 4 50 50 5 6 60 70 traveltime t [ms] 60 traveltime t [ms] traveltime t [ms] 60 70 70 80 80 80 90 90 90 100 −2 0 2 offset x [m] (a) original impulse responses Fig. 5.4 100 −2 0 2 offset x [m] (b) simulated impulse responses 100 −2 0 2 offset x [m] (c) measured impulse responses Offset-traveltime representation of the impulse responses of the Concertgebouw. (a) original measured impulse responses. Between x=-1.7 and x=+1.7 m (area 1) the data has been replaced by: (b) the simulated wave field in the virtual reproduction room and (c) the measured wave field in the reproduction room. 56 Chapter 5: Physical comparison responses. The direct front and the reflections from the sidewalls are easily identified. Knowing the geometry of the concert hall, we can also identify the other reflections, but this was not part of this study. After about 100 ms the reflections can not be identified individually anymore, the reflection density increases and the wave field becomes more complex. Also the amplitude decreases significantly after multiple reflections due to both the geometrical speading of the sound field and the absorption during reflection against the room bondaries. Using these measurements we can reconstruct the sound field in any area covered by the measurements. 5.3.1 Reproduction in area 1 Simulation As discussed in chapter 4 the wave field on array I and II is separated with simulated steering cardioid microphones and inversely extrapolated, giving the driving signals for the loudspeakers arrays. These signals have been used in computer simulations in a virtual reproduction room. = [ 1 7; +1:7] m are displayed in : The results of the simulation in area 1 at x figure 5.4(b) replacing that part of the original signal. We see for example that the simulated direct front coincides with the original wave front. The same holds for the first reflections from the side walls. Comparison with figure 5.4(a) shows a quite good overall similarity. The weaker signals at times 61 ms, 69 ms and 77 ms are also reproduced almost perfectly. However we see the presence of a lot of (strong) aliasing. The main artefacts are due to aliasing. Measurements The extrapolated signals have been fed to the loudspeaker arrays in the reproduction room where the pressure of the resulting wave field was measured at the position of array I for comparison with the original sound field. The (x,t): representation of the measurements can be seen in figure 5.4(c) between x and x : m. The digital processing system was isolated to prevent the generation of additional noise. = +1 7 = 17 5.3 Auralization of the Concertgebouw impulse responses 57 Here too the wave front is connecting to the original one but it is little bit weaker. A reason could be the following: we adapt the scale of the measurements so that its maximum amplitude equals the maximum amplitude of the original. Because the microphones at both ends are just next to a loudspeaker, they pick up a too strong amplitude. Therefore the scale is maybe not optimal. The reflections of the reproduction room that can be identified have been numbered as in the previous figure. Event number 20 is the reflection from the floor and ceiling and conversely and also from the wall behind the front array and the floor (equivalent to number 3) of events 10 . The sidewall reflections are well reproduced but not exactly at the correct time. The reason is probably that the measurements were done at a different sampling frequency than the original signal and then resampled at a rate which is slightly different from the original one. We observe especially strong aliasing below the cross of the first reflections, which resembles what we have seen in figure 4.7(b) for a plane wave. A few weaker events present in the original signal are hardly noticeable in the measurements. This could be due to (interference with) the quite strong reverberation which we have noted in section 5.2. 5.3.2 Reproduction in area 2 The same experiments have been carried out in the second area (see figure 5.1). Figure 5.5(a) represents the offset-traveltime of the original impulse responses of the Concertgebouw around area 2. We observe that the amplitude of the events is smaller than in area 1 (except for the direct wave). Figure (b) shows the simulation of the reconstructed wave field. This time the direct wave is not well reproduced over the entire length. As shown in [Hulsebos et al., 2001] the area within which good reproduction can be obtained with a cross array configuration is limited. Here we are crossing the limit of this area. In figure 5.6 the results are shown for a smaller reproduction room (4*2 m instead of 4*6 m). We can see that the reproduction of the direct wave is now almost perfect so we conclude that we are now (nearly) within the limits. The other reflections are reasonably well reproduced. Figure (c) gives the measurements of the auralization in area 2. Just like the simulation, the wave field shows artefacts. On top of the ones present in the 58 Chapter 5: Physical comparison 40 40 40 50 50 50 60 80 70 70 traveltime t [ms] traveltime t [ms] 70 traveltime t [ms] 60 60 80 80 90 90 90 100 100 100 110 110 110 120 6 8 10 offset x [m] (a) original impulse responses Fig. 5.5 120 6 8 10 offset x [m] (b) simulated impulse responses 120 6 8 10 offset x [m] (c) measured impulse responses Offset-traveltime representation of the impulse responses of the Concertgebouw. (a) original measured impulse responses around area 2. Between x=6 and x=9.4 m the data have been replaced by: (b) the simulated wave field in the virtual reproduction room and (c) the measured wave field in the reproduction room. 5.3 Auralization of the Concertgebouw impulse responses 59 40 50 traveltime t [ms] 60 70 80 90 100 110 120 Fig. 5.6 6 8 10 offset x [m] Offset-traveltime representation of the impulse responses of the Concertgebouw simulated in a smaller area than area 2 (4*2 m instead of 4*6 m). 60 Chapter 5: Physical comparison simulation, we can see a lot of artefact signals due to the reproduction room. 5.3.3 Reproduction in area 3 Within area 3 no measurements have been made in the Concertgebouw, but from the pressure and the particle velocity recorded on arrays I and II, extrapolation can be carried out everywhere in the concert hall within the limitation mentioned in [Hulsebos et al., 2001], allowing the auralization in the reproduction room of the wave field in area 3 (figure 5.7). No comparison with the original signal is possible, but we observe the same characteristics as in area 2, except that the direct wave is slightly better reproduced which means that we are (more) within the allowed limit for correct reproduction. It also proves that it is possible to auralize in this area, away from the microphone arrays. 5.4 Auralization of De Doelen impulse responses We will now have a look at the results of the auralization of De Doelen (figure 5.1(b)). Due to the different geometry of De Doelen from the Concertgebouw, different shapes and arrival times of the reflections are observed. The original impulse responses displayed in figure 5.8 look more homogeneous than the impulse responses in the Concertgebouw. There are less events coming from the sides. 5.4.1 Reproduction in areas 1 and 2 Here the same experiments have been conducted. Results of area 1 can be found in figures 5.8(b) and (c). Again strong aliasing effects are present in the direct arrival and the side reflections. At the side reflections this causes steeply dipping events similar to what we observed in figure 4.10. Also like for the Concertgebouw, we see a high reverberation level due to the reflections of the reproduction room. This tends to hide the weaker events of the auralized signal. Simulation and measurement results of area 2 can be seen in figure 5.9. We note that we are again partly beyong the allowed area. Like before, there is a lot of aliasing present in the auralized signals. 5.4 Auralization of De Doelen impulse responses 61 40 40 50 50 60 70 70 traveltime t [ms] traveltime t [ms] 60 80 90 90 100 100 110 110 120 6 8 10 offset x [m] (a) Fig. 5.7 80 120 6 8 10 offset x [m] (b) Offset-traveltime representation of the impulse responses of the Concertgebouw. Between x=6 and x=9.4 m (area 3) the data have been replaced by: (b) the simulated wave field and (c) the measured wave field in the reproduction room. 62 Chapter 5: Physical comparison 35 35 35 40 40 40 45 45 45 50 50 50 1 2 3 4 5 55 55 55 60 65 traveltime t [ms] traveltime t [ms] traveltime t [ms] 6 60 65 60 65 70 70 70 75 75 75 80 80 80 85 85 85 90 −2 0 2 offset x [m] (a) original impulse responses Fig. 5.8 90 −2 0 2 offset x [m] (b) simulated impulse responses 90 −2 0 2 offset x [m] (c) measured impulse responses Offset-traveltime representation of the impulse responses of De Doelen at microphone array positions. (a) original measured impulse responses, (b) simulated wave field and (c) measured wave field. 63 40 40 50 50 50 60 60 60 70 70 70 80 traveltime t [ms] 40 traveltime t [ms] traveltime t [ms] 5.4 Auralization of De Doelen impulse responses 80 80 90 90 90 100 100 100 110 110 110 120 6 8 10 offset x [m] (a) original impulse responses Fig. 5.9 120 4 6 8 10 offset x [m] (b) simulated impulse responses 120 6 8 10 offset x [m] (c) measured impulse responses Offset-traveltime representation of the impulse responses of De Doelen at microphone array positions. (a) original measured impulse responses, (b) simulated wave field and (c) measured wave field. 64 5.5 Chapter 5: Physical comparison Conclusions and discussion We have seen that the reproduction room has a large effect on the quality of the reproduction signal. The best results would be obtained in an anechoic room. However this is not practical. Although some absorbing materials were applied to the walls, this has not eliminated the acoustics of the reproduction room. An other way to reduce the impact of the reproduction room would be to try to cancel the reflections by emitting signals in counter phase through the loudspeakers. However this is nearly impossible because of the enormous complexity and computation requirements. The next obvious element to improve would be to reduce the aliasing. This could be achived by using more loudspeakers, for example by driving all loudspeakers in the current setup individually. In the results from area 2 and 3, we observe that the area in which good auralization is possible is limited for a cross array setup, and does not cover the whole area defined by the two arrays. An alternative would be to use the contributions from both microphone arrays for the extrapolation to all four sides. Another option could be to use a circular array (see [Hulsebos et al., 2001]). Chapter 6 Perceptual comparison of original and auralized impulse responses 6.1 Introduction We have seen that the auralization system functions physically well for a complex signal and we have drawn some conclusions, but it will be useful to add to this a perceptive aspect. A listener uses perceptual (subjective) criteria like spaciousness (spatial impression), loudness, warmth and coloration to describe the acoustics of a hall. Usually the direct sound and its reflections can not be heard separately; they are integrated by the human auditory system into an overall sound impression: the acoustical perception of the hall. However, the impulse response can be divided into three parts contributing to different aspects of the acoustical perception. 1. Primary sound: the direct part of the sound field (non reflected) and the very early reflections arriving within 20 ms after the direct sound. The direct sound helps for the localization of the source. The energy of the very early reflections contributes to the reinforcement of the direct sound. 2. Early reflections: the reflections arriving at the receiver between 20 and 80 to 100 ms after the direct sound contribute to the loudness and the clarity of sound and the perceived apparent source width. 65 66 Chapter 6: Perceptual comparison of original and auralized impulse responses 3. Reverberation: reflections reaching the listener later than 100 ms after the direct sound create the reverberant field. The reverberation is related to subjective parameters like warmth, brillance, envelopment. 1 Amplitude 0.5 0 −0.5 −1 1 0 3 2 50 100 150 travel time t [ms] Fig. 6.1 Impulse response measured in the Concertgebouw at a receiver position can be divided into three parts: (1) the primary sound, (2) the early reflections and (3) the reverberation. We will define the perceptual cues which will be discussed in the rest of the chapter. Spaciousness: can be divided in two effects: the first is the Apparent Source Width (ASW) which describes how the apparent size of the sound source seems broader when music is performed in a concert hall than the visual width of the actual source. Especially early lateral reflections from side walls contribute to this image broadening. The second effect is the envelopment which describes how much the listener feels himself enveloped by the sound in a hall. Late lateral reflections (reverberant sound) contribute to this envelopment. Coloration: due to interference between direct sound and early reflections and due to frequency dependent reflection of sound by e.g. walls, some frequency components are being amplified while others are attenuated. This results in a change in the coloration of the sound. Strong coloration is usually very undesirable. 6.2 KEMAR head 6.2 67 KEMAR head The artificial head and ear of the dummy-head KEMAR are made to resemble the human ones and to simulate the distortions of the sound fields by the human head and ear. The KEMAR head allows recording of a binaural signal using two transducers placed inside the ear canal of each ear. Ideally the recording should be done within the ears of listener himself to account for the particular morphology of his auditive system. Also, for an optimal result, the signals should be played back at the same location where they were recorded, i.e. inside the ear canal. However, in practice the playback is done using headphones at the entry of the ear canal. The ear canals of KEMAR add coloration to the signals. If a listener listens to a recording made by KEMAR the spectrum is distorted twice, once by KEMAR’s ear canal and once by his own. The distortion by his own ear canal is natural, but the distortions caused by KEMAR’s ear canal should be removed. A filter was used to compensate for this distortion. Figure 6.2 shows an example of a binaural recording in the Concertgebouw with the KEMAR head. The reflections coming from the left wall are recorded by the left ear while hardly reaching the right ear and vice-versa. We will compare perceptively the original signal measured in the concert hall with the measured signal in the reproduction room by looking at two perceptual criteria: coloration and spatial impression. 6.3 Spatial impression - Apparent Source Width The notion of apparent source width (ASW) emanates from the image broadening of the source signal received by a listener. It is caused by the difference between the left and the right ear signals. The ASW increases if the signals are less correlated (i.e. more different from each other), and for more strongly correlated signals, the ASW decreases. A measure to quantify ASW is the ’interaural cross-correlation coefficient’. The IACC is defined as the maximum of the absolute value of the normalized interaural cross-correlation function, in the delay range j j 1 ms: IACC = max jlr ( )j; j j 1ms: (6.1) 68 Chapter 6: Perceptual comparison of original and auralized impulse responses 40 40 50 50 60 60 70 80 70 80 90 90 100 100 110 110 120 −5 Fig. 6.2 right ear 30 travel time t [ms] travel time t [ms] left ear 30 0 offset x [m] 5 120 −5 0 offset x [m] 5 Binaural impulse responses measured with the KEMAR head in the Concertgebouw. where R +1 pl (t)pr (t + )dt lr ( ) = qR +1 1 2 dt R +1 jpr (t)j2 dt j p ( t ) j l 1 1 (6.2) In this formula, the numerator is the cross-correlation function and the denominator performs a normalisation by the total energy of the two signals. Two filters were applied before the calculation of the IACC: a time window and a frequency window. Time window Since only early reflections contribute to the apparent size of the source, only the first 80 ms after the direct sound are taken into account for the left and right ear 6.3 Spatial impression - Apparent Source Width 69 signals. We therefore apply a time window as defined in [De Vries et al.]: w80ms (t) = 8 > < > : 1; cos 0; for 2 (t 60) 80 ; for for 0 < t < 60 ms 60 < t < 100 ms t > 100 ms This time window is plotted in figure 6.3(a). Frequency window The frequency window used is defined as follows: W (f ) = e (f=300 2) ; for f < 600 Hz e (f=600 1) ; for f 600 Hz 2 2 amplitude It takes into account only the dominant frequency components that contribute to the ASW according to [Raatgever, 1980]. 1.2 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0.05 0.1 time t [s] (a) time window Fig. 6.3 0.15 0 0 2 4 frequency [kHz] 6 8 (b) frequency window Time and frequency windows used for the calculation of the IACC. Using the binaural signals measured in the Concertgebouw, the IACC can be plotted (figure 6.4(a)), after having applied the time and frequency window filters. We see (strong) fluctuations in the IACC values due to interference of wave components. Note the strong correlation between the left and right ear signals at the center position. Figure 6.4(b) shows an enlargement of part of figure 6.4(a), with added to it the IACC values for the auralized version of the Concertgebouw wave field at three 70 Chapter 6: Perceptual comparison of original and auralized impulse responses 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 −10 −5 0 offset x [m] 5 10 (a) IACC as a function of dummy head offset in the Concertgebouw. 1 IACC for the reproduction room IACC for the Concertgebouw 0.8 0.6 0.4 0.2 0 −2 0 2 4 6 8 10 offset x [m] (b) IACC as a function of dummy head offset in the Concertgebouw and for its auralized version at 3 different positions. Fig. 6.4 IACC as a function of dummy head offset in the Concertgebouw in Amsterdam and the auralized version. positions where the auralization has been performed. We observe that the measured IACC shows the same fluctuation pattern as the original one, but that the absolute amplitudes are slightly different. For example, the measured IACC at the center of the room stays below 0.6, while the original goes up to 0.9. A reason for this could be the acoustical asymmetry of the reproduction room which causes the left and right reflections to differ. Also [Verheijen, 1997] mentions a decrease in IACC as a result of interference of aliased waves with the actual wave field 6.4 Coloration 71 causing phase distortion between the signals of the two ears. 6.4 Coloration The sound produced by a source at the stage of a concert hall is reflected at the walls, where a part of the sound energy is absorbed. Due to interference between frequency-dependent absorption and direct sound and early reflections (comb effect), some frequencies of the signal will be intensified and others attenuated. The perception of the spectral distortion of the signal due to these effects, is called coloration. In free-field i.e. where there are no reflections, the sound can be perceived coloration free. Figure 6.5 shows the direct sound and a reflection at two different listener’s positions. A and B are positioned at the same distance from the source. For position A, the path difference between direct sound and reflection is smaller than the path difference at position B. The coloration at position A will therefore be different from the coloration in B. A B Fig. 6.5 The paths of the direct sound and a reflection are different for each position, leading to coloration of the sound field. Hence each position in a concert hall has a different coloration. After auralization, the frequency content of the original signal is also distorted by the recording and the reproduction (microphones, amplifiers, loudspeakers), and by reflections in the reproduction room. Hence, the auralized field will have a dif- 72 Chapter 6: Perceptual comparison of original and auralized impulse responses ferent coloration than the original field. Figure 6.6 shows the frequency spectrum of an impulse response measured in the Concertgebouw and an auralized impulse response measured in the reproduction room at the corresponding position. 14 14 12 12 10 10 8 8 6 6 4 4 2 2 0 −2 10 −1 0 (a) Spectrum of the original signal Fig. 6.6 0 10 10 frequency [kHz] −2 10 −1 0 10 10 frequency [kHz] (b) Spectrum of the auralized signal Spectra of an impulse response measured in the Concertgebouw and of an impulse response measured in the reproduction room, in semilog scale. Removal of coloration using Patterson filters We now consider the auditory system and especially the cochlea which transforms pressure variations into neural impulses traveling in the auditory nerve to the brain. Each auditory nerve fiber responds over a certain range of frequencies. Auditory filters have been modelled in [Patterson et al., 1986] by a set of parallel band-pass filters with increasing center frequency fc . The shape of these filters is described by: s jf fcj e 2Wjf (ffcc)j hfc f (6.3) ( ) = 1 + 4W (f ) c ( ) where f is the frequency, fc the center frequency and W fc the bandwidth of the auditory filter. According to Patterson, the relation between center frequency and bandwidth is given by: W (fc) = (6:23 fc2 + 93:39 fc + 28:52) :10 3 (W and fc in kHz) (6.4) It is sufficient to use 24 such bands to cover the audible range (the so called critical bands) because the ear can not distinguish between frequencies within one such 6.5 Experiments 73 band. A representation of a set of Patterson auditory filters is shown in figure 6.7. 0.1 0.08 0.06 0.04 0.02 0 100 Fig. 6.7 200 300 400 500 600 700 800 900 1000 Normalised Patterson auditory filters in the frequency domain. In order to remove the difference in coloration between the original and the auralized impulse responses, we have applied a set of Patterson filters to the power spectra of both the original and the auralized signals. This gives a measure of the energy of both signals in each frequency band. We then multiply each of the frequency bands of the auralized signal with the ratio of (the square root of) those energies. This procedure is repeated several times, until both signals have (nearly) the same energy content in all frequency bands. The spectra of energy in the Patterson bands for the two signals are shown in figure 6.8(a). Figures (b) and (c) show the results after applying the energy equalization a few times. We see that the spectral differences are gradually removed, resulting in decreasing coloration difference. This allows a better comparison of the spaciousness of the original and the auralized signals. 6.5 Experiments Two different tests will be described in this section: one to test the effect of the minimization of coloration and the other to compare the ASW. Subsection 6.5.1 will describe the experiments and in subsection 6.5.2 the results will be given and discussed. 74 Chapter 6: Perceptual comparison of original and auralized impulse responses 7 6 original auralized 6 6 original 1 time 2 times 5 5 original 3 times 4 times 5 4 4 3 3 2 2 1 1 4 3 2 1 0 0 2 4 frequency [kHz] (a) Fig. 6.8 6 8 0 0 2 4 frequency [kHz] (b) 6 8 0 0 2 4 frequency [kHz] 6 8 (c) Spectra of the energy in the Patterson bands for the original signal, the auralized signal and the auralized signal after several equalization treatments. The perceptive aspects have only been judged on the impulse responses of the Concertgebouw, not for De Doelen. The reason for this was that measurements in De Doelen with the Kemar head have not been done, so comparison of the auralized impulse responses with the original signal was not easily possible. 6.5.1 Description of the experiments Experiment 1 Experiment 1 has been done in order to find out whether the added coloration of the auralized signal has been perceptually removed with respect to the original signal. Three signals were presented to the listener. The second signal was the original signal. The first and last signals were the auralized signal and the auralized signal with the coloration minimized, or the other way around. The listener was asked which of the first and last signals resembled the second most closely. Each signal has a duration of 5 seconds. Experiment 2 This experiment has been done in order to find out whether any difference in apparent source width could be heard between the original signal and the auralized signal with the coloration minimized. These two signals were presented three 6.5 Experiments 75 times with a short pause in between. Each signal had a duration of 3 seconds. The listener was asked which of the two signals sounded broader. The listener was not obliged to make a choice if he did not hear any difference. All experiments were done using both the first 160 ms and the first second of the impulse responses of the Concertgebouw. Further, the experiments were performed twice: once using the impulse responses convolved with white noise, the other time after convolution with a piece of music (a cello recorded in an anechoic room). The experiments were repeated for three different positions in the hall, P1 in the center of the room, P2 approximately half way between the center and the right wall, and P3 near the right wall, all along array I. This makes for a total of 12 different signals for the tests. In order to correct for the ear canal of the KEMAR head recordings, a real time hardware filter was used. The experiments lasted around 20 minutes. Six persons who were not experienced listeners, have done the test. The subjects could repeat the signals as often as they wanted. 6.5.2 Results Observations from the coloration experiment In this test, the subjects were asked to compare the 2 auralized versions with the original one for each of the 12 signals. Each time the signal with the coloration minimized was assessed to resemble the original closest, with two exceptions. Two subjects have ones chosen the untreated auralized signal as closest to the original one, in both cases for the 160 ms impulse response at position P2, for one person when convolved with white noise, for the other when convolved with music. However in general, we can conclude that the subjects can clearly hear the difference and that the impulse response with coloration minimized resembles the original most. Observations from the Apparent Source Width experiment In the second experiment, for each of the four cases (music and noise, on 160 ms and 1s of the impulse response) the subjects were presented twice with the signals 76 Chapter 6: Perceptual comparison of original and auralized impulse responses for the 3 positions, in random order. The order of the auralized and the original signal was not necessarily the same in both cases. Each time the subject had to choose the broader of the two signals. The fact that each signal was judged twice was used to verify whether the choices were made consistently. In the following tables the results of the ASW experiment are summarized. Table 6.1 gives the results for the signals created with the first 160 ms, and table 6.2 for those created with 1 second. The meaning of the symbols is: A: the subject has chosen both times the auralized signal as the broader, O: the subject has chosen both times the original signal as the broader, ?: the subject has chosen one time the auralized and the other the original signal, -: when no choice was made at least one of the two times the signal was presented. subject 1 2 3 4 5 6 Table 6.1 P2 noise music A A ? O ? O A O ? ? O ? P3 noise music O A A A O A A ? O A Results of the ASW test on white noise. subject 1 2 3 4 5 6 Table 6.2 P1 noise music A A A ? A O A A A A A A P1 noise music A ? A A A O A A A ? A A P2 noise music A A A A ? A A ? O P3 noise music ? A ? O O ? A ? A A Results of the ASW test on a music signal. The first thing we observe is that there is a lot of variation in choice, although there seems to be a slight tendency towards the auralized signal. 6.5 Experiments 77 In both tables, the choice for a signal convolved with noise and for the same signal convolved with music is often not the same, this indicates that the perception of ASW depends on the type of sound that is heard. Listeners did not find it easier to judge the ASW on a music signal than on a noise signal, since in both cases there are about as many inconsistences (’?’) and blanks (’-’). Further we observe that within each column there is quite a lot of variation, so different persons make different choices for the same signal. Finally, the choice made with a signal of 160 ms and that of 1 s is often not the same. So the addition of a reverberant part may influence the perception of the signal. This may be even more the case since their origins are different: the reverberant part of the auralized signal was artificially created using plane waves. An exception to these observations is position P1 for which the subjects have mostly chosen the auralized signal as the broader one, regardless of the type of signal used. 6.5.3 Conclusion The aim of these experiments was not to do a thorough perceptual analysis but rather to get a global impression. Therefore no hard conclusions can be given but some sort of tends can be indicated. For more reliable statistics subjects should listen more than twice each signal. We can infer from the results of experiment 1 that the decoloration helps getting closer to the original signal, this means that if we are able to identify the sources of the coloration and compensate for them, we can hope to make a better reproduction. It appears from experiment 2 that for position P1 the difference in ASW is significant: the auralized signal is generally perceived broader than the original, while at the other positions the subjects often made inconsistent choices, or no choice at all. This indicates that there the differences are small. We have seen in section 6.3 that aliasing can explain the broader auralized signals. Chapter 7 Conclusions and recommendations 7.1 Conclusions This new concept of auralization allows the reconstruction of a close copy of the original. The listener is not constraint to listen to music with headphones, but can move within the entire reproduction area. We have seen that: we can record the acoustical properties of the wave field we want to reproduce by using a microphone array, to get a good separation between waves coming from different directions, we need a steering cardioid, requiring pressure and particle velocity measurements, for a full 2D coverage of the wave field, a cross array can be used, the separated signals are extrapolated to loudspeaker array position using the 2D Rayleigh integrals where they are used to drive the loudspeakers, the reproduced wave field resembles the original except for the artefacts due to aliasing and the effects of the reproduction room, we can not extrapolate the signals in the whole area covered by the measurements, 79 80 Chapter 7: Conclusions and recommendations 7.2 there is a noticeable difference in coloration between auralized and original signals, at P1, the ASW seems to be significantly broader for the auralized signal, most likely due to aliasing, a conclusion cannot be made for the two other positions. Recommendations to improve the auralization system, aliasing should be reduced, by increasing the number of loudspeakers in use, to reduce the effect of the reproduction room, absorbing materials should be applied on walls and other obstacles. Another way would be to try to cancel the reflections by emitting signals in counter phase though the loudspeakers. In recent research in the Laboratory, a number of approaches have been developped, that could improve the quality of the auralization. In [Hulsebos et al., 2001], the following is pointed out: we are using a 2D formula for the extrapolation which supposes that wave amplitude decreases with p1r , while in reality waves that are emitted by the loudspeakers decrease with 1r because they are emitted in 3D. A distance dependent amplitude correction should be applied, the area where the extrapolation is possible can be increased by using a weighted sum of the contributions of both microphone arrays, using the Kirchhoff-Helmholtz integrals to create a directivity pattern, finally using a circular array instead of a cross array could reduce the artefacts. Bibliography Baan, J. (1997). Array technology for acoustic wavefield analysis in enclosed spaces, Thesis, Delft University of Technology. Berkhout, A.J. (1987). Applied seismic wave theory, Elsevier, Amsterdam. Hulsebos, E.M. (1999). Fluctuations in measures for spaciousness, Thesis, Delft University of Technology. Hulsebos, E.M., de Vries, D. and Bourdillat, E. (2001). Improved microphone array configurations for auralization of sound fields by Wave Field Synthesis, Preprint of 110th AES Convention Amsterdam. Patterson, R.D. and Moore, B.C.J. (1986). Auditory filters and excitation patterns as representations of frequency resolution, Academic Press, London. Raatgever, J. (1980). On the binaural processing of stimuli with different interaural phase relations, Thesis, Delft University of Technology. Sonke, J.J. (2000). Variable Acoustics by wave field synthesis, Ph. D. Thesis, Delft University of Technology. Start, E.W. (1997). Direct sound enhancement by wave field synthesis, Ph.D. Thesis, Delft University of Technology. Verheijen, E.N.G. (1997). Sound reproduction by wave field synthesis, Ph.D. Thesis, Delft University of Technology. 81 82 BIBLIOGRAPHY de Vries, D., Betkhout, A.J. and Sonke J.J. (May 1996). Array technology for measurement and analysis of sound fields in enclosures, Preprint of 100th AES Convention Copenhagen. de Vries, D., and Baan, J. (May 1996). Auralization of sound fields by wave field synthesis, Preprint of 106th AES Convention Munich. Vogel, P. (1993). Application of wave field synthesis in room acoustics, Ph. D. Thesis, Delft University of Technology.