Auralization of sound fields in auditoria using Wave Field Synthesis

advertisement
Auralization of sound fields in auditoria
using Wave Field Synthesis
Emmanuelle Bourdillat
M.Sc. Thesis
Supervisors: dr. ir. D. de Vries
drs. E. M. Hulsebos
Laboratory of Acoustical Imaging and
Sound Control
Department of Applied Physics
Faculty of Applied Sciences
Delft University of Technology
Delft, July 2001
ii
Graduation Committee:
Prof. dr. ir. A. Gisolf
Laboratory of Acoustic Imaging and Sound Control
Department of Applied Physics
Delft University of Technology
Dr. ir. D. de Vries
Laboratory of Acoustic Imaging and Sound Control
Department of Applied Physics
Delft University of Technology
Drs. E. M. Hulsebos
Laboratory of Acoustic Imaging and Sound Control
Department of Applied Physics
Delft University of Technology
Ir. J. Baan
TNO TPD
Afdeling Beeldbewerking
Delft
Prof. dr. ir. Bilsen
Laboratory of Perceptual Acoustics
Department of Applied Physics
Delft University of Technology
Ir. R. A. Metkemeijer
Adviesbureau Peutz & Associés bv
Zoetermeer
iii
iv
Abstract
In this thesis a new method of sound reproduction in an enclosure is described,
based on the theory of wave field synthesis (WFS): auralization. WFS is a technique based on Huygens’s principle of wave field propagation and was introduced
for application in acoustics in [Berkhout, 1987]. The synthesis in a reproduction
room, of a wave field recorded elsewhere with preservation of its acoustical properties is called auralization.
By measuring the impulse responses of a hall with closely spaced microphone arrays, the spatial and temporal structure of the sound field in that hall can be determined. Pressure and velocity microphones reveal directivity information allowing
the simulation of a detector with a cardioid directivity characteristic to discriminate between signals coming from different directions. These microphone signals
can then be extrapolated to loudspeaker arrays at all sides. Thus, when the loudspeakers are driven with these extrapolated signals, the acoustics of (a part of) the
original hall will spatially and temporally correctly be recreated within the reproduction area.
First of all, the auralization system has been evaluated on a simple type of wave
field: one plane wave. Next, auralization on a complex concert hall wave field
was performed. The original impulse reponses to be auralized were measured in
two different concert halls: the Concertgebouw and De Doelen.
The auralized wave field was physically compared to the original one. The main
features are well reproduced however some artefacts due to the limitations of the
system were found to be present.
From perceptual experiments it can be concluded that there are some differences
which are perceived by the human auditory system that should ideally be eliminated.
v
vi
Samenvatting
In dit verslag wordt een nieuwe methode beschreven om geluid in een besloten
ruimte te reproduceren, gebaseerd op de theorie van de golfveldsynthese (Wave
Field Synthesis): auralisatie. WFS is een techniek gebaseerd op het golf voortplantings principe van Huygens en werd geı̈ntroduceerd voor toepassing in de
akoestiek in [Berkhout, 1987]. Met auralisatie bedoelen we de synthese in een
reproductiekamer van een golfveld dat ergens anders is opgenomen, met behoud
van zijn akoestische eigenschappen.
Door de impulsresponsies van een zaal te meten met een array van dicht
aaneengesloten microfoons kan de ruimtelijke en temporele structuur van het
geluidsveld in die zaal worden bepaald. Druk- en (deeltjes) snelheidmicrofoons
geven richtingsinformatie waarmee het mogelijk wordt om detectoren met een
cardioı̈de richtingsgevoeligheid te simuleren om signalen die uit verschillende
richtingen komen te scheiden.
Deze metingen kunnen dan worden geëxtrapoleerd naar luidsprekerarrays aan alle
zijden van de reproductiekamer. Als vervolgens de luidsprekers worden aangestuurd met deze geëxtrapoleerde signalen wordt de akoestiek van (een deel van)
de originele zaal ruimtelijk en temporeel correct nagebootst binnen de reproductieruimte.
Eerst is het auralisatie systeem geëvalueerd met één eenvoudig soort golfveld:
een vlakke golf. Vervolgens is de auralisatie voor een complex golfveld uit een
concertzaal uitgevoerd. De originele impulsresponsies voor de auralisatie zijn
gemeten in twee verschillende concertzalen: het Concertgebouw en De Doelen.
Het geauraliseerde golfveld is objectief vergeleken met het originele. De belangrijkste elementen worden goed gereproduceerd, maar een aantal artefacten,
vii
veroorzaakt door de beperkingen van het systeem, zijn aanwezig. Uit perceptieve
experimenten kan worden geconcludeerd dat er verschillen zijn die worden opgemerkt door het menselijke gehoorsysteem en die idealiter zouden moeten worden
verwijderd.
viii
Contents
Abstract
v
Samenvatting
vii
1 Introduction
1.1 Historical reproduction methods . . . . . . . . . .
Stereophony . . . . . . . . . . . . . . . .
Binaural recording and reproduction . . . .
Reproduction system with array technology
1.2 The acoustics of a hall . . . . . . . . . . . . . . .
1.3 Outline of this thesis . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2 Wave Field Synthesis Theory
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
2.2 The acoustic wave equation . . . . . . . . . . . . . . . .
2.2.1 Plane wave solution . . . . . . . . . . . . . . .
2.2.2 Spherical wave solution . . . . . . . . . . . . .
2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals
2.3.1 The Kirchhoff-Helmholtz integral . . . . . . . .
2.3.2 The 3-D Rayleigh integrals - planar arrays . . . .
2.3.3 The 2-D Rayleigh integrals - linear arrays . . . .
2.3.4 The 2 12 -D Rayleigh I integral - linear arrays . . .
2.4 Discretisation . . . . . . . . . . . . . . . . . . . . . . .
2.5 Finite arrays . . . . . . . . . . . . . . . . . . . . . . . .
3 Auralization of wave fields
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
1
1
2
3
4
.
.
.
.
.
.
.
.
.
.
.
5
5
5
6
7
7
8
10
11
12
14
14
17
ix
3.1
3.2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
17
17
17
19
19
24
25
25
27
28
29
29
30
31
Auralization of plane waves
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Properties of plane waves . . . . . . . . . . . . . . . . . . . . . .
4.3 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Artefacts due to discretisation . . . . . . . . . . . . . . . . . . .
4.5 Artefacts due to finiteness of array . . . . . . . . . . . . . . . . .
4.6 Simulation of plane waves . . . . . . . . . . . . . . . . . . . . .
4.6.1 Reproduction of non-steering synthesized cardioid microphone arrays . . . . . . . . . . . . . . . . . . . . . . . .
4.6.2 Reproduction of synthesized steering cardioid arrays . . .
4.6.3 Reproduction of a plane wave coming with an elevation
angle different from zero . . . . . . . . . . . . . . . . . .
4.7 Auralization of plane waves in a non-anechoic room . . . . . . . .
4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
35
35
36
38
40
40
3.3
3.4
4
5
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
Recording . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Linear array . . . . . . . . . . . . . . . . . . . . . . .
Linear array of omnidirectional pressure microphones
Linear array of pressure and velocity microphones . .
Cross array configuration . . . . . . . . . . . . . . . .
Experimental setup . . . . . . . . . . . . . . . . . . .
Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 Forward extrapolator . . . . . . . . . . . . . . . . . .
3.3.2 Inverse extrapolator . . . . . . . . . . . . . . . . . . .
3.3.3 Extrapolator for 2-D . . . . . . . . . . . . . . . . . .
Forward extrapolation . . . . . . . . . . . . . . . . .
Inverse extrapolation . . . . . . . . . . . . . . . . . .
Reproduction . . . . . . . . . . . . . . . . . . . . . . . . . .
Physical comparison
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Influence of the reproduction room on the reconstruction of
monopole source signal . . . . . . . . . . . . . . . . . . . . .
5.3 Auralization of the Concertgebouw impulse responses . . . . .
5.3.1 Reproduction in area 1 . . . . . . . . . . . . . . . . .
x
.
.
.
.
.
.
.
.
.
.
.
.
.
.
42
44
45
47
49
51
. . 51
a
. . 53
. . 54
. . 56
5.4
5.5
Simulation . . . . . . . . . . . . . .
Measurements . . . . . . . . . . . .
5.3.2 Reproduction in area 2 . . . . . . . .
5.3.3 Reproduction in area 3 . . . . . . . .
Auralization of De Doelen impulse responses
5.4.1 Reproduction in areas 1 and 2 . . . .
Conclusions and discussion . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 56
. 56
. 57
. 60
. 60
. 60
. 64
6 Perceptual comparison of original and auralized impulse responses
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 KEMAR head . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Spatial impression - Apparent Source Width . . . . . . . . . . . .
Time window . . . . . . . . . . . . . . . . . . . . . . . .
Frequency window . . . . . . . . . . . . . . . . . . . . .
6.4 Coloration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Removal of coloration using Patterson filters . . . . . . .
6.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.5.1 Description of the experiments . . . . . . . . . . . . . . .
Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . .
Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . .
6.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . .
Observations from the coloration experiment . . . . . . .
Observations from the Apparent Source Width experiment
6.5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .
65
65
67
67
68
69
71
72
73
74
74
74
75
75
75
77
7 Conclusions and recommendations
7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . .
79
79
80
Bibliography
81
xi
xii
Chapter 1
Introduction
1.1
Historical reproduction methods
A few techniques will be investigated which allow to reproduce the same sound
image as an original sound field, i.e., to obtain an exact copy of the original sound
field. Advantages and disadvantages will be mentioned.
Stereophony
Since the 1930’s, sound has been reproduced in mono causing the listener to localise all sound at the loudspeaker position. In stereophony (a technique which
started in the 1950s) a sound field is recorded with a pair of microphones placed
in front of the source area. This method is mainly used for classical music recording and almost no manipulation is done to preserve the natural sound image. The
sound field is then reproduced by two loudspeakers placed in front of the listener.
A listener seated midway in front of these loudspeakers perceives a sound image similar to that of the original sources. A listener seated off-center receives a
distorted stereo image that inclines to the nearest loudspeaker, which is a severe
disadvantage of the stereophonic technique.
Binaural recording and reproduction
An other technique is binaural reproduction. The sound field is recorded with two
microphones placed at the entrance of the ear canals of an artificial head. The
1
2
Chapter 1: Introduction
reproduction of the recorded sound is done using a headphone. The listener perceives an exact copy of the original sound field during the recording session. Disadvantages of this technique are that the listener will still hear the same signals at
the same position in the room when he moves his head or walkes around. Besides,
sound reproduction through headphones often leads to ’in-head localization’ such
that a good evaluation of spatial cues becomes impossible.
Since the 1970s, efforts have been made to develop a multi-channel reproduction
system to generate a sound field that envelops the listener as in natural sound
fields and that resembles the original as closely as possible: the quadraphonic or
the surrond sound configurations are examples of this. But the generated sound
fields are only correct within a small area. Outside this area, spatial and temporal
distortion occur [Verheijen, 1997].
Reproduction system with array technology
A new method of sound reproduction has been developed at the laboratory of
Acoustical Imaging and Sound control. When loudspeaker arrays are driven with
the correct functions, wave fronts can be synthesized with predefined temporal
and spatial properties. Hence, the generated sound fields are correct within the
entire space. The concept of WFS is depicted in figure 1.1.
W
y = yM
Fig. 1.1
y = yL
Concept of WFS.
The wave field, emitted by a primary source, is recorded by a microphone array lo-
1.2 The acoustics of a hall
3
=
cated at the plane y yM . The recorded signals are extrapolated to a loudspeaker
array at plane y yL with the operator W. The extrapolated signals are fed to the
loudspeakers yielding a synthesized wave field which is an image of the original
sound field. A prototype of such a reproduction system was built, consisting of
160 loudspeakers disposed in a rectangular shape.
=
1.2
The acoustics of a hall
Consider a monopole source placed on the stage of a concert hall, emitting a
spherical wave propagating in the enclosed space. Most of the sound energy will
reflect at boundaries, such as the front-, back- and side-walls, the ceiling, the floor,
seats and other obstacles, giving rise to a complex wave field. Part of the energy
is absorbed by the walls. After successive reflections the wave field vanishes by
absorption. The temporal and spatial properties of the direct wave front will be
modified after each reflection, thus determining the acoustics of the concert hall.
Hence, each concert hall with its specific (enclosing) geometry and specific reflecting properties at the boundaries will have its own specific acoustics. The
sound pressure of a wave field due to a source on the stage emitting a spherical acoustical pulse can be recorded at a receiver position, yielding the transfer
function between source and receiver. We can write
p(t) =
()
()
Z
+1
1
h( )s(t )d
(1.1)
()
where p t represents the pressure at the receiver position, s t is the source signal and h t is the impulse response or acoustical response. For a causal signal
the boundary of the integral 1 is replaced by . The impulse response h t
which is the response for a source signal s t
Æ t describes all properties of
the transmission path between source and receiver. In the frequency domain the
convolution reduces to a multiplication:
0
()= ()
P (! ) = S (! )H (! )
( )
with H ! the transfer function for the source-receiver pair and S
of the source.
()
(1.2)
(!) the spectrum
A single impulse response measured in a concert hall shows only the temporal
properties of the specific transmission path between the source and one receiver.
4
Chapter 1: Introduction
Measuring the impulse response along an array of microphones with equidistant
spacing between the microphones gives a multi-channel recording with a temporal
and spatial distribution, thus characterising the acoustics of the hall. This is sufficient information to allow a reproduction of the acoustical properties in another
room.
1.3
Outline of this thesis
The technique of the auralization of a wave field consists of measurement of the
impulse responses of a hall on a microphone array, analysis of these measurements, followed by extrapolation and emission by loudspeaker arrays in a reproduction room. The aim of this thesis is to explain this process in more detail
and to give a comparison between auralized and original signals in a physical and
perceptual evaluation.
In the next chapter, the Rayleigh integrals which describe the theory of WFS will
be given and the simplifications necessary to arrive at a realizable system are discussed.
In chapter 3 the three different steps necessary to auralize wave fields will be
investigated: recording, processing and reproduction.
Chapter 4 will evaluate the quality of the reproduction system for a simple wave
field: one plane wave. When auralization takes place in a reproduction room with
its own acoustical properties extra reflections will be added to the synthesized
original wave field. The influence of this reproduction room will be looked at.
Chapter 5 gives a physical comparison between an original, a simulated (in a
virtual anechoical reproduction room) and a measured auralized signal.
To complement the physical description, a subjective appreciation was made in
chapter 6.
Finally, in chapter 7 some conclusions are drawn and recommendations are made
for further research.
Chapter 2
Wave Field Synthesis Theory
2.1
Introduction
In this chapter a summary of the wave field synthesis (WFS) theory is presented.
Starting from two fundamental equations to which the names of Newton and
Hooke are closely related, the acoustic wave equation will be derived. Its solutions for a plane wave and a monopole source are treated in subsections 2.2.1
and 2.2.2. In section 2.3 the Kirchhoff and Rayleigh integrals are derived, being
the fundamentals of WFS. These integrals describe the reconstruction of a (real
or virtual) primary wave field inside a surface from a continuous distribution of
secondary sources on that closed surface. In the last sections the effects of finite
and discrete arrays will be investigated. For a more elaborate treatment the reader
is referred to [Berkhout, 1987].
2.2
The acoustic wave equation
Consider a homogeneous isotropic fluid with zero viscosity. The derivation of the
wave equation for a compressional wave is obtained from two basic equations.
The first equation follows Newton’s second law of motion. It states the relationship between pressure variations in space and changes in particle velocity over
time:
rp r; t 0 @ v r; t ;
(2.1)
( )=
with scalar
( )
@t
p and vector v the acoustic pressure and the particle velocity respec5
6
Chapter 2: Wave Field Synthesis Theory
=(
)
tively as a function of the position r
x; y; z ; 0 represents the mass density. In
the frequency domain, this equation reads
rP (r; !) = j!0V(r; !);
(2.2)
with ! the radial frequency.
The second equation follows Hooke’s law for compressional fluids. It gives a
relation between particle velocity variations in space and pressure changes in time:
r:v(r; t) = K1 @p(@tr; t) ;
(2.3)
where K represents the compression modulus. In the frequency domain this equation reads
r:V(r; !) = j!
P (r; ! ):
K
(2.4)
From these two basic equations the wave equation is derived:
=
p
2
r2p(r; t) c12 @ p@t(r2; t) = 0;
(2.5)
where c
K=0 represents the propagation velocity of the wave (the density
0 is supposed to be homogeneous). This wave equation describes the distribution
of the pressure field in space and time. Applying the temporal Fourier transform,
we get the so-called Helmholtz equation
r2P (r; !) + k2 P (r; !) = 0;
(2.6)
where k is the wave number and equals !=c. The wave field at all positions can
be calculated by solving this wave equation. In the two next subsections we will
look at two fundamendal solutions of the wave equation: the plane wave and the
spherical wave.
2.2.1 Plane wave solution
The pressure field of a plane wave with propagation direction n is described by
p(r; t) = s t
n:r c
;
(2.7)
2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals
7
(t) is the source function. In the frequency domain we get
P (r; ! ) = S (! )e jkn:r
where s
(2.8)
This equation satisfies the Helmholtz equation 2.6. The particle velocity of this
plane wave can be derived using Newton’s equation 2.2
( ) = 1c P (r; !)n
V r; !
(2.9)
0
V is proportinal to P with a factor 0 c without phase difference.
2.2.2 Spherical wave solution
The pressure field of a spherical wave at position r generated by a monopole
source at the origin is given in the time domain by
p(r; t) =
s t
jrj
jrj
c
where jrj is the distance from the source to the receiver position and
source function. In the frequency domain this equation reads:
P (r; ! ) = S (! )
e
(2.10)
s(t) is the
jkjrj
jrj :
(2.11)
This equation shows that the sound pressure is inversely proportional with the distance jrj and that the sound pressure is delayed by jrj=c, due to the finite propagation velocity c of sound waves. The particle velocity (using Newton’s equation)
can be written by
r
1
1
+
jkjrj
P (r; ! ) :
(2.12)
V(r; ! ) =
0 c
jkjrj
jrj
Note that in the far field (k jrj 1) the spherical wave can locally be considered
as a plane wave.
2.3
The Kirchhoff-Helmholtz integral and Rayleigh integrals
In this section the theory of WFS is summarized.
8
Chapter 2: Wave Field Synthesis Theory
2.3.1 The Kirchhoff-Helmholtz integral
According to the Huygens principle each element of a wave front coming from
a point source can be seen as a secondary source emitting a spherical wave. The
contribution of all these secondary sources form a new wave front (see figure 2.1).
t
t + t
1
0
0
1
1
0
0
1
1
0
0
1
Primary source
1
0
0
1
secondary sources
Fig. 2.1
Representation of the Huygens principle.
The quantification of Huygens’s principle is given by the Kirchhoff-Helmholtz integral: the wave field in a source-free volume V due to sources outside V can be
described by a distribution of sources along the surface S of the volume. The pressure at a point A in volume V is mathematically described by (see e.g. [Berkhout,
1987]),
I 1
@ e jkr
P (rA ; ! ) =
4 S P (r; !) @n r
@P (r; ! ) e jkr
@n
r dS
(2.13)
where
r = jrj, r being the vector from the secondary source to the point of interest
A,
n
= jnj, n being the normal vector to the surface S pointing inward the recon-
struction volume V,
2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals
9
k is the wave number and
P (r; ! ) is the Fourier transformed pressure distribution on S due to primary
sources outside S , as shown in figure 2.2.
From this equation follows that any pressure wave field within a source-free volume V resulting from a source distribution outside V may be synthesized by
means of monopole and dipole distributions on a closed surface S (figure 2.2).
The strength of each monopole (second term in equation 2.13), is determined by
the gradient of the pressure on S which is proportional to the normal component
of the particle velocity. The strength of each dipole (first term in equation 2.13),
is given by the pressure P on S .
Note that in absence of primary sources the pressure distribution inside V can be
exactly synthesized using correct driving signals for the secondary sources.
In the following subsection we will find that in some cases the Kirchhoff-Helmholtz
integral can be simplified using only monopole or dipole secondary sources, leading to the Rayleigh integrals.
111
000
000
111
000
111
000
111
V
r
A
n
S
Fig. 2.2
The pressure field at point A inside volume V caused by primary sources outside
V can be synthesized from the wave fields of monopole and dipole distributions
on S.
10
Chapter 2: Wave Field Synthesis Theory
2.3.2 The 3-D Rayleigh integrals - planar arrays
S1
R
n
r
A
00
11
S0
Fig. 2.3
z
x
y
Configuration for the Rayleigh integrals using a planar distribution of secondary sources.
( )
Consider the situation where a wave field P r; ! is generated by a primary source
distribution in the infinite half space y > . Volume V is determined by the plane
y
S0 and a hemisphere of radius R in the half space y < S1 . The
situation is depicted in figure 2.3. If we want to calculate the pressure in a point A
in the upper half space for a finite time interval (0 t Tmax , R can always be
chosen such that the integral contribution from the surface S 1 has not yet reached
point A in this time interval, so that this surface does not contribute to the total
integral. According to [Berkhout, 1987], we can write the following equations for
PA in space-frequency domain:
0
= 0( )
0( )
)
Z
1
e jkr
P (rA ; ! ) = j!0 Vn (r; ! )
2
r dS
S
(2.14)
0
and
jk
P (rA ; ! ) =
2
where cos = jyA j=r .
Z
S0
1
+
jkr
e jkr
P (r; ! )
cos
jkr
r dS
(2.15)
Equations 2.14 and 2.15 are called the first and second Rayleigh integral, respectively. The Rayleigh integrals show us that only contributions of monopoles or
dipoles located on a plane need to be used for wave field synthesis.
The first Rayleigh integral states that the pressure field PA at position A can be
synthesized from the wave field of a monopole distribution on the plane surface
2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals
11
S0 . The strength of each monopole is given by the normal component of the particle velocity of the wave field measured at the position of this particular monopole
.
in the plane y
=0
The second Rayleigh integral states that the pressure PA can be synthesized from
the wave field of a dipole distribution on S 0 , the strength of each dipole being
given by the pressure of the incident wave field at the specific dipole location.
Equations (2.14) and (2.15) describe forward wave field extrapolation. Using
these equations, the wave field at any position A can be synthesized if the wave
field at the plane y
is known.
=0
This result is valid for an infinite continuous planar distribution of monopoles or
dipoles. In practice two simplifications must be made :
1. the continuous distribution will be replaced by a discrete distribution of
sources. This means that above a certain frequency, aliasing will occur,
depending on spacing of this discrete distribution.
2. the infinite distribution of sources is replaced by a finite distribution of
sources. This means that at the edges diffraction effects occur.
Examples of these undesirable effects will be discussed in chapter 4.
In the next section we will see that it is possible to synthesize a wave field with
a linear array instead of a planar array. However the wave field is not controlled
throughout the entire volume anymore, but only in a horizontal plane through
the array. The shape of the wave front is modified in the vertical direction and
becomes circular. However, in the horizontal plane the shape of the wave front is
unchanged.
2.3.3 The 2-D Rayleigh integrals - linear arrays
The 2D integrals can be obtained by integration of equations 2.14 and 2.15 over
the z-axis. The primary source is replaced by a primary line source such that the
integrand Vn and P are independent of z, yielding the 2D Rayleigh I integral:
P (rA ; ! ) =
j!0
2
Z
+1
1
Vn(x; ! )H0(2) (kr)dx;
(2.16)
12
Chapter 2: Wave Field Synthesis Theory
and the 2D Rayleigh II integral:
jk
P (rA ; ! ) =
with
Z
2
+1
P (x; ! ) cos H1(2) (kr)dx;
1
q
r = (x
xA )2 + yA2 :
(2.17)
(2.18)
H0(2) and H1(2) are the zeroth-order and first order Hankel functions of the second
kind. These functions can be approximated by an exponential function for k r 1 (far-field approximation), such that:
r
jk
P (rA ; ! ) = 0 c
2
Z
+1
1
Vn (x; ! )
jkr
e
p
r
dx;
(2.19)
for the 2D Rayleigh I integral, and similarly the Rayleigh II integral in the far-field
approximation is given by:
P (rA ; ! ) =
r
jk
2
Z
+1
1
P (x; ! ) cos e
jkr
p
r
dx:
(2.20)
The infinite secondary source plane S0 is now replaced by an infinite secondary
source line.
2.3.4 The 2 12 -D Rayleigh I integral - linear arrays
Consider again the 3D Rayleigh configuration of figure 2.3. The primary source
and the receiver are positioned in the horizontal (x; y )-plane. The wave field of
the primary source is synthesized by a secondary monopole source distribution on
the plane S0 according to the 3-D Rayleigh I integral.
On the plane S0 the contribution of the secondary sources on each vertical line
can be approximated by the contribution of just one source on that line. This
point source is taken in the horizontal plane of the primary source and the receiver
on y
yL (figure 2.4). Each point source on yL should then be driven with an
adapted source signal. This approximation is called the stationary phase approximation and can be found in [Start, 1997] and [Verheijen, 1997]. The surface S 0
is now transformed to a line y
yL since the contribution of secondary sources
on column L are replaced by one single secondary monopole source. The 12 D
=
=
2
2.3 The Kirchhoff-Helmholtz integral and Rayleigh integrals
13
S0
x
z
y
11
00
00
r 11
11
00
S
00
11
00
11
r
11
00
00A
11
00
11
n
r
1
0
0
1
0
1
r0
0
rL
yL
yS
yR
L
Fig. 2.4
Representation for the Rayleigh I integral. The secondary point source at rL
gives the largest contribution to the wave field in point A of all points on L.
Rayleigh I integral is given by :
P (rA ; ! ) = S (! )
r
jk
2
(
Z
+1 r
1
)
r0 cos epjkr e jkr dx:
r0 + r0
r0 r0
0
0
(2.21)
The driving function Qm xL ; ! of the secondary monopole point source is :
r
Qm (xL ; ! ) = S (! )
p
jk
2
( + )
Z
+1 r
1
r0 cos epjkr ;
r0 + r0
r0
0
(2.22)
where the factor
r0 = r0
r0 is a function of x and must be determined for
each secondary source-receiver combination. It has been shown in [Start, 1997]
that this factor can be replaced by:
r
r0 =
r0 + r0
r
yR
yR
yL
;
yS
(2.23)
14
Chapter 2: Wave Field Synthesis Theory
under the stationary phase assumption. Thus the driving function will be independant of the secondary source-to-receiver distance for receivers on the line y yR .
But the amplitude of the synthesized wave field will not be correct at other receiver
lines. By choosing an appropriate distance for the reference line, the amplitude
errors can be kept small in a large listening area [Sonke, 2000].
=
2.4
Discretisation
For practical situations, continuous secondary source distributions will be replaced by discrete arrays of loudspeakers. When a signal is sampled in the spatial
domain with sampling distance x, the spectrum in the k x -domain is convolved
with a pulse-train with period = x. x should be chosen such that overlap
does not occur.
For time domain sampling, the maximum angular frequency that can be reproduced without aliasing for a given sampling interval t, referred to as the Nyquist
frequency !Nyq , is given by
:
!Nyq
(2.24)
2 = t
An analogy can be made for sampling in the space domain: the maximum spatial
frequency that can be distinguished is the spatial Nyquist frequency given by
(2.25)
x :
The maximum value of kx for a given k is k sin max where max is the maxikx;Nyq =
mum angle with respect to the x-axis of the plane wave components present in the
signal. The temporal frequency where spatial aliasing starts to occur is given by
fal =
c
2x sin max :
(2.26)
Practical examples will be given in the next chapters.
2.5
Finite arrays
When using finite arrays instead of infinite arrays diffraction effects in the synthesized wave field will occur. The approximate reconstruction area of a finite
array can be found by drawing lines from the primary source towards the edges
of the truncated array to the receiver line. We see that the receiver line can not
2.5 Finite arrays
15
receiver line
Fig. 2.5
S
0
1
0
111111
00000
000000000000000
111111111111111
0
000001
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
00000 111111111111111
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
00000 111111111111111
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
A
B secondary sources array
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000 111111111111111
11111
000000000000000
111111111111111
00000
11111
000000000000000
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
00000 111111111111111
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
00000
11111
000000000000000
00000 111111111111111
11111
000000000000000
111111111111111
00000
11111
000000000000000
111111111111111
The correct reconstruction area is defined by the edges of the finite array.
be covered completely by the array (figure 2.5). Outside the reconstruction area
diffraction will be seen whereas inside this area diffraction waves and the synthesized wave field will interfere. Examples of this effect of truncation will be shown
in chapter 4.
To reduce these effects a Hanning window can be applied over the length of the
array, known as tapering. This will cause the diffraction waves to be attenuated
inside as well as outside the reconstruction area. Figure 2.6 shows a Hanning
window applied on the truncated array.
tapering window
A
B
secondary sources array
Fig. 2.6
Reduction of diffraction effects by using a Hanning window applied to the amplitude of the driving function of the secondary sources.
Chapter 3
Auralization of wave fields
3.1
Introduction
In this chapter the concept of auralization based on the theory of wave field synthesis will be described. Auralization means making audible the acoustics of a
room (a concert hall for example) in another room (a reproduction room), so the
reproduced sound field should resemble the original one as closely as possible.
The three steps of auralization will be investigated in the next three sections of
this chapter: recording, extrapolation and reproduction.
After measurement (or simulation) of the impulse responses which characterize
the sound field in (part of) a concert hall with array technology, the spatial and
temporal characteristics of the hall are known. These impulse responses can be
extrapolated to the positions of the loudspeaker arrays in the reproduction room
and convolved with an anechoic (music) signal to yield this signal with the same
temporal and spatial structure, i.e. with the same acoustics, as the original hall.
3.2
Recording
Linear array
To reproduce the acoustic field of a (sub-)volume of the source-free audience area
of a concert hall in a reproduction room (see figure 3.1), the pressure or the particle
velocity at the bounding surface of this volume should be known, according to
theory in previous chapter. This can be done by recording the sound at the surface
17
18
Chapter 3: Auralization of wave fields
with planar arrays of pressure or velocity microphones. The sound field could
then be re-created using planar arrays of loudspeakers with monopole or dipole
characteristic according to the Rayleigh integrals to reconstruct the correct 3D
sound field in the reproduction room.
Stage
Audience
Reproduction
room
sub−volume
Fig. 3.1
The acoustics of part of the listening area of a concert hall can be re-created in
the reproduction room.
However, this is not realizable because of the enormous number of loudspeakers
that would have to be used and for which the signals would have to be calculated, leading to a problem of computational power. Therefore, limiting ourselves
to a horizontal planar (sub-)area, linear arrays of microphones and loudspeakers
will contribute to an approximately correct sound field reconstruction using the
Rayleigh 12 -D integral described in chapter 2. Hence, the impulse responses of a
source on the stage will be measured with a linear array of microphones giving a
multi-trace offset-traveltime recording that reveals the wave fronts traveling in the
enclosure. Note that just recording at the boundary surface of the sub-area is not
flexible since we can only reproduce the wave field within that sub-area. At the
end of this section a more flexible measurement-method will be described which
allows to reproduce an arbitrary sub-area of the hall considered.
2
3.2 Recording
19
Linear array of omnidirectional pressure microphones
We can measure the pressure of the wave field using a linear array of omnidirectional pressure microphones. However in order to be able to extrapolate the
composing waves to the different sides in the next step, we need to know where
they came from. When measuring the pressure with an omnidirectional microphone array, no discrimination in the elevation plane can be detected due to the
symmetry of the linear array: wave fronts coming from front, back, above and below are all projected in the same offset-traveltime plane. The different reflections
could be identified taking the hall geometry into account, however this requires
making a detailed model for each hall to be measured and difficult analysis of the
measurements.
Linear array of pressure and velocity microphones
When using pressure and particle velocity microphones, the elevation angle of
incidence can be determined. Directivity information is obtained from the pressure and velocity components. Since the three spatial components of the particle
velocity vector are recorded, we can calculate the velocity component in an arbitrary direction. Consider a plane wave propagating in the direction n. The particle
velocity v r; t for a plane wave given in section 2.2.1 reads:
( )
( ) = p(r;ct)
v r; t
0
n:
(3.1)
Taking a linear combination of the pressure and this velocity component in the
direction m will synthesize a directivity pattern with the main lobe in the chosen
direction m (m and n are unit vectors):
( ) =
=
pm r; t
a p(r; t) + b v(r; t) : m
b
a+
cos
p(r; t)
0 c
(3.2)
where is the angle between n and m. The expression between brackets in equation 3.2 describes the directivity pattern which depends on the values of a and b.
In table 3.1 the values of a and b are given for a monopole, dipole and cardioid
characteristic. In figure 3.2 the different directivity patterns of table 3.1 are drawn.
20
Chapter 3: Auralization of wave fields
characteristic
monopole
dipole
cardioid
Table 3.1
a
1
0
1
2
b
0
0 c
1 0 c
2
directivity
1
cos 1+cos 2
Directivity patterns of a monopole, a dipole and a cardioid.
y
m
n
x
m=0
monopole
dipole
cardioid
Fig. 3.2
Directivity patterns of a monopole, a dipole and a cardioid.
Thus we can simulate the response of an array of directional microphones with a
cardioid characteristic. If we define as the angle between m and the (x; y )-plane
and as the angle between y-axis and the projection of m on the (x; y )-plane
(figure 3.3), we can write:
mx
my
mz
= sin cos ;
= cos cos ;
= sin :
3.2 Recording
21
z
m
y
Fig. 3.3
x
Elevation ( ) and azimuth () of a vector m.
Then from equation 3.2 and table 3.1 the cardioid characteristic gives:
p; (r; t)
=
=
=
1 p(r; t) + 1 c v(r; t) : m
2
20
1 p + 1 c [v m + v m + v m ]
2 20 x x y y z z
1 p + 1 c [v sin cos + v cos cos + v sin ]:
y
z
2 20 x
First let us have a look at turning the cardioid in the (y; z )-plane (
1
1
p = p + 0 c [vy cos + vz sin ]:
2 2
(3.3)
= 0):
(3.4)
This allows us to discriminate between waves from different elevation angles of
incidence. However in this thesis we are not interested in sound waves coming
from another elevation angle than zero (x; y -plane) since the wave fields will be
reproduced with loudspeaker arrays only in the horizontal plane (see section 3.4).
In this case we obtain:
1
2
1
2
p = p + 0 c [vx sin + vy cos ]:
(3.5)
In the simplest case, we can simulate an array of cardioid microphones oriented to
). See figure 3.4, cardioid number 1 for a drawing of this situation.
the front (
=0
22
Chapter 3: Auralization of wave fields
y
1
3
2
= = m=0
v
Fig. 3.4
x
m=0
0
3
Three different orientations for the cardioid for a wave coming in with angle .
As said we want to extrapolate waves to the direction from which they came.
However this cardiod will detect, and thus extrapolate to the front, any wave coming with an angle 6 . As we will see in the next chapter, reconstruction using
this cardiod is not perfect.
=
To improve the separation, the angle of incidence of the incoming wave front
needs to be known. For a complex wave field is not defined. However by applying
a 2D Fourier transform in the time- and space domain, the wave field can be
written as a weighted sum of monochromatic plane waves, each with a different
kkx :
angle of incidence, defined by
sin =
k X P~ (mk ; 0; n!)e j(mkx)x ;
P (x; 0; n! ) = x
(3.6)
x
2 m
where P~ (mkx ; 0; n! ) represents the discretized two-dimentional Fourier transform of p(x; 0; t).
Thus we can select only the waves coming from the front for extrapolation. Now
that the angle of incidence is known we can turn the cardioid towards this direction
3.2 Recording
23
to improve its sensibility. Maximal sensitivity in this direction is obtained when
(cardioid 2 in figure 3.4).
=
For each plane wave with angle , we can calculate the contribution by:
pfront
=
=
1 p + 1 c [v sin( ) + v cos(
y
2 20 x
1 p + 1 c [v sin( ) v cos( )];
y
2 20 x
)]
(3.7)
where vx and vy are the velocities in the x and y direction respectively. Knowing:
vx
vyI
we obtain:
pfront
=
=
=
= v cos( 2 ) = v sin = v sin( 2 ) = v cos ;
(3.8)
1 p + 1 cv [sin sin + cos cos ]
2 20
1 p + 1 cv
2 20
p;
(3.9)
(3.10)
A property of the Fourier transform is that waves coming with angle of from the
front and from the rear ( ) end up with the same kx . This means that if a complex wave field contains contributions from and from , P mkx ; ; n !
kx will also contain a contribution from
used with a cardioid for k
, which will be picked up by the cardioid. Let’s take p p + p and
v v + v , where p+ and v + correspond to the part of the wave field pressure
due to a wave coming an angle and p and v to the part due to a simultaneous
wave from . Then:
+ = arcsin( )
=
+
pfront
+
~(
=
0 )
+
+
=
=
=
1 (p+ + p ) + 1 c [(v+ + v ) sin (v+ + v ) cos ]
y
y
2
20 x x
1 p+ + 1 cv+ + 1 p + 1 c [v sin v cos ]
xI
2 I 2 0 I 2 I 2 0 xI
1 p+ + 1 cv+ + 1 p + 1 c [v sin2 v cos2 ]
I
2I 20 I 2I 20 I
(3.11)
24
Chapter 3: Auralization of wave fields
We can reduce the contribution from
cardioid to the wave coming from pfront
to zero by turning the back of the
( = , cardioid 3 in figure 3.4):
1 p+ + 1 c [ v+ sin v+ cos ] + 0
x
y
2
20
1 p + 1 cv [ sin sin + cos cos ]
2 20
1 p + 1 cv cos(2 );
2 20
=
=
=
Note that this also reduces the sensitivity for a wave coming in with
For the rear array we will use cardiod 3 mirrored in the x-axis (
30 ):
prear
= 12 p + 21 0 c [
vx sin + vy cos ]:
=
(3.12)
+ .
, cardioid
(3.13)
Cross array configuration
When approches 2 , the cardioid must be turned such that almost no signal is
left. In order to detect waves coming with these angles better, we can add a second
array perpendicular to the first. For this second array we find:
pleft
pright
=
=
1 p + 1 c [v sin + v cos ];
y
2 20 x
1 p + 1 c [ v sin + v cos ]:
x
y
2 20
(3.14)
(3.15)
By recording impulse responses along two perpendicular arrays of microphone
positions over the length and the width of the hall (as shown in figure 3.5), with a
linear array of pressure and velocity microphones, a sound field can be re-created
at arbitrary sub-areas using wave field extrapolation. This will be further discussed
in section 3.3 and 3.4. In chapter 5 examples of auralization at different sub-areas
will be shown. Using equations 3.12, 3.13, 3.14 and 3.15 we can separate the
wave fields coming from front, rear, left and right.
Examples of wave fields separated with a cardioid and a steering cardioid will be
shown in chapter 4.
3.3 Extrapolation
25
Listening area
array I
Stage
x
Source
array II
y
Fig. 3.5
z
Measuring configuration in the concert hall along two arrays of microphone
positions over the full length and width of the hall.
Experimental setup
The impulse responses are measured along a cross array of microphone positions.
The measurements were made every 5 cm along a rail, using a remote control.
To discriminate wave fields coming from different directions, pressure and particle velocity were measured. The microphone used for the measurements, the
Soundfield MKV microphone, is able to measure the pressure and the three particle velocity vector components. It is composed of four microphones placed on a
tetrahedron. Thus directivity patterns can be synthesized by taking a linear combination of the pressure and the velocity components giving simulated cardioid
microphones, as described above.
3.3
Extrapolation
After the separation of the signals coming from front/rear and left/right with a
cardioid characteristic, the inverse extrapolation to the loudspeaker arrays takes
place as shown in figure 3.6. The inverse extrapolation is done using the formula
derived in this section.
In section 2.3, we have seen that:
1. the 3D Rayleigh integral gives a correct synthesized wave field for all types
of sources in a volume using continuous planar arrays of point sources
26
Chapter 3: Auralization of wave fields
I
right array
source
x
y
II
front
array
rear
array
left array
Fig. 3.6
Inverse extrapolation to the loudspeaker arrays.
2. the 2D Rayleigh integral gives a correct synthesized wave field for plane
waves and vertical line sources in a horizontal plane using continuous linear
arrays of (vertical) line sources
3. the 2 12 D Rayleigh integral gives an approximately correct synthesized wave
field in a horizontal plane using continuous linear arrays of point sources.
In this thesis we are dealing with linear arrays of point sources so the 2 12 D Rayleigh
integral should be used, but in equation 2.22, an amplitude factor depending on
the distance secondary source-receiver and primary source-secondary source is
present: the distance from secondary source to receiver is known, but the distance from primary source to secondary source is only known for the direct wave
and not any more for the reflected wave fields which appear to come from virtual
mirror sources.
In this section the 2D Rayleigh integral will therefore be used in which the distance primary source to secondary source does not appear. Plane waves and ver-
3.3 Extrapolation
27
tical line sources will be correctly reproduced whereas amplitude errors will be
present for the reproduction of a point source.
3.3.1 Forward extrapolator
=(
)
In the space-frequency domain, the pressure at a point A
xA ; y1; zA is given
y0 which separates the space
by the Rayleigh II integral over a surface S at y
into a receiver area and a source area (see section 2.3 and figure 3.7):
=
S
source area
z
Fig. 3.7
xi
y = y0
y
x
y
y = y1
n
ri
recording level
receiver area
extrapolation level
A
The wave field in A resulting from source S can be written as a superposition of
wave fields generated by secondary point sources on level y = y0 .
Z
1 + jkr cos e jkrdS; (3.16)
1
P (x; y0 ; z; ! )
P (xA ; y1; zA ; ! ) =
2 S
r2
p
with r = jrj = (xA x)2 + (zA z )2 + y 2 . Equation 3.16 can be writ-
ten with the symbolic convolution notation
P (x; y1; z; ! ) = W + (x; y; !; z ) P (x; y0; z; ! );
where ’**’ denotes convolution in x and z and
extrapolation operator:
W + (x; y; z; ! ) =
y = jy1
1 1 + jkr cos e
2 r2
(3.17)
y0 j. W + is called the
jkr :
(3.18)
28
Chapter 3: Auralization of wave fields
Equation 3.17 describes the extrapolation of the pressure from the surface y 0 to
the surface y1 , as shown in figure 3.8.
S
y = y0
y
z
Fig. 3.8
Forward
extrapolation
W+
W
Inverse
extrapolation
x
y = y1
Forward extrapolation describes wave propagation away from the source and
inverse extrapolation describes wave propagation towards the source.
In the wavenumber-frequency domain the spatial convolution is replaced by a
multiplication operation yielding
~ +(kx; y; kz ; !)P~ (kx; y0; kz ; !);
P~ (kx; y1 ; kz ; ! ) = W
~ + (spatial Fourier transform of W +), being given by:
the extrapolator W
~ +(kx; y; kz ; !) =
W
(
e
e
(3.19)
p(k (k +k )y 2 2 2
p(k +k )x k zy ; kx2 + kz2 k2
j
2
2
2
; kx + kz > k :
~ +(kx; y; kz ; !) describes the
For kx2 + kz2 k 2 the multiplication operator W
phase shift of the waves traveling from the plane y = y0 towards y = y1 . For
~ + describes the near field waves called evanescent waves, the
kx2 + kz2 > k2 , W
x2
z2
2
amplitude of which decrease exponentially for increasing y.
3.3.2 Inverse extrapolator
The inverse extrapolation is defined in the same way as the forward extrapolation.
The wave field extrapolated at y
y1 can also be inversely extrapolated from
y y1 to y y0 as shown on figure 3.8.
=
=
=
From equation 3.19 the inverse extrapolator in the wavenumber-frequency domain
noted W is easily found:
~
~ (kx; y; kz ; !)P~ (kx; y1; kz ; !):
P~ (kx; y0 ; kz ; ! ) = W
(3.20)
3.3 Extrapolation
29
~
In the spatial domain the inverse extrapolator W is defined by the inversion of
the forward operator W + . Here we define the inverse wave field extrapolator as
the conjugate of W + , the so-called ’matched filter’:
~
~
~ (kx; y; kz ; !) =
W
1
~ +
~W +(kx; y; kz ; !) ' W (kx; y; kz ; !);
yielding
~ (kx; y; kz ; !) =
W
+
(
p
p(k +k )
(3.21)
kx2 + kz2 k2
k y ; k 2 + k 2 > k 2 :
x
z
ej k2 (kx2 +kz2 )y ;
e
2
2
x
z
2
For kx2 kz2 > k 2 the inverse extrapolator equals
forward extrapolator. The
p(kthe
2 +k 2 ) k 2 y
+
+
x z
real inversion of W in this case would be e
which is unstable
because it has a positive real exponent. Therefore we keep the negative sign. For
more details the reader is referred to [Berkhout, 1987].
For kx2
~
+ kz2 k2 the operator is the inverse of the forward operator.
In the space-frequency domain:
W (x; y; z; ! ) = W + ( x; y; z; ! ):
(3.22)
3.3.3 Extrapolator for 2-D
Forward extrapolation
The far-field approximation for the 2D Rayleight II integral was defined in equation 2.20:
P (xA ; y = y; ! ) =
r
Z
jk
e jkr
P (x; y = 0; ! ) cos p dx;
2 y=0
r
where A is an arbitrary point at some plane y
(3.23)
= y (figure 3.9).
The extrapolation operator is defined as:
W +(x; y; ! ) =
r
jk
e jkr
2 cos pr
(3.24)
30
Chapter 3: Auralization of wave fields
microphone
array
loudspeaker
array
x
z
r
n
r
Source
A
listening area
y
y
y = y
y=0
Fig. 3.9
Forward extrapolation.
Inverse extrapolation
Consider now the situation of inverse extrapolation depicted in figure 3.10. The
loudspeaker array is positioned between the source S and the microphone array.
The inverse extrapolator can be found by applying equation 3.22 to equation 3.24,
resulting in:
W (x; y; ! ) =
r
e+jkr
jk
cos
p :
2
r
(3.25)
Using W , we can compute the pressure which will need to be emitted by the
loudspeaker arrays:
P (xA; y; ! ) =
r
Z
jk
e+jkr
P (x; y = 0; ! ) cos p dx:
2 y=0
r
(3.26)
The inverse operator is only valid if no sources are positioned between the loudspeaker array and the microphone array. Note that equation 3.26 does not depend
on the distance between the source and the microphone array. This equation has
been used for the examples that will be shown in the remainder of this thesis.
3.4 Reproduction
31
loudspeaker
array
x
n
r
Source
z
r
A
microphone
array
y
y
listening
y = y
Fig. 3.10
3.4
area
y=0
Inverse extrapolation.
Reproduction
The system used in the laboratory for sound reproduction by wave field synthesis
is built of 10 arrays of light-weight electrodynamic loudspeakers which are active
in a large frequency range.
In conventional reproduction of audio signals, the bandwidth of electrodynamic
loudspeakers is usually covered by two or three drivers with complementary frequency ranges, because it is very difficult to construct a single driver over the
whole audio spectrum with enough output power. But in array systems the power
is generated by many elements. So elements with low-power and large frequency
ranges can be used in array systems.
The elements of the array used are electrodynamic loudspeakers with an oval
shape as shown in figure 3.11 below. This shape gives a higher directivity in
the horizontal plane compared to the vertical plane since the larger dimension lies
in the horizontal direction. The length of this loudspeaker is 12.6 cm while the
width is 5.8 cm.
Electrodynamic transducers are placed in large bars in a separate enclosure. The
volume of each element determines the resonance frequency. To keep a low resonance frequency, the volume should be large. But too large a volume will increase
32
Chapter 3: Auralization of wave fields
the dimensions and also the weight of the array. The size of the volume is about
1.8 dm3 and the resonance frequency lies around 180 Hz.
The elements of the array are spaced by 12.6 cm, due to the horizontal dimension
of the loudspeakers. The value of the spatial aliasing frequency is determined by
the spacing between the loudspeakers and equals 1360 Hz.
The reproduction system consists of 160 loudspeakers repartitioned over ten line
array bars in a rectanglar configuration, enclosing a listening area of about m 2 ,
see figure 3.13 and 3.11 top. The arrays hang at a height of about 1.65 m above
the floor. Thus listeners can easily walk through the room to check the spatial
properties of the reproduced sound field.
24
Fig. 3.11
Above: the reproduction room with loudspeaker arrays at the Laboratory of
Acoustics; below: array of electrodynamic loudspeakers with oval shaped
diaphragm.
Two DSP (Digital Signal Processor) systems are used to calculate the driving
3.4 Reproduction
33
signals for the loudspeakers.
One DSP system convolves the early reflections (first 160 ms) of the extrapolated impulse response with the audio signal yielding the auralized wave field and
adds them to the reverberation signals thus forming the driving signals Q m for the
loudspeakers.
The reverberation signals are generated by the other DSP system by real-time convolution of four plane waves with the anechoic audio signal (figure 3.12). DSP 2
calculates the approximate delays and amplitudes for the individual loudspeakers,
in such a way that the loudspeaker arrays synthesize the desired plane wave. The
four plane waves are coming from each of the walls (see figure 3.13(b)) generating
the reverberation signals.
64 impulse responses
(first 160 ms)
dsp1
anechoic
anechoic
signal
convolution
signal s(t)
64 driving
signals Qm
4 plane waves
Fig. 3.12
dsp2
Reverberation
signal: 4 plane
waves convolved
with anechoic signal
Schematic setup of the DSP system.
The output of the DSPs contains 64 driving signals. As a total of 160 loudspeakers
are present and there are only 64 channels in the system, loudspeakers in the 6
front array bars (numbered 4 through 9 in figure 3.13(a)) are coupled per two: two
loudspeakers per output channel. Loudspeakers in the 4 rear arrays (1 through 3
and 10) are coupled per 4.
34
Chapter 3: Auralization of wave fields
6
7
5
8
arrays
4
9
3
10
2
(a)
Fig. 3.13
1
(b)
(a) top view of the arrays of loudspeakers. Each array contains 16 loudspeakers. (b) configuration of 4 plane waves for the reverberation field.
Chapter 4
Auralization of plane waves
4.1
Introduction
In order to better understand the results of the auralization of a complex wave
field measured in a concert hall, we will start by studying a very simple wave
field: a single plane wave. It will allow us to easily identify the limitations of the
reproduction system (spatial aliasing, truncation artefacts) and the influence of the
reproduction room (reflections).
In the first section the properties of a plane wave are discussed. Then the differences between continuous and discrete arrays are shown. Also the effects of
transition from infinite to finite arrays are shown. In the last two parts, simulation
and measurement results for a single plane wave will be given.
4.2
Properties of plane waves
The plane wave solution of the homogeneous wave equation described in equation
2.7 in the positive x direction is:
p(x; t) = p(t
=
x
):
c
(4.1)
The wave front is defined by t xc
constant which gives a straight line in the
x; t -plane, independent of y (figure 4.1). So the amplitude of the wave front
must be constant. Theoretically the amplitude of a plane wave does not decay in
a loss-free medium: there is no attenuation during propagation.
( )
35
36
Chapter 4: Auralization of plane waves
x
t
Fig. 4.1
Impulsive plane wave in the space-time domain.
The propagation direction is perpendicular to the wave front. If t is increased by
t then x has to be increased by x c t to maintain the same amplitude, with
the constant velocity of sound c. The specific acoustic impedance, defined by:
= Zs(x; ! ) =
P (x; ! )
;
V (x; ! )
(4.2)
P (x; t)
;
V (x; t)
(4.3)
is real and frequency independent for a plane wave and is given by 0 c and in this
special case also holds in the (x,t) domain:
Zs(x; t) =
4.3
Experimental setup
The source function used in the following sections is a pulse of 2.2 kHz bandwidth
depicted in figure 4.2(a) in the time domain and its amplitude spectrum in (b).
1. the pressure and the particle velocity are simulated on a cross array in the
horizontal plane (x,y) to obtain a perfect input signal for the auralization to
emulate the recorded signal described in section 3.2.
In figure 4.2(c) the pressure field for a plane wave oriented with an angle
Æ with respect to the x-axis is depicted in the (x,t)-domain at the
microphone positions on array I,
= 30
4.3 Experimental setup
37
0.04
10
1.5
8
1
6
0.5
0
4
−0.5
2
−1
traveltime t [s]
0.045
0.036
0.038
0.04
0.042
traveltime t [s]
(a)
Fig. 4.2
0.044
0.046
0
0.05
0.055
0.06
0
2
4
frequency [kHz]
6
8
(b)
0.065
−4
−2
0
2
4
offset x [m]
(c)
(a) source signal, (b) amplitude spectrum of the source signal and (c) wave field
of a plane wave recorded by microphone array I.
2. cardioid microphones are simulated allowing the separation of the signals
coming from different directions,
3. the separated signals are extrapolated according to the theory of section 3.3
to the loudspeaker arrays at positions y m for array I and x m
for array II, as depicted in figure 4.3,
= 3
= 2
4. the extrapolated signals are fed to the loudspeaker arrays,
5. the pressure of the auralized signals is measured at the position of array I
for comparison with the input signal.
Before using these extrapolated results for reconstruction in a real room, we have
simulated the reconstruction in a reflection-free virtual room. This allows us to
see the quality of an idealised auralization without artefacts due to distortion from
the loudspeaker imperfections and reflections from the reproduction room. The
results will be discussed in section 4.6.
First some effects of discretisation and finite array length will be investigated in
the next two sections.
38
Chapter 4: Auralization of plane waves
x
3m
right array
72
57
74
80
56
II
source
1
3
front
array
2m
y
rear
array
15
41
40
25
23
left array
17
I
Fig. 4.3
4.4
Configuration of the recording microphones and the auralization loudspeakers.
Artefacts due to discretisation
By sampling a continuous signal in space, spatial aliasing will occur if the maximum frequency in the signal is larger than the aliasing frequency:
fal =
c
2x sin max :
(4.4)
where max is the maximal angle of the sound emitted by the loudspeakers.
The discretisation of continuous secondary source distributions by regularly spaced
arrays of point sources leads to a periodic signal in the wave number domain.
When the signal is undersampled in the spatial domain, the spectrum of the signal overlaps in the wave number domain causing spatial aliasing. To illustrate
Æ with the xthis effect, simulated plane waves traveling with an angle axis are shown in figure 4.4 with different distances between the loudspeakers.
Subfigure 4.4(a) represents the reconstructed pressure field as a function of time
with a distance of 6.3 cm between the loudspeakers. No spatial aliasing occurs
= 30
4.4 Artefacts due to discretisation
39
traveltime t [s]
(a) dx = 0.126/2 cm
(c) dx = 0.126*4 cm
0.06
0.06
0.07
0.07
0.07
0.08
0.08
0.08
−1
amplitude
2
−1
2
0
offset
1
−1
2
1
1
0
0
0
20
amplitude
0
1
offset x [m]
1
−1
0.06
0.07
0.08
traveltime t [s]
10
0
Fig. 4.4
(b) dx =0.126*2 cm
0.06
−1
0.06
20
0.07
0.08
travel time t [s]
10
0
2
4
frequency [kHz]
0
−1
0.06
20
0
1
offset x [m]
0.07
0.08
travel time t [s]
10
0
2
4
frequency [kHz]
0
0
2
4
frequency [kHz]
A plane wave traveling under an angle = 30Æ synthetized by a loudspeaker
array for three different sampling distances dx.
(a) above: dx=12.6/2 = 6.3 cm, no aliasing occurs, middle: signal at the receiver position x=-1.3 m and below: spectrum at receiver position x=-1.3 m,
(b) above: dx= 12.6*2 = 25.2 cm, with little aliasing, middle: signal at the
receiver position x=-1.3 m and below: spectrum at receiver position x=-1.3 m,
(c) above: dx= 12.6*4 = 50.4 cm, aliasing occurs, middle: signal at the receiver position x=-1.3 m and below: spectrum at receiver position x=-1.3 m.
since fmax < fal = 5444 Hz. Only effects of the finite array length occur, which
will be discussed in the next paragraph.
In subfigures 4.4(b) and (c) the sampling distance is chosen equal to 25.2 cm and
50.4 cm, respectively. The wave fronts are still visible but distortion of the wave
field occurs: the criterion of anti-aliasing is not respected anymore because f al =
1361 Hz and 680 Hz respectively.
The subfigures below give the reconstructed signals at the receiver position x =
-1.3 m as a function of time and their spectra, respectively. We see that the source
40
Chapter 4: Auralization of plane waves
signal speads out in time due to aliasing. In the amplitude spectrum, we see that
the reconstructed field is not correct for frequencies above a certain frequency.
In the remainder of this thesis the sampling distance is 25.2 cm (the loudspeakers
are coupled per 2 at the front of the room) except for the back of the room where
the loudspeakers are coupled per 4 and the distance is 50.4 cm, as depicted in
figure 4.3. So we expect spatial aliasing to be present in our wave field synthesis.
4.5
Artefacts due to finiteness of array
A loudspeaker array used to reproduce wave fields will always have a finite length,
so diffraction effects will occur in the synthesized wave field due to truncation of
the array.
Subfigure 4.5(a) shows the wave front with two additional signals, starting from
the edges of the loudspeaker array. The middle image of subfigure 4.5(a) gives
the amplitude of the wave front at position x=-1.3 m where the diffraction effect
is visible. To avoid truncation effects the driving signals at the edges of the array
should be attenuated relative to the center using a tapering function. Subfigure
4.5(b) shows the results of the tapering.
The diffraction effects are then attenuated. Note that the wave front is not constant
anymore at the edges, the amplitude has decreased (subfigure 4.5(b) middle) as
well as the amplitude spectrum. In the following a tapering window will be used
to avoid diffraction effects.
4.6
Simulation of plane waves
We will now have a look at the simulation of plane waves auralization. The driving
signals for the loudspeaker arrays are used to simulate the reconstruction in a
reflection-free virtual room.
The simulation was carried out as follows: at each microphone position x m the
distances to all loudspeaker positions, d ml , were calculated. The driving signal of
each loudspeaker Ql is delayed by the travel time corresponding to this distance,
dml =c, and attenuated by =dml giving a new signal Qml . The simulated response
Sm of the microphone at position xm was found by summing the signals Q lm for
1
4.6 Simulation of plane waves
41
travel time t [s]
(a) without tapering
0.04
0.04
0.045
0.045
0.05
−1
amplitude
1
0.04
0.045
travel time t [s]
−1
0.05
0
1
offset x [m]
0.04
0.045
travel time t [s]
10
5
0
−1
1
0
10
amplitude
0.05
0
1
offset x [m]
0
−1
Fig. 4.5
(b) with tapering
0.05
5
0
0
2
4
frequency [kHz]
0
2
4
frequency [kHz]
Plane wave synthesized by a loudspeaker array
(a) above: without spatial tapering function, middle: signal at receiver position
x=-1.3 m and below: spectrum at the same position,
(b) above: with spatial tapering function, middle: signal at receiver position
x=-1.3 m and below: spectrum at the same position.
all loudspeakers l.
Sm (t) =
X
l
Qml (t) =
X
l
1
dml
Ql t
dml c
(4.5)
To start we will consider the reconstruction of a plane wave using non-steering
cardioid microphone arrays. Next we will simulate steering cardioid microphone
arrays to see if the separation is better than with (non-steering) cardioid microphone arrays. Then we will consider the case when a plane wave travels in a plane
different than the horizontal one.
42
Chapter 4: Auralization of plane waves
4.6.1 Reproduction of non-steering synthesized cardioid microphone arrays
First of all, we will look at a cross array of non-steering cardioid microphones.
Subfigure 4.6(a) shows the driving signals for the loudspeaker arrays of a plane
wave oriented with an angle = 30Æ with respect to the x-axis. The loudspeaker
configuration and its numbering are shown in figure 4.3. Thus the first 16 columns
of subfigure (a) correspond to the rear array, the next 24 to the left, then 16 for the
front and the last 24 drive the right array. The empty columns at the back ends of
the side arrays and for the rear array are caused by the fact that the speakers are
coupled per four.
We see that the cardioid attributes a driving signal to the rear and right arrays
(though smaller than for front and left), where there should be none. This is
because the cardioid oriented to the rear array is sensitive for all signals coming
in with an angle different from zero. Similarly the cardioid oriented to the right
array will pick up any signal not coming from 270 Æ with respect to the x-axis.
Subfigure 4.6(b) shows the reconstructed signal between x=-1.7 m and x=+1.7 m
at microphone array I. To get a better insight in the reconstruction, the ideal signal
that we could expect to obtain has been added at the sides of the reconstructed
signal, i.e. the original signal. The wave front is correctly reproduced, it matches
the arrival time and wave length of the original on both sides.
We also observe a wave field at a different angle from the original wave front. To
better understand this additional signal, we can simulate the reproduction of the
plane wave using one array at a time instead of all four loudspeaker arrays. The
contributions of each of the four arrays are shown in subfigures 4.6(c) through
(f). The grayscale on subfigure (d) has been changed to see the small signal from
the rear array. The reconstruction of the wave front by the front array does not
cover the whole width. This can be understood from the fact that the left most
loudspeaker has no left neighbour that constructively adds to the wave front. A
shadow zone of 1.5 m is created on microphone array I as shown in drawing 4.7
where
0
tan =
x
3
) x0 = 1:5 m
(4.6)
This shadow zone will be compensated for by the contribution of the left array
which covers the complementary area (figure 4.6(e)). In this way the wave front
4.6 Simulation of plane waves
43
0.03
front array
0.04
0.04
0.045
0.045
0.05
0.05
0.055
0.06
0.055
0.06
0.045
traveltime t [s]
0.04
traveltime t [s]
traveltime t [s]
0.035
0
20
40
60
0.065
−3
80
−2
−1
rear array
2
0.065
−3
3
0.06
1
2
(d) Signal reproduced by
loudspeaker array at the
rear.
3
0.05
0.055
0.065
−3
2
3
0.045
0.05
0.055
0.06
0.06
offset x [m]
1
0.04
traveltime t [s]
traveltime t [s]
0.055
0
right array
left array
0.05
0
−1
(c) Signal reproduced
by loudspeaker array at
the front.
0.045
−1
−2
offset x [m]
0.04
0.045
traveltime t [s]
1
(b) Signal reproduced on
microphone array I by
all four loudspeaker arrays.
0.04
Fig. 4.6
0
offset x [m]
(a) Driving signals for the
loudspeaker arrays.
−2
0.055
0.06
loudspeaker number
0.065
−3
0.05
−2
−1
0
1
2
offset x [m]
(e) Signal reproduced by
loudspeaker array at the
left.
3
0.065
−3
−2
−1
0
1
2
3
offset x [m]
(f) Signal reproduced by
loudspeaker array at the
right.
Reproduction of a plane wave after separation with cardioid microphone arrays.
is reconstructed over the whole width. For the right and rear arrays, we can see
that the cardioid has attributed the same driving signal as for the left and front
arrays respectively but with a (much) smaller amplitude. We can conclude that
using a cardioid microphone perpendicular to the arrays does not allow a perfect
reconstruction of the plane wave because of these parasite signals. Note also the
presence of aliasing as expected.
44
Chapter 4: Auralization of plane waves
wave front
y
3m
I
x
x0
2m
II
Fig. 4.7
Reproduction by the front array of a plane wave arriving under an angle of 30Æ
with respect to the x-axis.
4.6.2 Reproduction of synthesized steering cardioid arrays
Next, auralization of a plane wave with the same incidence angle as in the previous subsection has been simulated, but this time the cardioid is oriented in the
direction of incidence of the plane wave, i.e. using a steering cardioid. This direction can be obtained from the two horizontal particle velocity components as
explained in chapter 3.
Figure 4.8 shows the results in the same way as figure 4.6. In subfigure (a), no
signal is attributed to the right and rear array as expected. We can see that the separation with the steering cardioid is perfect. The reconstructed signal in subfigure
(b) matches the original signal well on both sides. In subfigures 4.8(c) through (f)
only the front and left arrays contribute to the reconstruction of the plane wave.
Aliasing is still present.
4.6 Simulation of plane waves
45
front array
0.03
0.04
0.04
0.045
0.045
0.04
0.045
0.05
traveltime t [s]
traveltime t [s]
traveltime t [s]
0.035
0.05
0.055
0.055
0.055
0.06
0.06
0.06
0.065
−3
0.065
−3
0
20
40
60
80
−2
−1
loudspeaker number
2
3
0.05
0.055
0
1
2
offset x [m]
(d) Signal reproduced by
loudspeaker array at the
rear
3
1
2
3
right array
0.04
0.04
0.045
0.045
0.05
0.055
0.065
−3
0
(c) Signal reproduced
by loudspeaker array at
the front
0.05
0.055
0.06
0.06
0.06
−1
−1
offset x [m]
traveltime t [s]
traveltime t [s]
0.045
−2
−2
left array
rear array
Fig. 4.8
1
(b) Signal reproduced
on microphone array I
by all four loudspeaker
arrays.
0.04
0.065
−3
0
offset x [m]
(a) Driving signals for
the loudspeaker arrays
traveltime t [s]
0.05
−2
−1
0
1
2
offset x [m]
(e) Signal reproduced by
loudspeaker array at the
left
3
0.065
−3
−2
−1
0
1
2
3
offset x [m]
(f) Signal reproduced by
loudspeaker array at the
right
Reproduction of a plane wave after separation with steering cardioid microphone arrays.
4.6.3 Reproduction of a plane wave coming with an elevation angle different from zero
Consider now the case of a plane wave propagating out of the horizontal plane: a
plane wave coming with an elevation angle of 45 Æ and an azimuthal angle of 30Æ .
The calculated driving signals for the loudspeakers are shown in subfigure 4.9(a).
The reconstructed signal embedded in the ideal plane wave simulation is depicted
in figure 4.9(b).
46
Chapter 4: Auralization of plane waves
front array
0.03
0.04
0.04
0.045
0.05
0.045
0.05
0.055
0.055
0.06
0.04
travel time t [s]
travel time t [s]
traveltime t [s]
0.035
0
20
40
60
0.06
−3
80
−2
−1
0
0.06
−3
3
0.05
1
2
offset x [m]
(d) Signal reproduced by
loudspeaker array at the
rear
3
2
3
0.04
0.045
0.05
0.06
−3
1
right array
0.055
0.055
0
(c) Signal reproduced by
loudspeaker array at the
front
travel time t [s]
travel time t [s]
0.045
0
−1
offset x [m]
0.04
−1
−2
left array
0.04
travel time t [s]
2
(b) Signal reproduced on
microphone array I by
all four loudspeaker arrays
rear array
Fig. 4.9
1
offset x [m]
(a) Driving signals for
the loudspeaker arrays
−2
0.05
0.055
loudspeaker number
0.06
−3
0.045
0.045
0.05
0.055
−2
−1
0
1
2
offset x [m]
(e) Signal reproduced
by loudspeaker array at
the left
3
0.06
−3
−2
−1
0
1
2
3
offset x [m]
(f) Signal reproduced by
loudspeaker array at the
right
Reproduction of a plane wave coming out of the horizontal plane.
Plane waves propagating with an elevation angle should be recorded and reconstructed with planar arrays. But, as mentioned before, in practice we use linear
array leading to errors in the reproduced signal.
We can see that the reproduced signal comes at a completely wrong time in subfigures (e) and (f) (the grayscale of (d) has been modified), and that the contributions
of front/rear and left/right are not consistent with each other. We can calculate
the apparent angle of incidence for a horizontal plane wave corresponding to the
observed apparent velocity of the wave front as seen by arrays I and II either geo-
4.7 Auralization of plane waves in a non-anechoic room
47
metrically or by measuring the slope of the wave fronts in the measurements. We
find an apparent angle of 21Æ for array I and 51Æ for array II. This explains why
the contributions from both arrays are not consistent with each other. In the auralization of concert hall acoustics (in the next chapter), no correction was applied to
suppress these effects.
4.7
Auralization of plane waves in a non-anechoic room
Now that we have seen the results of the simulation of plane wave reconstruction,
we will have a look at the auralization in a real room as described in chapter 3.
The first auralized signal is a plane wave parallel to array I. The measured result
is displayed in figure 4.10(a), together with the ideal signal.
The plane wave is reconstructed over the whole width with the correct wave
length.
Since the room in which we have done the auralization is not anechoic, we observe
that there are reflections of the reproduction room present in the auralized signal.
We can identify them by calculating the delays corresponding to reflections from
the walls, the floor and the ceiling.
This will be useful in order to recognize them in a more complex auralized signal.
We have identified the following reflections as follow:
Event 1: is the first reflection from the ceiling.
Event 2: is the first reflection from the wall behind the front array together with the
first reflection from the floor.
Event 3: is the first reflection from the floor followed by a second reflection from the
ceiling and conversely and also a first reflection from the wall behind the
front array followed by a second reflection from the floor, which explains
the stronger event.
Event 4: is the reflection from the ceiling, then from the floor and a third reflection
from the ceiling.
Event 5: is the first reflection from the wall behind the front array, then from the floor
and the ceiling and a fourth one from the floor.
48
Chapter 4: Auralization of plane waves
Event 6: is the first reflection from the wall behind rear array.
The multiple reflections are more difficult to identify because of the complex pathways. After 60 ms it becomes harder to make out the individual reflections in the
’noise’ of multiple reflections and diffractions.
0.045
0.04
0.045
2
1
0.05
2
0.05
0.045
3
1
0.055
0.055
2
0.05
0.06
0.055
5
0.06
6
0.06
3
traveltime t [s]
traveltime t [s]
traveltime t [s]
4
0.065
0.07
3
0.065
0.07
0.065
0.07
0.075
0.08
−4
−2
0
2
offset x [m]
(a) plane wave with incident parallel to the x-axis.
Auralized after processing
with steering cardioid microphones.
Fig. 4.10
4
0.075
0.075
0.08
0.08
0.085
0.085
0.09
−4
−2
0
2
offset x [m]
(b) Plane wave incident
under 30 Æ with respect
to the x-axis. Auralized after processing
with steering cardioid
microphones.
4
0.09
−4
−2
0
offset x [m]
2
4
(c) Plane wave incident
under 30 Æ plotted in octave band centered around
500 Hz.
Plane waves with different incident angles auralized in the reproduction room.
Figure 4.10(b) is the result of auralization of a plane wave incident under 30 Æ
with respect to the x-axis. A steering cardioid was used to make the separation
between signals coming from front-rear and signals coming from left-right. Again
the wave front is perfectly reproduced. We can clearly see aliasing between the
4.8 Conclusion
49
arrows denoted 1 caused by the front array, and aliasing caused by the left array
denoted 2. By plotting this signal in the octave band centered around 500 Hz (see
figure (c)) we can see that the aliasing is not present anymore since the frequencies
above the aliasing frequency have been removed.
Arrows 3 (in figure (b)) show the reflections of the aliasing at the right wall. Also
other reflections are visible but difficult to identify because of the complex pathways for a wave incident with an angle.
4.8
Conclusion
We can conclude that:
a simple wave can be auralized,
a steering cardioid gives a perfect suppression of the signals from undesired
directions,
a lot of aliasing is observed,
the influence of the reproduction room can easily be found on a plane wave
with incident parallel to the x-axis.
Chapter 5
Physical comparison
5.1
Introduction
In the previous chapter, we have seen that:
1. we can obtain a near perfect reconstruction for a simple signal
2. the reproduction room has an influence on the reconstructed signal, that can
not be neglicted
In this chapter we will see how a total impulse response, formed by, amongst
others, wave front and reflections of a complex room (a concert hall), will be
reconstructed.
In step 1 of the setup of section 4.3, the simple perfect input is now replaced by a
complex measured signal, after which the extrapolation was carried out. The resulting driving signals were again fed to loudspeaker arrays in a virtual anechoic
and a real (non-anechoic) reproduction room, where the resulting wave field was
measured at the microphone positions of array I. In the following sections, physical characteristics of both the simulation and the measurements, will be described.
The next chapter will discuss perceptual aspects.
The original impulse responses were measured in two different concert halls: the
Concertgebouw in Amsterdam and De Doelen in Rotterdam. Figure 5.1 shows
the ground plan and the measurement configuration of both halls. The wave field
was auralized in different areas in each concert hall. The center of the first area
51
52
Chapter 5: Physical comparison
(a) Concertgebouw in Amsterdam (43m*28m*17m)
(b) De Doelen in Rotterdam (57m*32m*14m)
Fig. 5.1
Ground plan of the Concertgebouw and De Doelen, the microphone cross-array
and the position of the areas where the wave field have been auralized.
5.2 Influence of the reproduction room on the reconstruction of a monopole
source signal
53
coincides with the center of the concert halls. A second area was chosen 8 meters
to the right of the center, and a third one 8 meters to the right and 5 meters to the
back (only for the Concertgebouw).
5.2
Influence of the reproduction room on the reconstruction of a
monopole source signal
We have measured the response of the reproduction room on a wave front signal.
This will be helpfull for the identification of reflections which were not present in
the original signal. For this we have taken the first 3 ms of the measured impulse
response of the Concertgebouw.
35
35
40
40
1
2
45
45
50
50
3
4
38
55
55
60
65
42
65
44
70
70
75
75
80
80
48
85
85
50
90
−4
−2
0
offset x [m]
2
(a) original signal
Fig. 5.2
40
60 6
traveltime t [ms]
traveltime t [ms]
traveltime t [ms]
5
4
90
−4
46
52
−2
0
offset x [m]
2
(b) measured signal
4
−2
−1
0
1
offset x [m]
2
3
(c) zoom
Auralization of a wave front in the reproduction room.
Figure 5.2(a) shows the wave front of a monopole source and figure (b) gives the
54
Chapter 5: Physical comparison
reconstructed wave front in the reproduction room. We find the same reflections
already identified in figure 4.10(a). The reader is referred to section 4.7 for the
identification of the numbering of the reflections. In figure 5.2(c) we can see the
aliasing better. We observe that there is a lot of reverberation signal after the input
signal has died off.
5.3
Auralization of the Concertgebouw impulse responses
Measurements of the impulse responses of a monopole source positioned on the
stage of the concertgebouw have been done by J. Baan and J.J. Sonke, recorded
along the microphone position arrays I and II. The offset-traveltime representation
of the measurements on array I is shown in figure 5.3 and in figure 5.4(a) for an
enlargement.
40
60
traveltime t [ms]
80
100
120
140
160
−10
−5
0
5
10
offset x [m]
Fig. 5.3
Impulse responses of the Concertgebouw as a function of position.
The vertical axis represents the traveltime t beginning at zero when the source
fires. The horizontal axis gives the offset x of the microphone from the center
of the hall. The figure shows the temporal and spatial structure of the impulse
5.3 Auralization of the Concertgebouw impulse responses
55
1
1’
40
40
40
2
2’
3
3’
50
4
50
50
5
6
60
70
traveltime t [ms]
60
traveltime t [ms]
traveltime t [ms]
60
70
70
80
80
80
90
90
90
100
−2
0
2
offset x [m]
(a) original impulse responses
Fig. 5.4
100
−2
0
2
offset x [m]
(b) simulated impulse
responses
100
−2
0
2
offset x [m]
(c) measured impulse responses
Offset-traveltime representation of the impulse responses of the Concertgebouw. (a) original measured impulse responses. Between x=-1.7 and x=+1.7
m (area 1) the data has been replaced by: (b) the simulated wave field in the
virtual reproduction room and (c) the measured wave field in the reproduction
room.
56
Chapter 5: Physical comparison
responses. The direct front and the reflections from the sidewalls are easily identified. Knowing the geometry of the concert hall, we can also identify the other
reflections, but this was not part of this study. After about 100 ms the reflections
can not be identified individually anymore, the reflection density increases and
the wave field becomes more complex. Also the amplitude decreases significantly
after multiple reflections due to both the geometrical speading of the sound field
and the absorption during reflection against the room bondaries.
Using these measurements we can reconstruct the sound field in any area covered
by the measurements.
5.3.1 Reproduction in area 1
Simulation
As discussed in chapter 4 the wave field on array I and II is separated with simulated steering cardioid microphones and inversely extrapolated, giving the driving
signals for the loudspeakers arrays. These signals have been used in computer
simulations in a virtual reproduction room.
= [ 1 7; +1:7] m are displayed in
:
The results of the simulation in area 1 at x
figure 5.4(b) replacing that part of the original signal.
We see for example that the simulated direct front coincides with the original wave
front. The same holds for the first reflections from the side walls.
Comparison with figure 5.4(a) shows a quite good overall similarity. The weaker
signals at times 61 ms, 69 ms and 77 ms are also reproduced almost perfectly.
However we see the presence of a lot of (strong) aliasing. The main artefacts are
due to aliasing.
Measurements
The extrapolated signals have been fed to the loudspeaker arrays in the reproduction room where the pressure of the resulting wave field was measured at
the position of array I for comparison with the original sound field. The (x,t):
representation of the measurements can be seen in figure 5.4(c) between x
and x
: m. The digital processing system was isolated to prevent the generation of additional noise.
= +1 7
= 17
5.3 Auralization of the Concertgebouw impulse responses
57
Here too the wave front is connecting to the original one but it is little bit weaker.
A reason could be the following: we adapt the scale of the measurements so that
its maximum amplitude equals the maximum amplitude of the original. Because
the microphones at both ends are just next to a loudspeaker, they pick up a too
strong amplitude. Therefore the scale is maybe not optimal.
The reflections of the reproduction room that can be identified have been numbered as in the previous figure. Event number 20 is the reflection from the floor
and ceiling and conversely and also from the wall behind the front array and the
floor (equivalent to number 3) of events 10 . The sidewall reflections are well reproduced but not exactly at the correct time. The reason is probably that the measurements were done at a different sampling frequency than the original signal
and then resampled at a rate which is slightly different from the original one.
We observe especially strong aliasing below the cross of the first reflections, which
resembles what we have seen in figure 4.7(b) for a plane wave. A few weaker
events present in the original signal are hardly noticeable in the measurements.
This could be due to (interference with) the quite strong reverberation which we
have noted in section 5.2.
5.3.2 Reproduction in area 2
The same experiments have been carried out in the second area (see figure 5.1).
Figure 5.5(a) represents the offset-traveltime of the original impulse responses of
the Concertgebouw around area 2. We observe that the amplitude of the events is
smaller than in area 1 (except for the direct wave).
Figure (b) shows the simulation of the reconstructed wave field. This time the
direct wave is not well reproduced over the entire length. As shown in [Hulsebos
et al., 2001] the area within which good reproduction can be obtained with a cross
array configuration is limited. Here we are crossing the limit of this area. In figure
5.6 the results are shown for a smaller reproduction room (4*2 m instead of 4*6
m). We can see that the reproduction of the direct wave is now almost perfect so
we conclude that we are now (nearly) within the limits.
The other reflections are reasonably well reproduced.
Figure (c) gives the measurements of the auralization in area 2. Just like the
simulation, the wave field shows artefacts. On top of the ones present in the
58
Chapter 5: Physical comparison
40
40
40
50
50
50
60
80
70
70
traveltime t [ms]
traveltime t [ms]
70
traveltime t [ms]
60
60
80
80
90
90
90
100
100
100
110
110
110
120
6
8
10
offset x [m]
(a) original impulse responses
Fig. 5.5
120
6
8
10
offset x [m]
(b) simulated impulse
responses
120
6
8
10
offset x [m]
(c) measured impulse
responses
Offset-traveltime representation of the impulse responses of the Concertgebouw. (a) original measured impulse responses around area 2. Between x=6
and x=9.4 m the data have been replaced by: (b) the simulated wave field in the
virtual reproduction room and (c) the measured wave field in the reproduction
room.
5.3 Auralization of the Concertgebouw impulse responses
59
40
50
traveltime t [ms]
60
70
80
90
100
110
120
Fig. 5.6
6
8
10
offset x [m]
Offset-traveltime representation of the impulse responses of the Concertgebouw
simulated in a smaller area than area 2 (4*2 m instead of 4*6 m).
60
Chapter 5: Physical comparison
simulation, we can see a lot of artefact signals due to the reproduction room.
5.3.3 Reproduction in area 3
Within area 3 no measurements have been made in the Concertgebouw, but from
the pressure and the particle velocity recorded on arrays I and II, extrapolation
can be carried out everywhere in the concert hall within the limitation mentioned
in [Hulsebos et al., 2001], allowing the auralization in the reproduction room of
the wave field in area 3 (figure 5.7). No comparison with the original signal is
possible, but we observe the same characteristics as in area 2, except that the
direct wave is slightly better reproduced which means that we are (more) within
the allowed limit for correct reproduction. It also proves that it is possible to
auralize in this area, away from the microphone arrays.
5.4
Auralization of De Doelen impulse responses
We will now have a look at the results of the auralization of De Doelen (figure
5.1(b)). Due to the different geometry of De Doelen from the Concertgebouw,
different shapes and arrival times of the reflections are observed. The original impulse responses displayed in figure 5.8 look more homogeneous than the impulse
responses in the Concertgebouw. There are less events coming from the sides.
5.4.1 Reproduction in areas 1 and 2
Here the same experiments have been conducted. Results of area 1 can be found in
figures 5.8(b) and (c). Again strong aliasing effects are present in the direct arrival
and the side reflections. At the side reflections this causes steeply dipping events
similar to what we observed in figure 4.10. Also like for the Concertgebouw, we
see a high reverberation level due to the reflections of the reproduction room. This
tends to hide the weaker events of the auralized signal.
Simulation and measurement results of area 2 can be seen in figure 5.9. We note
that we are again partly beyong the allowed area. Like before, there is a lot of
aliasing present in the auralized signals.
5.4 Auralization of De Doelen impulse responses
61
40
40
50
50
60
70
70
traveltime t [ms]
traveltime t [ms]
60
80
90
90
100
100
110
110
120
6
8
10
offset x [m]
(a)
Fig. 5.7
80
120
6
8
10
offset x [m]
(b)
Offset-traveltime representation of the impulse responses of the Concertgebouw. Between x=6 and x=9.4 m (area 3) the data have been replaced by: (b)
the simulated wave field and (c) the measured wave field in the reproduction
room.
62
Chapter 5: Physical comparison
35
35
35
40
40
40
45
45
45
50
50
50
1
2
3
4
5
55
55
55
60
65
traveltime t [ms]
traveltime t [ms]
traveltime t [ms]
6
60
65
60
65
70
70
70
75
75
75
80
80
80
85
85
85
90
−2
0
2
offset x [m]
(a) original impulse responses
Fig. 5.8
90
−2
0
2
offset x [m]
(b) simulated impulse
responses
90
−2
0
2
offset x [m]
(c) measured impulse responses
Offset-traveltime representation of the impulse responses of De Doelen at microphone array positions. (a) original measured impulse responses, (b) simulated wave field and (c) measured wave field.
63
40
40
50
50
50
60
60
60
70
70
70
80
traveltime t [ms]
40
traveltime t [ms]
traveltime t [ms]
5.4 Auralization of De Doelen impulse responses
80
80
90
90
90
100
100
100
110
110
110
120
6
8
10
offset x [m]
(a) original impulse responses
Fig. 5.9
120
4
6
8
10
offset x [m]
(b) simulated impulse
responses
120
6
8
10
offset x [m]
(c) measured impulse
responses
Offset-traveltime representation of the impulse responses of De Doelen at microphone array positions. (a) original measured impulse responses, (b) simulated wave field and (c) measured wave field.
64
5.5
Chapter 5: Physical comparison
Conclusions and discussion
We have seen that the reproduction room has a large effect on the quality of the
reproduction signal. The best results would be obtained in an anechoic room.
However this is not practical. Although some absorbing materials were applied to
the walls, this has not eliminated the acoustics of the reproduction room.
An other way to reduce the impact of the reproduction room would be to try to
cancel the reflections by emitting signals in counter phase through the loudspeakers. However this is nearly impossible because of the enormous complexity and
computation requirements.
The next obvious element to improve would be to reduce the aliasing. This could
be achived by using more loudspeakers, for example by driving all loudspeakers
in the current setup individually.
In the results from area 2 and 3, we observe that the area in which good auralization is possible is limited for a cross array setup, and does not cover the whole area
defined by the two arrays. An alternative would be to use the contributions from
both microphone arrays for the extrapolation to all four sides. Another option
could be to use a circular array (see [Hulsebos et al., 2001]).
Chapter 6
Perceptual comparison of original
and auralized impulse responses
6.1
Introduction
We have seen that the auralization system functions physically well for a complex
signal and we have drawn some conclusions, but it will be useful to add to this a
perceptive aspect.
A listener uses perceptual (subjective) criteria like spaciousness (spatial impression), loudness, warmth and coloration to describe the acoustics of a hall. Usually
the direct sound and its reflections can not be heard separately; they are integrated
by the human auditory system into an overall sound impression: the acoustical
perception of the hall. However, the impulse response can be divided into three
parts contributing to different aspects of the acoustical perception.
1. Primary sound: the direct part of the sound field (non reflected) and the very
early reflections arriving within 20 ms after the direct sound. The direct
sound helps for the localization of the source. The energy of the very early
reflections contributes to the reinforcement of the direct sound.
2. Early reflections: the reflections arriving at the receiver between 20 and 80
to 100 ms after the direct sound contribute to the loudness and the clarity of
sound and the perceived apparent source width.
65
66 Chapter 6: Perceptual comparison of original and auralized impulse responses
3. Reverberation: reflections reaching the listener later than 100 ms after the
direct sound create the reverberant field. The reverberation is related to
subjective parameters like warmth, brillance, envelopment.
1
Amplitude
0.5
0
−0.5
−1
1
0
3
2
50
100
150
travel time t [ms]
Fig. 6.1
Impulse response measured in the Concertgebouw at a receiver position can be
divided into three parts: (1) the primary sound, (2) the early reflections and (3)
the reverberation.
We will define the perceptual cues which will be discussed in the rest of the chapter.
Spaciousness: can be divided in two effects: the first is the Apparent Source
Width (ASW) which describes how the apparent size of the sound source seems
broader when music is performed in a concert hall than the visual width of the
actual source. Especially early lateral reflections from side walls contribute to this
image broadening.
The second effect is the envelopment which describes how much the listener feels
himself enveloped by the sound in a hall. Late lateral reflections (reverberant
sound) contribute to this envelopment.
Coloration: due to interference between direct sound and early reflections and
due to frequency dependent reflection of sound by e.g. walls, some frequency
components are being amplified while others are attenuated. This results in a
change in the coloration of the sound. Strong coloration is usually very undesirable.
6.2 KEMAR head
6.2
67
KEMAR head
The artificial head and ear of the dummy-head KEMAR are made to resemble
the human ones and to simulate the distortions of the sound fields by the human
head and ear. The KEMAR head allows recording of a binaural signal using two
transducers placed inside the ear canal of each ear. Ideally the recording should be
done within the ears of listener himself to account for the particular morphology
of his auditive system. Also, for an optimal result, the signals should be played
back at the same location where they were recorded, i.e. inside the ear canal.
However, in practice the playback is done using headphones at the entry of the
ear canal. The ear canals of KEMAR add coloration to the signals. If a listener
listens to a recording made by KEMAR the spectrum is distorted twice, once by
KEMAR’s ear canal and once by his own. The distortion by his own ear canal is
natural, but the distortions caused by KEMAR’s ear canal should be removed. A
filter was used to compensate for this distortion.
Figure 6.2 shows an example of a binaural recording in the Concertgebouw with
the KEMAR head. The reflections coming from the left wall are recorded by the
left ear while hardly reaching the right ear and vice-versa.
We will compare perceptively the original signal measured in the concert hall
with the measured signal in the reproduction room by looking at two perceptual
criteria: coloration and spatial impression.
6.3
Spatial impression - Apparent Source Width
The notion of apparent source width (ASW) emanates from the image broadening
of the source signal received by a listener. It is caused by the difference between
the left and the right ear signals. The ASW increases if the signals are less correlated (i.e. more different from each other), and for more strongly correlated
signals, the ASW decreases.
A measure to quantify ASW is the ’interaural cross-correlation coefficient’. The
IACC is defined as the maximum of the absolute value of the normalized interaural
cross-correlation function, in the delay range j j 1 ms:
IACC = max jlr ( )j;
j j 1ms:
(6.1)
68 Chapter 6: Perceptual comparison of original and auralized impulse responses
40
40
50
50
60
60
70
80
70
80
90
90
100
100
110
110
120
−5
Fig. 6.2
right ear
30
travel time t [ms]
travel time t [ms]
left ear
30
0
offset x [m]
5
120
−5
0
offset x [m]
5
Binaural impulse responses measured with the KEMAR head in the Concertgebouw.
where
R +1
pl (t)pr (t + )dt
lr ( ) = qR +1 1
2 dt R +1 jpr (t)j2 dt
j
p
(
t
)
j
l
1
1
(6.2)
In this formula, the numerator is the cross-correlation function and the denominator performs a normalisation by the total energy of the two signals.
Two filters were applied before the calculation of the IACC: a time window and a
frequency window.
Time window
Since only early reflections contribute to the apparent size of the source, only the
first 80 ms after the direct sound are taken into account for the left and right ear
6.3 Spatial impression - Apparent Source Width
69
signals. We therefore apply a time window as defined in [De Vries et al.]:
w80ms (t) =
8
>
<
>
:
1;
cos
0;
for
2 (t
60)
80 ; for
for
0 < t < 60 ms
60 < t < 100 ms
t > 100 ms
This time window is plotted in figure 6.3(a).
Frequency window
The frequency window used is defined as follows:
W (f ) =
e (f=300 2) ; for f < 600 Hz
e (f=600 1) ; for f 600 Hz
2
2
amplitude
It takes into account only the dominant frequency components that contribute to
the ASW according to [Raatgever, 1980].
1.2
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
0.05
0.1
time t [s]
(a) time window
Fig. 6.3
0.15
0
0
2
4
frequency [kHz]
6
8
(b) frequency window
Time and frequency windows used for the calculation of the IACC.
Using the binaural signals measured in the Concertgebouw, the IACC can be plotted (figure 6.4(a)), after having applied the time and frequency window filters.
We see (strong) fluctuations in the IACC values due to interference of wave components. Note the strong correlation between the left and right ear signals at the
center position.
Figure 6.4(b) shows an enlargement of part of figure 6.4(a), with added to it the
IACC values for the auralized version of the Concertgebouw wave field at three
70 Chapter 6: Perceptual comparison of original and auralized impulse responses
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
−10
−5
0
offset x [m]
5
10
(a) IACC as a function of dummy head offset in the Concertgebouw.
1
IACC for the reproduction room
IACC for the Concertgebouw
0.8
0.6
0.4
0.2
0
−2
0
2
4
6
8
10
offset x [m]
(b) IACC as a function of dummy head offset in the Concertgebouw and for its auralized version at 3 different
positions.
Fig. 6.4
IACC as a function of dummy head offset in the Concertgebouw in Amsterdam
and the auralized version.
positions where the auralization has been performed. We observe that the measured IACC shows the same fluctuation pattern as the original one, but that the
absolute amplitudes are slightly different. For example, the measured IACC at the
center of the room stays below 0.6, while the original goes up to 0.9. A reason
for this could be the acoustical asymmetry of the reproduction room which causes
the left and right reflections to differ. Also [Verheijen, 1997] mentions a decrease
in IACC as a result of interference of aliased waves with the actual wave field
6.4 Coloration
71
causing phase distortion between the signals of the two ears.
6.4
Coloration
The sound produced by a source at the stage of a concert hall is reflected at the
walls, where a part of the sound energy is absorbed. Due to interference between
frequency-dependent absorption and direct sound and early reflections (comb effect), some frequencies of the signal will be intensified and others attenuated. The
perception of the spectral distortion of the signal due to these effects, is called coloration. In free-field i.e. where there are no reflections, the sound can be perceived
coloration free.
Figure 6.5 shows the direct sound and a reflection at two different listener’s positions. A and B are positioned at the same distance from the source. For position
A, the path difference between direct sound and reflection is smaller than the path
difference at position B. The coloration at position A will therefore be different
from the coloration in B.
A
B
Fig. 6.5
The paths of the direct sound and a reflection are different for each position,
leading to coloration of the sound field.
Hence each position in a concert hall has a different coloration.
After auralization, the frequency content of the original signal is also distorted by
the recording and the reproduction (microphones, amplifiers, loudspeakers), and
by reflections in the reproduction room. Hence, the auralized field will have a dif-
72 Chapter 6: Perceptual comparison of original and auralized impulse responses
ferent coloration than the original field. Figure 6.6 shows the frequency spectrum
of an impulse response measured in the Concertgebouw and an auralized impulse
response measured in the reproduction room at the corresponding position.
14
14
12
12
10
10
8
8
6
6
4
4
2
2
0
−2
10
−1
0
(a) Spectrum of the original signal
Fig. 6.6
0
10
10
frequency [kHz]
−2
10
−1
0
10
10
frequency [kHz]
(b) Spectrum of the auralized signal
Spectra of an impulse response measured in the Concertgebouw and of an impulse response measured in the reproduction room, in semilog scale.
Removal of coloration using Patterson filters
We now consider the auditory system and especially the cochlea which transforms pressure variations into neural impulses traveling in the auditory nerve to
the brain. Each auditory nerve fiber responds over a certain range of frequencies.
Auditory filters have been modelled in [Patterson et al., 1986] by a set of parallel
band-pass filters with increasing center frequency fc . The shape of these filters is
described by:
s
jf fcj e 2Wjf (ffcc)j
hfc f
(6.3)
( ) = 1 + 4W (f )
c
( )
where f is the frequency, fc the center frequency and W fc the bandwidth of the
auditory filter. According to Patterson, the relation between center frequency and
bandwidth is given by:
W (fc) = (6:23 fc2 + 93:39 fc + 28:52) :10 3
(W and fc in kHz)
(6.4)
It is sufficient to use 24 such bands to cover the audible range (the so called critical
bands) because the ear can not distinguish between frequencies within one such
6.5 Experiments
73
band. A representation of a set of Patterson auditory filters is shown in figure 6.7.
0.1
0.08
0.06
0.04
0.02
0
100
Fig. 6.7
200
300
400
500
600
700
800
900
1000
Normalised Patterson auditory filters in the frequency domain.
In order to remove the difference in coloration between the original and the auralized impulse responses, we have applied a set of Patterson filters to the power
spectra of both the original and the auralized signals. This gives a measure of
the energy of both signals in each frequency band. We then multiply each of the
frequency bands of the auralized signal with the ratio of (the square root of) those
energies. This procedure is repeated several times, until both signals have (nearly)
the same energy content in all frequency bands. The spectra of energy in the Patterson bands for the two signals are shown in figure 6.8(a). Figures (b) and (c)
show the results after applying the energy equalization a few times. We see that
the spectral differences are gradually removed, resulting in decreasing coloration
difference. This allows a better comparison of the spaciousness of the original and
the auralized signals.
6.5
Experiments
Two different tests will be described in this section: one to test the effect of the
minimization of coloration and the other to compare the ASW. Subsection 6.5.1
will describe the experiments and in subsection 6.5.2 the results will be given and
discussed.
74 Chapter 6: Perceptual comparison of original and auralized impulse responses
7
6
original
auralized
6
6
original
1 time
2 times
5
5
original
3 times
4 times
5
4
4
3
3
2
2
1
1
4
3
2
1
0
0
2
4
frequency [kHz]
(a)
Fig. 6.8
6
8
0
0
2
4
frequency [kHz]
(b)
6
8
0
0
2
4
frequency [kHz]
6
8
(c)
Spectra of the energy in the Patterson bands for the original signal, the auralized signal and the auralized signal after several equalization treatments.
The perceptive aspects have only been judged on the impulse responses of the
Concertgebouw, not for De Doelen. The reason for this was that measurements
in De Doelen with the Kemar head have not been done, so comparison of the
auralized impulse responses with the original signal was not easily possible.
6.5.1 Description of the experiments
Experiment 1
Experiment 1 has been done in order to find out whether the added coloration of
the auralized signal has been perceptually removed with respect to the original
signal. Three signals were presented to the listener. The second signal was the
original signal. The first and last signals were the auralized signal and the auralized signal with the coloration minimized, or the other way around. The listener
was asked which of the first and last signals resembled the second most closely.
Each signal has a duration of 5 seconds.
Experiment 2
This experiment has been done in order to find out whether any difference in
apparent source width could be heard between the original signal and the auralized
signal with the coloration minimized. These two signals were presented three
6.5 Experiments
75
times with a short pause in between. Each signal had a duration of 3 seconds. The
listener was asked which of the two signals sounded broader. The listener was not
obliged to make a choice if he did not hear any difference.
All experiments were done using both the first 160 ms and the first second of
the impulse responses of the Concertgebouw. Further, the experiments were performed twice: once using the impulse responses convolved with white noise, the
other time after convolution with a piece of music (a cello recorded in an anechoic
room). The experiments were repeated for three different positions in the hall, P1
in the center of the room, P2 approximately half way between the center and the
right wall, and P3 near the right wall, all along array I. This makes for a total of
12 different signals for the tests.
In order to correct for the ear canal of the KEMAR head recordings, a real time
hardware filter was used. The experiments lasted around 20 minutes. Six persons
who were not experienced listeners, have done the test. The subjects could repeat
the signals as often as they wanted.
6.5.2 Results
Observations from the coloration experiment
In this test, the subjects were asked to compare the 2 auralized versions with the
original one for each of the 12 signals. Each time the signal with the coloration
minimized was assessed to resemble the original closest, with two exceptions.
Two subjects have ones chosen the untreated auralized signal as closest to the
original one, in both cases for the 160 ms impulse response at position P2, for
one person when convolved with white noise, for the other when convolved with
music. However in general, we can conclude that the subjects can clearly hear the
difference and that the impulse response with coloration minimized resembles the
original most.
Observations from the Apparent Source Width experiment
In the second experiment, for each of the four cases (music and noise, on 160 ms
and 1s of the impulse response) the subjects were presented twice with the signals
76 Chapter 6: Perceptual comparison of original and auralized impulse responses
for the 3 positions, in random order. The order of the auralized and the original
signal was not necessarily the same in both cases. Each time the subject had to
choose the broader of the two signals. The fact that each signal was judged twice
was used to verify whether the choices were made consistently.
In the following tables the results of the ASW experiment are summarized. Table
6.1 gives the results for the signals created with the first 160 ms, and table 6.2 for
those created with 1 second. The meaning of the symbols is:
A: the subject has chosen both times the auralized signal as the broader,
O: the subject has chosen both times the original signal as the broader,
?: the subject has chosen one time the auralized and the other the original signal,
-: when no choice was made at least one of the two times the signal was presented.
subject
1
2
3
4
5
6
Table 6.1
P2
noise music
A
A
?
O
?
O
A
O
?
?
O
?
P3
noise music
O
A
A
A
O
A
A
?
O
A
Results of the ASW test on white noise.
subject
1
2
3
4
5
6
Table 6.2
P1
noise music
A
A
A
?
A
O
A
A
A
A
A
A
P1
noise music
A
?
A
A
A
O
A
A
A
?
A
A
P2
noise music
A
A
A
A
?
A
A
?
O
P3
noise music
?
A
?
O
O
?
A
?
A
A
Results of the ASW test on a music signal.
The first thing we observe is that there is a lot of variation in choice, although
there seems to be a slight tendency towards the auralized signal.
6.5 Experiments
77
In both tables, the choice for a signal convolved with noise and for the same signal
convolved with music is often not the same, this indicates that the perception of
ASW depends on the type of sound that is heard.
Listeners did not find it easier to judge the ASW on a music signal than on a noise
signal, since in both cases there are about as many inconsistences (’?’) and blanks
(’-’).
Further we observe that within each column there is quite a lot of variation, so
different persons make different choices for the same signal.
Finally, the choice made with a signal of 160 ms and that of 1 s is often not
the same. So the addition of a reverberant part may influence the perception of
the signal. This may be even more the case since their origins are different: the
reverberant part of the auralized signal was artificially created using plane waves.
An exception to these observations is position P1 for which the subjects have
mostly chosen the auralized signal as the broader one, regardless of the type of
signal used.
6.5.3 Conclusion
The aim of these experiments was not to do a thorough perceptual analysis but
rather to get a global impression. Therefore no hard conclusions can be given but
some sort of tends can be indicated. For more reliable statistics subjects should
listen more than twice each signal.
We can infer from the results of experiment 1 that the decoloration helps getting
closer to the original signal, this means that if we are able to identify the sources
of the coloration and compensate for them, we can hope to make a better reproduction.
It appears from experiment 2 that for position P1 the difference in ASW is significant: the auralized signal is generally perceived broader than the original, while
at the other positions the subjects often made inconsistent choices, or no choice at
all. This indicates that there the differences are small. We have seen in section 6.3
that aliasing can explain the broader auralized signals.
Chapter 7
Conclusions and recommendations
7.1
Conclusions
This new concept of auralization allows the reconstruction of a close copy of the
original. The listener is not constraint to listen to music with headphones, but can
move within the entire reproduction area.
We have seen that:
we can record the acoustical properties of the wave field we want to reproduce by using a microphone array,
to get a good separation between waves coming from different directions,
we need a steering cardioid, requiring pressure and particle velocity measurements,
for a full 2D coverage of the wave field, a cross array can be used,
the separated signals are extrapolated to loudspeaker array position using
the 2D Rayleigh integrals where they are used to drive the loudspeakers,
the reproduced wave field resembles the original except for the artefacts due
to aliasing and the effects of the reproduction room,
we can not extrapolate the signals in the whole area covered by the measurements,
79
80
Chapter 7: Conclusions and recommendations
7.2
there is a noticeable difference in coloration between auralized and original
signals,
at P1, the ASW seems to be significantly broader for the auralized signal,
most likely due to aliasing, a conclusion cannot be made for the two other
positions.
Recommendations
to improve the auralization system, aliasing should be reduced, by increasing the number of loudspeakers in use,
to reduce the effect of the reproduction room, absorbing materials should be
applied on walls and other obstacles. Another way would be to try to cancel
the reflections by emitting signals in counter phase though the loudspeakers.
In recent research in the Laboratory, a number of approaches have been developped, that could improve the quality of the auralization. In [Hulsebos et al.,
2001], the following is pointed out:
we are using a 2D formula for the extrapolation which supposes that wave
amplitude decreases with p1r , while in reality waves that are emitted by the
loudspeakers decrease with 1r because they are emitted in 3D. A distance
dependent amplitude correction should be applied,
the area where the extrapolation is possible can be increased by using a
weighted sum of the contributions of both microphone arrays, using the
Kirchhoff-Helmholtz integrals to create a directivity pattern,
finally using a circular array instead of a cross array could reduce the artefacts.
Bibliography
Baan, J. (1997). Array technology for acoustic wavefield analysis in enclosed
spaces, Thesis, Delft University of Technology.
Berkhout, A.J. (1987). Applied seismic wave theory, Elsevier, Amsterdam.
Hulsebos, E.M. (1999). Fluctuations in measures for spaciousness, Thesis, Delft
University of Technology.
Hulsebos, E.M., de Vries, D. and Bourdillat, E. (2001). Improved microphone
array configurations for auralization of sound fields by Wave Field Synthesis,
Preprint of 110th AES Convention Amsterdam.
Patterson, R.D. and Moore, B.C.J. (1986). Auditory filters and excitation patterns
as representations of frequency resolution, Academic Press, London.
Raatgever, J. (1980). On the binaural processing of stimuli with different interaural phase relations, Thesis, Delft University of Technology.
Sonke, J.J. (2000). Variable Acoustics by wave field synthesis, Ph. D. Thesis,
Delft University of Technology.
Start, E.W. (1997). Direct sound enhancement by wave field synthesis, Ph.D.
Thesis, Delft University of Technology.
Verheijen, E.N.G. (1997). Sound reproduction by wave field synthesis, Ph.D.
Thesis, Delft University of Technology.
81
82
BIBLIOGRAPHY
de Vries, D., Betkhout, A.J. and Sonke J.J. (May 1996). Array technology for
measurement and analysis of sound fields in enclosures, Preprint of 100th AES
Convention Copenhagen.
de Vries, D., and Baan, J. (May 1996). Auralization of sound fields by wave field
synthesis, Preprint of 106th AES Convention Munich.
Vogel, P. (1993). Application of wave field synthesis in room acoustics, Ph. D.
Thesis, Delft University of Technology.
Download