Digital Audio Signal Processing Lecture-2: Microphone Array Processing - Fixed Beamforming Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be homes.esat.kuleuven.be/~moonen/ Overview • Introduction & beamforming basics – Data model & definitions • Fixed beamforming – Filter-and-sum beamformer design – Matched filtering • White noise gain maximization • Ex: Delay-and-sum beamforming – Superdirective beamforming • Directivity maximization – Directional microphones (delay-and-subtract) • Lecture-3: Adaptive Beamforming Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 2 1 Introduction • Directivity pattern of a microphone – A microphone is characterized by a `directivity pattern’ which specifies the gain (& phase shift) that the microphone gives to a signal coming from a certain direction (i.e. `angle-of-arrival’) – In general the directivity pattern is also a function of frequency (ω) |H(ω,θ)| for 1 frequency – In a 3D scenario `angle-of-arrival’ is azimuth + elevation angle – Will consider only 2D scenarios for simplicity, with one angle-of arrival (θ), hence directivity pattern is H(ω,θ) – Directivity pattern is fixed and defined by physical microphone design Digital Audio Signal Processing Version 2014-2015 p. 3 Lecture-2: Microphone Array Processing Introduction • By weighting or filtering (=freq.dependent weighting) and then summing signals from different microphones, a (software controlled) `virtual’ directivity pattern (=weigthed sum of individual patterns) can be produced F1 (ω ) z[k] + F2 (ω ) y1[k] 1 y2 [k] 0.5 : 0 0 FM (ω ) yM [k] N−1 Fm (ω ) = ∑ fm,n .e− jnω n=0 Fr e 1000 que ncy 2000 ( Hz 3000 ) 0 45 90 135 180 Angle (deg) M H virtual (ω ,θ ) = ∑ Fm (ω ).H m (ω ,θ ) m =1 This assumes all microphones receive the same signals (so are all in the same position). However… Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 4 2 Introduction • However, an additional aspect is that in a microphone array different microphones are in different positions/locations, hence also receive different signals θ • Example : uniform linear array i.e. microphones placed on a line & uniform inter-micr. distances (d) & ideal micr. characteristics (p.8) z[k] Version 2014-2015 + F2 (ω ) : dM FM (ω ) For a far-field source signal (plane waveforms), each microphone receives the same signal, up to an angle-dependent delay fs=sampling rate c=propagation speed Digital Audio Signal Processing y1[k] F1 (ω ) τ m (θ ) = dM cosθ d m cosθ fs c ym [k ] = y1[k + τ m ] d m = (m − 1)d M H virtual (ω, θ ) = ∑ Fm (ω ).e− jωτ m (θ ) m=1 p. 5 Lecture-2: Microphone Array Processing Introduction • Beamforming = `spatial filtering’ based on microphone characteristics (directivity patterns) AND microphone array configuration (`spatial sampling’) z[k] + θ F1 (ω ) y1[k] F2 (ω ) : dM FM (ω ) Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing dM cosθ p. 6 3 Introduction • Background/history: ideas borrowed from antenna array design/ processing for RADAR & (later) wireless communications. • Microphone array processing considerably more difficult than antenna array processing: – narrowband radio signals versus broadband audio signals – far-field (plane wavefronts) versus near-field (spherical wavefronts) – pure-delay environment versus multi-path environment • Classification: – fixed beamforming: data-independent, fixed filters Fm e.g. delay-and-sum, filter-and-sum – adaptive beamforming: data-dependent adaptive filters Fm e.g. LCMV-beamformer, Generalized Sidelobe canceler • Applications: voice controlled systems (e.g. Xbox Kinect), speech communication systems, hearing aids,… Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 7 Beamforming basics Data model: source signal in far-field (see p.12 for near-field) • Microphone signals are filtered versions of source signal S(ω) at angle θ pos.-dep. phase shift dir. pattern !#! " $ ! #! " $ Ym (ω , θ ) = H m (ω ,θ ). e − jωτ m (θ ) . S (ω ) • Stack all microphone signals in a vector Y(ω ,θ ) = d(ω ,θ ).S (ω ) [ d(ω ,θ ) = H1 (ω ,θ ).e − jωτ1 (θ ) ... H M (ω ,θ ).e − jωτ M (θ ) ] T d is `steering vector’ • Output signal after `filter-and-sum’ is H instead of T for convenience (**) M Z (ω ,θ ) = ∑ Fm* (ω ).Ym (ω ,θ ) = F H (ω ).Y(ω ,θ ) = {F H (ω ).d(ω ,θ )}.S (ω ) m =1 Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 8 4 Beamforming basics Data Model: source signal in far-field Y(ω ,θ ) = d(ω ,θ ).S (ω ) • If all microphones have the same directivity pattern Ho(ω,θ), steering vector can be factored as… [ ] T d(ω ,θ ) = H 0 (ω ,θ ). 1 e − jωτ 2 (θ ) ... e − jωτ M (θ ) !#!!!!! " $ !#! " $!!!! dir. pattern spatial positions microphone-1 is used as a reference (=arbitrary) • Will often consider arrays with ideal omni-directional microphones : Ho(ω,θ)=1 Example : uniform linear array, see p.5 Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 9 Beamforming basics Definitions (1) : • In a linear array (p.5) : θ =90o=broadside direction θ = 0o =end-fire direction • Array directivity pattern (compare to p.3) = `transfer function’ for source at angle θ ( -π<θ< π ) H (ω,θ ) = Z (ω,θ ) = F H (ω ).d(ω ,θ ) S (ω ) • Steering direction = angle θ with maximum amplification (for 1 freq.) θ max (ω ) = arg maxθ H (ω,θ ) • Beamwidth (BW) = region around θmax with (e.g.) amplification > -3dB (for 1 freq.) Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 10 5 Beamforming basics Data model: source signal + noise • Microphone signals are corrupted by additive noise Y(ω ,θ ) = d(ω ,θ ).S (ω ) + N(ω ) T N(ω ) = [N1 (ω ) N 2 (ω ) ... N M (ω )] • Define noise correlation matrix as Φ noise (ω ) = E{N(ω ).N(ω ) H } • Will assume noise field is homogeneous, i.e. all diagonal elements of noise correlation matrix are equal : Φii (ω ) = Φnoise (ω ) , • Then noise coherence matrix is Digital Audio Signal Processing Version 2014-2015 ∀i ⎡1 .. ..⎤ 1 Γ noise (ω ) = .Φ noise (ω ) = ⎢⎢.. ! ..⎥⎥ φnoise (ω ) ⎢⎣.. ..p. 11 1⎥⎦ Lecture-2: Microphone Array Processing Beamforming basics Definitions (2) : • Array Gain =improvement in SNR for source at angle θ ( -π<θ< π ) G(ω ,θ ) = SNRoutput SNRinput = F H (ω ).d(ω ,θ ) 2 F H (ω ).Γ noise (ω ).F(ω ) |signal transfer function|^2 |noise transfer function|^2 • White Noise Gain =array gain for spatially uncorrelated noise (=white) WNG (ω ,θ ) = F H (ω ).d(ω , θ ) 2 F H (ω ).F(ω ) white Γnoise =I (e.g. sensor noise) ps: often used as a measure for robustness • Directivity =array gain for diffuse noise (=coming from all directions) DI (ω , θ ) = F H (ω ).d(ω ,θ ) H diffuse noise F (ω ).Γ 2 .F (ω ) ⎛ ω f s (d j − di ) ⎞ ⎟⎟ Γijdiffuse (ω ) = sinc⎜⎜ c ⎝ ⎠ (ignore this formula) DI and WNG evaluated at θmax is often used as a performance criterion Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 12 6 PS: Near-field beamforming • Far-field assumptions not valid for sources close to microphone array – spherical wavefronts instead of planar waveforms – include attenuation of signals – 2 coordinates θ,r (=position q) instead of 1 coordinate θ (in 2D case) • Different steering vector d(ω ,θ ) [ d(ω , q) = a1e − jωτ1 (q ) e e=1 (3D)…2 (2D) (e.g. with Hm(ω,θ)=1 m=1..M) am = q − p ref q − pm τ m (q) = : a2e − jωτ 2 (q ) … aM e − jωτ M (q ) q − p ref − q − p m c ] T fs with q position of source pref position of reference microphone pm position of mth microphone Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 13 PS: Multipath propagation • In a multipath scenario, acoustic waves are reflected against walls, objects, etc.. • Every reflection may be treated as a separate source (near-field or far-field) • A more practical data model is: Y(ω , q) = d(ω , q).S (ω ) + N(ω ) T d(ω , q) = [H1 (ω , q) H 2 (ω , q) ... H M (ω , q)] with q position of source and Hm(ω,q), complete transfer function from source position to m-the microphone (incl. micr. characteristic, position, and multipath propagation) `Beamforming’ aspect vanishes here, see also Lecture-3 (`multi-channel noise reduction’) Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 14 7 Overview • Introduction & beamforming basics – Data model & definitions • Fixed beamforming – Filter-and-sum beamformer design – Matched filtering • White noise gain maximization • Ex: Delay-and-sum beamforming – Superdirective beamforming • Directivity maximization – Directional microphones (delay-and-subtract) • Lecture-3: Adaptive beamforming Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 15 Filter-and-sum beamformer design • Basic: procedure based on H (ω , θ ) = page 9 Z (ω ,θ ) = F H (ω ).d(ω , θ ) S (ω ) N −1 T F(ω ) = [F1 (ω ) ... FM (ω )] , Fm (ω ) = ∑ f m,n .e − jnω n =0 Array directivity pattern to be matched to given (desired) pattern H d (ω,θ ) over frequency/angle range of interest • Non-linear optimization for FIR filter design (=ignore phase response) 2 min f m ,n ,m =1.. M ,n =0.. N −1 ∫ ∫ ( H (ω , θ ) − H d (ω , θ ) ) dω dθ θ ω • Quadratic optimization for FIR filter design (=co-design phase response) 2 min f m ,n ,m=1.. M ,n =0.. N −1 ∫ ∫ H (ω , θ ) − H d (ω ,θ ) dω dθ θ ω Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 16 8 Filter-and-sum beamformer design • Quadratic optimization for FIR filter design (continued) Kronecker product 2 min f m ,n ,m=1.. M ,n=0.. N −1 ∫ ∫ H (ω,θ ) − H d (ω,θ ) dω dθ θ ω f m = [ f m,0 ... f m, N −1 ] T [ , f = f1T ... f MT ] T NMxM '!! ! &!!! % ⎡ e 0. jω ⎤ ⎢ ⎥ H (With ω ,θ ) = F H (ω ).d(ω ,θ ) = F1H (ω ) ... FMH (ω ) .d(ω ,θ ) = f T .( I MxM ⊗ ⎢ ( ⎥ ).d(ω ,θ ) ⎢e ( N −1). jω ⎥ ⎣ !#!! ⎦ !!" $!!! [ ] d (ω ,θ ) optimal solution is foptimal = Q −1. p , Q = ∫ ∫ d(ω,θ ).d θ ω Digital Audio Signal Processing H (ω,θ ) dω dθ , p = ∫ ∫ d(ω,θ ).H θ ω Version 2014-2015 * d (ω,θ ) dω dθ Lecture-2: Microphone Array Processing p. 17 Filter-and-sum beamformer design • Design example M=8 Logarithmic array N=50 fs=8 kHz 1 0.5 0 0 Fre 1000 que 2000 ncy (H 3000 z) Digital Audio Signal Processing 0 Version 2014-2015 45 90 135 180 Angle (deg) Lecture-2: Microphone Array Processing p. 18 9 Overview • Introduction & beamforming basics – Data model & definitions • Fixed beamforming – Filter-and-sum beamformer design – Matched filtering • White noise gain maximization • Ex: Delay-and-sum beamforming – Superdirective beamforming • Directivity maximization – Directional microphones (delay-and-subtract) • Lecture-3: Adaptive beamforming Digital Audio Signal Processing Version 2014-2015 p. 19 Lecture-2: Microphone Array Processing Matched filtering : WNG maximization • Basic: procedure based on page 11 • Maximize White Noise Gain (WNG) for given steering angle ψ MF F (ω ) = arg{maxF (ω ) WNG(ω ,ψ )} = arg{maxF (ω ) F H (ω ).d(ω ,ψ ) F H (ω ).F(ω ) 2 } • A priori knowledge/assumptions: – angle-of-arrival ψ of desired signal + corresponding steering vector – noise scenario = white Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 20 10 Matched filtering : WNG maximization • Maximization in F MF (ω ) = arg{maxF (ω ) F H (ω ).d(ω ,ψ ) F H (ω ).F(ω ) 2 } is equivalent to minimization of noise output power (under white input noise), subject to unit response for steering angle (**) minF (ω ) F H (ω).F(ω), s.t. F H (ω).d(ω,ψ ) = 1 • Optimal solution (`matched filter’) is F MF (ω ) = 1 d(ω ,ψ ) 2 .d(ω ,ψ ) • [FIR approximation] 2 min f m ,n ,m=1.. M ,n=0.. N −1 ∫ F MF (ω ) − F(ω ) dω ω Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 21 Matched filtering example: Delay-and-sum • Basic: Microphone signals are delayed and then summed together 1 M z[kH ](ω=,ψ ) = 1 .∑ ym [k + Δ m ] M m =1 ψ Δ1 1 M Σ Δ2 d Fm (ω ) = d Δm e − jωΔ m M ( m − 1) d cosψ • Fractional delays implemented with truncated interpolation filters (=FIR) • Consider array with ideal omni-directional micr’s Then array can be steered to angle ψ : d(ω ,ψ ) T F(ω ) = Δ m = τ m (ψ ) d(ω ,ψ ) = 1 e − jωτ 2 (ψ ) ... e − jωτ M (ψ ) M [ ] Hence (for ideal omni-dir. micr.’s) this is matched filter solution Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 22 11 Matched filtering example: Delay-and-sum ideal omni-dir. micr.’s • Array directivity pattern H(ω,θ): 1 H d (ω ,ψ ).d(ω , θ ) M H (ω , θ ) ≤ 1 H (ω , θ ) = H (ω , θ = ψ ) = 1 =destructive interference =constructive interference (ψ = θ max ) • White noise gain : WNG (ω ,θ = ψ ) = F H (ω ).d(ω ,ψ ) F H (ω ).F(ω ) 2 = .. = M (independent of ω) For ideal omni-dir. micr. array, delay-and-sum beamformer provides WNG equal to M for all freqs. (in the direction of steering angle ψ). Digital Audio Signal Processing Version 2014-2015 p. 23 Lecture-2: Microphone Array Processing Matched filtering example: Delay-and-sum ideal omni-dir. micr.’s • Array directivity pattern H(ω,θ) for uniform linear array: H (ω ,θ ) = M ∑e − j ( m −1)ω m =1 − jMγ / 2 = e M=5 microphones d=3 cm inter-microphone distance ψ=60° steering angle fs=16 kHz sampling frequency d (cos θ − cosψ ) fs c sin( Mγ / 2) e − jγ / 2 sin(γ / 2) γ H(ω,θ) has sinc-like shape and is frequency-dependent H (ω ,θ = ψ ) = 1 Spatial directivity pattern for f=5000 Hz 90 0 1 -10 0.8 0.6 -20 180 0.4 0 =endfire 0.2 0 2000 4000 Frequency (Hz) 6000 8000 0 Digital Audio Signal Processing wavelength=4cm 45 135 90 Angle (deg) 180 ψ=60° Version 2014-2015 270 Lecture-2: Microphone Array Processing p. 24 12 Matched filtering example: Delay-and-sum ideal omni-dir. micr.’s f ≥ For c d .(1 + cosψ ) an ambiguity, called spatial aliasing, occurs. This is analogous to time-domain aliasing where now the spatial sampling (=d) is too large. c c λ d≤ = = min Aliasing does not occur (for any ψ) if f s 2. f max 2 M=5, ψ=60°, fs=16 kHz, d=8 cm 1 0.8 0.6 0.4 0.2 f = c d .(1 + cosψ ) Details... H (ω ,θ ) = 1 iff γ = 2π.p for integer p 1) γ = 0 for θ = ψ (for all ω ) 0 y enc qu Fre 2000 6000 z) (H c d .(1 + cosψ ) π c 3) if ψ ≥ then γ = 2π occurs for θ = 0 and f = .. = 2 d .(1 − cosψ ) 2) if ψ ≤ 4000 50 8000 0 Digital Audio Signal Processing 100 150 π 2 then γ = 2π occurs for θ = π and f = .. = Angle (deg) Version 2014-2015 Lecture-2: Microphone Array Processing p. 25 Matched filtering example: Delay-and-sum ideal omni-dir. micr.’s • Beamwidth for a uniform linear array: 96(1 −ν ) secψ ωdM BW ≈ c with e.g. ν=1/sqrt(2) (-3 dB) hence large dependence on # microphones, distance (compare p.22 & 23) and frequency (e.g. BW infinitely large at DC) • Array topologies: – Uniformly spaced arrays – Nested (logarithmic) arrays (small d for high ω, large d for small ω) – 2D- (planar) / 3D-arrays 4d d 2d Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 26 13 Overview • Introduction & beamforming basics – Data model & definitions • Fixed beamforming – Filter-and-sum beamformer design – Matched filtering • White noise gain maximization • Ex: Delay-and-sum beamforming – Superdirective beamforming • Directivity maximization – Directional microphones (delay-and-subtract) • Lecture-3: Adaptive beamforming Digital Audio Signal Processing Version 2014-2015 p. 27 Lecture-2: Microphone Array Processing Super-directive beamforming : DI maximization • Basic: procedure based on page 11 • Maximize Directivity (DI) for given steering angle ψ SD F (ω ) = arg{max F (ω ) DI (ω ,ψ )} = arg{max F (ω ) F H (ω ).d(ω ,ψ ) 2 diffuse F H (ω ).Γnoise .F(ω ) } • A priori knowledge/assumptions: – angle-of-arrival ψ of desired signal + corresponding steering vector – noise scenario = diffuse Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 28 14 Super-directive beamforming : DI maximization • Maximization in SD F (ω ) = arg{max F (ω ) 2 F H (ω ).d(ω ,ψ ) diffuse F H (ω ).Γnoise .F(ω ) } is equivalent to minimization of noise output power (under diffuse input noise), subject to unit response for steering angle (**) diffuse minF (ω ) F H (ω).Γnoise (ω).F(ω), s.t. F H (ω).d(ω,ψ ) = 1 • Optimal solution is F SD (ω ) = 1 diffuse (ω )}−1 .d (ω ,ψ ) d (ω ,ψ ) H .{ Γnoise diffuse .{Γnoise (ω )}−1.d(ω ,ψ ) • [FIR approximation] 2 min f m ,n ,m=1.. M ,n =0.. N −1 ∫ FSD (ω ) − F(ω ) dω ω Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 29 Super-directive beamforming : DI maximization ideal omni-dir. micr.’s • Directivity patterns for end-fire steering (ψ=0): Delay-and-sum beamformer (f=3000 Hz) Superdirective beamformer (f=3000 Hz) 90 90 0 0 -10 -10 -20 -20 180 0 180 0 270 M=5 d=3 cm fs=16 kHz 270 Superdirective beamformer has highest DI, but very poor WNG (at low frequencies, where diffuse noise coherence matrix becomes ill-conditioned) 25 Directivity (linear) 20 DI=25 15 10 DI=5 5 0 0 WNG=5 10 Superdirective Delay-and-sum 0 White noise gain (dB) M hence problems with robustness (e.g. sensor noise) ! 2 -10 -20 -30 -40 -50 2000 4000 Frequency (Hz) 6000 8000 -60 0 Superdirective Delay-and-sum 2000 4000 Frequency (Hz) 6000 PS: diffuse noise = white noise for high frequencies 8000 Digital Audio Signal Processing obtained Version Microphone Array Maximum directivity=M.M for2014-2015 end-fire steering Lecture-2: and for frequency->0 (noProcessing proof) p. 30 15 Overview • Introduction & beamforming basics – Data model & definitions • Fixed beamforming – Filter-and-sum beamformer design – Matched filtering • White noise gain maximization • Ex: Delay-and-sum beamforming – Superdirective beamforming • Directivity maximization – Directional microphones (delay-and-subtract) • Lecture-3: Adaptive beamforming Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 31 Differential microphones : Delay-and-subtract • First-order differential microphone = directional microphone 2 closely spaced microphones, where one microphone is delayed (=hardware) and whose outputs are then subtracted from each other θ Σ + _ Digital Audio Signal Processing d H (ω ,θ ) = 1 − e − jω (τ + d cosθ ) c τ Version 2014-2015 Lecture-2: Microphone Array Processing p. 32 16 Differential microphones : Delay-and-subtract • First-order differential microphone = directional microphone 2 closely spaced microphones, where one microphone is delayed (=hardware) and whose outputs are then subtracted from each other θ Σ + _ H (ω ,θ ) = 1 − e d τ − jω (τ + d cosθ ) c ωd/c <<π, ωτ <<π • Array directivity pattern: H (ω ,θ ) ≈ ω .(τ + high -pass angle dependence ! ! d d cosθ ) = ω .(τ + ). P(θ ) c c τ α1 = – First-order high-pass frequency dependence τ +d /c – P(θ) = freq.independent (!) directional response P(θ ) = α1 + (1 − α1 ) cosθ – 0 ≤ α1 ≤ 1 : P(θ) is scaled cosine, shifted up with α1 such that θmax = 0o (=end-fire) and P(θmax )=1 Digital Audio Signal Processing Version 2014-2015 p. 33 Lecture-2: Microphone Array Processing Differential microphones : Delay-and-subtract • Types: dipole, cardioid, hypercardioid, supercardioid (HJ84) Cardioid: α1= 0.5 zero at 180o DI = 4.8 dB Dipole: α1= 0 (τ=0) zero at 90o DI = 4.8 dB =broadside =broadside =endfire Digital Audio Signal Processing Version 2014-2015 =endfire Lecture-2: Microphone Array Processing p. 34 17 Differential microphones : Delay-and-subtract Hypercardioid: α1= 0.25 Supercardioid: α1= ( 3 − 1) / 2 ≈ 0.35 zero at 109o zero at 125o, DI=5.7 dB highest DI=6.0 dB highest front-to-back ratio =endfire Digital Audio Signal Processing Version 2014-2015 =endfire Lecture-2: Microphone Array Processing p. 35 Overview • Introduction & beamforming basics – Data model & definitions • Fixed beamforming – Filter-and-sum beamformer design – Matched filtering • White noise gain maximization • Ex: Delay-and-sum beamforming – Superdirective beamforming • Directivity maximization – Directional microphones (delay-and-subtract) • Lecture-3: Adaptive beamforming Digital Audio Signal Processing Version 2014-2015 Lecture-2: Microphone Array Processing p. 36 18