Automatic Position Calibration of Multiple Microphones Vikas Chandrakant Raykar | Ramani Duraiswami Perceptual Interfaces and Reality Lab. | University of Maryland, CollegePark Motivation Multiple microphones are widely used for applications like source localization, tracking and beamforming. Most applications need to know the precise locations of the microphones. Small uncertainity in the sensor location could make substantial contribution to the overall localization error. In ad-hoc deployed arrays it is tedious and often inaccurate to manually measure using a tape or a laser device. In this paper we describe a method to automatically determine the three dimensional positions of multiple microphones. Automatically fix a coordinate system Y X Z If we know the positions of 3 speakers…. Y Distances are not exact Need atleast 3 speakers in 2D. Can use more speakers Find the intersection in the least square sense ? X If positions of speakers unknown… Consider M Microphones and S speakers. What can we measure? Calibration signal Distance between each speaker and all microphones. Or Time Of Flight (TOF) MxS TOF matrix Assume TOF corrupted by Gaussian noise. Can derive the ML estimate. Nonlinear Least Squares.. More formally can derive the ML estimate using a Gaussian Noise model speed of sound Find the coordinates of both the microphones as speakers which minimizes Maximum Likelihood (ML) Estimate.. we can define a noise model and derive the ML estimate i.e. maximize the likelihood ratio observation model parameters to be estimated If noise is Gaussian and independent ML is same as Least squares Gaussian noise Reference Coordinate system Reference Coordinate System Positive Y axis Similarly in 3D Origin 1.Fix origin (0,0,0) X axis 2.Fix X axis (x1,0,0) 3.Fix Y axis (x2,y2,0) 4.Fix positive Z axis x1,x2,y2>0 Which to choose? Later… Nonlinear least squares.. Levenberg Marquadrat method Function of a large number of parameters [ 3(M+S)-6 ] Unless we have a good initial guess may not converge to the minima. Approximate initial guess required. If we have M microphones and S speakers [ 3M+3S–6 ] parameters to estimate. [ MS ] TOF observations [ MS ] >= [ 3M+3S – 6 ] If M=S=K then K>=5 Why do we consider M=S ? Later.. Closed form Solution.. Say if we are given all pairwise distances between N points can we get the coordinates. 1 2 3 4 1 X X X X 2 X X X X 3 X X X X 4 X X X X Classical Metric Multi Dimensional Scaling dot product matrix Symmetric positive definite rank 3 Say given B can you get X ?....Singular Value Decomposition Same as Principal component Analysis One hitch.. we can measure only the pairwise distance matrix How to get dot product from the pairwise distance matrix…Cosine Law i d ki d ij k j d kj MDS... • If given pairwise distances between cities we can build a map. • Instead of pairwise distances we can use pairwise “dissimilarities”. • When the distances are Euclidean MDS is equivalent to PCA. • Eg. Face recognition, wine tasting • Can get the significant cognitive dimensions. Steyvers, M., & Busey, T. (2000). Predicting Similarity Ratings to Faces using Physical Descriptions. In M. Wenger, & J. Townsend (Eds.), Computational, geometric, and process perspectives on facial cognition: Contexts and challenges. Lawrence Erlbaum Associates Can we use MDS.. s1 s2 s3 s4 m1 m2 m3 m4 m5 m6 m7 X 1. We do not have X X X X the complete pairwise X X Xdistances X s1 ? ? ? ? X X X s2 ? ? ? ? X X s3 ? ? ? ? X X X X X X X s4 ? ? ? ? X X X X X X X m1 X X X X ? ? ? ? ? ? ? m2 X X X X ? ? ? ? ? ? ? m3 X X X X ? ? ? ? ? ? ? m4 X X X X ? ? ? ? ? ? ? m5 X X X X ? ? ? ? ? ? ? m6 X X X X ? ? ? ? ? ? ? m7 X X X X ? ? ? ? ? ? ? Forming microphone speaker pairs… Now we know the locations of speakers and microphones close to them. Problem is essentially same as with position of speakers known. Can get a closed form solution using least squares technique. Can refine all the values further by a further ML estimation. The complete algorithm… TOF matrix Approx Distance matrix Between Microphone Speaker pairs Approximation MDS Approx. microphone and speaker locations Nonlinear minimization Microphone and speaker locations Exact. microphone and speaker locations Nonlinear minimization Approx. Microphone locations Sample result in 2D… Algorithm Performance… •The performance of our algorithm depends on •Noise variance in the estimated distances. •Number of microphones and speakers. •Microphone and speaker geometry •One way to study the dependence is to do a lot of monte carlo simulations. •Else can derive the covariance matrix and bias of the estimator. •The ML estimate is implicitly defined as the minimum of a certain error function. •Cannot get an exact analytical expression for the mean and variance. •Can use implicit function theorem and Taylors series expansion to get approximate expressions for bias and variance. Where to place loudspeakers.. Monte Carlo Simulations… Calibration Signal… Time Delay Estimation… • • Compute the cross-correlation between the signals received at the two microphones. The location of the peak in the cross correlation gives an estimate of the delay. Task complicated due to two reasons 1.Background noise. 2.Channel multi-path due to room reverberations. Use Generalized Cross Correlation(GCC). • • W(w) is the weighting function. PHAT(Phase Transform) Weighting • • Experimental Setup… Results Related Previous work… J. M. Sachar, H. F. Silverman, and W. R. Patterson III. Position calibration of large-aperture microphone arrays. ICASSP 2002 Y. Rockah and P. M. Schultheiss. Array shape calibration using sources in unknown locations Part II: Near-field sources and estimator implementation. IEEE Trans. Acoust.,Speech, Signal Processing, ASSP-35(6):724-735, June 1987. R. Moses, D. Krishnamurthy, and R. Patterson. A self-localization method for wireless sensor networks. Eurasip Journal on Applied Signal Processing Special Issue on SensorNetworks, 2003(4):348-358, March 2003. J. Weiss and B. Friedlander. Array shape calibration using sources in unknown locations a maximum likelihood approach. IEEE Trans. Acoust., Speech, Signal Processing , 37(12):1958-1966, December 1989. Our Contributions… • Locations of the speakers need not be known. • Only constraint is that there showld be a microphone close to a loud speaker. • In a practical setup attach a microphone to a louspeaker. •Derived the theoretical variance of the estimator. •Where to place the loudspeakers? Acknowledgements… •Dr. Dmitry Zotkin for building the microphone array. •Dr. Elena Grassi and Zhiyun Li for the data capture boards, Thank You ! | Questions ?