Self Localizing sensors and actuators on Distributed Computing Platforms

advertisement
Self Localizing sensors
and actuators on
Distributed Computing
Platforms
Vikas Raykar
Igor Kozintsev
Rainer Lienhart
Intel Labs
Motivation
 Many multimedia applications are emerging which use multiple
audio/video sensors and actuators.
Distributed
Capture
Cameras
Speakers
Distributed
Rendering
Microphones
Number
Crunching
Displays
Other Applications
Applications
Smart Conference
Rooms
Speech Recognition
Source separation and
Deverberation
Meeting Recording
Audio/Image Based
Rendering
Hands free voice
communication
MultiChannel Speech
Enhancement
MultiChannel Echo
Cancellation
Audio/Video
Surveillance
Object Localization
And tracking
Distributed Audio
Video Capture
Interactive Audio
Visual Interfaces
Additional Motivation
 Current work has focused on setting up all the sensors and
actuators on a single dedicated computing platform.
 Dedicated infrastructure required in terms of the sensors,
multi-channel interface cards and computing power.
On the other hand…
 Computing devices such as laptops, PDAs, tablets, cellular
phones, camcorders have become pervasive.
 Audio/video sensors on different laptops can be used to form a
distributed network of sensors.
Problem formulation
Put all the distributed audio-visual I/O capabilities into a common
time and space.
In this paper:
Focus on providing a common space by means of actively
estimating the 3D positions of the sensors (microphones) and
actuators (speakers).
Account for the errors due to lack of temporal synchronization
among various sensors and actuators (A/Ds and D/As) on
distributed general purpose computing platforms.
Our View of Distributed
Sensor Network
Y
Z
X
Localization with known
positions of speakers
Distances are not
exact
There are
more speakers
If positions of speakers are
unknown…
Consider M Microphones and S speakers.
What can we measure?
Calibration signal
Distance
between each
speaker and all
microphones
(Time Of Flight)
MxS TOF matrix
Assume TOF
corrupted by
AWGN: can
derive the ML
estimate.
Nonlinear Least Squares
Find the coordinates which minimizes this
Reference Coordinate System
Positive Y axis
Similarly in 3D
Origin
1.Fix origin
(0,0,0)
X axis
2.Fix X axis
(x1,0,0)
3.Fix Y axis
(x2,y2,0)
4.Fix positive Z
axis
x1,x2,y2>0
Which to choose? Later…
On a synchronized platform all is
well..
Intel Labs
However on a Distributed
system..
PC platform overview
CPU
AGP
CPU, MCH,
FSB, memory
Multimedia/multistream applications
MCH
Operating system
ATA
ICH, hub,
PCI, LAN, etc.
AC97
ICH
I/O bus
LAN
USB
PCI Slots
Audio/video I/O devices
Intel Labs
Timing on distributed system
Time Origin
tsj
Signal Emitted by source j
Playback Started
Capture Started
tmi
t
Signal Received by microphone i
TOFij
ˆ Fij
TO
t
Joint Estimation
MS TOF Measurements
Microphone and speaker
Coordinates
DM+DS - [ D(D+1)/2 ]
Speaker Emission
Start Times
S
Microphone Capture
Start Times
M -1
Assume tm_1=0
Time Difference of Arrival (TDOA)
Formulation same as above but less number of parameters.
Nonlinear least squares
Levenberg Marquadrat
method
Multidimensional function.
Unless we have a good initial guess may not converge
to the global minima.
Approximate initial guess required.
Multi Dimensional Scaling
dot product matrix
Symmetric positive definite
rank 3
Given B can you get X ?....Singular Value Decomposition
Clustering approximation
Clustering approximation
ii
ij
ji
jj
How to get dot product from the
pair wise distance matrix
i
d ki
d ij

j
k
d kj
Centroid as the origin
Later shift
it to our
orignal reference
Slightly perturb each location of GPC
into two to get the initial guess for the
microphone and speaker coordinates
Sample result in 2D
Algorithm
TOF matrix
Clustering
Approx
ts
Approx
Distance matrix
between GPCs
Approx
tm
Dimension and
coordinate system
TDOA based
Nonlinear
minimization
Microphone and speaker
locations
Dot product matrix
tm
MDS to get approx
GPC locations
perturb
Approx. microphone
and speaker
locations
Cramer-Rao bound
 Gives the lower bound on the variance of any unbiased
estimator.
 Does not depends on the estimator. Just the data and the
noise model.
 Basically tells us to what extent the noise limits our
performance i.e. you cannot get a variance lesser than the CR
bound.
Rank deficit: remove the
known parameters
Jacobian
Performance comparison
Dependence on number of nodes
Dependence on number of nodes
Geometry matters
Geometry matters
Experimental setup: bias 0.08 cm
sigma 3.8 cm
Speaker
2
Speaker
3
Mic
3
Mic
4
Mic
2
Mic
1
Speaker
1
Z
Room Width = 2.55 m
Speaker
4
Room Length = 4.22 m
Room Height = 2.03 m
Summary




General purpose computers can be used for
distributed array processing
It is possible to define common time and space for a
network of distributed sensors and actuators.
For more information please see our two papers in
ACM MM in November or contact
igor.v.kozintsev@intel.com rainer.lienhart@intel.com
Let us know if you will be interested in testing/using
out time and space synchronization software for
developing distributed algorithms on GPC (available
in November)
Intel Labs
Backup
Intel Labs
Calibration signal
Results (contd.)
Download