Research Statement Radu Balan Executive Summary

advertisement
Research Statement
Radu Balan
February 2007
Executive Summary
My research interests cover areas of statistical estimation and modeling, with applications to
sensor networks, communications, and biomonitoring systems.
This document is structured in two parts. First part contains description of three research
directions (systems) that I would like to pursue:
• Audio-Video Sensor Networks
• Body Sensor Networks
• Statistical Modeling and Control of Wireless Local Area Networks
In part two I highlight a selected set of problems where I made significant contributions, and are still
open to research in order to advance the three research directions mentioned above. More specifically
I discuss:
• Sparse Signal Estimation
• Nonlinear Signal Processing
• Sensor Fusion
• Analysis of Wireless Communication Channels
• 802.11 WLAN MAC Layer Modeling and Control
• Optimizations in Wireless Networks
• Topics of Applied Mathematics
Over the years, with help from my colleagues and students at Siemens, I developed and built some
of these components. Other components are still waiting for technology, or theory, to develop.
http://folk.uio.no/paalee
Chapter 1
Systems
1.1
Heterogeneous Sensor Networks
A Heterogeneous Sensor Network (AVSN) is represented by a collection of heterogeneous sensors
(such as microphones, videocameras, RFID, radar) that communicate with a central monitoring
system in a wired, or a wireless manner (see Figure 1.1).
Figure 1.1: An Audio-Video Sensor Network deployed in a wired-wireless mixed mode
Typical applications targeted by such a system include:
• Security/Surveillance Systems
• Meeting Transcription Systems
• Scene Understanding
Typically, this system includes the following components:
• Front-end preprocessing of audio/video/radar signals, e.g. noise reduction, source separation,
detection, localization, feature extraction block;
• Communication block;
• Statistically principled sensor fusion;
• Context estimation and
http://www.unik.no/personer/paalee
high-level statistical inference;
• Control and feedback into external world.
In part two of this document I describe in more detail some challenges and issues for some of these
components. Here I only sketch the current status of my research and technology development
concerning these components.
Noise Reduction: I worked extensively on speech enhancement, and noise reduction. In part
two I present two new approaches to this problem: sparse signal estimation, and nonlinear signal
processing.
Signal Separation: Using geometric information known a priorily, or estimated using audio-video
past information, audio source signals can be separated by signal separation algorithms such as the
ones described in Part two of this document.
Pattern Detection, Classification, and Tracking: This component has been used in the Person
Recognition system we developed at Siemens.
Communication block: In the case of wireless communication time synchronization is critical.
Assuming the wireless communication is achieved through 802.11 WLANs, one subject of research
is achieving high accuracy in time stamping wireless data packets. In particular 802.11p promises a
new chipset design that includes a time stamp for each wireless data packet.
Postprocessing: When data is sampled at nonuniform times, resampling is required. My expertise
in applied harmonic analysis is very well suited here. Depending on available information and
computational resources, I will choose an appropriate resampling algorithm.
Sensor Fusion: In the Person Recognition project at Siemens, I developed a joint posterior
distribution estimation based on each component characteristic. The main challenge is to have a
“good” training database to cover all cases of interest.
Statistical Inference: For a specific scenario a Dynamic Bayes Network (DBN) is created to model
the context.
For most of these components I already have an extensive expertise. Some components may be
developed jointly through collaborations within EE Department, or other EE/ECE Departments,
or industry (in particular Siemens). I envison a lot of interest would come from US Government
agencies (e.g. DARPA, ONR, ARL), or industry. Two years ago I had a joint NSF proposal with
Professors Poor and Kulkarni, and Dr. Zhu from Siemens. Currently I plan to develop a joint ARL
grant proposal with Radu Marculescu from CMU.
Recently (January 2007), the ONR launched a SBIR BAA (N07-T039) concerning transfer of
novel signal separation technology and nonlinear methods where my joint research with Professors
Casazza and Edidin is a first referenced paper. The application is the use of a microphone array for
a particular speaker identification and localization in a cocktail party problem setup.
1.2
Body Sensor Networks
A Body Sensor Network (BSN) represents a collection of in-vivo sensors that continuously monitor
electro-chemical signals of a patient. An example of such a system is pictured in Figure 1.2.
Depending on sensors type, and communication mode, a BSN can exhibit the following features:
• Passive (no battery) or Active (battery powered) sensors;
• Fixed, or mobile sensor;
• Wired or wireless communication between nearby reading antenna, and the monitoring system.
Such a system requires components that perform the following functions:
• In-Vivo sensors;
• Sensor reading: Wireless communication between in-vivo sensors and external antenna(s);
• Body surface communication between antenna(s) and the monitoring system;
• Signal detection, estimation, and sensor fusion (if multiple sensors)
As in previous system proposals, I posses expertise on designing some of these components, but I
lack expertise in others. More specifically,
In-vivo sensors: I do not have expertise in building these sensors. I envison collaborating with
bio-sensor groups that may provide in-vivo sensors;
Wireless sensor reading: I have expertise in communication channel modeling for wireless communications, and acoustic environments; Also my applied mathematics background and strong connections with the applied mathematics community would play a major role in solving this problem.
I propose to consider several scenarios: multiple external antennas for more accurate detection and
estimation (higher SNR signals), and for localization of sensor (e.g. sensors through digestive system,
or blood vassels).
Body surface communications: I am familiar with IEEE 11073 standardization activity devoted
to Medical Device Communications, a MAC layer description of communication protocol.
Statistical Signal Processing: I worked on several projects and have extensive expertise on statistical signal detection and estimation: signal separation, source localization, sensor fusion, pattern
recognition and abnormal mode detection (see my CV).
Figure 1.2: A Body Sensor Network with passive sensors read off by an external antenna
Due to multidisciplinary nature of this project, a larger team should collaborate to design,
develop, and test such a system. I am eager to collaborate with medical doctors, and researchers on
medical sciences.
1.3
Statistical Modeling and Control of Wireless Local Area
Networks
Wireless Local Area Networks (802.11 networks) have become as ubiquitous as the internet access.
Due to their widespread in home (personal) use, and enterprise environment, 802.11 WLAN has
become a hot topic of research and development these days. Compared to other more traditional
research areas in Electrical Engineering, such as Signal Processing, Control Theory, Information
Theory, WLAN control (management) theory is quite underdeveloped presently, but it is developing
fast.
Sure, one can argue that WLAN control can be seen as a subset of the cross-layer design theory.
However, due to the standardization process, fewer degrees of freedom are left to vary, and lot of
care should be given to specifics of 802.11 standardized networks.
My specific proposal concerns the physical and MAC layer modeling and control of 802.11
WLANs. The basic unit of a WLAN is composed of one Access Point (AP), and several mobile
stations (STAs), all forming a Basic Service Set (BSS). A typical WLAN setup is depicted in Figure
1.3.
Figure 1.3: A typical Basic Service Set (BSS), with one AP, and several mobile stations.
There are two distinct regimes, completely opposite from one another, that characterize the
behaviour of such a network:
1. Deterministic Regime: when no collision happen, and the initial MAC instance time are sufficiently far apart, then transmission happens in a deterministic mode. Such a case may happen
when only voice data (such as VoIP stations are connected to the AP), or periodic transmitting stations are present. The deterministic regime analysis gives an upper bound on system
performance;
2. Stochastic Regime: once collisions happen, or medium is detected busy during a packet arrival,
the random generation of a Backoff counter happens, and the contention-based mechanism
kicks in.
The deterministic regime is used to compute the maximal performance of a WLAN. In such a
case, performance may be superior even to the PCF (Point Coordination Function) mode, where
an AP acts as a transmission controller. However, in highly loaded networks, collisions are quite
frequent, and the stochastic regime is more likely. Several works proposed stochastic models for this
regime. Each work concerned with one feature or another of the network behavior. In particular
Bianchi [11] was the first to propose the use of Markov Chain in modeling the saturation regime of a
802.11 WLAN. Since his paper, several others considered saturation, and non-saturation modeling
of WLANs, increasing the model complexity, and taking into account more phenomena observed in
experimental setups. However the current state-of-the-art model is not sufficient for several reasons:
1. It does not take into account the deterministic regime, nor does the performance converge to
that upper bound;
2. The Markovianity assumption is not always justifiable; is it possible to introduce a deterministicstochastic hybrid model?
3. Subsequent improvements of the standard are not yet captured by the current model; in
particular the 802.11n draft introduces new MAC enhancements that will have to be accounted
for in future models.
The purpose of modeling is two-fold: to study a network performance, and to design control algorithms that optimize a desired criterion.
During August 2007 I will run a mentoring program at the University of Minnesota (IMA) devoted
to modeling of 802.11 WLAN MAC layer. The program is focused on second year graduate students
with strong applied mathematics background. I plan to use ns2 and Matlab tools to simulate WLAN
behavior in extreme scenarios.
In the second part of this document I present in more details one problem (saturation avoidance)
and the approach I propose to follow.
Chapter 2
Components, and Problems
2.1
2.1.1
Statistical Signal Methods for Estimation and Modeling
Sparse Signal Estimation – Part I
Our research at Siemens Corporate Research during the past six years produced a rich body of papers
concerning blind source separation of speech signals. The key observation enabling our breakthrough
is that very rarely two speech signals use the same time-frequency point. We turned this observation
into a hypothesis, called W-disjoint orthogonality, by postulating that supports of time-frequency
representations of speech signals form disjoint sets. We further extended this hypothesis by allowing
simultaneous use of same time-frequency points by up to N source signals (so called generalized
W-disjoint orthogonality hypothesis) (see [9, 17]). More specifically the mixing model has the form
x(t, ω) = A(ω; θ)s(t, ω) + n(t, ω)
(2.1)
where (t,ω) is the time-frequency point, A(ω; θ) is the frequency dependent mixing matrix defined
by mixing parameters θ, x(t, ω) is the measurements vector, s(t, ω) is the vector of source signals
(to be estimated), and n(t, ω) is the noise. Typically we assume n is a zero-mean Gaussian random
variable with known covariance matrix, e.g. Rn = σ 2 I. A challenging case is when dimension
of s is larger than that of x (i.e. more sources than measurements). This is precisely the case
when our hypothesis is best suited to estimate both the mixing parameters θ and source signals
s from a sequence of observations of x. Under these assumptions the Maximum Likelihood (ML)
estimator of s is the most appropriate statistical signal estimator. It turns out that under appropriate
prior signal models (e.g. generalized Gaussians with subunit exponent) the maximum à posteriori
(MAP) estimator yields also sparse signal values. However the ML estimator requires only very
little information about the model; more specifically it requires only to set the maximum number
of simultaneously active sources, whereas the MAP (and similarly the MMSE) estimators require
the knowledge of the source signal distributions (spectral power, exponent, etc.). A more complex
source signal model may yield a better performance provided it fits well the data. However more
complex models are less robust to mismatches than a simpler model, and may perform worse on real
world data. The difficult art is to find the right balance between deterministic and statistic signal
prior model complexity.
Temporal description is yet another dimension that can be added to our model. Instead of
treating each time-frequency coefficient as an independent random variable, one can use dynamic
models to select among possible descriptions. Given the advent of ever increasing computational
power of today processors, hidden Markov models (HMMs) are a popular choice nowadays. Our
source signal model is of the form s(t, ω) = b(t, ω)G(t, ω), a product between a Bernoulli random
variable b(t, ω) and a continuous random variable G(t, ω). However we would like to increase the
power of source separation particularly when there exists prior knowledge about the sources (see
also [18], [19]). In a very recent preprint we proposed an incremental increase in source model
complexity conforming to our basic belief that models should not be more complicated than what is
really needed in order to solve the problem. For this we allow for statistical dependencies of source
signals across time: we modeled {b(t, ω); t} by a first order Markov model. Thus:
p(b(t, ω)|b(t − 1, ω), b(t − 2, ω), ..., b(1, ω)) = p(b(t, ω)|b(t − 1, ω)) = πω (b(t, ω), b(t − 1, ω))
where πω are the 2x2 transition probability matrices indexed by frequency ω. Then the posterior
distribution for vectors b(t, ω), G(t, ω) and θ assuming G(t, ω) and θ are uniformly distributed and
the noise in (2.1) is Gaussian turns into:
P ({b(t, ω), G(t, ω); t}, θ|X)) ∝
T Y
1
2
C exp − 2 kX(t, ω) − Ar (ω; θ)Gr (t, ω)k
σ
t=1
Qω (b(t, ω), b(t − 1, ω))
Q0ω (b(0, ω))
where Qω is the transition probability between selection vectors b(t − 1, ω) and b(t, ω) obtained
simply by multiplying the corresponding component transition probabilities πω , Q0 is the initial
probability, and Ar , Gr are the reduced matrix, respectively vector, by removing the columns,
entries, corresponding to null entries of b(t, ω). The MAP estimation problem has been reduced
now to maximize this criterion.
Given θ and {b(t, ω)}, {G(t, ω)} can easily be computed from a least square problem. Taking
negative of the logarithm, the optimization turns into
min
{b(t,w);t},θ
T
X
∗
X (I − A∗r (A∗r Ar )−1 Ar )X − σ 2 log Qω (b(t, ω), b(t − 1, ω)) − σ 2 log Q0ω
t=1
The optimization
problem is constrained by the generalized W-disjoint orthogonality hypothesis that
P
reads as k bk = N . It is apparent that the optimum solution would not depend too much on the
initial probability distribution provided we choose a large T .
One can carry out the optimization by alternating two partial optimization steps: one over the
selection variables {b(t, ω)}, the other over the mixing parameters θ. The pleasant thing about the
first optimization problem is that it can be carried out efficiently using a Viterbi decoding scheme.
The second optimization problem reduces to a classic ML source location estimation problem.
The complexity of the problem depends heavily on the sensor array geometry, and the mixing model.
The transition probabilities are learned from a training dataset. The training procedure involves
thresholding the database signals with a threshold proportional to average signal spectral power.
More specifically we assume a signal model of the form S = Scritical + Srest where the “critical”
component is the information carrying component of the signal, and the “rest” is just the rest.
The prior assumption is that the critical part has a sparse distribution, whereas the rest has a
Gaussian distribution. Then the MAP estimator of the critical component is given by thresholding
(hard or soft, depending upon the exponent of the prior distribution). Since the critical component
has a sparse distribution, we apply the product model and thus we estimate the binary selection
variables. Then the transition probabilities can be easily estimated using e.g. a ML criterion.
Preliminary tests (see our recent preprints) in the case of known mixing parameters showed
an improvement of about 1.5 dB of separation SINR gain compared to the DUET algorithm that
would use uniform probabilities of transition. Future work will concern the “blind” case of the
signal separation problem, namely when both the source signals and the mixing parameters are to
be estimated.
2.1.2
Sparse Signal Estimation – Part II
Let use return to the model (2.1) introduced before. Another prior distribution very popular nowadays is given by
p
P (S) = Cµ,p exp(−µ |S| )
where µ, p are adjustable parameters.
For p = 2 we get the Gaussian distribution, and the MAP estimator corresponds to Tikhonov
regularization. In this case the solution is obtained by solving a linear system of equations.
The case p = 1 corresponds to Laplace prior distributions and the MAP estimator is obtained
efficiently by solving a convex optimization problem.
Cases when p < 1 are the most interesting since they yield sparse solutions (that is vectors S
with many vanishing components). However the optimization problem is no longer convex. The
estimator is obtained by solving an optimization of the form:
2
p
arg min kX − ASk + λ kSkp
S
(2.2)
with λ = µσ 2 . In a recent paper [10], we studied this optimization problem for two values of p:
p = 0 and p = 1. In particular we showed that the optimizer of (2.2) for (A, λ, p) = (A1 , λ1 , 1) and
(A, λ, p) = (A0 , λ0 , 0) have
p the same support for a nonempty interior set of input vectors X, when
A1 = (A0 )−T and λ1 = λ0 /a(A0 ), with a(A0 ) a function of A0 .
Next I would be interested to explore algorithms that solve (2.2) based on homotopical connection
between the case p = 1 (when we know how to solve (2.2) efficiently) and p = 0 (which is the one
we are really interested).
2.1.3
Nonlinear Signal Processing
.
A longstanding paradigm of speech signal processing is that frequency domain phase information
is either not critical to the task, or it cannot be further improved by the signal processor and therefore
it is not be touched. More specifically I refer to the following two problems: speech recognition, and
speech enhancement (noise reduction). A speech recognition system typically uses Mel frequency
cepstral coefficients (MFCCs) by which the phase information is discarded. Speech enhancement
systems perform time-to-TF-domain conversion, followed by a processing of the modulus of speech
TF coefficient, followed by a linear reconstruction back into time domain using the same phase of the
noisy signal. In the former case we ask whether there is any loss of information by totally discarding
the phase, whereas in the latter case the problem is to find alternate reconstruction algorithms
(possibly nonlinear) that do not use phase information. Jointly with Peter Casazza, Dan Edidin
(both from Univ. of Missouri), and Gitta Kutyniok (Math. Institute Justus-Liebig, Univ. Giessen,
Germany) we published already several results (see [5, 4, 6]).
In a nut shell the abstract problem can be stated as follows. Assume F = {f1 , f2 , . . . , fn } are n
vectors in a d-dimensional Euclidian space E (Rd or Cd ) that span the space (hence n ≥ d). On E
consider the equivalence relation x ∼ y if there is a scalar z with |z| = 1 so that y = zx (that is, x
and y are essentially the same vector up to a constant phase factor). The problem is to study when
the nonlinear map
M : E/ ∼→ (R+ )n , M (x) = {|hx, fk i|}1≤k≤n
is injective, and in such a case to propose an inversion algorithm.
Our analysis so far proved several necessary and sufficient conditions for injectivity of this map,
and also produced the following important and practical result. Assume the real case (E = Rd ) and
use the following notations:
I
I
a
A=
, α=
, G = f˜1 | · · · | f˜n
I − P −(I − P )
0
where G is the d × n matrix whose columns are the canonical dual frame vectors. Let a = M x.
Then
Theorem[6] M −1 (a) contains only one point if and only if for every 0 ≤ p < 1 the following
optimization problem
minAu=α kukp
admits exactly two solutions u and v independent of p, with u = [uT1 uT2 ]T and v = [uT2 uT1 ]T so
that a = |u1 + u2 | and x = G(u1 − u2 ) or x = −G(u1 − u2 ). 2
We remark the similarity of this statement to the equivalence principle found in [13].
In speech processing, a machine learning approach (HMM based phase estimation) has recently
been considered in [12]. Our approach has been so far purely deterministic. Perhaps combining the
two approaches can be beneficial to advanced speech enhancement techniques.
2.1.4
Sensor Fusion
.
One problem of system integration is how to fuse together overlaping components. The challenge
is to do so in a statistically principled manner. Formally this can be achieved by estimating a joint
distribution of sensor outputs, and then perform a statistical inference based on this estimate.
However, due to lack of sufficient training data, or even to complete lack of training data of some
type, reliable joint estimation may be a hard problem.
An example of the latter case is furnished by the following scenario. Assume a sensor network
that monitors a gas turbine, where the sensors measure temperature, pressure, vibration, acoustic
emissions (ultrasound), etc. The task is to detect any abnormal regime of functioning. It is impractical to generate training data for abnormal regimes, hence the decision system should use non-Bayes
techniques to perform this task. Once such approach is based on novelty detection in machine learning. (One may argue that there still exists a Bayes interpretation for one-class classifiers. I tend to
be agnostic on this issue.)
The particular problem I am mostly concerned with is how to assign probabilities to machine
learning classifiers, and in particular to (kernel) support vector machines (kSVM). To fix the problem,
assume a 2-class classification problem where a linear SVM produces the following decision:
x ∈ Rn , x 7→ y = sign(wT x + a) , y ∈ {+1, −1}
where x ∈ Rn is the input feature vector, w ∈ Rn is the normal to the separation hyperplane, a ∈ R
is an offset, and y ∈ {+1, −1} is the binary decision. It is conceivable that the larger the argument
wT x + a the more probable the true class yt is +1 (and similar for the negative case). But what is
the exact distribution. A popular choice is to use sigmoidal type functions:
P (y = 1|r = wT x + a) =
eαr
eαr
+ e−βr
e−βr
eαr + e−βr
where parameters α, β are fit experimentally. Assume this is the case, then the next question is how
to combine multiple SVM classifiers, and fit parameters.
P (y = −1|r = wT x + a) =
2.2
2.2.1
Wireless Communication Networks
Physical Layer: Analysis of Communication Channels
The RAKE receiver is design to exploit spatial diversity in propagation medium by aligning different paths to increase the effective SNR. Similarly, the time-frequency RAKE receiver introduced
by Sayeed and Aazhang in 1999 exploits the diversity of the Time-Frequency doubly spread communication channel and achieves a higher effective SNR. Recently (in [7, 8]) in joint works with
S.Rickard, V.Poor and S.Verdu, we explored other channel models by taking into account the time
dilation associated with Doppler effects. We proposed two new channel models: the time-scale and
the frequency-scale channel model.
Consider a linear communication channel H whose time-varying impulse response
R is h(t, τ ). Thus
for a transmit signal x(t), the received signal y(t) is given by y(t) = Hx(t) = h(t, t − τ )x(τ )dτ .
Using the spreading function formulation (or Weyl quantization) the channel takes the form
Z Z
y(t) =
S(ω, τ )e2πiωt x(t − τ )dωdτ
(2.3)
Assume the transmit signal is bandlimited to [−Ω/2, Ω/2] and the observation takes place over [0, T ].
Then Sayeed and Aazhang proved the received signal admits an expansion of the form
X m n
t
n
y(t) =
Ŝ( , )e2πim T x(t − )
(2.4)
T Ω
Ω
m,n
where the coefficients are given by sampling
Z Z
Ŝ(u, v) =
S(ω, τ )sinc((v − τ )Ω)sinc((u − ω)T )e−iπ(u−ω)T dωdτ
(2.5)
For some channels (see [7]), the input-output correspondence can be rewritten as
Z Z
1
t−b
)da db
y(t) =
L(a, b) p x(
a
|a|
where the time-scale symbol L(a, b) replaces the spreading function (Weyl symbol) S(ω, τ ). Assume
the transmit signal is bandlimited to [−1/2b0 , 1/2b0 ] as before, but the received signal is passed
through a scale-limited filter of scale band [−1/2ln(a0 ), 1/2ln(a0 )] (the scale band filters are linear filter similar to frequency band filters where the Fourier transform is replaced by the Mellin
transform). Then similar to (2.4), the output admits the following expansion
X
−m/2
y(t) =
L̂(m, n)a0
x(a−m
(2.6)
0 t − nb0 )
m,n
where
Z Z
ln a
b
)sinc(n −
)da db
ln a0
ab0
Expansion (2.6) is called the canonical time-scale channel model. (2.3) can be replaced by
Z Z
t
1
y(t) =
ρ(ω, a)e2πiωt p x( )dω da
|a| a
L̂(m, n) =
L(a, b)sinc(m −
(2.7)
For scale band-limited transmit signals to [−1/2ln(a0 ), 1/2ln(a0 )] and finite observation time limited
to [T1 , T2 ], the received signal admits an expansion of the form
X
t
−n/2
y(t) =
cm,n e2πim T2 −T1 a0
x(a−n
(2.8)
0 t)
m,n
where
cm,n
T +T
1
−imπ T1 −T2
2
1
=
e
(T2 − T1 )2
Z Z
ρ(ω, a)einω(T1 +T2 ) sinc(
ω
ln a
− m)sinc(
− n)da dω
Ω
ln a0
The expansion (2.8) is called the canonical frequency-scale channel model.
Next I propose several directions that ought to be studied:
• Study the performance of the RAKE receivers
• Channel decompositions. This is an operator and functional analysis problem closely related
to the time-scale symbols of bounded operators issue that I present below.
• Comparison between channel models. For a given channel expressed, one can use any of a set
of equivalent representations (e.g. time kernel, time-frequency, time-scale, frequency-scale).
The issue is to compare the corresponding RAKE receiver performance.
• Use of smoother cut-offs. The channel models obtained so far use orthogonal projections:
either time cut-offs, or frequency cut-offs, or scale cut-offs. It would be interesting to replace
these sharp cut-offs by smoother versions. It is likely to obtain localized formulae for channel
coefficients, similar to the oversampling case (instead of reconstruction using sinc functions,
one can use reconstruction using faster decaying prototypes).
2.2.2
802.11 WLAN MAC Layer Modeling and Control
Several studies dealing with 802.11 MAC layer stochastic modeling have been published. The seminal
paper by Bianchi ([11]) produced many off-shots including [14] which I consider the state-of-the-art
in modeling the stochastic regime of a 802.11 WLAN. However not all 802.11 standard mechanisms
are accounted for by this model. In particular, the countdown of backoff counters is empirically
modeled by a 1st order feedback process with an independent return rate. Also the non-saturation
case has been modeled in a simplified manner to minimize computational complexity. To account
for these shortcomings I propose the Markov chain pictured in Figure 2.1.
Figure 2.1: A Markov Chain model for the stochastic regime of a WLAN device.
For highly loaded networks, devices are well modeled by Markov chains, such as the one I propose
above. However for low load, or some QoS scenarios, the WLAN behaves deterministically and
Markov chain modeling is not appropriate. One problem I intend to study is a heterogeneous model,
where at one limit the system behaves deterministically, at the other limit it behaves randomly, as
described by the Markov chain depicted in Figure 2.1.
Another problem of interest is detection of saturation, and network control.
From the AP perspective, during transmission of a voice packet the AC[0] instance (Voice Access
Category) spends the following time:
• Counting down for a total of n slot times Tσ , where n is the sum of all randomly generated
back-off counters (including the value of the post-back-off counter, if applicable)
• Holding down while medium is busy
• Waiting for TAIF S each time after a busy medium event (a countdown deferral)
Let us denote by T the total transmission time, Tidle the time medium was idle during transmission,
and Tbusy , the medium was busy during transmission. Denote further by m the number of deferrals.
Then we have:
T = Tidle + Tbusy , Tidle = nTσ + mTAIF S
Let R denote the number of retransmissions for this packet, and b a Boolean variable recording
whether the last transmissions was successful or not: b = 0 if successful, b = 1 if unsuccessful. Let
Q denote the number of unsuccessful transmissions, let D denote the number of collisions, and let
E denote the number of unsuccessful transmissions due to bad channel, all referring to the same
packet. Then we have:
Q=R+b=D+E
From this equations we notice that, given (T, Tbusy , m, R, b) we can compute (m, n, Q). In the
following we assume (m, n, Q) are known for each packet, and we try to derive an estimator for
p, the probability of an unsuccessful transmission due to collisions. If (D,E) were known for each
packet, then this probability would easily be estimated by:
P
k Dk
p̂ = P
(D
k + Ek )
k
Thus our input data is the sequence (mk , nk , Qk )k , indexed by k.
Consider the compressed time t0 = t−Tbusy (t)−m(t)TAIF S . Where Tbusy (t) denotes the duration
the channel was busy by time t, and m(t) denotes number of deferrals by time t. Essentially t0
increases in increments of slot time Tσ . Now we make the following assumption. We assume that
over-the-air packet transmissions occur as a Poisson arrival process with rate λ in this modified time
t0 , see Figure 2.2.
Figure 2.2: Packet transmission time diagrams, and compressed time t0 .
During current packet transmissions, the medium was busy due to other transmissions for a
number M = m + D of times. Since 0 ≤ D ≤ Q, M is a random variable taking one of the following
Q + 1 possible values: {m, m + 1, . . . , m + Q}. M represents the number of arrivals of our Poisson
process during the Tidle − mTAIF S time. Then our task is to write the likelihood of having this data
(m, n, Q) knowing parameters p and λ. First we have:
Q X
Q
P (m, n, Q|p, λ) =
pd (1 − p)Q−d P rob(D = d|λ)
d
d=0
where the remaining probability represents the probability of having exactly d collisions, hence m+d
arrivals. This probability is expressed as (given the Poisson process hypothesis):
P rob(D = d|λ) =
(λnTσ )m+d −λnTc
e
(m + d)!
Putting these two together, and denoting Λ = λTσ we obtain:
P (m, n, Q|p, Λ) =
Q X
(Λn)m+d −Λn
Q
pd (1 − p)Q−d
e
d
(m + d)!
d=0
In this likelihood we can relate further the two parameters, p and Λ. We make the observation that
p represents the probability of a packet transmission, other than own transmission, in the next slot
time. This implies:
p = 1 − e−Λ
or: Λ = −log(1 − p). Given a sequence of observations (mk , nk , Qk )k we obtain:
#
"Q f
k
mk +d
Y
X
Qk
d
Qk −d+nk (−nk log(1 − p))
P [Observation|p] =
p (1 − p)
d
(mk + d)!
k=1
d=1
The Maximum Likelihood estimator (MLE) of p becomes:
p̂M LE = argmaxp P [Observation|p]
Once such an estimate has been obtained, we can estimate further:
P rob[d collisions in Q unsuccessf ul transmissions] =
2.2.3
Q
d
pd (1 − p)Q−d
Optimizations in Wireless Networks
This research direction concerns optimizations of utility like measures used in wireless communications (see [15, 16]).
Consider a multiple access scenario where a base station receives signals from multiple mobile
stations. One defines the utility function u for each user, as the ratio of user’s goodput by its
transmission power uk = PTkk , where the goodput Tk represents the number of successfully transmited
bits per second of user k, and Pk represents the transmission power. In the MA scenario described
before, Tk = Rk f (γk ) is the product of transmission rate Rk and the probability of successful
k)
transmission f (γk ) that depends on the SINR γk . Since γk = σ2 +PPk Pj , it follows that uk = C f (γ
γk
j6=k
where C depends on the other users. Fixing other users’ transmit powers and rates, the utilitymaximizing strategy for user k (the Nash equilibrium) is given by the solution of the constrained
maximization
maxγk uk subject to fixed γj , ∀j 6= k
Given our previous derivation for utility, it follows that each users’ optimum power is given by
k)
independently. Two conclusions can be drawn:
maximizing f (γ
γk
(i) At optimum utility function, each user achieves the same received SINR
(ii) The user optimum power can be computed independently by each user, providing the base station
informs the user of received total power and user’s gain.
These conclusions are well-known for the non-cooperative game as described above. In [15, 16]
we extended this problem by taking into account QoS constraints. First we considered a M/G/1
queue type service with an automatic-repeat-request transmission for each user. The QoS contraint
is manifested by limiting the average wait time W̄ . Thus the problem becomes:
maxRk ,pk uk
subject to fixed γj , ∀j 6= k and W̄k ≤ τk
In this framework we studied the existence of Nash equilibria, and analyzed different network characteristics (e.g. total goodput, maximum number of users that satisfy QoS constraint, etc.).
Some problems remained to be studied further.
One issue concerns the utility function defined above. The sharp delay constraint can be “softened” by using a fixed cost multiplier. Thus the problem turns into:
maxRk ,pk uk − λk (τk − W̄k )
subject to fixed γj , ∀j 6= k
Other possible variants can be imagined.
Another future issue concerns contention-based communication mechanisms, such as 802.11. For
this networks, the service mechanism is not accurately modeled by a M/G/1 queue. It will be
interesting to see how the WLAN type communication can be modeled. Once such a model is
obtained, one can then look at optimal non-cooperative strategies as before. The interesting issue is
to see what is the minimal information one user needs in order to obtain its own optimal strategy.
In the MA scenario this optimization decouples once the user knows the total interfering power and
its transmission gain. Would such a distributed optimization hold true for other service models?
2.3
2.3.1
Topics of Applied Mathematics
Frames: Redundancy, Density and Measure Theory
Frames are redundant sets of vectors in a Hilbert space. While redundancy and excess are straightforward notions in the case of a finite frame set, the similar concepts in the infinite set case are not
so well understood. My collaboration with Z.Landau and the set of papers jointly with him and
P.Casazza and C.Heil ([1, 2, 3]) are important steps toward a better understanding of these concepts.
The key observation (and belief) is that, for a frame set F = {fi , i ∈ I}, a measure of redundancy
is governed by the partial averages of the form
a(J) =
1 X
hfi , f˜i i
|J|
i∈J
where F̃ = {f˜i , i ∈ I} is the canonical dual frame. In the aforementioned papers we linked these
averages to densities of labels and making them computationally feasible.
To give the “flavor” of results we proved I state just some of the results, by refering to Gabor
frames only; However the statements we proved are much more general.
Consider G = {gλ = Uλ g ; λ ∈ Λ} a Gabor frame with canonical dual frame G̃ = {g˜λ ; λ ∈ Λ},
where Λ ⊂ R2d is the set of time-frequencies parameters, Uλ g(x) = eiωx g(x−t), is the time-frequency
shift with parameter λ = (t, ω). We let SR (c) denote a box of size R centered at c in the phase space
R2d , SR (c) = {λ | kλ − ck ≤ R}. For a set I, we let |I| denote its cardinal. Define
a(R, c) =
1
|Λ ∩ SR (c)|
and
D(R, c) =
X
hgλ , g˜λ i
λ∈Λ∩SN (c)
|Λ ∩ SR (c)|
vol(SR (c))
where vol(K) is the volume of set K. The Beurling densities are D+ (Λ) = lim supR→∞ supc D(R, c),
respectively D− (Λ) = lim inf R→∞ inf c D(R, c). I also recall the modulation space
Z
M 1 = {f ∈ L2 (Rd ) ;
|hγλ , f i|dλ < ∞}
where γ(x) = exp(−x2 /2) is the Gaussian window.
Theorem. Let G be a Gabor frame for L2 (Rd ) with canonical dual G̃.
1. Let (Rn , cn ) be a sequence so that D0 = limn D(Rn , cn ) exists. Then:
lim a(Rn , cn ) =
n
1
D0
(2.9)
2. If g ∈ M 1 , then for all λ ∈ Λ, g˜λ ∈ M 1 and there is an envelope F ∈ L1 (R2d ) so that
|hγµ , g˜λ i| ≤ F (µ − λ);
3. Assume D− > 1 and g ∈ M 1 . Then there is a subset Σ ⊂ Λ of positive uniform measure, that
is D+ (Σ) = D− (Σ) > 0, so that G 0 = {gλ ; λ ∈ Λ \ Σ} is frame for L2 (Rd );
4. Assume D+ > 1 and g ∈ M 1 . Then there is a subset Σ ⊂ Λ so that G 0 = {gλ ; λ ∈ Λ \ Σ} is
frame and D+ (Λ \ Σ) < D+ (Λ).
2
The method applies only to Gabor or Gabor like frames. It would be interesting to explore if
and how these methods extend to other sets of frames, in particular to wavelet sets. Even for Gabor
sets there still remains as an open problem the issue of removing subsets of positive density and
leave the remaining set frame with Beurling densities arbitrarily close to one.
2.3.2
Algebras of Time-Frequency/Time-Scale Shift Operators
Consider the set of Time-Frequency shift operators. It naturally forms a group, and by taking
arbitrary linear combinations with absolutely summable coefficients it gives rise to a Banach algebra
with involution:
X
X
cλ Uλ ; kT kAv :=
v(λ)|cλ | < ∞}
(2.10)
Av = {T =
λ
λ
iωx
where Uλ f (x) = e f (x − t) is the time-frequency shift by λ = (t, ω), and v is an admissible
weight (e.g. polynomial growth). Note we do not assume the support of c has a lattice structure,
supp(c) = {λ ∈ R2d ; cλ 6= 0}. In general the support is a countable subset of R2d , possibly
dense. The closure of Av with respect to the operator norm produces a noncommutative C*-algebra
denoted by C. The closure of Av (or C) with respect to the weak (or strong) operator topology is
the full B(L2 (Rd )) algebra of bounded operators on L2 (Rd ). Regarding these algebras, I proved in
a recent preprint the following results
Theorem.
1. The algebra Av is inverse closed. Thus, if T ∈ Av and T is invertible in B(L2 (Rd )), then
T −1 ∈ Av .
2. For any T ∈ Av its spectral radius with respect to algebra Av is the same as the spectral
radius with respect to algebra B(L2 (Rd )).
P
3. Assume T = λ∈Λ cλ Uλ with |Λ| = N < ∞ and R0 = maxλ∈Λ kλk. Assume T is invertible in
2
2
B(L2 (Rd )), and hence in Av as well. Denote A = T −1 B(L2 (Rd )) , B = kT kB(L2 (Rd )) , and
ρ = max(1, 2R0 ), and assume a polynomial weight w(x) = C(1 + x)m for some C > 0 and
m ∈ N. Then
m+N
−1 Cρm kT kAv
A+B
T ≤
(m
+
N
)!
(2.11)
Av
A
2A
2
Furthermore these algebras admit a faithful tracial state, namely
X
T =
cλ Uλ −→ γ(T ) := c0
λ
This is given explicitely by the following result.
(2.12)
Theorem. Consider now G = {gm,n;α,β := Uβn,2παm g | m, n ∈ Zd } a Gabor frame for L2 (Rd ),
with α, β > 0, αβ ≤ 1, and a dual Gabor frame (not necessarily the canonical dual frame) G̃ =
{g̃m,n;α,β := Uβn,2παm g̃ | m, n ∈ Zd }. Then for any T ∈ C,
γ(T ) =
1
(αβ)d
lim
M,N →∞
1
(2M +
1)d (2N
X
+
1)d
X
hT gm,n;α,β , g̃m,n;α,β i
(2.13)
|m|≤M |n|≤N
is the faithful tracial state (2.12) on C, independent of the choice of the Gabor frame G. 2
Putting all these elements together I was able recently to prove a special case of the HeilRamanthan-Topiwala conjecture (linear independence of finitely many time-frequency shifts of an
L2 function) namely:
P
Theorem. For any finite Λ ⊂ R2d and complex scalars (cλ )λ∈Λ , the operator T = λ∈Λ cλ Uλ
has no finite multiplicity eigenvalue. Hence the pure point spectrum, if exists, can only contain
either eigenvalues with infinite multiplicity, or eigenvalues that belong to the continuum part of the
spectrum as well. 2
I plan to study further properties of these and other similar algebras. More specifically:
• I would like to investigate how to extend the eigenspectrum theorem to the infinite multiplicity
case;
• Another case of interest is furnished by dilation operators. Thus time and scale shift operators
are closely related to wavelet sets. There is also interest in the full wave packet group containing
time, frequency, and scale shifts.
• An application of this theory is to the channel equalization problem. More specifically the
question is to invert an operator T ∈ A that has finite support. Our norm estimates suggest
how to approximate the inverse using finitely many coefficients.
2.3.3
Time-Scale Symbols of Bounded Operators
An off-shot of the project on communication channels analysis ([7, 8]) is the study of integral
operators whose kernels act through time-scale shifts. More specifically the class of operators we are
interested in is given by:
Z Z
x−b
1
T f (x) =
L(a, b) p f (
)da db
a
|a|
where L(a,b) is its kernel. It turns out an object of interest for designing a RAKE receiver is
a “sandwich” of operators PTQ, where P and Q are some orthogonal projectors. For particular
choices of P and Q, we were able to prove that PTQ admits decomposition into a convergent series
of type
P T Q = Σm,n cm,n P U m V n Q,
where U and V are some unitary operators. Define the set
X
X
A = {T =
cm,n P U m V n Q ; kT kA :=
|cmn | < ∞}
m,n
m,n
A is a Banach space, subspace in B(Ran Q, Ran P ) the space of bounded operators from Ran Q to
Ran P . Of interest are the cases when P, Q, U, V are chosen so that P U = U P , QV = V Q, and
there are e0 , f0 ∈ L2 so that {U m e0 ; m ∈ Z} is an orthonormal basis in Ran P , and {V n f0 ; n ∈ Z}
is an orthonormal basis in Ran Q. Denote:
am,n = hV m f0 , U n e0 i , hm,n = hP T QV m f0 , U n e0 i
X
X
A(z1 , z2 ) =
am,n z1n z2n , H(z1 , z2 ) =
hm,n z1m z2n
m,n
m,n
So far I proved the following result
Theorem. Assume P Q and P T Q are Hilbert-Schmidt operators.
1. The sequence a = (amn ) is in l2 (Z2 ). Hence A(z1 , z2 ) is a function in L2 (T 2 ). The same goes
for h = (hm,n ) and H(z1 , z2 ).
2. Assume further that for some a0 > 0 and a1 < ∞,
a0 ≤ |A(e2πiθ1 , e2πiθ2 )| ≤ a1
Then
Z
1/2
Z
1/2
dθ2
dθ1
cm,n =
−1/2
−1/2
H(z1 , z2 )
|
2πiθ1 ,z =e2πiθ2
2
A(z1 , z2 ) z1 =e
(2.14)
P
is in l2 (Z2 ) and the series m,n cm,n P U m V n Q converges strongly to P T Q. 2
The typical cases where we applied this result are given by the translation, modulation , and
dilation operators. However the combination dilation-translation does not yield a Hilbert-Schmidt
operator. I plan to explore further such decompositions for other pair of operators, and to understand
the operator algebras generated by them.
2.3.4
Machine Learning: Data Embeddings into Higher Dimensional Linear Spaces
A redundant set of vectors in an Euclidian space performs a linear embedding of the space vectors
into the higher dimensional space of coefficients: x 7→ {hx, fk i}1≤k≤n . A more complex embedding is
given by the absolute value of frame coefficients map considered in the Nonlinear Signal Processing
problem presented above. Nonlinear embeddings given by reproducing kernel Hilbert spaces (RKHS)
are of high interest in classification problem, e.g. kernel support vector machines (KSVMs). I propose
to consider nonlinear embeddings suggested by the RKHS associated to Gabor and Wavelet analysis.
I would like to explore the mathematical fundations of these embeddings into a series of lectures as
an advanced seminar, or develop a new curriculum on this topic.
Bibliography
[1] R. Balan, P. Casazza, C. Heil, and Z. Landau. Deficits and Excesses of Frames. Advances in
Computational Mathematics, 18:93–116, 2003.
[2] R. Balan, P. Casazza, C. Heil, and Z. Landau. Excesses of Gabor Frames. Appl. Comput.
Harmon. Anal., 14:87–106, 2003.
[3] R. Balan, P. Casazza, C. Heil, and Z. Landau. Excess of Parseval frames. In Proceedings of
SPIE Wavelets XI, August 2005.
[4] R. Balan, P.G. Casazza, and D. Edidin. On signal reconstruction from absolute value of frame
coefficients. In Proceedings of SPIE Wavelets XI, August 2005.
[5] R. Balan, P.G. Casazza, and D. Edidin. On Signal Reconstruction without Noisy Phase.
Appl.Comput.Harmon.Anal., 20:345–356, 2006.
[6] R. Balan, P.G. Casazza, and D. Edidin. Equivalence of Reconstruction from the Absolute
Value of the Frame Coefficients to a Sparse Representation Problem. IEEE Sig.Proc.Letters,
May 2007.
[7] R. Balan, H.V. Poor, S. Rickard, and S. Verdu. Canonical time-frequency, time-scale, and
frequency-scale representations of time-varying channels. J. of Comm. in Infor. Syst., 5(5):1–
30, 2005.
[8] R. Balan, V. Poor, S. Rickard, and S. Verdú. Frequency and Time-Scale Canonical Representations of Doubly Spread Channels. In Proceedings of EUSIPCO 2004, Vienna Austria,
September 2004.
[9] R. Balan, J. Rosca, and S. Rickard. Non-square Blind Source Separation under Coherent Noise
by Beamforming and Time-Frequency Masking. In Proc. ICA, 2003.
[10] R. Balan, J. Rosca, and S. Rickard. Equivalence Principle for Optimization of Sparse versus
Low-Spread Representations for Signal Estimation in Noise. International Journal of Imaging
Systems and Technology, 15(1):10–17, 2005.
[11] G. Bianchi. Performance analysis of the ieee 802.11 distributed coordination function. IEEE
Journal on Selected Areas of Communications, 3(18):535–547, 2000.
[12] K. Chan, S.T. Roweis, and B.J. Frey. Probabilistic inference of speech signals from phaseless
spectrograms. In Proceedings of Neural Information Processing Systems (NIPS03), volume 16,
2003.
[13] D.L. Donohoe and X. Huo. Uncertainty principles and ideal atomic decomposition. IEEE Trans
IT, 47(7):2845–2862, 2001.
[14] P.E. Engelstad and O.N. Osterboro. Non-saturation and saturation analysis of ieee 802.11e edca
with starvation prediction. In Proc. 8th ACM Int.Symp.onModel.Anal.Sim.WirelessMob.Syst.,
MSWiM’05, 2005.
[15] F. Meshkati, H.V. Poor, S.C. Schwartz, and R.V. Balan. Energy-Efficient Power and Rate
Control with QoS Constraints: A Game-Theoretic Approach. In Proc. Int. Comm. and Mobile
Comp. Conf., 2006.
[16] F. Meshkati, H.V. Poor, S.C. Schwartz, and R.V. Balan. Energy-Efficient Resource Allocation
in Wireless Networks with Quality-Of-Service Constraints. to appear in IEEE Trans. in Comm.,
2006.
[17] J. Rosca, C. Borss, and R. Balan. Generalized sparse signal mixing model and application to
noisy blind source separation. In Proc. ICASSP, 2004.
[18] S. T. Roweis. One microphone source separation. In Neural Information Processing Systems
13 (NIPS), pages 793–799, 2000.
[19] P.J Wolfe, S.J. Godsill, and W.J. Ng. Bayesian variable selection and regularization for timefrequency surface estimation. J.R.Statist.Soc.B, 66(Part 3):575–589, 2004.
Download