/

advertisement
i
/
4ô-
To appear in Proc. of the 22’ Annual Asilomar Conference on Signals Systems and Computers, Nov. 1-3, 1993/
Orthogonal Matching Pursuit:
Recursive Function Approximation with Applications to Wavelet
Decomposition
Y. C. PATI
Information Systems Laboratory
Dept. of Electrical Engineering
Stanford University, Stanford, CA 94305
R.
Abstract
In this paper we describe a recursive algorithm to
compute representations of functions with respect to
nonorthogonal and possibly overcomplete dictionaries
of elementary building blocks e.g. aiEne (wa.velet)
frames. We propoeea modification to the Matching
Pursuit algorithm of Mallat and Zhang (1992) that
maintains full backward orthogonality of the residual
(error) at every step and thereby leads to improved
convergence. We refer to this modified algorithm as
Orthogonal Matching Pursuit (OMP). It is shown that
all additional computation required for the OMP al
gorithm may be performed recursively.
REZAIIFAR AND P. S. KRISHNAPRASAD
Institute for Systems Research
Dept. of Electrical Engineering
University of Maryland, College Park, MD 20742
where fk is the current approximation, and Rkf the
current residual (error). Using initial values of R
f=
0
1, fo = 0, and k = 1, the MP algorithm is comprised
of the following steps,
.,.41) Compute the inner-products {(Rkf,z)}.
(H) Find
flki
such that
+,)l
1
I(R*f,1:n
where 0 < a < 1.
(III) Set,
fk+1
1
Introduction and Background
Given a collection of vectors D
space 1i, let us define
V
=
Span{z}, and W
=
=
{z} in a Hubert
1 (in E).
V
We shall refer to D as a dictionary, and will aasume
the vectors z,,, are normalized (jlxII = 1). In [3] Mal
lat and Zhang proposed an iterative algorithm that
they termed Matching Pursuit (MP) to construct rep
resentations of the form
Pyf=Eanzn,
(1)
ft
where P
, is the orthogonal projection operator onto
1
V. Each iteration of the MP algorithm results in an
intermediate representation of the form
f=Ea:n+R*ffk+Pf,
1
asupl(Rkf,z,)I,
=
=
1k + (Rkf, z+
) Zflk+j
1
Rkf— (Rkf,zflk+l)zflk+,
(IV) Increment k, (k +— k + 1), and repeat steps (I)—
(IV) until some convergence criterion has been
satisfied.
The proof of convergence [3] of MP relies essentially on
the fact that (Rk+lf,zflk+l) = 0. This orthogonality
of the residual to the last vector selected leads..±o the
following “energy conservation” equation.
2
lIfII
fjI + l(kf,Zn4+1)I2.
1
jj’,’+
2
(2)
It has been noted that the MP algorithm may be d&
rived as a special case of a technique known as Pro
jection Pursuit (c.f. [2]) in the statisti literature.
A shortcoming of the Matching Pursuit algorithm
in its originally proposed form is that although asymp
t.otic convergence is guaranteed, the resulting approxi
mation after any finite number of iterations will in gen
eral be suboptimal in the following sense. Let N < co
be the number of MP iterations performed. Thus we
have
fN
=
(f,zflk+I)
z,.
We shall refer
Define VN = Span{z,,.
to fN as an optima! N-term apprnximation if fri
i.e. fri is the best approximation we can
‘VNf’
,. .,zflN} of
1
construct using the selected subset {z
the dictionary D. (Note that this notion of optimal
ity does not involve the problem of selecting an “op
timal” N-element subset of the dictionary.) In this
sense, fri is an optimal N-term approximation, if and
only if RN! E V. As MP only guarantees that
RNf L x,,, fri as generated by MP will in general
be suboptimal. The difficulty with such suboptimality
. Let
2
is easily illustrated by a simple example in JR
2 as
2 and take f € i
2 be two vectors in JR
z and x
,
1
shown in Figure 1(a). Figure 1(b) is a plot of IIRsfll
2
.
./
gives the optimal approximation with respect to the
selected subset of the dictionary. This is achieved by
ensuring full backward orthogonality of the error i.e.
at each iteration Rkf E V. For the example in Fig
ure 1, OMP ensures convergence in exactly two itera
tions. It is also shown that the additional computation
required for OMP, takes a simple recursive form.
We demonstrate the utility of OMP by example
of applications to representing functions with respect
to time-frequency localized affine wavelet dictionaries.
We also compare the performance of OMP with that
of MP on two numerical examples.
2
Orthogonal Matching Pursuit
Assume we have the following kthorder model for
f € Ii,
f=azn+Rkf, with (Rkf,s) =0,
..--+--.-r
n
1,...k.
(3)
The superscript k, in the coefficients a, show the de
pendence of these coefficients on the model-order. We
model to a model
would like to update this
of order k+1,
(7
(a)
k+1
+ Rk+lf, with (Rk+lf, z,) = 0,
f=
n=1,...k+1.
,i=1
(4)
Since elements of the dictionary D are not required
to be orthogonal, to perform such an update, we also
require an auxiliary model for the dependence of zk+1
on the previous x’s (n = 1,.. .k). Let,
(b)
Ill
—
a
———
a
: (a) Dic
2
Figure 1: Matching pursuit example in 1R
2
,z and a vector f €1R
1
{z
}
tionary D = 2
versus k. Hence although asymptotic convergence is
guaranteed, after any finite number of steps, the error
may still be quite large. ‘
In this paper we propose a refinement of the Match
ing Pursuit (MP) algorithm that we refer to as Or
thogonal Matching Pursuit (OMP). For nonorthogo
nal dictionaries, OMP will in general converge faster
than MP. For any finite size dictionary of N elements,
OMP converges to the projection onto the span of the
dictionary elements in no more than N steps. Fur• thermore after any finite number of iterations, OMP
with <7k,Zn)O,
n=1,...k.
(5)
Thus
= Pvk+1,
= Pvbzk+a, and
is the component of xk1 which is unexplained by
. ..,z}.
11
{z
Using the auxiliary model (5), it may be shown that
the correct update from the ktorder model to the
model of order k + 1, is given by
an5+1
and at
‘A aimlai difficulty with the Projection Pursuit algorithm
noted by Donoho eLaL [11 who suggested that bec.kfitting
may be used to improve the convergence o( PPR Although the
tedinique is not fully de.ibed in (iJ it appears that it is in the
same spirit a. the tednique we present here.
where a
—
k_
a
b
3 —1
,...,
k
(5
=
=
, Zk+i)
(R
f
5
(7s,zsi)
(Rkf, Zk+j)
=
,
R
f
5
Ilzt+ i112
—
117511 2
Xt+i)
1 b (z,,, Z54.j)
E
2
Theorem 2.1 For f € ?t, let
gerzeruted by OMP. Then
It also follows that the residual Rk+jf satisfies,
Rkf = Rk+lf + akj’k, and
2
IIRkfH
2.1
=
ifil
2
IIR+
+
(1)
(7)
117k II
foO,
Zo=O,
f,
R
f
0
a=O,
(I) Compute {(Rkf, x,) ; z
Xllb+
ED
\ Dk
The key difference between MP and OMP lies in Prop
erty (ii/) of Theorem 2.1. Property (iif) implies that
at the N step we have the best approximation we
can get using the N vectors we have selected from
the dictionary. Therefore in the case of finite dictio
naries of size M, OMP converges in no more than M
iterations to the projection of f onto the span of the
dictionary elements. As mentioned earlier, Matching
Pursuit does not possess this property.
{)
D
=
0
k=O
D
\ Dk).
such that
kf,xi)I,
1(R*f,znh+jIasupl(R
J
(IV)
O<’a 1.
(Rkf,xnh+,)I <, ( > 0) then stop.
2.3
the dictionary D, by applying the per
k + 1 4-k k+1•
on
mutati
Reorder
1 bz +7k
= E
and (7k,z)=0, n=1,...,k.
cz =ak =
(f,z) = (f-fk,zf) = (f,x)—a (x,z).
+l),
k11
Rkf,Zk
2
(117
(8)
OMP,
for
d
The only additional computation require
arises in determining the b ‘a of the auxiliary model
(5). To compute the b’s we rewrite the normal equa
tions associated with (5) as a system of k linear equa
tions,
(9)
Vk = Akbk
a—akb, n=1,...,k,
a
=
1
and update the
model,
fk+1
=
R*+af
=
Some Computational Details
As in the case of MP, the inner products
{(Rkf, x,)) may be computed recursively. For OMP
we may express these recursions implicitly in the for
mula
, such that,
1
(V) Compute {b}
(VI) Set,
k+1
a1z,*
where
= DkU{zk+1).
T
=
VII) Set k
2.2
—
N=O,1,2
Remarks:
1itiali.ation
(III) If
urn IIRkf—.PvfIIO.
Proof: The proof of convergence parallels the proof
of Theorem 1 in [3]. The proof of the the second prop
erty follows immediately from the orthogonality con
ditions of Equation (3).
The results of the previous section may be used to
construct the following algorithm that we will refer to
as Orthogonal Matching Pursuit (OMP).
(II) Find
be the residuals
k-.oo
(ii)fN=PV,,,f,
The OMP Algorithm
Rkf
k + 1, and repeat (I)—(V1I).
[(Zk+1,Z1),(Xk+j,Z2).
.
=
and
Some Properties of OMP
As in the case of MP, convergence of OMP relies
on an ener’ conservation equation that now takes
the form (7). The following theorem summarizes the
convergence properties of OMP.
3
,x
1
(z
)
z
1
,
2
(z
)
.
.
.
(x,z)
)
2
(:i,x
x2)
21
(x
.
.
.
(z,z)
(X1,Xk)
(z2,Zk)
..:
Oir.4 Sp w.d 0.W Amcn
Note that the positive constant used in Step (111)
of OMP guarantees nonsingulazity of the matrix Ak,
hence we may write
bk
= A’vk.
(a)
(10)
However, since Ak+1 may be written as
r Ak
I.
k
V
i
1
10’
(11)
I,
10•’
(where * denotes conjugate transpose) it may be
shown using the block matrix inversion formula that
k+11r
—
1 + bkb
A
—,3bk
—flb
1
(a)
10
(12)
1
0
—
Examples
700
I
In the following examples we consider represen
tations with repect to an a.ffine wavelet frame con
structed from dilates and translates of the second
derivate of a Gaussian, i.e. D = {m,,,, m, n E Z}
where,
= 2m/2,(2m1
14
40
/
\1/2
(:2
—
1)
.40
•
a
40
to
to
Figure 3: Distribution of coefficients obtained by ap
plying OMP in Example 1. Shading is proportional to
squared magnitude of the coefficients 0
k, with dark
colors indicating large magnitudes.
and the analyzing wavetet 0 is given by,
4
(i.;;:;)
40
Tr.i W4
—
=
too
4
Figure 2: Example I: (a) Original signal f, with OMP
2 norm of
approximation superimposed, (b) Squared L
residual Rkf versus iteration number k, for both OMP
(solid lini) and MP (dashed line).
, and therefore
1
where fi = 1/(1 vZbk). Hence A
, and
1
using A
recursively
computed
be
may
bk÷j,
previous
step.
.bk from the
3
o
e_’’2
to 8 times more expensive to achieve a prespecified er
ror using OMP even though OMP converges in fewer
iterations. On the other hand for the signal shown
in Figure 5, which lies in the span of three dictionary
vectors, it is approximately 20 times more expensive
to apply MP. In this case OMP converges in exactly
three iterations.
Note that for wavelet dictionaries, the initial set of in
ner products {(f, Om,n)}, are readily computed by one
convolution followed by sampling at each dilation level
m. The dictionary used in these examples consists of
a total of 351 vectors.
In our first example, both OMP and MP were ap
plied to the signal shown in Figure 2(a). We see from
Figure 2(b) that OMP clearly converges in far fewer
iterations than MP. The squared magnitude of the co
efficients 0
k, of the resulting representation is shown in
Figure 3. We could also compare the two algorithms
on the basis of required computational effort to com
pute representations of signals to within a prespecified
error. However such a comparison can only be made
for a given signal and dictionary, as the number of it
erations required for each algorithm depends on both
the signal and the dictionary. For example, for the
signal of Example I, we see from Figure 4 that it is 3
4
Summary and Conclusions
In this paper we have described a recursive al
gorithm, which we refer to as Orthogonal Matching
Pursuit (OMP), to compute representations of signals
with respect to arbitrary dictionaries of elementary
functions. The algorithm we have described is a mod
ification of the Matching Pursuit (MP) algorithm of
Mallat and Zhang [3] that improves convergence us4
at each step.
7
to
Acknowledgements
5_ t0•
This research of Y.C.P. was supported in part by NASA
Headquarters, Center for Aeronautics and Space Informa
tion Sciences (CASIS) under Grant NAGW419,S6, and in
part by the Advanced Research Projects Agency of the De
partment of Defense monitored by the Air Force Office of
Scientific Research under Contract F49620-93-I-Oog5,
This research of R.R. and P.S.K was supported in part
by the Air Force Office of Scientific Research under con
tract F49620-92-J-0500, the AFOSR University Research
Initiative Program under Grant AFOSR-90-0105, by the
Army Research Office under Smart Structures URI Con
tract no. DAALO3-92-G-0121, and by the National Sci
ence Foundation’s Engineering Research Centers Program,
NSFD CDR 8803012.
10
.15
-2
ta
— z.d 1.2 ,w
Figure 4: Computational cost (FLOPS) versus ap
proximation error for both OMP (solid line) and MP
(dashed line) applied to the signal in Example 1.
(a)
02
0
-
-150
.100
-
100
50
0
150
References
ZO
[1) D. Donoho, I. Johnstone, P. Rousseeuw, and
W. Stahel. The Annals of Statistics, 13(2):496—
500, 1985. Discussion following article by P. Hu
ber.
(b)
0
10
40
00
70
[2) P. .1. Huber. Projection pursuit.
The Annals of
Statistics, 13(2) :435—475, 1985.
[3] S. Mallat and Z. Zhang. Matching pursuits with
time-frequency dictionaries. Preprint. Submitted
to IEEE Trzrnsactions on Signal Processing, 1992.
Figure 5: Example II: (a) Original signal f, (b)
2 norm of residual R.f versus iteration
Squared L
number k, for both OMP (solid line) and MP (dashed
line).
ing an additional orthogonalization step. The main
benefit of OMP over MP is the fact that it is guar
anteed to converge in a finite number of steps for a
finite dictionary. We also demonstrated that all addi
tional computation that is required for OMP may be
performed recursively.
The two algorithms, MP and OMP, were compared
on two simple examples of decomposition with respect
to a wavelet dictionary. It was noted that although
OMP converges in fewer iterations than MP, the com
putational effort required for each algorithm depends
on both the class of signals and choice of dictionary.
Although we do not provide a rigorous argument here,
it seems reasonable to conjecture that OMP will be
computationally cheaper than MP for very redundant
dictionaries, as knowledge of the redundancy is ex
ploited in OMP to reduce the error as much as poible
5
Download