Approximation Algorithms for Indefinite Complex Quadratic Maximization Problems Yongwei Huang Shuzhong Zhang

advertisement
Approximation Algorithms for Indefinite Complex Quadratic
Maximization Problems
Yongwei Huang
∗
Shuzhong Zhang
†
July 2005
Abstract
In this paper we consider the following two types of complex quadratic maximization problems:
(i) maximize z H Qz, subject to zkm = 1, k = 1, ..., n, where Q is a Hermitian matrix with tr Q = 0
and z ∈ Cn is the decision vector; (ii) maximize Re y H Az, subject to ykm = 1, k = 1, ..., p, and
zlm = 1, l = 1, ..., q, where A ∈ Cp×q and y ∈ Cp and z ∈ Cq are the decision vectors. In
the real cases (namely m = 2 and the matrices are all real-valued), such problems were recently
considered by Charikar and Wirth [6] and Alon and Naor [1] respectively. In particular, Charikar
and Wirth [6] presented an Ω(1/ log n) approximation algorithm for Problem (i) in the real case,
and Alon and Naor [1] presented a 0.56-approximation algorithm for Problem (ii) in the real
case. In this paper, we propose
Ω(1/ log n) ´approximation algorithm for the general version of
³ an
m2 (1−cos 2π
m)
complex Problem (i), and a
− 1 -approximation algorithm for the general version
4π
of complex Problem (ii). For the limit and continuous version of complex Problem (ii) (m → ∞),
we further show that a 0.7118 approximation ratio can be achieved.
Keywords: indefinite Hermitian matrix, randomized algorithms, approximation ratio, semidefinite programming relaxation.
MSC subject classification: 90C20, 90C22.
∗
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin,
Hong Kong (ywhuang@se.cuhk.edu.hk).
†
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin,
Hong Kong (zhang@se.cuhk.edu.hk). Research supported by Hong Kong RGC Earmarked Grant CUHK418505.
1
1
Introduction
Polynomial-time approximation algorithms for NP-hard problems via semidefinite programming (SDP)
have received much attention in the last decade, since, in several important cases, this approach leads
to significant improvements on the worst-case approximation ratios. The pioneering work along this
direction is the famous 0.87856-approximation algorithm of Goemans and Williamson [7] for the
Max-Cut problem. Nesterov [11] and Ye [14] obtained π2 -approximation algorithms for quadratic
maximization problem with a positive semidefinite objective matrix. Actually, the π2 approximation
result can also be derived from the so-called real matrix cube theorem developed by Ben-Tal and
Nemirovski in [3]. Alternatively, Alon and Naor [1] showed that the π2 approximation result can also
be obtained via the so-called Grothendieck’s Inequality. Moreover, Ben-Tal and Nemirovski [3] and
Alon and Naor [1] showed that the π2 bound is essentially tight.
Recently, Goemans and Williamson [8] proposed a randomized approximation algorithm, via complex
SDP relaxation, for solving the Max-3-Cut problem, which is formulated as a quadratic maximization
problem with complex-valued decision variables. In particular, they consider the following model:
maximize z H Qz subject to zk3 = 1, k = 1, ..., n, where Q is the Laplacian of the graph (hence positive semidefinite for a nonnegatively weighted graph). By a SDP relaxation and random hyperplane
method, Goemans and Williamson showed that the algorithm achieves a 0.836 approximation ratio.
Recently, Zhang and Huang [16] extended Goemans and Williamson’s model [8], where they first
developed a closed-form formula for computing the probability of a complex-valued normally distributed bivariate random vector to be in a given angular region, and then applying this formula they
obtained an approximation ratio π4 for the problem: maximize z H Qz subject to |zk | = 1, k = 1, ..., n,
where Q is Hermitian positive semidefinite. Similar as with its real-case counter-part, in fact this π4
approximation ratio can also be obtained in two other ways: either by the so-called complex matrix
cube theorem developed by Ben-Tal, Nemirovski and Roos [4], or by the complex Grothendieck’s Inequality approach developed by Haagerup [9]. However, these approaches shed lights on the problem
from very different angles. For the discrete version of the model: maximize z H Qz subject to zkm = 1,
k = 1, ..., n, So, Zhang and Ye [12] obtained the m2 (1 − cos 2π
m )/8π approximation ratio based on
Grothendieck’s Inequality. We obtained the same approximation ratio in a later version of [16], by
using the probability formula that we developed earlier.
Charikar and Wirth [6] considered the following problem: maximize xT Qx subject to x2k = 1,
k = 1, ..., n, where Q is a symmetric matrix with zero diagonal elements. They presented an
Ω(1/ log n)-approximation algorithm for such problems. The novelty of their approach lies in a
rounding procedure, taking into account the size of the projections of a random vector onto the
solution vectors.
A related paper was due to Alon and Naor [1], where they considered the so-called cut-norm of a real
2
matrix. In fact, they first considered another norm, denoted as ||A||∞7→1 , of a real matrix A ∈ <m×n ,
with ||A||∞7→1 := max{xT Ay | x2i = 1, yj2 = 1, i = 1, ..., m, j = 1, ..., n}. Alon and Naor [1] proposed
a 0.56-approximation algorithm for computing the norm ||A||∞7→1 of a real matrix A, where the
π √
ratio 0.56 is the reciprocal of an upper bound 2 ln(1+
of the real Grothendieck’s constant. Clearly,
2)
||A||∞7→1 can also be viewed as the optimal value of a binary quadratic programming, which in fact
is a subclass of the problem considered by Charikar and Wirth [6].
In this paper, we consider approximation algorithms for indefinite complex quadratic programming
with m-point constellation constraint. In particular, we consider the following two problems. Problem (i): maximize z H Qz, subject to zkm = 1, k = 1, ..., n, where Q is a Hermitian matrix with
tr Q = 0; and Problem (ii): maximize Re y H Az, subject to ykm = 1, k = 1, ..., p, and zlm = 1,
l = 1, ..., q, where A ∈ Cp×q . These models are extensions of Charikar and Wirth [6], and Alon and
Naor
[1] respectively.
We propose an Ω(1/ log n)-approximation algorithm for Problem (i), and a
³
´
m2 (1−cos
4π
2π
)
m
− 1 -approximation algorithm for Problem (ii). Some further extensions of the model
in Problem (i) are also considered. With regard to Problem (ii), we propose a 0.7118-approximation
algorithm in the limiting continuous case (m → ∞), where we make use of several results established
in Zhang and Huang [16], and Haagerup [9].
This paper is organized as follows. In Section 2 we discuss an approximation algorithm for the indefinite complex quadratic maximization problem: Problem (i), and show that the approximation results
can be obtained for quadratic programming with convex constraint on squared modulus and continuous complex quadratic programming. In Section 3, we study Problem (ii), and study approximation
algorithms for both the discrete and continuous cases.
Notation. We denote by ā the conjugate of a complex number a, Arg z the argument of z, |z| the
modulus of z, and by Cn the space of n-dimensional complex vectors. As a convention we assume
Arg z = 0 if z = 0. For a given vector z ∈ Cn , we denote z H the conjugate transpose of z, and
Diag(z) the n×n diagonal matrix with diagonal entries taken from z, and if Z is an n×n matrix, then
diag(Z) denotes an n-dimensional vector formed by the diagonal elements of Z. The space of n × n
real symmetric and the space of complex Hermitian matrices are denoted by S n and Hn , respectively.
For a matrix Z ∈ Hn , we write Re Z and Im Z for the real part and imaginary part of Z, respectively.
Matrix Z being Hermitian implies that Re Z is symmetric and Im Z is skew-symmetric. We denote
n (S n ) and Hn (Hn ) the cones of real symmetric positive semidefinite (positive definite)
by S+
++
+
++
and complex Hermitian positive semidefinite (positive definite) matrices, respectively. The notation
Z º 0 (Â 0) means that Z is positive semidefinite (positive definite). For two complex matrices Y and
£
¤
Z, their inner product Y • Z is defined to be Re (tr Y H Z) = tr (Re Y )T (Re Z) + (Im Y )T (Im Z) ,
where tr denotes the trace of a matrix and T denotes the transpose of a matrix.
3
2
Indefinite complex quadratic maximization
In this section, we consider the following (indefinite) complex quadratic maximization problem
(DQP) max z H Qz
s.t. zk ∈ C and zkm = 1, k = 1, . . . , n,
where Q 6= 0 is an indefinite Hermitian matrix with diag(Q) = 0, and m ≥ 2 is an integer which is a
part of the input parameter of the problem. Clearly, the problem can be more explicitly written as
(DQP) max z H Qz
s.t. zk ∈ {1, ω, . . . , ω m−1 }, k = 1, . . . , n,
2π
where ω = ei m = cos 2π
m + i sin m .
2π
In case Q is real symmetric and m = 2, the above model coincide with the one considered in Charikar
and Wirth [6]. We remark that, as |zk | = 1, the objective function value remains unchanged if we
replace the condition diag(Q) = 0 by tr Q = 0. Zhang and Huang [16] considered approximation
algorithms for such problem when Q is Hermitian positive semidefinite. In that case, applications
of such models arise from solving the Max-3-Cut problem (Goemans and Williamson [8]) and signal
processing (Luo, Luo, and Kisialiou [10]).
We solve the following semidefinite programming as a relaxation of (DQP):
(SDP) max Q • Z
s.t. Zkk = 1, k = 1, . . . , n,
Z º 0.
This relaxed problem (complex SDP) can be solved in polynomial time up to any prescribed precision.
For practical solution methods, see e.g. Sturm [13].
Next we shall discuss what to do with the optimal solution of the relaxed problem, and our method
is inspired by Charikar and Wirth [6].
Suppose that Z ∗ is an optimal solution of (SDP). We draw a random complex vector ξ ∈ Nc (0, Z ∗ ),
where Nc (0, Z ∗ ) stands for the n-dimensional complex-valued normal distribution with mean vector
zero and covariance matrix Z ∗ . (More discussions on complex-valued normal distribution can be
found in [2] and [16]). For k = 1, 2, ..., n, let
(
ξk /|ξk |, if |ξk | > T ;
xk :=
(1)
ξk /T,
if |ξk | ≤ T ,
where T > 0 is a number to be specified later.
4
This process naturally produces a complex vector x ∈ Cn with |xk | ≤ 1, k = 1, ..., n. We then use x
to further randomly generate zk independently from each other as follows:

1,
with probability (1 + Re xk )/m




.

.


 .
zk =
(2)
ωj ,
with probability (1 + Re (ω −j xk ))/m


.

..




 m−1
ω
, with probability (1 + Re (ω −(m−1) xk ))/m
for k = 1, . . . , n. Indeed we note that (1 + Re (ω −j xk ))/m ≥ 0 for all j = 0, 1, ..., m − 1, and that

 
m−1
m−1
X
X
1
(1 + Re (ω −j xk ))/m = 1 + Re 
ω −j  xk  = 1.
m
j=0
j=0
With regard to this second randomization process (from x to z), we have the following general result.
Lemma 2.1 For k 6= l, it holds that
E[zk z̄l ] =
(
E [Re xk Re x̄l ] ,
1
4 E[xk x̄l ],
for m = 2,
for m ≥ 3.
The case m = 2 is easy to see, and is actually used in Charikar and Wirth [6]. For the case m ≥ 3, the
proof of Lemma 2.1 involves some tedious calculations, and we postpone the proof to the appendix.
We remark that there is an immediate consequence of Lemma 2.1 regarding the relationship between
the optimal Max-2-Cut value and the optimal Max-3-Cut value for the same weighted graph. To be
specific, consider a weighted graph (undirected) with n nodes, and the weight on the edge (k, l) is
wkl (k =
6 l). Let Q be the Laplacian matrix of a weighted graph, i.e., Qkl = −wkl for k 6= l, and
P
2π
4π
Qkk = l6=k wkl , k = 1, ..., n. Let xk ∈ {1, −1}, k = 1, ..., n, and zk ∈ {1, ei 3 , ei 3 }. Let X = xxT
and Z = zz H . It is easy to verify that the corresponding 2-Cut value associated with x is 14 Q • X, and
the corresponding 3-Cut value associated with z is 13 Q • Z. Let us denote the sum of all weights be
P
W ∗ := k<l wkl . Now let x ∈ {1, −1}n be corresponding to the optimal Max-2-Cut solution. Based
2π
4π
on x we again generate z ∈ {1, ei 3 , ei 3 } as described in the above procedure. Then we have
v(M3C) ≥
=
=
1
E[Q • Z]
3Ã
!
n
X
1 X
Qkk + 2
Qkl Re E [zk z̄l ]
3
k=1
k<l
Ã
!
X
1
1
2W ∗ +
Qkl xk xl
3
2
k<l
=
1 ∗ 1
W + v(M2C),
2
3
5
(3)
where v(M3C) is the optimal Max-3-Cut value and v(M2C) is the optimal Max-2-Cut value. Note
that in relation (3), no assumption is made regarding the signs of the weights.
In the construction of vector x, its components differ from that of ξ/T only when |ξk | > T . Our next
target is to show that the difference is small if T is large.
For k 6= l, define ∆kl = ξk ξ¯l /T 2 − xk x̄l .
2
Lemma 2.2 For k 6= l and T > 1 it holds that E[|∆kl |] < e−T ( T5 + 4).
Proof. Let us divide the complex space C2 (with overlaps) into
A = {(ξk , ξl ) : |ξk | ≤ T, |ξl | ≤ T }, B = {(ξk , ξl ) : |ξl | > T } and C = {(ξk , ξl ) : |ξk | > T }.
Suppose, without loss of generality due to symmetricity, that EC [|∆kl |] ≤ EB [|∆kl |]. We have
E[|∆kl |] ≤ EA [|∆kl |] + EB [|∆kl |] + EC [|∆kl |] ≤ EA [|∆kl |] + 2EB [|∆kl |].
(4)
Obviously, EA [|∆kl |] = 0, and we need only to treat the term EB [|∆kl |].
∗ = γeiα . Since
Suppose that E[ξk ξ¯l ] = Zkl
à Ã
Ã
!
!!
1
γeiα
ξk
∈ Nc 0,
,
ξl
γe−iα
1
we can express (ξk , ξl ) in terms of (η, λ) as
Ã
where
η
λ
!
p
1 − γ 2 λ, ξl = η,
ξk = γeiα η +
∈ Nc (0, I2 ). Thus, we have
Z
Prob {|ξl | > T } = Prob {|η| > T } =
∞ Z 2π
T
0
r −r2
e dθdr = 2
π
Z
∞
2
2
re−r dr = e−T .
T
Moreover,
2
EB [|xk x̄l |] ≤ EB [1] = Prob {(ξk , ξl ) ∈ B} = Prob {|η| > T } = e−T .
On the other hand, since γ ≤ 1, we have
EB [|ξk ξl |] = EB [γeiα η η̄ +
p
1 − γ 2 λη̄]
≤ EB [|η|2 + |η| · |λ|]
Z ∞ Z 2π 3
Z ∞ Z 2π 2
Z ∞ Z 2π 2
r −r2
r
s −s2
=
e dθdr +
dθ1 dr
e dθ2 ds
π
π
π
T
0
T
0
0
0
√
√
π
π
2
T + 1)e−T + (1 − Φ( 2 T )),
= (T 2 +
2
2
6
where in the last equality we use the facts that
√
Z ∞
Z ∞
√
1 2
T −T 2
π
3 −r2
−T 2
2 −r2
r e dr = (T + 1)e
, and
r e dr = e
+
(1 − Φ( 2 T )),
2
2
2
T
T
and
Z
4
∞
2
s2 e−s ds =
√
π,
0
where Φ(·) is the cumulative distribution function of the real-valued standard normal distribution
N (0, 1).
Therefore,
EB [|∆kl |] ≤ EB [|ξk ξ¯l |/T 2 + |xk x̄l |]
√
√
π
π
−T 2 1
+ 2) +
(1
−
Φ(
2 T )),
≤ e
( 2+
T
2T
2T 2
and so from (4) we have
E[|∆kl |] ≤ e
−T 2
√
√
π
π
2
+ 4) + 2 (1 − Φ( 2 T )).
( 2+
T
T
T
(5)
Further notice that
√ Z ∞
√ Z ∞
√
√
π
π
π −T 2
π
−s2
−s2
e ds ≤ 2
se ds =
(1 − Φ( 2 T )) = 2
e
,
2
T
T
T
2T 2
T
T
and so it follows from (5) that
E[|∆kl |] ≤ e
−T 2
√
√
4+ π
π
2 5
(
+ 4) < e−T ( + 4).
+
2
2T
T
T
Note that in the above derivations we used the fact that T > 1.
Lemma 4 of [6] asserts that
v(DRQP) ≥
1X
|akl |,
n
k,l
where v(DRQP) is the optimal value of
(DRQP) max xT Ax
s.t. xk ∈ {±1}, k = 1, . . . , n,
where A is a real matrix with diag(A) = 0.
Since (SDP) is a relaxation of the following continuous complex quadratic programming (CQP)
(CQP) max z H Qz
s.t. |zk | = 1, k = 1, . . . , n,
7
¤
(6)
it follows that the optimal value of (SDP), v(SDP), is no less than the optimal value of (CQP),
v(CQP). By denoting Re Q = A, Im Q = B, Re z = x and Im z = y, we may write (CQP) as an
equivalent real quadratic program (CRQP):
Ã
!Ã !
A −B
x
T
T
(CRQP) max (x , y )
B A
y
2
2
s.t. xk + yk = 1, k = 1, . . . , n.
That is, v(CQP) = v(CRQP). Obviously, the optimal value of (CRQP), v(CRQP), is no less than
v(SRQP), the optimal value of
Ã
!Ã !
A −B
x
T
T
(SRQP) max (x , y )
B A
y
2
2
s.t. xk = yk = 1/2, k = 1, . . . , n,
which is in fact equivalent to the following problem, by substitution of variables u :=
√
v := 2 y,
Ã
!Ã !
A
−B
u
max 12 (uT , v T )
B A
v
2
2
s.t. uk = vk = 1, k = 1, . . . , n.
√
2 x and
Hence, using (6) one obtains
v(SRQP) ≥
1 X
1 X
(|Akl | + |Bkl |) ≥
|Qkl |.
2n
2n
k,l
k,l
Finally, by noting v(SDP) ≥ v(CQP) = v(CRQP) ≥ v(SRQP), we arrive at the following result:
Lemma 2.3 It holds that v(SDP) ≥
1
2n
P
k,l
|Qkl |.
Theorem 2.4 Suppose that n ≥ 3. By setting T =
8
√
9 log n, we have E[z H Qz] ≥
1
40 log n v(SDP).
Proof. It follows that
E[z H Qz] =
X
Qlk E[zk z̄l ]
k6=l
=
1X
Qlk E[xk x̄l ]
4
(7)
k6=l
=
1X
Qlk E[−∆kl + ξk ξ¯l /T 2 ]
4
k6=l
=
=
≥
≥
X
1 X
(
Qlk E[ξk ξ¯l /T 2 ] −
Qlk E[∆kl ])
4
k6=l
k6=l


X
1
v(SDP)/T 2 −
Qlk E[∆kl ]
4
k6=l


X
1
|Qlk | × |E[∆kl ]|
v(SDP)/T 2 −
4
k6=l
µ
¶
1
2
−T 2 5
v(SDP)/T − e
( + 4) × 2n × v(SDP)
4
T
(8)
2
1 − e−T (4T 2 + 5T )2n
=
v(SDP),
4T 2
where, equality (7) is due to Lemma 2.1, and inequality (8) is due to Lemmas 2.2 and 2.3.
If T =
√
9 log n then
2
1−e−T (4T 2 +5T )2n
T2
>
1
10 log n ,
E[z H Qz] ≥
whenever n ≥ 3. Thus we have
1
v(SDP ).
40 log n
¤
Theorem 2.4 implies that the approximation ratio of our randomized algorithm for (DQP) is at least
Ω(1/ log n), extending the approximation result of Charikar and Wirth [6] to the complex discrete
quadratic maximization model (DQP).
In the next two subsections, we shall see that the result can be further extended.
2.1
Constrained squared modulus
Consider the following constrained model
(DQPC) max z H Qz
1
s.t. Arg zk ∈ {0, m
2π, ..., m−1
m 2π}, k = 1, . . . , n,
2
2
T
(|z1 | , . . . , |zn | ) ∈ F,
9
where Q ∈ Hn with diag(Q) = 0, and F is a closed convex set in <n . In particular, if F is the
singleton e (the all-one vector), then (DQPC) reduces to (DQP).
Consider the following convex SDP relaxation for (DQPC)
(CSDP) max Q • Z
s.t. diag(Z) ∈ F,
Z º 0.
For 0 ≤ d ∈ <n , define its generalized inverse vector d−1 to be
(
1/dk , if dk > 0;
−1
(d )k =
1,
if dk = 0,
where k = 1, . . . , n. Thus the diagonal matrix D−1 := Diag(d−1 ) Â 0. Thus,
(
1, if dk > 0;
(D−1 D)kk =
0, if dk = 0.
p
For a Hermitian positive semidefinite matrix Z, let d := diag(Z) and D := Diag(d), and
Z̃ := Z + (I − D−1 D).
(9)
Then Z̃kl = Zkl for any k 6= l, Z̃kk = Zkk if Zkk > 0, and Z̃kk = 1 if Zkk = 0.
n , and Z̃ be defined as in (9). Then the matrix D −1 Z̃D −1 has the properties:
Lemma 2.5 Let Z ∈ H+
(i) D−1 Z̃D−1 º 0;
(ii) diag(D−1 Z̃D−1 ) is the n-dimensional all-one vector;
(iii) DD−1 Z̃D−1 D = Z.
Proof. The first two assertions follow from the definition, and the last assertion can be seen from the
fact that
DD−1 Z̃D−1 D = DD−1 (Z + (I − D−1 D))D−1 D = DD−1 ZD−1 D = Z.
This completes the proof of (iii).
¤
Now we consider the following randomization method for solving (DQPC).
p
Let Z ∗ be an optimal solution of (CSDP). Let d := diag(Z ∗ ), D := Diag(d), D−1 := Diag(d−1 ),
and Z̃ ∗ is defined according to (9).
10
Draw a random complex vector ξ ∈ Nc (0, D−1 Z̃ ∗ D−1 ), and set x according to (1), and generate z
according to (2). Finally, set z := Dz.
Next we shall analyze the performance of this algorithm. Consider the following complex quadratic
program
max DQD • zz H
s.t. zk ∈ {1, ω, . . . , ω m−1 }, k = 1, . . . , n,
and its SDP relaxation
(S) max DQD • Z
s.t. Zkk = 1, k = 1, . . . , n,
Z º 0.
p ∗ ∗
1 P
By Lemma 2.3 we have v(S) ≥ 2n
k6=l |Qkl | Zkk Zll .
Let Y ∗ be optimal for (S). Since DY ∗ D is feasible for (CSDP),
v(CSDP) ≥ Q • DY ∗ D = DQD • Y ∗ = v(S) ≥
p ∗ ∗
1 X
|Qkl | Zkk
Zll .
2n
k6=l
That is,
−
X
p ∗ ∗
|Qkl | Zkk
Zll ≥ −2n × v(CSDP).
k6=l
Therefore,
E[Q • zz H ] =
p ∗ ∗
1X
Qkl Zkk
Zll E[xk x̄l ]
4
k6=l
=
≥
≥
≥
p ∗ ∗
1X
Qkl Zkk
Zll (E[ξk ξ¯l ]/T 2 − E[∆kl ])
4
k6=l


−1
∗
−1
X
p ∗ ∗
1  DQD • D Z̃ D
Qkl Zkk
−
Zll E[|∆kl |]
2
4
T
k6=l


−1
∗
−1
X
p
1  DQD • D Z̃ D
2 5
∗ Z∗
|Qkl | Zkk
− e−T ( + 4)
ll
2
4
T
T
k6=l
µ
¶
1 v(CSDP)
−T 2 5
−e
( + 4)2n × v(CSDP)
4
T2
T
2
1 − e−T (4T 2 + 5T )2n
v(CSDP)
=
4T 2
1
≥
v(CSDP),
40 log n
√
where in the last step we take T = 9 log n and n ≥ 3. This concludes the theorem below.
11
Theorem 2.6 Let v(DQPC) be the optimal value of (DQPC) and v(CSDP) be the optimal value of
1
(CSDP). Then v(DQPC) ≥ 40 log
n v(CSDP) for n ≥ 3. This implies the approximation ratio for the
algorithm of (DQPC) is at least Ω(1/ log n).
2.2
Continuous indefinite complex quadratic maximization
Consider the following model
(CQP) max z H Qz
s.t. |zk | = 1, k = 1, . . . , n,
where Q ∈ Hn and diag(Q) = 0. We use (SDP) a relaxation for the above problem (CQP). Suppose
that Z ∗ is an optimal solution of (SDP).
Draw a random complex vector ξ ∈ Nc (0, Z ∗ ), and set x according to (1), and then independently
generate zk ∈ Cn as follows:
(
eiArg xk ,
with probability (1 + |xk |)/2,
zk =
i
Arg
x
k
−e
, with probability (1 − |x |)/2,
k
for k = 1, . . . , n.
In that case, we have E[zk z̄l ] = E[xk x̄l ] (see also Charikar and Wirth [6]). This leads to
X
X
Qkl E[xk x̄l ]
Qkl E[zk z̄l ] =
E[z H Qz] =
k6=l
k6=l
≥
1−e
−T 2
(4T 2 + 5T )2n
1
v(SDP),
v(SDP) >
T2
10 log n
√
if T is taken to be 9 log n (with n ≥ 3). Hence the approximation ratio of the above approximation
algorithm is at least Ω(1/ log n). So, Zhang, and Ye [12] obtained the same result for this model.
However, the basic techniques used in [12] are quite different from ours.
3
Discrete complex bilinear maximization
Alon and Naor [1] studied the problem of approximating the cut-norm ||A||C of a real matrix A ∈
<p×q . They in fact worked on approximation method for computing the norm ||A||∞7→1 , which turned
out to provide a scheme to approximate the cut-norm of A. By definition, the value of ||A||∞7→1 is
given by the following bilinear binary optimization problem
max y T Az
s.t. yk , zl ∈ {±1}, k = 1, . . . , p, l = 1, . . . , q,
12
which can be equivalently cast as a binary quadratic maximization problem
Ã
!Ã !
0
A
y
max 21 (y T , z T )
T
A
0
z
s.t. yk , zl ∈ {±1}, k = 1, . . . , p, l = 1, . . . , q.
At this point, we remark that computing ||A||∞7→1 is clearly a special case of Charikar and Wirth [6]’s
model (DRQP). Based on an identity due to Grothendieck, Alon and Naor [1] managed to derive a
0.56-approximation algorithm for computing ||A||∞7→1 .
In this section, we shall consider the discrete complex bilinear maximization problem
max Re y H Qz
s.t. yk , zl ∈ {1, ω, . . . , ω m−1 }, k = 1, . . . , p, l = 1, . . . , q,
or, equivalently, the following complex discrete quadratic program (DBLP):
Ã
!Ã !
0 Q
y
1 H H
(DBLP) max 2 (y , z )
H
Q
0
z
m−1
s.t. yk , zl ∈ {1, ω, . . . , ω
}, k = 1, . . . , p, l = 1, . . . , q,
where Q is an p × q complex matrix, and ω = cos(2π/m) + i sin(2π/m), m ≥ 3. We also consider the
following continuous version of (DBLP),
Ã
!Ã !
0 Q
y
1 H H
(CBLP) max 2 (y , z )
H
Q
0
z
s.t. |yk | = |zl | = 1, k = 1, . . . , p, l = 1, . . . , q.
Clearly, the objective matrix of (DBLP) is Hermitian with zero diagonals. As its real counter-part,
(DBLP) is a subclass of (DQP) and (CBLP) is a subclass of (CQP). The SDP relaxation for (DBLP)
and (CBLP) is
Ã
!
0
Q
(SDBLP) max 12
•W
QH 0
s.t. Wkk = 1, k = 1, . . . , p + q,
W º 0.
Lemma 2.3 asserts that if Q 6= 0 then v(SDBLP) > 0. Also, we note that in the summation form, the
P
P
objective function of (SDBLP) is pk=1 ql=1 Re (Q̄kl Wk,p+l ), and the objective function of (DBLP)
P
P
and (CBLP) is pk=1 ql=1 Re (Q̄kl yk z̄l ).
13
3.1
An approximation algorithm for the discrete complex bilinear program
Let W ∗ be an optimal solution for (SDBLP). We draw a random complex vector as follows:
à !
ξ
∈ Nc (0, W ∗ ),
η
and generate complex vectors y ∈ Cp and z ∈ Cq as follows:
j
For k = 1, ..., p, assign yk := ω j if Arg ξk ∈ [ m
2π, j+1
m 2π) with j ∈ {0, 1, . . . , m − 1};
and
j
For l = 1, ..., q, assign zl := ω j if Arg ηl ∈ [ m
2π, j+1
m 2π) with j ∈ {0, 1, . . . , m − 1}.
In Zhang and Huang [16] we have shown that
E[yk z̄l ] =
m−1
m(2 − ω − ω −1 ) X j
∗
∗
ω (arccos(−Re (ω −j Wk,p+l
)))2 =: Fm (Wk,p+l
), ∀k, l,
8π 2
j=0
and Fm (z) = z for z ∈ {1, ω, . . . , ω m−1 }. Furthermore, in Appendix of [16] we established that
Fm (z) =
with
m2 (1 − cos 2π
m)
z + φ1 (z) + φ2 (z),
8π
∞
2r+1
X
X
m2 (1 − cos 2π
m)
φ1 (z) =
ar
b2r+2−2i z i (z̄)2r+1−i ,
4π
r=1
and
φ2 (z) =
i=0
∞
2s+2t+2
X
X
m2 (1 − cos 2π
m)
a
a
b2s+2t+3−2i z i (z̄)2s+2t+2−i ,
s t
2
4π
s=0,t=0
i=0
where
(2r)!
ar = 4r+1
, bk+1−2i =
2
(r!)2 (2r + 1)
Note that
Ã
k
i
! m−1
X
ei( m ji) .
2π
j=0
Pm−1 i( 2π ji)
m
is either 0 or m.
j=0 e
Let φ(z) := φ1 (z) + φ2 (z).
If Z º 0 then Z̄ º 0. Moreover, the Hadamard product of Hermitian positive semidefinite matrices
remain positive semidefinite. This implies that if Z º 0 then φ(Z) º 0, where φ(Z) := (φ(Zkl ))n×n .
m2 (1−cos
2π
)
m2 (1−cos
2π
)
m
m
On the other hand, since Fm (1) = 1 we have 1 =
+ φ(1). Let βm :=
. We
8π
8π
∗
∗
conclude that (φ(W ))kk /(1 − βm ) = 1, for k = 1, ..., p + q, and so, φ(W )/(1 − βm ) is itself a feasible
14
solution for (SDBLP). Now observe that for any feasible solution of (SDBLP), say W , it necessarily
follows that
Ã
!
0 Q
1
−v(SDBLP) ≤
• W ≤ v(SDBLP).
(10)
2
QH 0
The second inequality is obvious, by definition of the feasibility. To argue that the first inequality
also holds, we consider a decomposition of W º 0, i.e.,
Ã
!
UH
W =
· (U, V ),
VH
where the number of rows in U H is p, and the number of rows in V H is q. Let us now consider another
solution,
Ã
!
UH
W̃ :=
· (U, −V ) º 0.
−V H
Since the diagonal of W̃ remains the all-one vector, it is also a feasible solution for (SDBLP). Therefore
Ã
!
Ã
!
0 Q
0 Q
1
1
v(SDBLP) ≥
• W̃ = −
• W,
2
2
QH 0
QH 0
and so the first inequality in (10) follows. Therefore,
Ã
!
0 Q
1
• φ(W ∗ )/(1 − βm ) ≥ −v(SDBLP).
2
QH 0
Now we are in a position to calculate the expected value of the randomized solution
" p q
#
p X
q
XX
X
∗
E
Re (Q̄kl yk z̄l ) =
Re (Q̄kl Fm (Wk,p+l
))
k=1 l=1
k=1 l=1
=
p X
q
X
∗
∗
Re (Q̄kl (βm Wk,p+l
+ φ(Wk,p+l
)))
k=1 l=1
1
= βm × v(SDBLP) + (1 − βm ) ×
2
Ã
0 Q
QH 0
!
• φ(W ∗ )/(1 − βm )
≥ βm × v(SDBLP) + (βm − 1) × v(SDBLP)
= (2βm − 1) × v(SDBLP)
m2 (1 − cos 2π
m)
= (
− 1) × v(SDBLP).
4π
This leads to the following result.
m2 (1−cos
2π
)
m
Theorem 3.1 There is an approximation algorithm for (DBLP) with the ratio αm :=
−1
4π
for m ≥ 3. In particular, α3 ≥ 0.0742, α4 ≥ 0.2732, α5 ≥ 0.3746, α10 ≥ 0.5198, and α100 ≥ 0.5702.
15
3.2
An improved approximation algorithm for the continuous bilinear problem
Since (CBLP) is the limit of (DBLP) with m → ∞, it is clear that if we let yk = eiArg ξk and
zl = eiArg ηl where (ξ H , η H )H is generated from Nc (0, W ∗ ) and W ∗ is an optimal solution of (SDBLP),
then we will get an approximation algorithm with an approximation ratio of limm→∞
π
2 − 1 ≈ 0.5708.
m2 (1−cos
4π
2π
)
m
−1 =
In this subsection, we shall show that this ratio can be improved if we further exploit particular
structures of Problem (CBLP). Our analysis below makes use of some of the results in our previous
paper [16], and also important insights presented in a paper by Haagerup, [9].
According to the analysis in Sections 3.1 and 3.3 of Zhang and Huang [16], if we generate yk = eiArg ξk
and zl = eiArg ηl with (ξ H , η H )H , then we have
∗
∗
lim Fm (Wk,p+l
) (=: F (Wk,p+l
))
Z 2π
1
eiθ (arccos(−γ cos(θ − α)))2 dθ
=
4π 0
Z π/2
i
α
= e
arcsin(γ sin θ) sin θdθ
E[yk z̄l ] =
m→∞
0
∞
=
π iα X
e
cr γ 2r+1 ,
4
r=0
∗
where we denoted Wk,p+l
as γeiα and cr =
Z
((2r)!)2
.
24r (r!)2 (r+1)
Ã
π/2
arcsin(γ sin θ) sin θdθ
ψ(γ) :=
For γ ∈ [−1, 1], let
Z
π/2
p
=γ
0
0
(cos θ)2
1 − (γ sin θ)2
!
dθ .
In that notation, the transformation function F can be rewritten as
F (z) = eiArg z ψ(|z|).
We remark here that equation (3.9) in Zhang and Huang [16] coincides with Lemma 3.2 in Haagerup [9],
although we were not aware of [9] at the time when we derived that equation.
Important properties of the function ψ(γ) are discussed in [9]. In particular, Theorem 2.1 of [9] states
that the inverse function ψ −1 : [−1, 1] → [−1, 1] of ψ exists and it can be expanded into an absolutely
convergent power series:
∞
X
b2r+1 s2r+1 , s ∈ [−1, 1],
ψ −1 (s) =
r=0
4
π
with b1 = and b2r+1 ≤ 0 for all r ≥ 1. Specifically, b3 = −8/π 3 , b5 = 0, b7 = −16/π 7 , b9 = −80/π 9 ,
b11 = −480/π 11 , b13 = −3136/π 13 and b2r+1 ∼ −4/((2r + 1) log(2r + 1))2 for r → ∞. Moreover, the
following result was shown in [9], which was used to bound the complex Grothendieck constant.
16
Lemma 3.2 There is a unique β ∈ (0, 1) for which
with β ≈ 0.7118.
P∞
r=0 |b2r+1 |β
2r+1
= 1 (i.e., ψ −1 (β) = π8 β − 1),
Now the inverse function of F (z) can be written as
F −1 (z) = eiArg z ψ −1 (|z|) =
∞
X
b2r+1 z r+1 z̄ r .
r=0
p+q
with all-one diagonal elements, let us construct another Hermitian matrix
For a given W ∈ H+
p+q
G(W ) ∈ H
as follows:
∞
Gk,p+l (W ) :=
Gk1 ,k2 (W ) :=
X
4
βWk,p+l −
|b2r+1 |β 2r+1 (Wk,p+l )r+1 (W̄k,p+l )r , k = 1, ..., p, l = 1, ..., q,
π
4
βWk1 ,k2 +
π
r=1
∞
X
|b2r+1 |β 2r+1 (Wk1 ,k2 )r+1 (W̄k1 ,k2 )r , k1 , k2 = 1, ..., p,
r=1
∞
Gp+l1 ,p+l2 (W ) :=
X
4
βWp+l1 ,p+l2 +
|b2r+1 |β 2r+1 (Wp+l1 ,p+l2 )r+1 (W̄p+l1 ,p+l2 )r , l1 , l2 = 1, ..., q.
π
r=1
That is,
Gk,p+l (W ) = F −1 (βWk,p+l ), k = 1, ..., p, l = 1, ..., q,
8
Gk1 ,k2 (W ) =
βWk1 ,k2 − F −1 (βWk1 ,k2 ), k1 , k2 = 1, ..., p,
π
8
βWp+l1 ,p+l2 − F −1 (βWp+l1 ,p+l2 ), l1 , l2 = 1, ..., q.
Gp+l1 ,p+l2 (W ) =
π
By the choice of β (Lemma 3.2), we see that if W has all-one diagonal elements, then so is true for the
T T T
T
Hermitian matrix G(W ). Denote E := (eT
p , −eq ) (ep , −eq ) (º 0), where ep and eq are the all-one
vectors in <p and <q respectively. We can now write G(W ) in a uniform fashion as follows
∞
G(W ) =
X
4
βW +
|b2r+1 |β 2r+1 E ◦ (W )(r+1) ◦ (W T )(r) .
π
r=1
where ‘A ◦ B’ stands for the Hadamard product between A and B, and A(r) is the rth power in the
r
}|
{
z
(r)
Hadamard sense, i.e., A = A ◦ A ◦ · · · ◦ A. If W º 0, then, by the fact that the Hadamard product
of positive semidefinite matrices remains positive semidefinite, we have G(W ) º 0. As a remark, we
note here that combining this and the previous observation (regarding the diagonals of G(W )) leads
to the conclusion that if W is a feasible solution for (SDBLP) then so is true for G(W ).
17
Suppose that W ∗ is an optimal solution of (SDBLP). Let us take yk = eiArg ξk and zl = eiArg ηl with
(ξ H , η H )H randomly generated from Nc (0, G(W ∗ )). In that case, the expected objective value is


X
X
E
Re (Q̄kl yk z̄l ) =
Re (Q̄kl E[yk z̄l ])
k,l
k,l
=
X
∗
Re (Q̄kl F (G(Wk,p+l
)))
k,l
=
X
∗
Re (Q̄kl F (F −1 (βWk,p+l
)))
k,l
=
X
∗
Re (Q̄kl βWk,p+l
)
k,l
= β
X
∗
Re (Q̄kl Wk,p+l
)
k,l
= β × v(SDBLP)
≈ 0.7118 × v(SDBLP).
This proves the following theorem.
Theorem 3.3 The above randomized algorithm has an approximation ratio 0.7118 for the continuous
complex bilinear maximization problem (CBLP).
A
Proof of Lemma 2.1
We only consider the case m ≥ 3 here. Since for the random variables z and x it holds that
E[zk z̄l ] = E[E[zk z̄l | (xk , xl )]], for k 6= l,
we shall first compute E[zk z̄l | (xk = x0k , xl = x0l )]. For simplicity, we drop the superscript naughts
of x0k and x0l , and denote the expectation E[zk z̄l | (xk = x0k , xl = x0l )] simply by E[zk z̄l | (xk , xl )], and
Prob {zk = ω j , zl = ω j−i | (xk = x0k , xl = x0l )} by Prob {zk = ω j , zl = ω j−i | (xk , xl )}. We have
E[zk z̄l | (xk , xl )] = 1 ×
m−1
X
Prob {zk = ω j , zl = ω j | (xk , xl )} + · · ·
j=0
i
+ω ×
m−1
X
Prob {zk = ω j , zl = ω j−i | (xk , xl )} + · · ·
j=0
+ω m−1 ×
m−1
X
Prob {zk = ω j , zl = ω j−m+1 | (xk , xl )}.
j=0
18
Obviously,
m−1
X
Prob {zk = ω j , zl = ω j−i | (xk , xl )}
j=0
=
m−1
X
j=0
1 + Re (ω −j xk ) 1 + Re (ω −j+i xl )
×
m
m
m−1
1
1 X
Re (ω −j xk )Re (ω −j+i xl ).
+
m m2
=
j=0
Thus we further have
E[zk z̄l | (xk , xl )] =
m−1
m−1
m−1
1 X iX
1 X j
ω + 2
ω
Re (ω −j xk )Re (ω −j+i xl )
m
m
j=0
=
i=0
j=0
m−1
m−1
X
1 X
−j
Re
(ω
x
)
ω i Re (ω −j+i xl ).
k
2
m
j=0
i=0
To simplify the expression, let us write xk = a + ib and xl = c + id. Then we have
m−1
X
Re (ω −j xk )
j=0
m−1
X
ω i Re (ω −j+i xl )
i=0
"
m−1
X
"
#
#
m−1
m−1
m−1
X
X
X
j
j
i
j−i
i
j−i
cos 2π
= ac
sin 2π
cos 2π cos
2π + bd
cos 2π sin
2π
m
m
m
m
m
m
j=0
i=0
j=0
i=0
"
#
#
"
m−1
m−1
m−1
m−1
X
X
X
X
j
i
j−i
i
j−i
j
sin 2π
cos 2π sin
2π + bc
cos 2π cos
2π
cos 2π
+ad
m
m
m
m
m
m
i=0
j=0
i=0
j=0

"
"
#
#
m−1
m−1
m−1
m−1
X
X
X
X
j
j
i
j
−
i
i
j
−
i
+i ad
cos 2π
sin 2π sin
2π + bc
sin 2π
sin 2π cos
2π
m
m
m
m
m
m
j=0
i=0
j=0
i=0
"
#
"
#
m−1
m−1
m−1
m−1
X
X
X
X
j
j
i
j−i
i
j−i
cos 2π
sin 2π
+ac
sin 2π cos
2π − bd
sin 2π sin
2π  .
m
m
m
m
m
m
j=0
Observing
Pm−1
j=0
i=0
j=0
i=0
ω 2j = 0 (note here that m ≥ 3), one has
m−1
X
j=0
m−1
X
2j
2j
sin( 2π) = 0.
cos( 2π) =
m
m
j=0
This in turn leads to
m−1
X
j=0
sin
m−1
m−1
X
X
j
j
j
j
(sin 2π)2 =
(cos 2π)2 = m/2.
2π cos 2π = 0, and
m
m
m
m
j=0
19
j=0
It follows that
#
m−1
X
j
i
j−i
cos 2π cos
2π
cos 2π
m
m
m
j=0
i=0
"
¶#
m−1
m−1
X
Xµ
j
j
i
j
i
i
=
cos 2π
cos 2π(cos 2π)2 + sin 2π sin 2π cos 2π
m
m
m
m
m
m
i=0
j=0
"
#
m−1
m−1
m−1
X
X
X
j
i
j
j
i
i
=
(cos 2π)2
(cos 2π)2 + sin 2π cos 2π
sin 2π cos 2π
m
m
m
m
m
m
j=0
i=0
i=0
#
"
m−1
m−1
X
X
j
i
=
(cos 2π)2
(cos 2π)2
m
m
m−1
X
=
j=0
m2
4
"
i=0
.
In a similar way, one further calculates that
"
# m−1 "
#
m−1
m−1
m−1
X
X
X
X
j
j
i
j−i
i
j−i
cos 2π
sin 2π
cos 2π cos
2π =
cos 2π sin
2π
m
m
m
m
m
m
j=0
i=0
j=0
i=0
# m−1 "
#
"
m−1
m−1
m−1
X
X
X
X
j
i
j−i
i
j−i
j
sin 2π
sin 2π sin
2π =
sin 2π cos
2π
cos 2π
= −
m
m
m
m
m
m
=
j=0
2
m
i=0
j=0
i=0
4
and
# m−1 "
#
m−1
m−1
X
X
X
j
j
i
j−i
i
j−i
cos 2π
sin 2π
cos 2π sin
2π =
cos 2π cos
2π
m
m
m
m
m
m
j=0
i=0
j=0
i=0
"
# m−1 "
#
m−1
m−1
m−1
X
X
X
X
j
i
j−i
j
i
j−i
=
cos 2π
sin 2π cos
2π =
sin 2π
sin 2π sin
2π
m
m
m
m
m
m
m−1
X
j=0
"
i=0
j=0
i=0
= 0.
Hence
m−1
X
=
=
=
(Re (ω j xk )
m−1
X
j=0
m2
ω i Re (ω j−i xl ))
i=0
(ac + bd + i(−ad + bc))
4
m2
(a + ib)(c − id)
4
m2
xk x̄l .
4
20
This means that
E [zk z̄l | (xk , xl )] =
m2
1
1
×
xk x̄l = xk x̄l ,
2
m
4
4
£
¤
i.e., E zk z̄l | (xk = x0k , xl = x0l ) = 41 x0k x0l . Therefore
1
E [zk z̄l ] = E [E [zk z̄l | (xk , xl )]] = E [xk x̄l ] .
4
¤
References
[1] N. Alon and A. Naor, Approximating the Cut-Norm via Grothendieck’s inequality, Proceedings
of the Thirty-sixth Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, 2004,
pp. 72 – 80.
[2] H.H. Andersen, M. Højbjerre, D. Sørensen, and P.S. Eriksen, Linear and graphical models for
the multivariate complex normal distribution, Lecture Notes in Statistics 101, Springer-Verlag,
New York, 1995.
[3] A. Ben-Tal and A. Nemirovski, On tractable approximations of uncertain linear matrix inequalities affected by interval uncertainty, SIAM Journal on Optimization, 12 (2002), pp. 811 – 833.
[4] A. Ben-Tal, A. Nemirovski and C. Roos, Extended matrix cube theorems with applications to
µ-theory in control, Mathematics of Operations Research, 28 (2003), pp. 497 – 523.
[5] D. Bertsimas and Y. Ye, Semidefinite relaxations, multivarite normal distribution, and order
statistics, In Handbook of Combinatorial Optimization, vol. 3, D.Z. Du and P.M. Pardalos, eds.,
Kluwer Academic Publishers, 1998, pp. 1 – 19.
[6] M. Charikar and A. Wirth, Maximizing quadratic programs: extending Grothendieck’s inequality,
Proc. 45th FOCS, 2004.
[7] M.X. Goemans and D.P. Williamson, Improved approximation algorithms for maximum cut and
satisfiability problems using semidefinite programming, Journal of the ACM, 42 (1995), pp. 1115
– 1145.
[8] M.X. Goemans and D.P. Williamson, Approximation algorithms for MAX-3-CUT and other
problems via complex semidefinite programming, Journal of Computer and System Sciences, 68
(2004), pp. 442 – 470.
[9] U. Haagerup, A new upper bound for the complex Grothendieck constant, Israel Journal of Mathematics, 60 (1987), pp. 199 – 224.
21
[10] Z.Q. Luo, X.D. Luo, and M. Kisialiou, An efficient quasi-maximum likelihood decoder for PSK
signals, in Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), 2003, pp. 561 – 564 (vol. 6).
[11] Yu. Nesterov, Semidefinite relaxation and nonconvex quadratic optimization, Optimization Methods and Softwares, 9 (1998), pp. 141 – 160.
[12] A. So, J. Zhang, and Y. Ye, On approximating complex quadratic optimization problems via
semidefinite programming relaxations, in Proceedings of the 11th Conference on Integer Programming and Combinatorial Optimization, Lecture Notes in Computer Science, vol. 3509, M.
Jünger and V. Kaibel, eds., Springer-Verlag, Berlin, 2005, pp. 125 – 135.
[13] J.F. Sturm, Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones,
Optimization Methods and Software, 11-12 (1999), pp. 625 – 653.
[14] Y. Ye, Approximating quadratic programming with bound and quadratic constraints, Mathematical Programming, 84 (1999), pp. 219 – 226.
[15] S. Zhang, Quadratic maximization and semidefinite relaxation, Mathematical Programming, 87
(2000), pp. 453 – 465.
[16] S. Zhang and Y. Huang, Complex quadratic opitmization and semidefinite programming, Technical Report SEEM2004-03, Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, 2004, (accessible at
http://www.se.cuhk.edu.hk/∼zhang/#workingpaper). Accepted for publication in SIAM Journal on Optimization.
22
Download