Approximation Algorithms for Indefinite Complex Quadratic Maximization Problems Yongwei Huang ∗ Shuzhong Zhang † July 2005 Abstract In this paper we consider the following two types of complex quadratic maximization problems: (i) maximize z H Qz, subject to zkm = 1, k = 1, ..., n, where Q is a Hermitian matrix with tr Q = 0 and z ∈ Cn is the decision vector; (ii) maximize Re y H Az, subject to ykm = 1, k = 1, ..., p, and zlm = 1, l = 1, ..., q, where A ∈ Cp×q and y ∈ Cp and z ∈ Cq are the decision vectors. In the real cases (namely m = 2 and the matrices are all real-valued), such problems were recently considered by Charikar and Wirth [6] and Alon and Naor [1] respectively. In particular, Charikar and Wirth [6] presented an Ω(1/ log n) approximation algorithm for Problem (i) in the real case, and Alon and Naor [1] presented a 0.56-approximation algorithm for Problem (ii) in the real case. In this paper, we propose Ω(1/ log n) ´approximation algorithm for the general version of ³ an m2 (1−cos 2π m) complex Problem (i), and a − 1 -approximation algorithm for the general version 4π of complex Problem (ii). For the limit and continuous version of complex Problem (ii) (m → ∞), we further show that a 0.7118 approximation ratio can be achieved. Keywords: indefinite Hermitian matrix, randomized algorithms, approximation ratio, semidefinite programming relaxation. MSC subject classification: 90C20, 90C22. ∗ Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong (ywhuang@se.cuhk.edu.hk). † Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong (zhang@se.cuhk.edu.hk). Research supported by Hong Kong RGC Earmarked Grant CUHK418505. 1 1 Introduction Polynomial-time approximation algorithms for NP-hard problems via semidefinite programming (SDP) have received much attention in the last decade, since, in several important cases, this approach leads to significant improvements on the worst-case approximation ratios. The pioneering work along this direction is the famous 0.87856-approximation algorithm of Goemans and Williamson [7] for the Max-Cut problem. Nesterov [11] and Ye [14] obtained π2 -approximation algorithms for quadratic maximization problem with a positive semidefinite objective matrix. Actually, the π2 approximation result can also be derived from the so-called real matrix cube theorem developed by Ben-Tal and Nemirovski in [3]. Alternatively, Alon and Naor [1] showed that the π2 approximation result can also be obtained via the so-called Grothendieck’s Inequality. Moreover, Ben-Tal and Nemirovski [3] and Alon and Naor [1] showed that the π2 bound is essentially tight. Recently, Goemans and Williamson [8] proposed a randomized approximation algorithm, via complex SDP relaxation, for solving the Max-3-Cut problem, which is formulated as a quadratic maximization problem with complex-valued decision variables. In particular, they consider the following model: maximize z H Qz subject to zk3 = 1, k = 1, ..., n, where Q is the Laplacian of the graph (hence positive semidefinite for a nonnegatively weighted graph). By a SDP relaxation and random hyperplane method, Goemans and Williamson showed that the algorithm achieves a 0.836 approximation ratio. Recently, Zhang and Huang [16] extended Goemans and Williamson’s model [8], where they first developed a closed-form formula for computing the probability of a complex-valued normally distributed bivariate random vector to be in a given angular region, and then applying this formula they obtained an approximation ratio π4 for the problem: maximize z H Qz subject to |zk | = 1, k = 1, ..., n, where Q is Hermitian positive semidefinite. Similar as with its real-case counter-part, in fact this π4 approximation ratio can also be obtained in two other ways: either by the so-called complex matrix cube theorem developed by Ben-Tal, Nemirovski and Roos [4], or by the complex Grothendieck’s Inequality approach developed by Haagerup [9]. However, these approaches shed lights on the problem from very different angles. For the discrete version of the model: maximize z H Qz subject to zkm = 1, k = 1, ..., n, So, Zhang and Ye [12] obtained the m2 (1 − cos 2π m )/8π approximation ratio based on Grothendieck’s Inequality. We obtained the same approximation ratio in a later version of [16], by using the probability formula that we developed earlier. Charikar and Wirth [6] considered the following problem: maximize xT Qx subject to x2k = 1, k = 1, ..., n, where Q is a symmetric matrix with zero diagonal elements. They presented an Ω(1/ log n)-approximation algorithm for such problems. The novelty of their approach lies in a rounding procedure, taking into account the size of the projections of a random vector onto the solution vectors. A related paper was due to Alon and Naor [1], where they considered the so-called cut-norm of a real 2 matrix. In fact, they first considered another norm, denoted as ||A||∞7→1 , of a real matrix A ∈ <m×n , with ||A||∞7→1 := max{xT Ay | x2i = 1, yj2 = 1, i = 1, ..., m, j = 1, ..., n}. Alon and Naor [1] proposed a 0.56-approximation algorithm for computing the norm ||A||∞7→1 of a real matrix A, where the π √ ratio 0.56 is the reciprocal of an upper bound 2 ln(1+ of the real Grothendieck’s constant. Clearly, 2) ||A||∞7→1 can also be viewed as the optimal value of a binary quadratic programming, which in fact is a subclass of the problem considered by Charikar and Wirth [6]. In this paper, we consider approximation algorithms for indefinite complex quadratic programming with m-point constellation constraint. In particular, we consider the following two problems. Problem (i): maximize z H Qz, subject to zkm = 1, k = 1, ..., n, where Q is a Hermitian matrix with tr Q = 0; and Problem (ii): maximize Re y H Az, subject to ykm = 1, k = 1, ..., p, and zlm = 1, l = 1, ..., q, where A ∈ Cp×q . These models are extensions of Charikar and Wirth [6], and Alon and Naor [1] respectively. We propose an Ω(1/ log n)-approximation algorithm for Problem (i), and a ³ ´ m2 (1−cos 4π 2π ) m − 1 -approximation algorithm for Problem (ii). Some further extensions of the model in Problem (i) are also considered. With regard to Problem (ii), we propose a 0.7118-approximation algorithm in the limiting continuous case (m → ∞), where we make use of several results established in Zhang and Huang [16], and Haagerup [9]. This paper is organized as follows. In Section 2 we discuss an approximation algorithm for the indefinite complex quadratic maximization problem: Problem (i), and show that the approximation results can be obtained for quadratic programming with convex constraint on squared modulus and continuous complex quadratic programming. In Section 3, we study Problem (ii), and study approximation algorithms for both the discrete and continuous cases. Notation. We denote by ā the conjugate of a complex number a, Arg z the argument of z, |z| the modulus of z, and by Cn the space of n-dimensional complex vectors. As a convention we assume Arg z = 0 if z = 0. For a given vector z ∈ Cn , we denote z H the conjugate transpose of z, and Diag(z) the n×n diagonal matrix with diagonal entries taken from z, and if Z is an n×n matrix, then diag(Z) denotes an n-dimensional vector formed by the diagonal elements of Z. The space of n × n real symmetric and the space of complex Hermitian matrices are denoted by S n and Hn , respectively. For a matrix Z ∈ Hn , we write Re Z and Im Z for the real part and imaginary part of Z, respectively. Matrix Z being Hermitian implies that Re Z is symmetric and Im Z is skew-symmetric. We denote n (S n ) and Hn (Hn ) the cones of real symmetric positive semidefinite (positive definite) by S+ ++ + ++ and complex Hermitian positive semidefinite (positive definite) matrices, respectively. The notation Z º 0 ( 0) means that Z is positive semidefinite (positive definite). For two complex matrices Y and £ ¤ Z, their inner product Y • Z is defined to be Re (tr Y H Z) = tr (Re Y )T (Re Z) + (Im Y )T (Im Z) , where tr denotes the trace of a matrix and T denotes the transpose of a matrix. 3 2 Indefinite complex quadratic maximization In this section, we consider the following (indefinite) complex quadratic maximization problem (DQP) max z H Qz s.t. zk ∈ C and zkm = 1, k = 1, . . . , n, where Q 6= 0 is an indefinite Hermitian matrix with diag(Q) = 0, and m ≥ 2 is an integer which is a part of the input parameter of the problem. Clearly, the problem can be more explicitly written as (DQP) max z H Qz s.t. zk ∈ {1, ω, . . . , ω m−1 }, k = 1, . . . , n, 2π where ω = ei m = cos 2π m + i sin m . 2π In case Q is real symmetric and m = 2, the above model coincide with the one considered in Charikar and Wirth [6]. We remark that, as |zk | = 1, the objective function value remains unchanged if we replace the condition diag(Q) = 0 by tr Q = 0. Zhang and Huang [16] considered approximation algorithms for such problem when Q is Hermitian positive semidefinite. In that case, applications of such models arise from solving the Max-3-Cut problem (Goemans and Williamson [8]) and signal processing (Luo, Luo, and Kisialiou [10]). We solve the following semidefinite programming as a relaxation of (DQP): (SDP) max Q • Z s.t. Zkk = 1, k = 1, . . . , n, Z º 0. This relaxed problem (complex SDP) can be solved in polynomial time up to any prescribed precision. For practical solution methods, see e.g. Sturm [13]. Next we shall discuss what to do with the optimal solution of the relaxed problem, and our method is inspired by Charikar and Wirth [6]. Suppose that Z ∗ is an optimal solution of (SDP). We draw a random complex vector ξ ∈ Nc (0, Z ∗ ), where Nc (0, Z ∗ ) stands for the n-dimensional complex-valued normal distribution with mean vector zero and covariance matrix Z ∗ . (More discussions on complex-valued normal distribution can be found in [2] and [16]). For k = 1, 2, ..., n, let ( ξk /|ξk |, if |ξk | > T ; xk := (1) ξk /T, if |ξk | ≤ T , where T > 0 is a number to be specified later. 4 This process naturally produces a complex vector x ∈ Cn with |xk | ≤ 1, k = 1, ..., n. We then use x to further randomly generate zk independently from each other as follows: 1, with probability (1 + Re xk )/m . . . zk = (2) ωj , with probability (1 + Re (ω −j xk ))/m . .. m−1 ω , with probability (1 + Re (ω −(m−1) xk ))/m for k = 1, . . . , n. Indeed we note that (1 + Re (ω −j xk ))/m ≥ 0 for all j = 0, 1, ..., m − 1, and that m−1 m−1 X X 1 (1 + Re (ω −j xk ))/m = 1 + Re ω −j xk = 1. m j=0 j=0 With regard to this second randomization process (from x to z), we have the following general result. Lemma 2.1 For k 6= l, it holds that E[zk z̄l ] = ( E [Re xk Re x̄l ] , 1 4 E[xk x̄l ], for m = 2, for m ≥ 3. The case m = 2 is easy to see, and is actually used in Charikar and Wirth [6]. For the case m ≥ 3, the proof of Lemma 2.1 involves some tedious calculations, and we postpone the proof to the appendix. We remark that there is an immediate consequence of Lemma 2.1 regarding the relationship between the optimal Max-2-Cut value and the optimal Max-3-Cut value for the same weighted graph. To be specific, consider a weighted graph (undirected) with n nodes, and the weight on the edge (k, l) is wkl (k = 6 l). Let Q be the Laplacian matrix of a weighted graph, i.e., Qkl = −wkl for k 6= l, and P 2π 4π Qkk = l6=k wkl , k = 1, ..., n. Let xk ∈ {1, −1}, k = 1, ..., n, and zk ∈ {1, ei 3 , ei 3 }. Let X = xxT and Z = zz H . It is easy to verify that the corresponding 2-Cut value associated with x is 14 Q • X, and the corresponding 3-Cut value associated with z is 13 Q • Z. Let us denote the sum of all weights be P W ∗ := k<l wkl . Now let x ∈ {1, −1}n be corresponding to the optimal Max-2-Cut solution. Based 2π 4π on x we again generate z ∈ {1, ei 3 , ei 3 } as described in the above procedure. Then we have v(M3C) ≥ = = 1 E[Q • Z] 3à ! n X 1 X Qkk + 2 Qkl Re E [zk z̄l ] 3 k=1 k<l à ! X 1 1 2W ∗ + Qkl xk xl 3 2 k<l = 1 ∗ 1 W + v(M2C), 2 3 5 (3) where v(M3C) is the optimal Max-3-Cut value and v(M2C) is the optimal Max-2-Cut value. Note that in relation (3), no assumption is made regarding the signs of the weights. In the construction of vector x, its components differ from that of ξ/T only when |ξk | > T . Our next target is to show that the difference is small if T is large. For k 6= l, define ∆kl = ξk ξ¯l /T 2 − xk x̄l . 2 Lemma 2.2 For k 6= l and T > 1 it holds that E[|∆kl |] < e−T ( T5 + 4). Proof. Let us divide the complex space C2 (with overlaps) into A = {(ξk , ξl ) : |ξk | ≤ T, |ξl | ≤ T }, B = {(ξk , ξl ) : |ξl | > T } and C = {(ξk , ξl ) : |ξk | > T }. Suppose, without loss of generality due to symmetricity, that EC [|∆kl |] ≤ EB [|∆kl |]. We have E[|∆kl |] ≤ EA [|∆kl |] + EB [|∆kl |] + EC [|∆kl |] ≤ EA [|∆kl |] + 2EB [|∆kl |]. (4) Obviously, EA [|∆kl |] = 0, and we need only to treat the term EB [|∆kl |]. ∗ = γeiα . Since Suppose that E[ξk ξ¯l ] = Zkl à à à ! !! 1 γeiα ξk ∈ Nc 0, , ξl γe−iα 1 we can express (ξk , ξl ) in terms of (η, λ) as à where η λ ! p 1 − γ 2 λ, ξl = η, ξk = γeiα η + ∈ Nc (0, I2 ). Thus, we have Z Prob {|ξl | > T } = Prob {|η| > T } = ∞ Z 2π T 0 r −r2 e dθdr = 2 π Z ∞ 2 2 re−r dr = e−T . T Moreover, 2 EB [|xk x̄l |] ≤ EB [1] = Prob {(ξk , ξl ) ∈ B} = Prob {|η| > T } = e−T . On the other hand, since γ ≤ 1, we have EB [|ξk ξl |] = EB [γeiα η η̄ + p 1 − γ 2 λη̄] ≤ EB [|η|2 + |η| · |λ|] Z ∞ Z 2π 3 Z ∞ Z 2π 2 Z ∞ Z 2π 2 r −r2 r s −s2 = e dθdr + dθ1 dr e dθ2 ds π π π T 0 T 0 0 0 √ √ π π 2 T + 1)e−T + (1 − Φ( 2 T )), = (T 2 + 2 2 6 where in the last equality we use the facts that √ Z ∞ Z ∞ √ 1 2 T −T 2 π 3 −r2 −T 2 2 −r2 r e dr = (T + 1)e , and r e dr = e + (1 − Φ( 2 T )), 2 2 2 T T and Z 4 ∞ 2 s2 e−s ds = √ π, 0 where Φ(·) is the cumulative distribution function of the real-valued standard normal distribution N (0, 1). Therefore, EB [|∆kl |] ≤ EB [|ξk ξ¯l |/T 2 + |xk x̄l |] √ √ π π −T 2 1 + 2) + (1 − Φ( 2 T )), ≤ e ( 2+ T 2T 2T 2 and so from (4) we have E[|∆kl |] ≤ e −T 2 √ √ π π 2 + 4) + 2 (1 − Φ( 2 T )). ( 2+ T T T (5) Further notice that √ Z ∞ √ Z ∞ √ √ π π π −T 2 π −s2 −s2 e ds ≤ 2 se ds = (1 − Φ( 2 T )) = 2 e , 2 T T T 2T 2 T T and so it follows from (5) that E[|∆kl |] ≤ e −T 2 √ √ 4+ π π 2 5 ( + 4) < e−T ( + 4). + 2 2T T T Note that in the above derivations we used the fact that T > 1. Lemma 4 of [6] asserts that v(DRQP) ≥ 1X |akl |, n k,l where v(DRQP) is the optimal value of (DRQP) max xT Ax s.t. xk ∈ {±1}, k = 1, . . . , n, where A is a real matrix with diag(A) = 0. Since (SDP) is a relaxation of the following continuous complex quadratic programming (CQP) (CQP) max z H Qz s.t. |zk | = 1, k = 1, . . . , n, 7 ¤ (6) it follows that the optimal value of (SDP), v(SDP), is no less than the optimal value of (CQP), v(CQP). By denoting Re Q = A, Im Q = B, Re z = x and Im z = y, we may write (CQP) as an equivalent real quadratic program (CRQP): à !à ! A −B x T T (CRQP) max (x , y ) B A y 2 2 s.t. xk + yk = 1, k = 1, . . . , n. That is, v(CQP) = v(CRQP). Obviously, the optimal value of (CRQP), v(CRQP), is no less than v(SRQP), the optimal value of à !à ! A −B x T T (SRQP) max (x , y ) B A y 2 2 s.t. xk = yk = 1/2, k = 1, . . . , n, which is in fact equivalent to the following problem, by substitution of variables u := √ v := 2 y, à !à ! A −B u max 12 (uT , v T ) B A v 2 2 s.t. uk = vk = 1, k = 1, . . . , n. √ 2 x and Hence, using (6) one obtains v(SRQP) ≥ 1 X 1 X (|Akl | + |Bkl |) ≥ |Qkl |. 2n 2n k,l k,l Finally, by noting v(SDP) ≥ v(CQP) = v(CRQP) ≥ v(SRQP), we arrive at the following result: Lemma 2.3 It holds that v(SDP) ≥ 1 2n P k,l |Qkl |. Theorem 2.4 Suppose that n ≥ 3. By setting T = 8 √ 9 log n, we have E[z H Qz] ≥ 1 40 log n v(SDP). Proof. It follows that E[z H Qz] = X Qlk E[zk z̄l ] k6=l = 1X Qlk E[xk x̄l ] 4 (7) k6=l = 1X Qlk E[−∆kl + ξk ξ¯l /T 2 ] 4 k6=l = = ≥ ≥ X 1 X ( Qlk E[ξk ξ¯l /T 2 ] − Qlk E[∆kl ]) 4 k6=l k6=l X 1 v(SDP)/T 2 − Qlk E[∆kl ] 4 k6=l X 1 |Qlk | × |E[∆kl ]| v(SDP)/T 2 − 4 k6=l µ ¶ 1 2 −T 2 5 v(SDP)/T − e ( + 4) × 2n × v(SDP) 4 T (8) 2 1 − e−T (4T 2 + 5T )2n = v(SDP), 4T 2 where, equality (7) is due to Lemma 2.1, and inequality (8) is due to Lemmas 2.2 and 2.3. If T = √ 9 log n then 2 1−e−T (4T 2 +5T )2n T2 > 1 10 log n , E[z H Qz] ≥ whenever n ≥ 3. Thus we have 1 v(SDP ). 40 log n ¤ Theorem 2.4 implies that the approximation ratio of our randomized algorithm for (DQP) is at least Ω(1/ log n), extending the approximation result of Charikar and Wirth [6] to the complex discrete quadratic maximization model (DQP). In the next two subsections, we shall see that the result can be further extended. 2.1 Constrained squared modulus Consider the following constrained model (DQPC) max z H Qz 1 s.t. Arg zk ∈ {0, m 2π, ..., m−1 m 2π}, k = 1, . . . , n, 2 2 T (|z1 | , . . . , |zn | ) ∈ F, 9 where Q ∈ Hn with diag(Q) = 0, and F is a closed convex set in <n . In particular, if F is the singleton e (the all-one vector), then (DQPC) reduces to (DQP). Consider the following convex SDP relaxation for (DQPC) (CSDP) max Q • Z s.t. diag(Z) ∈ F, Z º 0. For 0 ≤ d ∈ <n , define its generalized inverse vector d−1 to be ( 1/dk , if dk > 0; −1 (d )k = 1, if dk = 0, where k = 1, . . . , n. Thus the diagonal matrix D−1 := Diag(d−1 )  0. Thus, ( 1, if dk > 0; (D−1 D)kk = 0, if dk = 0. p For a Hermitian positive semidefinite matrix Z, let d := diag(Z) and D := Diag(d), and Z̃ := Z + (I − D−1 D). (9) Then Z̃kl = Zkl for any k 6= l, Z̃kk = Zkk if Zkk > 0, and Z̃kk = 1 if Zkk = 0. n , and Z̃ be defined as in (9). Then the matrix D −1 Z̃D −1 has the properties: Lemma 2.5 Let Z ∈ H+ (i) D−1 Z̃D−1 º 0; (ii) diag(D−1 Z̃D−1 ) is the n-dimensional all-one vector; (iii) DD−1 Z̃D−1 D = Z. Proof. The first two assertions follow from the definition, and the last assertion can be seen from the fact that DD−1 Z̃D−1 D = DD−1 (Z + (I − D−1 D))D−1 D = DD−1 ZD−1 D = Z. This completes the proof of (iii). ¤ Now we consider the following randomization method for solving (DQPC). p Let Z ∗ be an optimal solution of (CSDP). Let d := diag(Z ∗ ), D := Diag(d), D−1 := Diag(d−1 ), and Z̃ ∗ is defined according to (9). 10 Draw a random complex vector ξ ∈ Nc (0, D−1 Z̃ ∗ D−1 ), and set x according to (1), and generate z according to (2). Finally, set z := Dz. Next we shall analyze the performance of this algorithm. Consider the following complex quadratic program max DQD • zz H s.t. zk ∈ {1, ω, . . . , ω m−1 }, k = 1, . . . , n, and its SDP relaxation (S) max DQD • Z s.t. Zkk = 1, k = 1, . . . , n, Z º 0. p ∗ ∗ 1 P By Lemma 2.3 we have v(S) ≥ 2n k6=l |Qkl | Zkk Zll . Let Y ∗ be optimal for (S). Since DY ∗ D is feasible for (CSDP), v(CSDP) ≥ Q • DY ∗ D = DQD • Y ∗ = v(S) ≥ p ∗ ∗ 1 X |Qkl | Zkk Zll . 2n k6=l That is, − X p ∗ ∗ |Qkl | Zkk Zll ≥ −2n × v(CSDP). k6=l Therefore, E[Q • zz H ] = p ∗ ∗ 1X Qkl Zkk Zll E[xk x̄l ] 4 k6=l = ≥ ≥ ≥ p ∗ ∗ 1X Qkl Zkk Zll (E[ξk ξ¯l ]/T 2 − E[∆kl ]) 4 k6=l −1 ∗ −1 X p ∗ ∗ 1 DQD • D Z̃ D Qkl Zkk − Zll E[|∆kl |] 2 4 T k6=l −1 ∗ −1 X p 1 DQD • D Z̃ D 2 5 ∗ Z∗ |Qkl | Zkk − e−T ( + 4) ll 2 4 T T k6=l µ ¶ 1 v(CSDP) −T 2 5 −e ( + 4)2n × v(CSDP) 4 T2 T 2 1 − e−T (4T 2 + 5T )2n v(CSDP) = 4T 2 1 ≥ v(CSDP), 40 log n √ where in the last step we take T = 9 log n and n ≥ 3. This concludes the theorem below. 11 Theorem 2.6 Let v(DQPC) be the optimal value of (DQPC) and v(CSDP) be the optimal value of 1 (CSDP). Then v(DQPC) ≥ 40 log n v(CSDP) for n ≥ 3. This implies the approximation ratio for the algorithm of (DQPC) is at least Ω(1/ log n). 2.2 Continuous indefinite complex quadratic maximization Consider the following model (CQP) max z H Qz s.t. |zk | = 1, k = 1, . . . , n, where Q ∈ Hn and diag(Q) = 0. We use (SDP) a relaxation for the above problem (CQP). Suppose that Z ∗ is an optimal solution of (SDP). Draw a random complex vector ξ ∈ Nc (0, Z ∗ ), and set x according to (1), and then independently generate zk ∈ Cn as follows: ( eiArg xk , with probability (1 + |xk |)/2, zk = i Arg x k −e , with probability (1 − |x |)/2, k for k = 1, . . . , n. In that case, we have E[zk z̄l ] = E[xk x̄l ] (see also Charikar and Wirth [6]). This leads to X X Qkl E[xk x̄l ] Qkl E[zk z̄l ] = E[z H Qz] = k6=l k6=l ≥ 1−e −T 2 (4T 2 + 5T )2n 1 v(SDP), v(SDP) > T2 10 log n √ if T is taken to be 9 log n (with n ≥ 3). Hence the approximation ratio of the above approximation algorithm is at least Ω(1/ log n). So, Zhang, and Ye [12] obtained the same result for this model. However, the basic techniques used in [12] are quite different from ours. 3 Discrete complex bilinear maximization Alon and Naor [1] studied the problem of approximating the cut-norm ||A||C of a real matrix A ∈ <p×q . They in fact worked on approximation method for computing the norm ||A||∞7→1 , which turned out to provide a scheme to approximate the cut-norm of A. By definition, the value of ||A||∞7→1 is given by the following bilinear binary optimization problem max y T Az s.t. yk , zl ∈ {±1}, k = 1, . . . , p, l = 1, . . . , q, 12 which can be equivalently cast as a binary quadratic maximization problem à !à ! 0 A y max 21 (y T , z T ) T A 0 z s.t. yk , zl ∈ {±1}, k = 1, . . . , p, l = 1, . . . , q. At this point, we remark that computing ||A||∞7→1 is clearly a special case of Charikar and Wirth [6]’s model (DRQP). Based on an identity due to Grothendieck, Alon and Naor [1] managed to derive a 0.56-approximation algorithm for computing ||A||∞7→1 . In this section, we shall consider the discrete complex bilinear maximization problem max Re y H Qz s.t. yk , zl ∈ {1, ω, . . . , ω m−1 }, k = 1, . . . , p, l = 1, . . . , q, or, equivalently, the following complex discrete quadratic program (DBLP): à !à ! 0 Q y 1 H H (DBLP) max 2 (y , z ) H Q 0 z m−1 s.t. yk , zl ∈ {1, ω, . . . , ω }, k = 1, . . . , p, l = 1, . . . , q, where Q is an p × q complex matrix, and ω = cos(2π/m) + i sin(2π/m), m ≥ 3. We also consider the following continuous version of (DBLP), à !à ! 0 Q y 1 H H (CBLP) max 2 (y , z ) H Q 0 z s.t. |yk | = |zl | = 1, k = 1, . . . , p, l = 1, . . . , q. Clearly, the objective matrix of (DBLP) is Hermitian with zero diagonals. As its real counter-part, (DBLP) is a subclass of (DQP) and (CBLP) is a subclass of (CQP). The SDP relaxation for (DBLP) and (CBLP) is à ! 0 Q (SDBLP) max 12 •W QH 0 s.t. Wkk = 1, k = 1, . . . , p + q, W º 0. Lemma 2.3 asserts that if Q 6= 0 then v(SDBLP) > 0. Also, we note that in the summation form, the P P objective function of (SDBLP) is pk=1 ql=1 Re (Q̄kl Wk,p+l ), and the objective function of (DBLP) P P and (CBLP) is pk=1 ql=1 Re (Q̄kl yk z̄l ). 13 3.1 An approximation algorithm for the discrete complex bilinear program Let W ∗ be an optimal solution for (SDBLP). We draw a random complex vector as follows: à ! ξ ∈ Nc (0, W ∗ ), η and generate complex vectors y ∈ Cp and z ∈ Cq as follows: j For k = 1, ..., p, assign yk := ω j if Arg ξk ∈ [ m 2π, j+1 m 2π) with j ∈ {0, 1, . . . , m − 1}; and j For l = 1, ..., q, assign zl := ω j if Arg ηl ∈ [ m 2π, j+1 m 2π) with j ∈ {0, 1, . . . , m − 1}. In Zhang and Huang [16] we have shown that E[yk z̄l ] = m−1 m(2 − ω − ω −1 ) X j ∗ ∗ ω (arccos(−Re (ω −j Wk,p+l )))2 =: Fm (Wk,p+l ), ∀k, l, 8π 2 j=0 and Fm (z) = z for z ∈ {1, ω, . . . , ω m−1 }. Furthermore, in Appendix of [16] we established that Fm (z) = with m2 (1 − cos 2π m) z + φ1 (z) + φ2 (z), 8π ∞ 2r+1 X X m2 (1 − cos 2π m) φ1 (z) = ar b2r+2−2i z i (z̄)2r+1−i , 4π r=1 and φ2 (z) = i=0 ∞ 2s+2t+2 X X m2 (1 − cos 2π m) a a b2s+2t+3−2i z i (z̄)2s+2t+2−i , s t 2 4π s=0,t=0 i=0 where (2r)! ar = 4r+1 , bk+1−2i = 2 (r!)2 (2r + 1) Note that à k i ! m−1 X ei( m ji) . 2π j=0 Pm−1 i( 2π ji) m is either 0 or m. j=0 e Let φ(z) := φ1 (z) + φ2 (z). If Z º 0 then Z̄ º 0. Moreover, the Hadamard product of Hermitian positive semidefinite matrices remain positive semidefinite. This implies that if Z º 0 then φ(Z) º 0, where φ(Z) := (φ(Zkl ))n×n . m2 (1−cos 2π ) m2 (1−cos 2π ) m m On the other hand, since Fm (1) = 1 we have 1 = + φ(1). Let βm := . We 8π 8π ∗ ∗ conclude that (φ(W ))kk /(1 − βm ) = 1, for k = 1, ..., p + q, and so, φ(W )/(1 − βm ) is itself a feasible 14 solution for (SDBLP). Now observe that for any feasible solution of (SDBLP), say W , it necessarily follows that à ! 0 Q 1 −v(SDBLP) ≤ • W ≤ v(SDBLP). (10) 2 QH 0 The second inequality is obvious, by definition of the feasibility. To argue that the first inequality also holds, we consider a decomposition of W º 0, i.e., à ! UH W = · (U, V ), VH where the number of rows in U H is p, and the number of rows in V H is q. Let us now consider another solution, à ! UH W̃ := · (U, −V ) º 0. −V H Since the diagonal of W̃ remains the all-one vector, it is also a feasible solution for (SDBLP). Therefore à ! à ! 0 Q 0 Q 1 1 v(SDBLP) ≥ • W̃ = − • W, 2 2 QH 0 QH 0 and so the first inequality in (10) follows. Therefore, à ! 0 Q 1 • φ(W ∗ )/(1 − βm ) ≥ −v(SDBLP). 2 QH 0 Now we are in a position to calculate the expected value of the randomized solution " p q # p X q XX X ∗ E Re (Q̄kl yk z̄l ) = Re (Q̄kl Fm (Wk,p+l )) k=1 l=1 k=1 l=1 = p X q X ∗ ∗ Re (Q̄kl (βm Wk,p+l + φ(Wk,p+l ))) k=1 l=1 1 = βm × v(SDBLP) + (1 − βm ) × 2 à 0 Q QH 0 ! • φ(W ∗ )/(1 − βm ) ≥ βm × v(SDBLP) + (βm − 1) × v(SDBLP) = (2βm − 1) × v(SDBLP) m2 (1 − cos 2π m) = ( − 1) × v(SDBLP). 4π This leads to the following result. m2 (1−cos 2π ) m Theorem 3.1 There is an approximation algorithm for (DBLP) with the ratio αm := −1 4π for m ≥ 3. In particular, α3 ≥ 0.0742, α4 ≥ 0.2732, α5 ≥ 0.3746, α10 ≥ 0.5198, and α100 ≥ 0.5702. 15 3.2 An improved approximation algorithm for the continuous bilinear problem Since (CBLP) is the limit of (DBLP) with m → ∞, it is clear that if we let yk = eiArg ξk and zl = eiArg ηl where (ξ H , η H )H is generated from Nc (0, W ∗ ) and W ∗ is an optimal solution of (SDBLP), then we will get an approximation algorithm with an approximation ratio of limm→∞ π 2 − 1 ≈ 0.5708. m2 (1−cos 4π 2π ) m −1 = In this subsection, we shall show that this ratio can be improved if we further exploit particular structures of Problem (CBLP). Our analysis below makes use of some of the results in our previous paper [16], and also important insights presented in a paper by Haagerup, [9]. According to the analysis in Sections 3.1 and 3.3 of Zhang and Huang [16], if we generate yk = eiArg ξk and zl = eiArg ηl with (ξ H , η H )H , then we have ∗ ∗ lim Fm (Wk,p+l ) (=: F (Wk,p+l )) Z 2π 1 eiθ (arccos(−γ cos(θ − α)))2 dθ = 4π 0 Z π/2 i α = e arcsin(γ sin θ) sin θdθ E[yk z̄l ] = m→∞ 0 ∞ = π iα X e cr γ 2r+1 , 4 r=0 ∗ where we denoted Wk,p+l as γeiα and cr = Z ((2r)!)2 . 24r (r!)2 (r+1) à π/2 arcsin(γ sin θ) sin θdθ ψ(γ) := For γ ∈ [−1, 1], let Z π/2 p =γ 0 0 (cos θ)2 1 − (γ sin θ)2 ! dθ . In that notation, the transformation function F can be rewritten as F (z) = eiArg z ψ(|z|). We remark here that equation (3.9) in Zhang and Huang [16] coincides with Lemma 3.2 in Haagerup [9], although we were not aware of [9] at the time when we derived that equation. Important properties of the function ψ(γ) are discussed in [9]. In particular, Theorem 2.1 of [9] states that the inverse function ψ −1 : [−1, 1] → [−1, 1] of ψ exists and it can be expanded into an absolutely convergent power series: ∞ X b2r+1 s2r+1 , s ∈ [−1, 1], ψ −1 (s) = r=0 4 π with b1 = and b2r+1 ≤ 0 for all r ≥ 1. Specifically, b3 = −8/π 3 , b5 = 0, b7 = −16/π 7 , b9 = −80/π 9 , b11 = −480/π 11 , b13 = −3136/π 13 and b2r+1 ∼ −4/((2r + 1) log(2r + 1))2 for r → ∞. Moreover, the following result was shown in [9], which was used to bound the complex Grothendieck constant. 16 Lemma 3.2 There is a unique β ∈ (0, 1) for which with β ≈ 0.7118. P∞ r=0 |b2r+1 |β 2r+1 = 1 (i.e., ψ −1 (β) = π8 β − 1), Now the inverse function of F (z) can be written as F −1 (z) = eiArg z ψ −1 (|z|) = ∞ X b2r+1 z r+1 z̄ r . r=0 p+q with all-one diagonal elements, let us construct another Hermitian matrix For a given W ∈ H+ p+q G(W ) ∈ H as follows: ∞ Gk,p+l (W ) := Gk1 ,k2 (W ) := X 4 βWk,p+l − |b2r+1 |β 2r+1 (Wk,p+l )r+1 (W̄k,p+l )r , k = 1, ..., p, l = 1, ..., q, π 4 βWk1 ,k2 + π r=1 ∞ X |b2r+1 |β 2r+1 (Wk1 ,k2 )r+1 (W̄k1 ,k2 )r , k1 , k2 = 1, ..., p, r=1 ∞ Gp+l1 ,p+l2 (W ) := X 4 βWp+l1 ,p+l2 + |b2r+1 |β 2r+1 (Wp+l1 ,p+l2 )r+1 (W̄p+l1 ,p+l2 )r , l1 , l2 = 1, ..., q. π r=1 That is, Gk,p+l (W ) = F −1 (βWk,p+l ), k = 1, ..., p, l = 1, ..., q, 8 Gk1 ,k2 (W ) = βWk1 ,k2 − F −1 (βWk1 ,k2 ), k1 , k2 = 1, ..., p, π 8 βWp+l1 ,p+l2 − F −1 (βWp+l1 ,p+l2 ), l1 , l2 = 1, ..., q. Gp+l1 ,p+l2 (W ) = π By the choice of β (Lemma 3.2), we see that if W has all-one diagonal elements, then so is true for the T T T T Hermitian matrix G(W ). Denote E := (eT p , −eq ) (ep , −eq ) (º 0), where ep and eq are the all-one vectors in <p and <q respectively. We can now write G(W ) in a uniform fashion as follows ∞ G(W ) = X 4 βW + |b2r+1 |β 2r+1 E ◦ (W )(r+1) ◦ (W T )(r) . π r=1 where ‘A ◦ B’ stands for the Hadamard product between A and B, and A(r) is the rth power in the r }| { z (r) Hadamard sense, i.e., A = A ◦ A ◦ · · · ◦ A. If W º 0, then, by the fact that the Hadamard product of positive semidefinite matrices remains positive semidefinite, we have G(W ) º 0. As a remark, we note here that combining this and the previous observation (regarding the diagonals of G(W )) leads to the conclusion that if W is a feasible solution for (SDBLP) then so is true for G(W ). 17 Suppose that W ∗ is an optimal solution of (SDBLP). Let us take yk = eiArg ξk and zl = eiArg ηl with (ξ H , η H )H randomly generated from Nc (0, G(W ∗ )). In that case, the expected objective value is X X E Re (Q̄kl yk z̄l ) = Re (Q̄kl E[yk z̄l ]) k,l k,l = X ∗ Re (Q̄kl F (G(Wk,p+l ))) k,l = X ∗ Re (Q̄kl F (F −1 (βWk,p+l ))) k,l = X ∗ Re (Q̄kl βWk,p+l ) k,l = β X ∗ Re (Q̄kl Wk,p+l ) k,l = β × v(SDBLP) ≈ 0.7118 × v(SDBLP). This proves the following theorem. Theorem 3.3 The above randomized algorithm has an approximation ratio 0.7118 for the continuous complex bilinear maximization problem (CBLP). A Proof of Lemma 2.1 We only consider the case m ≥ 3 here. Since for the random variables z and x it holds that E[zk z̄l ] = E[E[zk z̄l | (xk , xl )]], for k 6= l, we shall first compute E[zk z̄l | (xk = x0k , xl = x0l )]. For simplicity, we drop the superscript naughts of x0k and x0l , and denote the expectation E[zk z̄l | (xk = x0k , xl = x0l )] simply by E[zk z̄l | (xk , xl )], and Prob {zk = ω j , zl = ω j−i | (xk = x0k , xl = x0l )} by Prob {zk = ω j , zl = ω j−i | (xk , xl )}. We have E[zk z̄l | (xk , xl )] = 1 × m−1 X Prob {zk = ω j , zl = ω j | (xk , xl )} + · · · j=0 i +ω × m−1 X Prob {zk = ω j , zl = ω j−i | (xk , xl )} + · · · j=0 +ω m−1 × m−1 X Prob {zk = ω j , zl = ω j−m+1 | (xk , xl )}. j=0 18 Obviously, m−1 X Prob {zk = ω j , zl = ω j−i | (xk , xl )} j=0 = m−1 X j=0 1 + Re (ω −j xk ) 1 + Re (ω −j+i xl ) × m m m−1 1 1 X Re (ω −j xk )Re (ω −j+i xl ). + m m2 = j=0 Thus we further have E[zk z̄l | (xk , xl )] = m−1 m−1 m−1 1 X iX 1 X j ω + 2 ω Re (ω −j xk )Re (ω −j+i xl ) m m j=0 = i=0 j=0 m−1 m−1 X 1 X −j Re (ω x ) ω i Re (ω −j+i xl ). k 2 m j=0 i=0 To simplify the expression, let us write xk = a + ib and xl = c + id. Then we have m−1 X Re (ω −j xk ) j=0 m−1 X ω i Re (ω −j+i xl ) i=0 " m−1 X " # # m−1 m−1 m−1 X X X j j i j−i i j−i cos 2π = ac sin 2π cos 2π cos 2π + bd cos 2π sin 2π m m m m m m j=0 i=0 j=0 i=0 " # # " m−1 m−1 m−1 m−1 X X X X j i j−i i j−i j sin 2π cos 2π sin 2π + bc cos 2π cos 2π cos 2π +ad m m m m m m i=0 j=0 i=0 j=0 " " # # m−1 m−1 m−1 m−1 X X X X j j i j − i i j − i +i ad cos 2π sin 2π sin 2π + bc sin 2π sin 2π cos 2π m m m m m m j=0 i=0 j=0 i=0 " # " # m−1 m−1 m−1 m−1 X X X X j j i j−i i j−i cos 2π sin 2π +ac sin 2π cos 2π − bd sin 2π sin 2π . m m m m m m j=0 Observing Pm−1 j=0 i=0 j=0 i=0 ω 2j = 0 (note here that m ≥ 3), one has m−1 X j=0 m−1 X 2j 2j sin( 2π) = 0. cos( 2π) = m m j=0 This in turn leads to m−1 X j=0 sin m−1 m−1 X X j j j j (sin 2π)2 = (cos 2π)2 = m/2. 2π cos 2π = 0, and m m m m j=0 19 j=0 It follows that # m−1 X j i j−i cos 2π cos 2π cos 2π m m m j=0 i=0 " ¶# m−1 m−1 X Xµ j j i j i i = cos 2π cos 2π(cos 2π)2 + sin 2π sin 2π cos 2π m m m m m m i=0 j=0 " # m−1 m−1 m−1 X X X j i j j i i = (cos 2π)2 (cos 2π)2 + sin 2π cos 2π sin 2π cos 2π m m m m m m j=0 i=0 i=0 # " m−1 m−1 X X j i = (cos 2π)2 (cos 2π)2 m m m−1 X = j=0 m2 4 " i=0 . In a similar way, one further calculates that " # m−1 " # m−1 m−1 m−1 X X X X j j i j−i i j−i cos 2π sin 2π cos 2π cos 2π = cos 2π sin 2π m m m m m m j=0 i=0 j=0 i=0 # m−1 " # " m−1 m−1 m−1 X X X X j i j−i i j−i j sin 2π sin 2π sin 2π = sin 2π cos 2π cos 2π = − m m m m m m = j=0 2 m i=0 j=0 i=0 4 and # m−1 " # m−1 m−1 X X X j j i j−i i j−i cos 2π sin 2π cos 2π sin 2π = cos 2π cos 2π m m m m m m j=0 i=0 j=0 i=0 " # m−1 " # m−1 m−1 m−1 X X X X j i j−i j i j−i = cos 2π sin 2π cos 2π = sin 2π sin 2π sin 2π m m m m m m m−1 X j=0 " i=0 j=0 i=0 = 0. Hence m−1 X = = = (Re (ω j xk ) m−1 X j=0 m2 ω i Re (ω j−i xl )) i=0 (ac + bd + i(−ad + bc)) 4 m2 (a + ib)(c − id) 4 m2 xk x̄l . 4 20 This means that E [zk z̄l | (xk , xl )] = m2 1 1 × xk x̄l = xk x̄l , 2 m 4 4 £ ¤ i.e., E zk z̄l | (xk = x0k , xl = x0l ) = 41 x0k x0l . Therefore 1 E [zk z̄l ] = E [E [zk z̄l | (xk , xl )]] = E [xk x̄l ] . 4 ¤ References [1] N. Alon and A. Naor, Approximating the Cut-Norm via Grothendieck’s inequality, Proceedings of the Thirty-sixth Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, 2004, pp. 72 – 80. [2] H.H. Andersen, M. Højbjerre, D. Sørensen, and P.S. Eriksen, Linear and graphical models for the multivariate complex normal distribution, Lecture Notes in Statistics 101, Springer-Verlag, New York, 1995. [3] A. Ben-Tal and A. Nemirovski, On tractable approximations of uncertain linear matrix inequalities affected by interval uncertainty, SIAM Journal on Optimization, 12 (2002), pp. 811 – 833. [4] A. Ben-Tal, A. Nemirovski and C. Roos, Extended matrix cube theorems with applications to µ-theory in control, Mathematics of Operations Research, 28 (2003), pp. 497 – 523. [5] D. Bertsimas and Y. Ye, Semidefinite relaxations, multivarite normal distribution, and order statistics, In Handbook of Combinatorial Optimization, vol. 3, D.Z. Du and P.M. Pardalos, eds., Kluwer Academic Publishers, 1998, pp. 1 – 19. [6] M. Charikar and A. Wirth, Maximizing quadratic programs: extending Grothendieck’s inequality, Proc. 45th FOCS, 2004. [7] M.X. Goemans and D.P. Williamson, Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming, Journal of the ACM, 42 (1995), pp. 1115 – 1145. [8] M.X. Goemans and D.P. Williamson, Approximation algorithms for MAX-3-CUT and other problems via complex semidefinite programming, Journal of Computer and System Sciences, 68 (2004), pp. 442 – 470. [9] U. Haagerup, A new upper bound for the complex Grothendieck constant, Israel Journal of Mathematics, 60 (1987), pp. 199 – 224. 21 [10] Z.Q. Luo, X.D. Luo, and M. Kisialiou, An efficient quasi-maximum likelihood decoder for PSK signals, in Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), 2003, pp. 561 – 564 (vol. 6). [11] Yu. Nesterov, Semidefinite relaxation and nonconvex quadratic optimization, Optimization Methods and Softwares, 9 (1998), pp. 141 – 160. [12] A. So, J. Zhang, and Y. Ye, On approximating complex quadratic optimization problems via semidefinite programming relaxations, in Proceedings of the 11th Conference on Integer Programming and Combinatorial Optimization, Lecture Notes in Computer Science, vol. 3509, M. Jünger and V. Kaibel, eds., Springer-Verlag, Berlin, 2005, pp. 125 – 135. [13] J.F. Sturm, Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones, Optimization Methods and Software, 11-12 (1999), pp. 625 – 653. [14] Y. Ye, Approximating quadratic programming with bound and quadratic constraints, Mathematical Programming, 84 (1999), pp. 219 – 226. [15] S. Zhang, Quadratic maximization and semidefinite relaxation, Mathematical Programming, 87 (2000), pp. 453 – 465. [16] S. Zhang and Y. Huang, Complex quadratic opitmization and semidefinite programming, Technical Report SEEM2004-03, Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, 2004, (accessible at http://www.se.cuhk.edu.hk/∼zhang/#workingpaper). Accepted for publication in SIAM Journal on Optimization. 22