Multivariate exponential integral approximations: a moment approach Dimitris Bertsimas Xuan Vinh Doan

advertisement
Multivariate exponential integral approximations: a moment approach
Dimitris Bertsimas
∗
Xuan Vinh Doan
†
Jean Lasserre
‡
January 2007
Abstract
We propose a method to approximate a class of exponential multivariate integrals using moment
relaxations. Using this approach, both lower and upper bounds of the integrals are obtained and we
show that these bound values asymptotically converge to the real value of the integrals when the
moment degree r increases. We further demonstrate the method by calculating both hypercubic and
order statistic probabilities for multivariate normal distributions.
1
Introduction
Multivariate integrals arise in statistic, physics, engineering and finance applications among other areas.
For example, these integrals are needed to calculate probabilities over compact sets for multivariate
normal random variables. It is therefore important to compute or approximate multivariate integrals.
Usual methods include Monte Carlo schemes (see Niederreiter [7] for details) and cubature formulae as
shown in e.g. de la Harpe and Pache [2]. However, there are still many open problems currently and
research on multivariate integrals is very much active due to its importance as well as its difficulties.
For instance, most cubature formulas are restricted to special sets like boxes and simplices, and even in
this particular context, determination of orthogonal polynomials used to construct a cubature, is not
an easy task.
∗
Boeing Professor of Operations Research, Sloan School of Management, co-director of the Operations Research Center,
Massachusetts Institute of Technology, E40-147, Cambridge, MA 02139-4307, dbertsim@mit.edu.
†
Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA 02139-4307, vanxuan@mit.edu.
‡
LAAS, 7 Avenue Du Colonel Roche, 31077 Toulouse Cédex 4, France, lasserre@laas.fr
1
Contributions and Paper Outline
In this paper, we attempt to approximate a class of exponential integrals, which in particular can be
useful to calculate probabilities of multivariate normal random variables. Specifically, our contribution
and structure of the paper are as follows:
(1) In Section 2, we provide a general framework to calculate lower and upper bounds for the exponential integrals mentioned above. These bounds are calculated by solving specific semidefinite
programming problems constructed from appropriate sequences of moments.
(2) In Section 3, we prove that the two monotone sequences of lower and upper bounds generated
by these semidefinite programming problems will asymptotically converge to the real value of the
integral. The proof is due to some results from the problem of moments.
(3) In Section 4, computational results are reported for probabilities of multivariate random variables
and their order statistics over a hypercube. These results show that the proposed method is indeed
applicable for this class of integrals.
2
General Framework
Before we provide a detailed analysis for general multivariate exponential integrals, and for clarity of
exposition, we briefly describe the approach for simple integrals in two variables x and y on a box
[a, b] × [c, d]. Indeed, the multivariate case n ≥ 3 essentially uses the same machinery but its detailed
exposition is a little tedious as it requires unavoidable heavy notation, whereas the bivariate case on a
box illustrates the approach with minimal effort.
2.1
Simple Integrals
Suppose that one wants to approximate:
Z bZ
d
g(x, y)eh(x,y) dydx,
ρ=
a
(1)
c
where g, h are bivariate polynomials.
Define Ω = [a, b] × [c, d] ∈ R2 and let us consider the measure µ on R2 defined by
Z
eh(x,y) dydx,
µ(B) =
Ω∩B
2
(2)
where B ∈ B(R2 ), and its sequence of moments z = {z(α, β)}:
Z
xα y β eh(x,y) dydx
z(α, β) =
(3)
Ω
for all (α, β) ∈ N2 . Clearly, ρ =
P
(α,β)∈N2
g(α,β) z(α, β) = hg, zi, where g(α,β) is the coefficient of the
monomial xα y β and g is the sequence of those coefficients. Therefore, if we have all necessary moments
z(α, β), we can compute ρ. Using integration by parts, we have:
Z b h
Z bZ d
iy=d
1
1
∂h(x, y) h(x,y)
α
β+1 h(x,y)
z(α, β) =
x y
e
dx −
xα y β+1
e
dydx.
β+1 a
β+1 a c
∂y
y=c
If {h(γ,δ) } is the finite sequence of coefficients of the polynomial h(x, y) then we have:
z(α, β) =
dβ+1
β+1
b
Z
xα eh(x,d) dx −
a
cβ+1
β+1
Z
b
xα eh(x,c) dx −
a
X
(γ,δ)∈N2
δh(γ,δ)
z(α + γ, β + δ).
β+1
If we define two measures ν and η on R with their corresponding sequences of moments:
Z b
Z b
α h(x,d)
v(α) =
x e
dx, w(α) =
xα eh(x,c) dx,
a
(4)
a
for all α ∈ N respectively, then we have:
z(α, β) =
cβ+1
dβ+1
v(α) −
w(α) −
β+1
β+1
X
(γ,δ)∈N2
δh(γ,δ)
z(α + γ, β + δ).
β+1
(5)
Let us define k(x) = h(x, d) and l(x) = h(x, c). Clearly, k(x) and l(x) are univariate polynomials in x.
Using integration by parts again for v(α), we have:
v(α) =
1
1 h α+1 k(x) ix=b
−
x
e
α+1
α+1
x=a
Z
a
b
xα+1
dk(x) k(x)
e
dx,
dx
or
v(α) =
bα+1 ek(b) aα+1 ek(a) X k
−
−
v(α + ),
α+1
α+1
α+1
(6)
∈N
where {k } is the finite sequence of coefficients of the polynomial x 7→ k(x) = Σ k x , of degree kx .
Similarly, we have:
w(α) =
bα+1 el(b) aα+1 el(a) X l
−
−
w(α + ),
α+1
α+1
α+1
(7)
∈N
where {l } is the finite sequence of coefficients of the polynomial x 7→ l(x) = Σ l x , of degree lx .
Also it is clear that in view of (6) and (7), all v(α) and w(α) are affine functions of v0 , . . . , vkx −1 ,
and w0 , . . . , wlx −1 , respectively.
3
Methodology: Notice that in (1), the quantity ρ to approximate, is a linear combination hg, zi
of moments of µ. Then, to compute upper and lower bounds on ρ, we build up two hierarchies of
semidefinite programs (SDP) Pru and Prl , as follows. Consider the vectors of moments v, w and z, up
to order 2r, r ∈ N.
• The linear constraints of both Pru and Prl are coming from those equations (5), (6), and (7), that
contain only moments of order up to 2r.
• The LMI (linear matrix inequalities) constraints of both Pru and Prl (to be defined later), state
necessary conditions on v, w and z, to be moments of some measures supported on [a, b], and
[a, b] × [c, d], respectively.
Then the SDP Pru (resp. Prl ) maximizes (resp. minimizes) the linear criterion hg, zi under the above
constraints. Both are SDP-relaxations of the original problem (1), and so their respective optimal values
provide upper and lower bounds on ρ. In addition, the quality of both bounds increases with r because
more and more constraints (5), (6), and (7), are taken into account.
We will show in the general case n > 2 that the resulting monotone nonincreasing and nondecreasing
sequences of upper and lower bounds both converge to the desired value ρ.
2.2
General Integrals
We now consider the class of more general multivariate exponential integrals:
Z
g(x)eh(x) dx,
ρ=
(8)
Ω
where x ∈ Rn , g, h ∈ R[x], the ring of real polynomials and Ω ⊂ Rn is a compact set defined as
(1)
(2)
(1)
(2)
Ω = {x ∈ Rn : b1 ≤ x1 ≤ b1 , bi (x[i − 1]) ≤ xi ≤ bi (x[i − 1]) ∀i = 2, . . . , n}
(1)
(2)
where x[i] ∈ Ri is the vector of first i elements of x, i = 1, . . . , n, bi , bi
(1)
(9)
∈ R[x[i − 1]], i = 2, . . . , n, and
(2)
b1 , b1 are constants. Similar to the previous section, we can now analyze the general case as follows.
(1)
Define Ωk = {x ∈ Rk : b1
(2)
(1)
(2)
≤ x1 ≤ b1 , bi (x[i − 1]) ≤ xi ≤ bi (x[i − 1]) ∀i = 2, . . . , k},
k = 1, . . . , n, we have: Ωk ∈ Rk and Ωn = Ω. Clearly, these Ωk sets are also compact in Rk for all
(i)
k = 1, . . . , n. Let us consider the measure µh on Ri defined by
(i)
µh (B) =
Z
eh(x[i]) dx[i],
Ωi ∩B
4
(10)
(i)
(i)
where h ∈ R[x[i]], B ∈ B(Ri ), and its sequence of moments z h = {zh (α)}:
Z
(i)
zh (α) =
(x[i])α eh(x[i]) dx[i],
(11)
Ωi
for all α ∈ Ni .
We then have ρ =
(n)
(n)
gα zh (α) = hg, z h i, where gα is the coefficient of monomial xα and g is
P
α∈Nn
(n)
the sequence of those coefficients. Therefore, what we need to do is to calculate zh (α) for all necessary
α ∈ Ni . Using integration by parts, we have:
(n)
zh (α) =
1
(Aα − Bα ),
αn + 1
where
Z
Aα =
Ωn−1
h
ib(2)
n (x[n−1])
dx[n − 1],
(x[n − 1])α[n−1] xαnn +1 eh(x) (1)
bn (x[n−1])
and
Z
Bα =
(x[n − 1])α[n−1] xαnn +1
Ω
∂h(x) h(x)
e
dx.
∂xn
We have:
X
∂h(x)
=
βn hβ (x[n − 1])β[n−1] xβnn −1 .
∂xn
n
β∈N
Therefore,
Bα =
Z
X
(x[n − 1])α[n−1] xαnn +1 (x[n − 1])β[n−1] xβnn −1 eh(x) dx
βn hβ
Ω
β∈Nn
or
Bα =
Z
X
xα+β eh(x) dx =
βn hβ
Ω
β∈Nn
X
(n)
βn hβ zh (α + β).
β∈Nn
Now consider Aα , we have: Aα = A2α − A1α where
Z
(i)
αn +1 h(x[n−1],bn (x[n−1]))
Aiα =
(x[n − 1])α[n−1] [b(i)
e
dx[n − 1],
n (x[n − 1])]
i = 1, 2.
Ωn−1
Let
(n−1)
gi,α
αn +1
(x) = xα[n−1] [b(i)
,
n (x)]
x ∈ Rn−1 ,
i = 1, 2,
and
(n−1)
hi
(x) = h(x, bn(i) (x)),
x ∈ Rn−1 ,
i = 1, 2.
(n−1)
All of these four functions are polynomials in R[x[n − 1]]. Define two measures µ1
(n−1)
µ2
(n−1)
z1
(n−1)
(n−1)
≡µ
(n−1)
h1
and
≡µ
(n−1)
over Ωn−1 on Rn−1 as defined in (10). With their two respective sequences of moments
≡z
(n−1)
and z 2
h2
(n−1)
h1
(n−1)
Aα =
≡z
(n−1)
(n−1)
h2
X β∈Nn−1
as defined in (11), we have:
(n−1)
g2,α
(n−1)
β
z2
(β) −
X β∈Nn−1
5
(n−1)
g1,α
(n−1)
β
z1
(β).
Thus we have:
(n)
X
zh (α) =
(n−1)
g2,α
αn + 1
β∈Nn−1
β (n−1)
z2
(β)
X
−
(n−1)
g1,α
αn + 1
β∈Nn−1
β (n−1)
z1
(β)
−
X βn hβ (n)
z (α + β). (12)
αn + 1 h
n
β∈N
(n)
(n−1)
Equation (12) shows that in order to calculate z h , we need to calculate z i
, i = 1, 2, which can
(n)
then be calculated by some other moment sequences in lower dimensions. Let denote h1
to be the
function h and define
(k)
(k+1)
(2−di/2e+bi/2c)
hi (x) = hdi/2e (x, bk+1
(k)
(k)
For each function hi , a measure µi
(x)),
x ∈ Rk ,
k = 1, . . . , n − 1,
(k)
and its moments sequence z i
i = 1, . . . , 2n−k .
(13)
are also defined over Ωk in Rk as
(k)
in (10) and (11) respectively. We also need to define the function gi,α . Let define
(k)
(i)
gi,α (x) = xα[k] [bk+1 (x)]αk+1 +1 ,
x ∈ Rk ,
α ∈ Nk ,
i = 1, 2.
(14)
Then we have:
(k)
zi (α)
=
X
β∈Nk−1
(k)
(k)
(k−1)
g2,α
αk + 1
(k)
β (k−1)
z2i (β)
X
−
β∈Nk−1
(k−1)
or zi (α) = Ri (z i , z 2i
(k−1)
g1,α
αk + 1
β (k−1)
z2i−1 (β)
X βk (h(k) )β (k)
i
−
z (α + β) (15)
α
+
1 i
k
k
β∈N
(k−1)
, z 2i−1 , α) for all α ∈ Rk , k = 2, . . . , n, and i = 1, . . . , 2n−k .
For k = 1 and i = 1, . . . , 2n−1 , we have:
(2)
(1)
zi (α) =
(1)
(1)
X β(h )β (1)
(b1 )α+1 h(1) (b(2)
(b )α+1 h(1) (b(1)
(1) (1)
i
e i 1 )− 1
e i 1 )−
z (α + β) = Ri (z i , α).
α+1
α+1
α+1 i
(16)
β∈N
(n)
With the general recursive formula presented in (15) and (16), the sequence of moments z h
(n)
(or z 1 )
can now be calculated.
2.3
Moment Relaxation
(k)
Let consider the measure µi
(k)
and its corresponding sequence of moments z i . For any nonnegative
(k)
(k)
(k)
(k)
integer r, the r-moment matrix associated with µi (or equivalently, with z i ) Mr (µi ) ≡ Mr (z i )
α
is a matrix of size k+r
r . Its rows and columns are indexed in the canonical basis {(x[k]) } of R[x[k]],
and its elements are defined as follows:
(k)
(k)
Mr (z i )(α, β) = zi (α + β),
6
α, β ∈ Nk , |α|, |β| ≤ r.
(17)
(k)
(k)
Similarly, given θ ∈ R[x[k]], the localizing matrix Mr (θz i ) associated with z i
(k)
X
Mr (θz i )(α, β) :=
(k)
and θ is defined by
α, β ∈ Nk , |α|, |β| ≤ r,
θγ zi (α + β + γ),
(18)
γ∈Nk
where θ = {θγ } is the vector of coefficients of θ in the canonical basis {(x[k])α }.
(k)
If we define the matrix Mrγ (z i ) with elements
(k)
(k)
Mrγ (z i )(α, β) = zi (α + β + γ),
α, β, γ ∈ Nk , |α|, |β| ≤ r,
(k)
then the localizing matrix can be expressed as Mr (θz i ) =
(k)
P
γ∈Nk
θγ Mrγ (z i ).
Note that for every polynomial f ∈ R[x[k]] of degree at most r with its vector of coefficients denoted
by f = {fγ }, we have:
(k)
hf , Mr (z i )f i
Z
=
f
2
(k)
dµi ,
(k)
hf , Mr (θz i )f i
(k)
Z
=
(k)
θf 2 dµi .
(k)
(19)
(k)
This property shows that necessarily, Mr (z i ) 0 and Mr (θz i ) 0, whenever µi has its support
contained in the level set x ∈ Rk : θ(x) ≥ 0 . If the sequence of moments is restricted to those moments
(k)
used to construct the moment matrix Mr (z i ) (up to moments of degree 2r), then the second necessary
(k)
condition is reduced to Mr−dd/2e (θz i ) 0, where d is the degree of the polynomial θ. For more details
on moment matrices, local matrices, and these necessary conditions, please refer to Laurent [6] and
references therein.
Let us define
h
ih
i
(k)
(2)
(1)
θi (x) = bi (x[i − 1]) − xi xi − bi (x[i − 1]) ,
x ∈ Rk ,
∀k ≥ i,
(20)
for all i = 2, . . . , n. Similarly, define
h
ih
i
(k)
(2)
(1)
θ1 (x) = b1 − x1 x1 − b1 ,
(k)
For a fixed i (i = 1, . . . , n), the polynomials θi
x ∈ Rk ,
∀k ≥ 1.
(21)
depends on the first i variables x1 , . . . , xi exactly the
same for all k ≥ i. Thus they all have the same degree di for all k ≥ i.
(k)
We also have Ωk = {x ∈ Rk : θi (x) ≥ 0
∀i = 1, . . . , k} for all k = 1, . . . , n. Thus necessary
(k)
conditions for moment matrices of all measures µi
(k)
Mr (z i ) 0,
(k) (k)
Mr−ddj /2e (θj z i ) 0,
can be written as follows:
∀k = 1, . . . , n,
7
∀i = 1, . . . , 2n−k ,
∀j = 1, . . . , k.
(22)
(k)
Combining these necessary conditions and the recursive formulae for z i
in (15) and (16), we can find
lower and upper bounds for ρ by solving the two following semidefinite programming problems:


(n)
inf hg, z 1 i




(k)

 s.t. Mr (z i ) 0
k = 1, . . . , n, i = 1, . . . , 2n−k ,


4 

(k) (k)
n−k
Prl = 
Mr−ddj /2e (θj z i ) 0
k = 1, . . . , n, i = 1, . . . , 2
, j = 1, . . . , k, 




(1)
(1) (1)
(1)
n−1


zi (α) = Ri (z i , α)
i = 1, . . . , 2
, α ∈ Ai,r ,


(k)
(k) (k) (k−1) (k−1)
(k)
n−k
zi (α) = Ri (z i , z 2i , z 2i−1 , α) k = 2, . . . , n, i = 1, . . . , 2
, α ∈ Ai,r ,
(23)
and


(n)
sup hg, z 1 i


(k)
 s.t. Mr (z i ) 0

4 
(k) (k)
Pru = 
Mr−ddj /2e (θj z i ) 0


(1)
(1) (1)

zi (α) = Ri (z i , α)

(k)
(k) (k) (k−1) (k−1)
zi (α) = Ri (z i , z 2i , z 2i−1 , α)
k = 1, . . . , n,
i = 1, . . . , 2n−k ,
k = 1, . . . , n,
i = 1, . . . , 2n−k ,
i = 1, . . . , 2n−1 ,
k = 2, . . . , n,
(1)
α ∈ Ai,r ,
i = 1, . . . , 2n−k ,





j = 1, . . . , k, 




(k)
α ∈ Ai,r ,
(24)
(k)
where Ai,r is the set of all α ∈ Rk that the recursive formulae can be expressed by moments of degree
up to 2r (used to construct the moment matrices Mr ).
Clearly, Z(Prl ) ≤ ρ ≤ Z(Pru ) and we will prove that these lower and upper bounds asymptotically
converge to ρ when r tends to infinity in the next section.
3
Convergence
In order to prove the convergence of Z(Prl ) and Z(Pru ), we need to prove the recursive formulae in (15)
(k)
and (16) define moment sequences for all measures µi .
(k)
Lemma 1 Let hi
(k)
(k)
and gi,α be defined as in (13) and (14) respectively, and let z i
sequences of some Borel measures
(k)
dψi
=
Proof.
(k)
dµi
(k)
ψi
be the moment
on the compact sets Ωk , which satisfy (15) and (16). Then
for all k = 1, . . . , n and i = 1, . . . , 2n−k .
We will prove the lemma by induction. Let consider the case k = 1, we have, according to
(16):
(1)
zi (α)
Z
=
(1)
xα dψi
xα+1 h(1) (x)
e i
=
α+1
8
b(2)
1
Z
−
(1)
b1
xα+1 (1) 0
(1)
(hi ) (x)dψi .
α+1
(25)
We also have:
Z
(1)
xα dµi
xα+1 h(1) (x)
=
e i
α+1
(1)
Let consider the signed measure φi
b(2)
1
Z
−
(1)
b1
xα+1 (1) 0
(1)
(hi ) (x)dµi .
α+1
(1)
on Ω1 that satisfies dφi
(1)
= dψi
(26)
(1)
− dµi , from (25) and (26), we
have:
xα+1 (1) 0
(1)
(h ) (x)dφi .
(27)
α+1 i
P
P
j
Consider the polynomial p(x) = dj=1 fj xj , we have: p0 (x) = d−1
j=0 fj+1 (j + 1)x . From Equation (27),
Z
α
x
(1)
dφi
Z
=−
we obtain the following equation:
Z h
i
(1)
(1)
p0 (x) + p(x)(hi )0 (x) dφi = 0.
(28)
We now prove that Equation (28) is also true for all continuous funtion f (x) = xg(x), where g is continuously differentiable on Ω1 . We have, polynomials are dense in the space of continuously differentiable
functions on Ω1 under the sup-norm max{supx∈Ω1 |f (x)|, supx∈Ω1 |f 0 (x)|} (see e.g. Coatmélec [1] and
Hirsch [3]) Therefore for any > 0, there exist p ∈ R[x] such that supx∈Ω1 |g(x) − p (x)| ≤ and
supx∈Ω1 |g 0 (x) − p0 (x)| ≤ . Equation (28) is true for the polynomial p(x) = xp (x), thus
Z h
Z
Z
i
(1)
(1)
(1)
(1)
(1)
f 0 (x) + f (x)(hi )0 (x) dφi = [x(g(x) − p (x))]0 dφi + x [g(x) − p (x)] (hi )0 (x)dφi .
We have: [x(g(x) − p (x))]0 = (g(x) − p (x)) + x(g 0 (x) − p0 (x)), thus
| [x(g(x) − p (x))]0 | ≤ (1 + sup |x|) ∀x ∈ Ω1 .
x∈Ω1
Similarly,
(1)
(1)
|x [g(x) − p (x)] (hi )0 (x)| ≤ sup |x(hi )0 (x)| ∀x ∈ Ω1 .
x∈Ω1
So we have:
|
Z h
Z
i
(1)
(1)
(1)
(1)
f 0 (x) + f (x)(hi )0 (x) dφi | ≤ (1 + sup |x| + sup |x(hi )0 (x)|) |dφi |.
x∈Ω1
(1)
x∈Ω1
(1)
The constant term M = (1+supx∈Ω1 |x|+supx∈Ω1 |x(hi )0 (x)|) |dφi | is finite and the above equation
R
is true for all > 0, thus we have for all f (x) = xg(x):
Z h
i
(1)
(1)
f 0 (x) + f (x)(hi )0 (x) dφi = 0.
For an arbitrary polynomial g(x) =
(1)
−hi (x)
Let consider f (x) = G(x)e
(1)
g(x)e−hi
(x)
Pd
gj j+1
j
, we have G0 (x)
j=0 gj x , define G(x) =
j=0 j+1 x
have: f (x)
x is a continuously differentiable function and
Pd
, we
(29)
= g(x).
f 0 (x) =
(1)
− f (x)(hi )0 (x). Using Equation (29), we obtain the following equation
Z
(1)
(1)
g(x)e−hi (x) dφi = 0, ∀g ∈ R[x].
9
(30)
(1)
If we define dνi
(1)
= e−hi
(x) dφ(1) ,
i
R
we then have
(1)
f (x)dνi
= 0 for all continuous function f in Ω1 since
(1)
polynomials are dense in the space of continuous functions. This implies that νi
(1)
−hi (x)
In addition, e
> 0 for all x ∈ R, thus
(1)
φi
is also a zero measure or
is a zero measure.
(1)
dψi
(1)
= dµi
for all
i = 1, . . . , 2n−1 .
(k−1)
Now assume that dψi
(k)
(k)
dψi
(k−1)
, 2 ≤ k ≤ n, for all i = 1, . . . , 2n−k+1 . We will prove that
= dµi
(k−1)
for all i = 1, . . . , 2n−k . We have: dψ2i
= dµi
(k−1)
2n−k . Thus z 2i
(k−1)
= dµ2i
(k−1)
(k−1)
and dψ2i−1 = dµ2i−1 for i : 1 ≤ i ≤
(k−1)
(k−1)
(k−1)
and z 2i−1 are moment sequences of two measures µ2i
and µ2i−1 respectively.
According to (15), we have:
(k−1)
(k−1)
Z
Z
g2,α
g1,α
(k)
X
X
xα xk ∂hi (x) (k)
β (k−1)
β (k−1)
(k)
α
x dψi =
z2i (β) −
z2i−1 (β) −
dψi . (31)
αk + 1
αk + 1
αk + 1 ∂xk
k−1
k−1
β∈N
β∈N
Similarly,
Z
α
x
(k)
dµi
=
X
β∈Nk−1
(k−1)
g2,α
αk + 1
β (k−1)
z2i (β) −
X
β∈Nk−1
(k)
Then if we consider the signed measure φi
Z
(k)
xα dφi
(k−1)
g1,α
αk + 1
β (k−1)
z2i−1 (β) −
(k)
on Ωk that satisfies dφi
Z
=−
(k)
xα xk ∂hi (x) (k)
dµi . (32)
αk + 1 ∂xk
Z
(k)
= dψi
(k)
− dµi , we have:
(k)
xα xk ∂hi (x) (k)
dφi .
αk + 1 ∂xk
(33)
Using similar arguments with simultaneous approximation of continuous functions, we obtain the following equation for all functions f (x) = xk g(x), x ∈ Rk , where g and
Z "
∂g
∂xk
are continuous:
#
(k)
∂hi (x)
∂f (x)
(k)
+ f (x)
dφi = 0.
∂xk
∂xk
For any polynomial g ∈ R[x], define G(x) = xk P (x), where P (x) ∈ R[x] and
(k)
−hi (x)
f (x) = G(x)e
, we again obtain
Z
(k)
(k)
g(x)e−hi (x) dφi = 0,
(34)
∂G(x)
∂xk
= g(x), then with
∀g ∈ R[x].
(k)
Similar arguments are again used and we have the measure φi
(35)
(k)
is a zero measure or dψi
all i = 1, . . . , 2n−k . So the results are true for all k = 1, . . . , n.
(k)
= dµi
for
With this lemma, we can now prove the following convergence theorem:
Theorem 1 Let g, h ∈ R[x], Ω be the compact set defined in (9), and consider the semidefinite programming problems Prl and Pru defined in (23) and (24) respectively. Then
10
(i) Optimal values Z(Prl ) and Z(Pru ) are finite and in addition, both problems Prl and Pru are solvable
for r large enough.
(ii) As r → ∞, Z(Prl ) ↑ ρ and Z(Pru ) ↓ ρ.
(k)
Proof. Clearly, the collection of truncated sequences of moments of µi
problems Prl and Pru . We have:
Z
(k)
|zi (α)| ≤
(k)
| (x[k])α ehi
(x[k])
(k)
|dx[k] ≤ sup |xα ehi
is a feasible solution for both
(x)
|vol(Ωk ).
x∈Ωk
Ωk
(k)
(k)
Ωk is compact in Rk ; therefore, we have ui (α) = supx∈Ωk |xα ehi
(x) |
(k)
= maxx∈Ωk |xα ehi
(k)
(x) |
is finite.
(k)
Now consider the problems Prl and Pru with additional bound constraints −ui (α) − 1 ≤ zi (α) ≤
(k)
ui (α) + 1 for all k, i and α, we have:
(i) The feasible sets of these two modified problems are bounded and closed. The objective functions
are linear and both problems are feasible. Therefore, they are both solvable and their optimal
values are finite.
(k)
(ii) Let {z i,r } be the optimal solution of Prl (with bound constraints), we extend these truncated
sequences with zeros to make them become infinite sequences. According to Lasserre [5], there
(k)
is a subsequence {rm } and infinite sequences {z i,∗ } such that pointwise convergence holds with
respect to the usual sup-norm. Therefore, we have:
(k)
M (z i,∗ ) 0,
(k) (k)
M (θj z i,∗ ) 0,
∀k, i, j.
We also have Ωk is compact in Rk ; therefore, according to Putinar [8], there exist measures
(k)
ψi
(k)
supported on Ωk such that z i,∗ are their moment sequences assuming the representation
(k)
condition holds. In addition, {z i,∗ } satisfy the recursive formulae (15) and (16) from the pointwise
(k)
convergence. From Lemma 1, we obtain that dψi
(k)
= dµi
for all k = 1, . . . , n and i = 1, . . . , 2n−k .
Thus we have:
(n)
(n)
lim hg, z 1,rm i = hg, z 1,∗ i = ρ.
m→∞
Due to the construction of truncated moment matrices and localizing matrices as well as the sets
(k)
l
Ai,r , clearly, a feasible solution of Pr+1
generates a feasible solution of Prl . Thus with r ≥ deg(g),
(n)
(n)
then hg, z 1,r i ≤ hg, z 1,r+1 i or we have:
(n)
hg, z 1,r i ↑ ρ.
Similar arguments can be applied for the modified problem Pru .
11
So far, the results obtained are for problems Prl and Pru with additional bound constraints. However,
at the limit with respect to the usual sup-norm, none of these bound constraints are tight. Thus, these
constraints can be removed when r is large enough. In other words, the problems Prl and Pru have finite
optimal solutions and solvable when r is large enough and
Z(Prl ) ↑ ρ,
Z(Pru ) ↓ ρ.
4
Computational Results
In order to demonstrate the method, we use two different kinds of Ω sets: hypercubes and order statistic
integrals. The computational results are obtained for integrals over these sets of random multivariate
normal distribution.
4.1
Integrals over Hypercubes
(i)
(i)
If Ω is a hypercube, then bk (x) = bk are constants for all k = 2, . . . , n. We have:
(k)
(i)
gi,α (x) = [bk+1 ]αk+1 +1 xα[k] ,
(k)
The functions hi
(k)
x ∈ Rk ,
α ∈ Nk ,
i = 1, 2.
(36)
are still formulated as in (13):
(k+1)
(2−di/2e+bi/2c)
hi (x) = hdi/2e (x, bk+1
),
x ∈ Rk ,
k = 1, . . . , n − 1,
i = 1, . . . , 2n−k .
The recursive formula simply becomes
(k)
zi (α) =
(2)
(1)
X βk (h(k) )β (k)
[bk ]αk +1 (k−1)
[b ]αk +1 (k−1)
i
z2i (α[k − 1]) − k
z2i−1 (α[k − 1]) −
z (α + β)
αk + 1
αk + 1
α
+
1 i
k
k
(37)
β∈N
for all k = 2, . . . , n, α ∈ Rk , i = 1, . . . , 2n−k while for k = 1, it remains the same as in (16):
(2)
(1)
zi (α) =
(1)
(1)
X β(h )β (1)
(b1 )α+1 h(1) (b(2)
(b )α+1 h(1) (b(1)
i
e i 1 )− 1
e i 1 )−
z (α + β),
α+1
α+1
α+1 i
∀i = 1, . . . , 2n−1 .
β∈N
(k)
h
(2)
The polynomials θi (x) = bi − xi
ih
(1)
xi − bi
i
degree di = 2 for all k ≥ i, i = 1, . . . , n in this case.
12
(1) (2)
= −bi bi
(1)
+ (bi
(2)
+ bi )xi − x2i , x ∈ Rk have the
4.2
Order Statistic Integrals
Order statistic integrals are calculated over the set Ω = {x ∈ Rn : b(1) ≤ x1 ≤ . . . ≤ xn ≤ b(2) }. So we
(2)
(1)
(1)
have: bk = b(2) are constant for all k = 1, . . . , n while bk (x) = xk−1 for all k = 2, . . . , n and b1 = b(1) .
(k)
(k)
The functions g2,α are defined as in (36) while g1,α are formulated as follows
(k)
α
g1,α (x) = xk k+1
+1 α[k]
x
= xα[k]+(αk+1 +1)ek ,
x ∈ Rk ,
α ∈ Nk ,
(38)
where ek is the k th unit vector in Rk .
The recursive formula is then
(k)
zi (α)
X βk (h(k) )β (k)
[b(2) ]αk +1 (k−1)
1
(k−1)
i
=
z
(α[k − 1]) −
z
(α[k − 1] + (αk + 1)ek ) −
z (α + β)
αk + 1 2i
αk + 1 2i−1
α
+
1 i
k
k
β∈N
(39)
for all k = 2, . . . , n, α ∈ Rk , i = 1, . . . , 2n−k . For k = 1, the formula remains the same as in (16).
(k)
Finally, the polynomials θi (x) = b(2) − xi [xi − xi−1 ] = −b(2) xi−1 + b(2) xi + xi−1 xi − x2i , x ∈ Rk
(k)
also have degree di = 2 for all k ≥ i, i = 2, . . . , n. For i = 1, the polynomial θ1 (x) is the same as in
the previous section with degree d1 = 2.
4.3
Computational Results for Normal Distributions
Multivariate normal distributions are used to obtain computational results for integrals over hypercubes
and order statistic integrals. The density function of a multivariate normal distribution with mean µ
and covariance matrix Σ = AA0 is
f (x) =
1
− 12 (x−µ)0 Σ−1 (x−µ)
e
.
(2π)n/2 det(Σ)1/2
(40)
−1
Thus, in this case, g(x) = (2π)n/2 det(Σ)1/2
, a constant, and h(x) = − 21 (x − µ)0 Σ−1 (x − µ), a
quadratic polynomials.
Algorithms for general g and h are implemented in Matlab for both hypercubic and order statistic
integrals. Semidefinite programming problems are solved using SeDuMi routines. For the case of
normal distributions, means and covariance matrices are created randomly using random vectors and
matrices with elements within the range of [−1, 1]. The two Ω sets are defined on the hypercube
{x ∈ Rn : −1 ≤ xi ≤ 1 ∀i = 1, . . . , n}.
The algorithms are run for different normal distributions in n = 1, 2, and 3-dimension space.
The moment degree starts at r = 2 and increases until the tolerance is met. The tolerance =
13
1
2
Z(Pru ) − Z(Prl ) is set to be 5 × 10−5 (4-digit accuracy). The results will be averaged from those
different distributions in each case. All computations are done under a Windows environment on a
Pentium III 1GHz with 256MB RAM.
Dimension
Min. degree
Max. degree
Average degree
Computational time (seconds)
n
rmin
rmax
r̄
r=2
r=3
r=4
r=5
1
2
5
3.5
1.332
1.631
1.505
1.437
2
3
5
4.3
1.994
3.215
11.036
70.107
Table 1: Computational results for hypercubic integrals
Dimension
Min. degree
Max. degree
Average degree
Computational time (seconds)
n
rmin
rmax
r̄
r=2
r=3
r=4
r=5
1
2
5
3.5
1.332
1.631
1.505
1.437
2
3
5
4.3
2.079
3.413
12.294
79.418
Table 2: Computational results for order statistic integrals
Table 1 and 2 show computational results for hypercubic and order statistic integrals respectively.
Minimum degree rmin is the smallest moment degree we need to use to meet the tolerance setting while
rmax is the largest moment degree. r̄ is the average value obtained from all distributions tested for each
value of n. In the case n = 1, there is not much difference between computational times for different
moment degrees and we need roughly r = 3 or r = 4 on average to obtain the desired results. These
observations are for both hypercubic and order statistic integrals even though the latter need slightly
more time to be calculated. For n = 2, the average moment degree is r̄ = 4.3 and computational times
are significantly varied from 2 seconds for r = 2 up to approximately 80 seconds for r = 5. Due to the
large amount of memory needed for SeDuMi routine, we cannot run the algorithms with r = 5 for the
case n = 3 to meet the 4-digit accuracy requirement. Therefore, no result on required moment degrees
for this accuracy setting is reported here for n = 3. However, we will report the real error gaps for this
case up to r = 4 in Table 3. With respect to the computational time, it is not very favorable in the
case n = 3 as we need more than 2000 seconds when r = 4 (as compared to approximately 5 and 100
seconds when r = 2 and r = 3 respectively).
Table 3 shows the real gaps between lower and upper bounds of integral values for some specific
14
Integral
Hypercubic
Order statistic
Error
1
2
Z(Pru ) − Z(Prl )
Dimension
Value
n
ρ̄
r=2
r=3
1
0.2665
1.4047E-04
8.1177E-07
2
0.1085
1.0376E+01
3
0.0736
2
3
r=4
r=5
1.1200E-02
6.7765E-04
4.7855E-05
4.7420E-01
8.0400E-02
2.2000E-03
0.0670
7.8969E+00
6.3000E-03
4.2279E-04
0.0274
4.9390E-01
5.4000E-03
4.2253E-04
3.2933E-05
Table 3: Real errors for specific normal distributions
normal distributions, both hypercubic and order statistic integrals. These values decrease when the
moment order increases. For example, when n = 3, the gap between lower and upper bounds for
r = 4 is already relatively small even though the accuracy requirement has not been satisfied in this
case. The computational times for large n and r are yet very promising; however, we believe that this
method is plausible to be implemented. One may also use the recursive formulae presented in Section
2, to eliminate some moments variables and substitute their expression in the remaining variables in
the LMIs of the SDP-relaxations. This would substantially reduce the total number of variables. In
addition, the framework we proposed could be further developed to calculate integrations not only with
polynomials but also rational functions, which have been recently analyzed by Lasserre [4].
References
[1] C. Coatmélec. Approximation et interpolation des fonctions différentiables de plusieurs variables. Annales
scientifiques de l’É.N.S 3e série, 83(4):271–341, 1966.
[2] P. de la Harpe and C. Pache. Cubature formulas, geometrical designs, reproducing kernels, and Markov
operators. In International Conference on Group Theory, Gaeta, June 2003.
[3] M. W. Hirsch. Differential topology. Springer-Verlag, New York, 1976.
[4] J. B. Lasserre. The K-moment problem with densities. Technical Report 05334, LAAS, June 2005.
[5] J. B. Lasserre. A sum of squares approximation of nonnegative polynomials. SIAM Journal on Optimization,
16(2):610–628, 2005.
[6] M. Laurent. Moment matrices and optimization over polynomials - a survey on selected topics. Preprint,
September 2005.
[7] H. Niederreiter. Random number generator and quasi-Monte Carlo method. SIAM CBMS-NSF Regional
Conference Series in Applied Mathematics, 63, 1992.
[8] M. Putinar. Positive polynomials on compact semi-algebraic sets. Indiana University Mathematics Journal,
42:969–984, 1993.
15
Download