Convergence Rate for a Curse-of-Dimensionality-Free Please share

Convergence Rate for a Curse-of-Dimensionality-Free Method for a Class of HJB PDEs The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation McEneaney, William M. and L. Jonathan Kluberg. "Convergence Rate for a Curse-of-Dimensionality-Free Method for a Class of HJB PDEs." SIAM J. Control Optim. Volume 48, Issue 5, pp. 3052-3079 (2009) ©2009 Society for Industrial and Applied Mathematics. As Published http://dx.doi.org/10.1137/070681934 Publisher Society for Industrial and Applied Mathematics Version Final published version Accessed Thu May 26 19:14:37 EDT 2016 Citable Link http://hdl.handle.net/1721.1/58307 Terms of Use Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. Detailed Terms SIAM J. CONTROL OPTIM. Vol. 48, No. 5, pp. 3052–3079 c 2009 Society for Industrial and Applied Mathematics CONVERGENCE RATE FOR A CURSE-OF-DIMENSIONALITY-FREE METHOD FOR A CLASS OF HJB PDES∗ WILLIAM M. MCENEANEY† AND L. JONATHAN KLUBERG‡ Abstract. In previous work of the first author and others, max-plus methods have been explored for solution of first-order, nonlinear Hamilton–Jacobi–Bellman partial differential equations (HJB PDEs) and corresponding nonlinear control problems. Although max-plus basis expansion and max-plus finite-element methods can provide substantial computational-speed advantages, they still generally suffer from the curse-of-dimensionality. Here we consider HJB PDEs where the Hamiltonian takes the form of a (pointwise) maximum of linear/quadratic forms. The approach to solution will be rather general, but in order to ground the work, we consider only constituent Hamiltonians corresponding to long-run average-cost-per-unit-time optimal control problems for the development. We consider a previously obtained numerical method not subject to the curse-of-dimensionality. The method is based on construction of the dual-space semigroup corresponding to the HJB PDE. This dual-space semigroup is constructed from the dual-space semigroups corresponding to the constituent linear/quadratic Hamiltonians. The dual-space semigroup is particularly useful due to its form as a max-plus integral operator with kernel obtained from the originating semigroup. One considers repeated application of the dual-space semigroup to obtain the solution. Although previous work indicated that the method was not subject to the curse-of-dimensionality, it did not indicate any error bounds or convergence rate. Here we obtain specific error bounds. Key words. partial differential equations, curse-of-dimensionality, dynamic programming, maxplus algebra, Legendre transform, Fenchel transform, semiconvexity, Hamilton–Jacobi–Bellman equations, idempotent analysis AMS subject classifications. 49LXX, 93C10, 35B37, 35F20, 65N99, 47D99 DOI. 10.1137/070681934 1. Introduction. A robust approach to the solution of nonlinear control problems is through the general method of dynamic programming. For the typical class of problems in continuous time and continuous space, with the dynamics governed by finite-dimensional, ordinary differential equations, this leads to a representation of the problem as a first-order, nonlinear partial differential equation—the Hamilton– Jacobi–Bellman (HJB) equation or the HJB PDE. If one has an infinite time-horizon problem, then the HJB PDE is a steady-state equation, and this PDE is over a space (or some subset thereof) whose dimension is the dimension of the state variable of the control problem. Due to the nonlinearity, the solutions are generally nonsmooth, and one must use the theory of viscosity solutions [4], [10], [11], [12], [20]. The most intuitive class of approaches to solution of the HJB PDE consists of grid-based methods (cf. [4], [7], [5], [14], [40], [16], [20], [24] among many others). These require that one generate a grid over some bounded region of the state space. In particular, suppose the region over which one constructs the grid is rectangular, say, square for simplicity. Further, suppose one uses N grid points per dimension. ∗ Received by the editors February 5, 2007; accepted for publication (in revised form) July 28, 2009; published electronically December 11, 2009. http://www.siam.org/journals/sicon/48-5/68193.html † Department of Mechanical and Aerospace Engineering, University of California San Diego, La Jolla, CA 92093-041 (wmceneaney@ucsd.edu). This author’s research was partially supported by NSF grant DMS-0307229 and AFOSR grant FA9550-06-1-0238. ‡ Operations Research Center, Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, MA 02139 (jonathan.kluberg@gmail.com). 3052 Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE 3053 If the state dimension is n, then one has N n grid points. Thus, the computations grow exponentially in state-space dimension n, and this is referred to as the curse-ofdimensionality. In [28], [29], [30], [31], a new class of methods for first-order HJB PDEs was introduced, and these methods are not subject to the curse-of-dimensionality. A different class of methods which also utilize the max-plus algebra are those which expand the solution over a max-plus basis and solve for the coefficients in the expansion via max-plus linear algebra (cf. [1], [2], [28], [34], [35]). Although this new approach bears a superficial resemblance to these other methods in that it utilizes the max-plus algebra, it is largely unrelated. Most notably, with this new approach, the computational growth in state-space dimension is on the order of n3 . There is of course no “free lunch,” and there is exponential computational growth in a certain measure of complexity of the Hamiltonian. Under this measure, the minimal complexity Hamiltonian is the linear/quadratic Hamiltonian—corresponding to solution by a Riccati equation. If the Hamiltonian is given as a pointwise maximum or minimum of M linear/quadratic Hamiltonians, then one could say the complexity of the Hamiltonian is M . One could also apply this approach to a wider class of HJB PDEs with semiconvex Hamiltonians (by approximation of the Hamiltonian by a finite number of quadratic forms), but that is certainly beyond the scope of this paper. We will be concerned here with HJB PDEs of the form 0 = −H(x, ∇V ), where the Hamiltonians are given or approximated as (1) H(x, ∇V ) = max {H m (x, ∇V )}, m∈M where M = {1, 2, . . . , M } and the H m ’s have computationally simpler forms. In order to make the problem tractable, we will concentrate on a single class of HJB PDEs—those for long-run average-cost-per-unit-time problems. However, the theory can clearly be expanded to a much larger class. In [28], [29], a curse-of-dimensionality-free algorithm was developed in the case where each constituent H m was a quadratic function of its arguments. In particular, we had (2) H m (x, p) = (Am x) p + 12 x Dm x + 12 p Σm p, where Am , Dm , and Σm were n × n matrices meeting certain conditions which guaranteed existence and uniqueness of a solution within a certain class of functions. First, some existence and uniqueness results were reviewed, and an approximately equivalent form as a fixed point of an operator Sτ = m∈M Sτm (where ⊕ and ⊗ indicate max-plus addition/summation and multiplication, respectively) was discussed. The analysis leading to the curse-of-dimensionality-free algorithm was developed. We briefly indicate the main points here, and more details appear in sections 2 and 3. In one sense, the curse-of-dimensionality-free method computes the solution of 0 = −H(x, ∇V ) with Hamiltonian (1) (and boundary condition V (0) = 0) through repeated application of Sτ to some initial function V 0 , say, N times, yielding approximation ST V 0 = [Sτ ]N V 0 , where T = N τ and the superscript N indicates repeated composition. However, the operations are all carried out in the semiconvex-dual space (where semiconvex duality is defined for the reader in section 3.2). Suppose V 1 = Sτ V 0 and that V 0 has semiconvex dual a0 . Then one may propagate instead in the dual space with a1 = Bτ a0 and recover V 1 as the inverse semiconvex dual of a1 . Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3054 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG It is natural to propagate in the dual space because one automatically obtains Bτ as a max-plus integral operator, that is, ⊕ Bτ (z, y) ⊗ a0 (y) dy Bτ [a0 ](z) = Rn (where the definitions are given below). Importantly, one has Bτ (z, y) m m m m∈M Bτ (z, y) where the Bτ ’s are dual to the Sτ ’s. The key to the algorithm m is that when the H ’s are quadratic as in (2), then the Bτm ’s are quadratic func tions. If one takes a0 to be a quadratic function, then a1 (z) = Bτ [a0 ](z) = ⊕ m 0 1 m∈M Rn Bτ (z, y) ⊗ a (y) dy = m∈M âm (z) where, as the max-plus integral is 1 a supremum operation, the âm ’s are obtained analytically (modulo a matrix inverse) m 0 as maxima of sums of 2quadratics Bτ (z, y) 2+ a (y). At the second step, one obtains 2 a (z) = m1 ,m2 ∈M2 âm1 ,m2 (z) where the âm1 ,m2 (z)’s are again obtained analytically. Thus, the computational growth in space dimension is only cubic (due to the matrix inverse). The rapid growth in cardinality of the set of âk{mj } is what we refer to as the curse-of-complexity, and we briefly discuss pruning as a means for complexity attenuation as well (although this is not the focus of this paper). The method allows us to solve HJB PDEs that would otherwise be intractable. A simple example over R6 appears in [27]. In [28], [29], the algorithm was explicated. Although it was clear that the method converged to the solution and that the computational cost growth was only at a rate proportional to n3 , no convergence rate or error analysis was performed. In this paper, we obtain error bounds as a function of the number of iterations. In particular, two parameters define the convergence, with the first one T being the time-horizon and going to infinity. The second τ is the time-step size, and it goes to zero. Of course, the number of iterations is T /τ . In sections 5 and 8, we indicate an error bound as a function of these two parameters. 2. Problem class. There are certain conditions which must be satisfied for solutions to exist and to be unique within an appropriate class, and for the method to converge to the solution. In order that the assumptions are not completely abstract, we work with a specific problem class—the long-run average-cost-per-unit-time optimal control problem. This is a problem class where there already exist a great many results, and so less analysis is required. More specifically, we are interested in solving HJB PDEs of the form (1) and of course equivalently, the corresponding control problems. We refer to the H m ’s in (1) as the constituent Hamiltonians. As indicated above, we suppose the individual constituent H m ’s are quadratic forms. These constituent Hamiltonians have corresponding HJB PDE problems which take the form (3) 0 = −H m (x, ∇V ), V (0) = 0. As the constituent Hamiltonians are given by (2), they are associated, at least formally, with the following (purely quadratic) control problems. In particular, the dynamics take the form (4) ξ˙m = Am ξ m + σ m w, ξ0m = x ∈ Rn , . k where the nature of σ m is specified just below. Let w ∈ W = Lloc 2 ([0, ∞); R ), and T loc k k 2 we recall that L2 ([0, ∞); R ) = {w : [0, ∞) → R : 0 |wt | dt < ∞ ∀T < ∞}. The Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE cost functionals in the problems are . m J (x, T ; w) = (5) T 0 1 m m m 2 (ξt ) D ξt − γ2 2 2 |wt | 3055 dt, where we use | · | to indicate vector and induced matrix norms. The value functions are V m (x) = limT →∞ supw∈W J m (x, T ; w). Lastly, σ m and γ are such that Σm = 1 m m γ 2 σ (σ ) . We remark that a generalization of the second term in the integrand of the cost functional to 12 w C m w with C m symmetric, positive definite is not needed since this is equivalent to a change in σ m in the dynamics (4). Obviously, J m and V m require some assumptions in order to guarantee their existence. The assumptions will hold throughout the paper. Since these assumptions only appear together, we will refer to this entire set of assumptions as Assumption Block (A.m), and this is as follows: Assume that there exists cA ∈ (0, ∞) such that x Am x ≤ −cA |x|2 ∀ x ∈ Rn , m ∈ M. Assume that there exists cσ < ∞ such that (A.m) |σ m | ≤ cσ ∀ m ∈ M. Assume that all Dm are positive definite, symmetric, and let cD be such that x Dm x ≤ cD |x|2 ∀ x ∈ Rn , m ∈ M (which is obviously equivalent to all eigenvalues of the Dm being no greater than cD ). Lastly, assume that γ 2 /c2σ > cD /c2A . . We note here that we will take σ m = σ (independent of m) in the error estimates. 3. Review of the basic concepts. The theory in support of the algorithm can be found in [28], [29] (without error bounds). We summarize it here. 3.1. Solutions and semigroups. First, we indicate the associated semigroups and some existence and uniqueness results. Assumption Block (A.m) guarantees the existence of the V m ’s as locally bounded functions which are zero at the origin (cf. [36]). The corresponding HJB PDEs are 2 0 = −H m (x, ∇V ) = − 12 x Dm x + (Am x) ∇V maxm (σ m w) ∇V − γ2 |w|2 w∈R 1 m m m 1 (6) = − 2 x D x + (A x) ∇V + 2 ∇V Σ ∇V V (0) = 0. . Let R− = R ∪ {−∞}. Recall that a function φ : Rn → R− is semiconvex if given any R ∈ (0, ∞), there exists kR ∈ R such that φ(x) + k2R |x|2 is convex over B R (0) = {x ∈ Rn : |x| ≤ R}. For a fixed choice of cA , cσ , γ > 0 satisfying the above assumptions and for any δ ∈ (0, γ) we define cA (γ −δ)2 2 n n Gδ = V : R → [0, ∞) V is semiconvex and 0 ≤ V (x) ≤ |x| ∀x ∈ R . 2c2σ Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3056 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG From the structure of the running cost and dynamics, it is easy to see (cf. [41], [36]) that each V m satisfies . (7) V m (x) = sup sup J m (x, T ; w) = lim sup J m (x, T ; w) = lim V m,f (x, T ) T →∞ w∈W T <∞ w∈W T →∞ and that each V m,f is the unique continuous viscosity solution of (cf. [4], [20]) 0 = VT − H m (x, ∇V ), V (x, 0) = 0. It is easy to see that these solutions have the form V m,f (x, t) = 12 x Ptm,f x where each P m,f satisfies the differential Riccati equation (8) Ṗ m,f = (Am ) P m,f + P m,f Am + Dm + P m,f Σm P m,f , P0m,f = 0. By (7) and (8), the V m ’s take the form V m (x) = 12 x P m x, where P m = limt→∞ Ptm,f . One can show that the P m ’s, are the smallest symmetric, positive definite solutions of their corresponding algebraic Riccati equations. The method we will use to obtain value functions/HJB PDE solutions of the 0 = −H(x, ∇V ) problems is through the associated semigroups. For each m define the semigroup T 2 . γ 1 m m m 2 m STm [φ] = sup (9) 2 (ξt ) D ξt − 2 |wt | dt + φ(ξT ) , w∈W 0 where ξ m satisfies (4). By [36], the domain of STm includes Gδ for all δ > 0. It has also been shown that V m is the unique solution in Gδ of V = STm [V ] for all T > 0 if δ > 0 is sufficiently small, and that V m,f (x, t + T ) = STm [V m,f (·, t)](x). Recall that the HJB PDE problem of interest is (10) . 0 = −H(x, ∇V ) = − max H m (x, ∇V ), V (0) = 0. m∈M Below, we show that the corresponding value function is (11) . w, μ) = sup sup sup V (x) = sup sup J(x, w∈W μ∈D∞ w∈W μ∈D∞ T <∞ 0 T lμt (ξt ) − γ2 |wt |2 dt, 2 where lμt (x) = 12 x Dμt x, D∞ = {μ : [0, ∞) → M : measurable }, and ξ satisfies (12) ξ˙ = Aμt ξ + σ μt wt , ξ0 = x. The computational complexity which will be discussed fully in section 4 arises from the switching control μ in (11). Define the semigroup T 2 γ μ 2 ST [φ] = sup sup (13) l t (ξt ) − |wt | dt + φ(ξT ) , 2 w∈W μ∈DT 0 where DT = {μ : [0, T ) → M : measurable }. One has the following. Theorem 3.1. Value function V is the unique viscosity solution of (10) in the class Gδ for sufficiently small δ > 0. Fix any T > 0. Value function V is also the unique continuous solution of V = ST [V ] in the class Gδ for sufficiently small δ > 0. Further, given any V ∈ Gδ , limT →∞ ST [V ](x) = V (x) for all x ∈ Rn (uniformly on compact sets). Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE 3057 We remind the reader that the proofs of the results in this section may be found in [28], [29]. Let Dn be the set of n × n symmetric, positive or negative definite matrices. We say φ is uniformly semiconvex with (symmetric, definite matrix) constant β ∈ Dn if φ(x) + 12 x βx is convex over Rn . Let Sβ = Sβ (Rn ) be the set of functions mapping Rn into R− which are uniformly semiconvex with (symmetric, definite matrix) constant β. Also note that Sβ is a max-plus vector space [19], [28]. (Note that a max-plus vector space is an example from the set of abstract idempotent spaces, which are also known as idempotent semimodules or as moduloids; cf. [8], [25].) We have the following. Theorem 3.2. There exists β ∈ Dn such that given any β such that β − β > 0 (i.e., β − β positive definite), V ∈ Sβ and V m ∈ Sβ for all m ∈ M. Further, one may take β negative definite (i.e., V , V m convex). We henceforth assume we have chosen β such that β − β > 0. 3.2. Semiconvex transforms. Recall that the max-plus algebra is the commu. . tative semifield over R− given by a ⊕ b = max{a, b} and a ⊗ b = a + b; see [3], [23], [28] for more details. Throughout the work, we will employ certain transform kernel functions ψ : Rn × Rn → R which take the form ψ(x, z) = 12 (x − z) C(x − z), with nonsingular, symmetric C satisfying C +β < 0 (i.e., C +β negative definite). The following semiconvex duality result [19], [28], [34] requires only a small modification of convex duality and Legendre/Fenchel transform results; see section 3 of [37] and more generally [38]. Theorem 3.3. Let φ ∈ Sβ . Let C and ψ be as above. Then, for all x ∈ Rn , (14) (15) φ(x)= maxn [ψ(x, z) + a(z)] z∈R ⊕ . . = ψ(x, z) ⊗ a(z) dz = ψ(x, ·) a(·), Rn where for all z ∈ R , n (16) (17) a(z)= − maxn [ψ(x, z) − φ(x)] x∈R ⊕ ψ(x, z) ⊗ [−φ(x)] dx = − {ψ(·, z) [−φ(·)]} , =− Rn which using the notation of [8], − (18) = ψ(·, z) [φ− (·)] . We will refer to a as the semiconvex dual of φ (with respect to ψ). Semiconcavity is the obvious analogue of semiconvexity. In particular, a function φ : Rn → R ∪ {+∞} is uniformly semiconcave with constant β ∈ Dn if φ(x) − 12 x βx is concave over Rn . Let Sβ− be the set of functions mapping Rn into R ∪ {+∞} which are uniformly semiconcave with constant β. It will be critical to the method that the functions obtained by application of the semigroups to the ψ(·, z) be semiconvex with less concavity than the ψ(·, z) themselves. This is the subject of the next theorem. Theorem 3.4. We may choose C ∈ Dn such that V , V m ∈ S−C . Further, there exist τ > 0 and η > 0 such that Sτ [ψ(·, z)], S m [ψ(·, z)] ∈ S−(C+ηIτ ) ∀ τ ∈ [0, τ ]. τ Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3058 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG Henceforth, we suppose C, τ , and η chosen so that the results of Theorem 3.4 hold. By Theorem 3.3, ⊕ Sτ [ψ(·, z)](x) = (19) ψ(x, y) ⊗ Bτ (y, z) dy = ψ(x, ·) Bτ (·, z), Rn where for all y ∈ R , ⊕ − (20) Bτ (y, z) = − ψ(x, y) ⊗ −Sτ [ψ(·, z)](x) dx = ψ(·, y) [Sτ [ψ(·, z)](·)]− . n Rn . It is handy to define the max-plus linear operator with “kernel” Bτ as Bτ [α](z) = − Bτ (z, ·) α(·) for all α ∈ S−C . Proposition 3.5. Let φ ∈ Sβ with semiconvex dual denoted by a. Define φ1 = Sτ [φ]. Then φ1 ∈ Sβ−ηIτ , and φ1 (x) = ψ(x, ·) a1 (·), where a1 (x) = Bτ (x, ·) a(·). Theorem 3.6. Let V ∈ Sβ , let a be its semiconvex dual (with respect to ψ), and suppose Bτ (z, ·) a(·) = Bτ [a](z) ∈ S − for some d such that C + d < 0. Then d V = Sτ [V ] if and only if a(z)= ⊕ Rn Bτ (z, y) ⊗ a(y) dy = Bτ (z, ·) a(·) = Bτ [a](z) ∀ z ∈ Rn . Also, for each m ∈ M and z ∈ Rn , Sτm [ψ(·, z)] ∈ S−(C+ηIτ ) and Sτm [ψ(·, z)](x) = ψ(x, ·) Bτm (·, z) where (21) ∀ x ∈ Rn , − − Bτm (y, z) = ψ(·, y) Sτm [ψ(·, z)] (·) ∀ y ∈ Rn . . It will be handy to define the max-plus linear operator with “kernel” Bτm as Bτm [a](z) = − Bτm (z, ·) a(·) for all a ∈ S−C . Further, one also obtains analogous results as those for V above. 3.3. Discrete-time approximation. We now discretize over time and employ approximate μ processes which will be constant over the length of each time step. We define the operator S̄τ on Gδ by τ γ2 S̄τ [φ](x) = sup max lm (ξtm ) − |wt |2 dt + φ(ξτm ) (x) = max Sτm [φ](x), m∈M 2 w∈W m∈M 0 (22) where ξ m satisfies (4). Let . B τ (y, z) = max Bτm (y, z) = Bτm (y, z) ∀ y, z ∈ Rn . m∈M m∈M = m The corresponding max-plus linear operator is B τ m∈M Bτ . Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE 3059 Lemma 3.7. For all z ∈ Rn , S̄τ [ψ(·, z)] ∈ S−(C+ηIτ ) . Further, S̄τ [ψ(·, z)](x) = ψ(x, ·) B τ (·, z) (23) ∀ x ∈ Rn . With τ acting as a time-discretization step-size, let (24) τ D∞ ={μ : [0, ∞) → M | for each n ∈ N ∪ {0}, there exists mn ∈ M such that μ(t) = mn ∀ t ∈ [nτ, (n + 1)τ ) } , and for T = n̄τ with n̄ ∈ N define DTτ similarly but with domain [0, T ) rather than [0, ∞). Let Mn̄ denote the outer product of M, n̄ times. Let T = n̄τ , and define ¯ τ [φ](x) = S̄ T (25) max n̄−1 n̄ {mk }n̄−1 k=0 ∈M Sτmk [φ](x) = (S̄τ )n̄ [φ](0), k=0 where the notation indicates operator composition with the ordering being given n̄−1 mk m m by k=0 Sτ = Sτ n̄−1 Sτ n̄−2 . . . Sτm1 Sτm0 , and the superscript in the last expression indicates repeated application of S̄τ , n̄ times. We will be approximating V by solving V = S̄τ [V ] via its dual problem a = B τ a for small τ . In [28], [29], it is shown that there exists a solution to V = S̄τ [V ] and that the solution is unique. In particular, for the existence part, we have the following. Theorem 3.8. Let . ¯ τ [0](x) V (x) = lim S̄ Nτ (26) N →∞ for all x ∈ Rn , where 0 here represents the zero-function. Then V satisfies V = S̄τ [V ], (27) V (0) = 0. Further, V (x) = sup sup T sup τ w∈W T ∈[0,∞) μ∈D∞ 0 lμt (ξt ) − γ2 |wt |2 dt , 2 where ξt satisfies (12), and 0 ≤ V m ≤ V ≤ V for all m ∈ M (which implies V ∈ Gδ ). Lastly, with the choice of β above (i.e., such that C +β < 0), one has V ∈ Sβ ⊂ S−C . Similar techniques to those used for V m and V prove uniqueness for (27) within Gδ . In particular, we have the following results [28], [29]. Theorem 3.9. V is the unique solution of (27) within the class Gδ for sufficiently ¯ τ [V ](x) = V (x) for all x ∈ Rn small δ > 0. Further, given any V ∈ Gδ , limN →∞ S̄ Nτ (uniformly on compact sets). Proposition 3.10. Let φ ∈ Sβ ⊂ S−C with semiconvex dual denoted by a. Define φ1 = S̄τ [φ]. Then φ1 ∈ S−(C+ηIτ ) , and φ1 (x) = ψ(x, ·) a1 (·) Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3060 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG where a1 (y) = B τ (y, ·) a(·) ∀ y ∈ Rn . We will solve the problem V = ST [V ], or equivalently 0 = −H(x, ∇V ), with k k 0 boundary condition V (0) = 0, by computing ā = B τ ā with appropriate initial ā0 (such as ā0 = 0), where the k superscript of the right-hand side represents operator k . composition k times. From this, one obtains approximation V (x) = ψ(x, ·) āk . We k will show that for k sufficiently large and τ sufficiently small, V will approximate V 2 within an error bound of ε(1 + |x| ) for as small an ε as desired. 4. The algorithm. We summarize the mechanics of the algorithm. The full m development may be found in [28], [29]. We start by computing Pτm , Λm τ , and Rτ for each m ∈ M where these are defined by m m 1 m Sτm [ψ(·, z)](x) = 12 (x − Λm τ z) Pτ (x − Λτ z) + 2 z Rτ z m m and where the time-dependent n × n matrices Ptm , Λm t , and Rt satisfy P0 = C, m m Λ0 = I, R0 = 0, Ṗ m = (Am ) P m + P m Am − [Dm + P m Σm P m ], Λ̇m = (P m )−1 Dm − Am Λm , (28) Ṙm = (Λm ) Dm Λm . m Pτm , Λm τ , and Rτ may be computed from these ordinary differential equations via a Runge–Kutta algorithm (or other technique) with initial time t = 0 and terminal time m t = τ . We remark that each Pτm , Λm τ , Rτ need only be computed once. m Next, noting that each Bτ is given by (21), one has m m m m z + z (M1,2 ) x + z M2,2 z , x M1,1 x + x M1,2 . where with shorthand notation Dτ = (Pτm − C), Bτm (x, z) = 1 2 (29) m = −CDτ−1 Pτm , M1,1 (30) m M1,2 = −CDτ−1 Pτm Λm τ , (31) m −1 m m = Rτm − (Λm M2,2 τ ) CDτ Pτ Λτ . m m m Note that each M1,1 , M1,2 , M2,2 need only be computed once. In order to reduce notational complexity, for the moment we suppose that we 0 initialize the following iteration with an a0 (the dual of V ) which consists of a single 0 0 0 0 (x − a (x), where a takes the form a0 (x) = 12 (x − z0 ) Q quadratic, that is, a (x) = z0 )+ r0 . Next we note that, recalling āk = B τ [B τ ]k−1 ā0 with B τ = m∈M Bτm , ak (x)= k {mi }k i=1 ∈M ak{mi }k (x), i=1 where each . ak{mi }k (x)= Bτmk (x, ·) ak−1 (·), {m }k−1 i=1 i i=1 = 1 2 (x − k k k (x z{m ) Q k {mi }i=1 i }i=1 k k − z{m ) + r{m , k k i} i} i=1 i=1 Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3061 CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE where for each mk+1 and each {mi }ki=1 , M mk+1 , k+1 k+1 = M mk+1 − M mk+1 D (32) Q 1,1 1,2 1,2 {mi }i=1 −1 m k+1 k+1 k+1 z{m Q M1,2k+1 E, k+1 = − {m } } i i=1 i i=1 k+1 k M mk+1 zk k − r{m {m + 12 E k k+1 = r 2,2 {mi }i=1 i }i=1 i }i=1 −1 m k k D= M2,2k+1 + Q , {mi } 1 2 k+1 k+1 k+1 zk+1 k+1 , Q z{m k+1 } {m } {m } i i=1 i i=1 i i=1 i=1 D Q k k zk k . E= {mi } {mi } i=1 i=1 We see that the propagation of each ak{mi }k amounts to a set of matrix multiplicai=1 tions and an inverse. (Note in the case of purely quadratic constituent Hamiltonians k k here, one will have all z{m = 0 and all r{m = 0.) Thus, at each step k, the k k i} i} i=1 k i=1 semiconvex dual of V , ak , is represented as the finite set of triples . k = k k , zk k , rk k Q | m Q ∈ M ∀i ∈ {1, 2, . . . , k} . i {mi } {mi } {mi } i=1 i=1 i=1 (We remark that in actual implementation, it is simpler to index the entries of each k with integers i ∈ {1, 2 . . . , #Q k } rather than the sequences {mi }k .) Q i=1 k At any desired stopping time, one can recover a representation of V as k . V (x)= max {mi }k i=1 1 2 x−x k{mi }k i=1 k P{m k i} i=1 x−x k{mi }k + ρk{mi }k , i=1 i=1 where k k k F C + (F C − I) C(FC − I), (33) P{m = C F Q k {mi }i=1 i }i=1 −1 k k k k k G k k + (F C − I) C F Q C F Q {m , x {mi }k = − P{mi }k k {mi }i=1 {mi }i=1 z i }i=1 i=1 i=1 Q k +Q k k F C F Q k k G zk G ρk k = rk k + 1 zk k k {mi }i=1 {mi }i=1 . k F= Q {mi }k i=1 and +C . k = G F Q{mi }k i=1 Thus, V (34) {mi }i=1 2 −1 {mi }i=1 {mi }i=1 {mi }i=1 , {mi }k i=1 , −I . k has the representation as the finite set of triples . k | mi ∈ M ∀i ∈ {1, 2, . . . , k} . P{m ,x k{mi }k , ρk{mi }k Pk = k i} i=1 i=1 i=1 We note that the triples which comprise Pk are analytically obtained from k k , zk k , rk k ) by matrix multiplithe triples given explicitly as (Q {mi }i=1 {mi }i=1 {mi }i=1 k k , zk k , rk k ) to cations and an inverse. The transference from (Q {mi }i=1 {mi }i=1 {mi }i=1 k (P{m ,x k{mi }k , ρk{mi }k ) need only be done once which is at the termination k i }i=1 i=1 i=1 of the algorithm propagation. We note that (34) is our approximate solution of the original control problem/HJB PDE. The curse-of-dimensionality is replaced by another type of rapid computational cost growth. We refer to this as the curse-of-complexity. If #M = 1, then all Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3062 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG the computations for our algorithm (except the solution of a Riccati equation) are unnecessary, and we informally refer to this as complexity one. When there are we say it has complexity M . Note M = #M such quadratics in the Hamiltonian H, that k | m ∈ M ∀i ∈ {1, 2, . . . , k} = M k. # V{m k i } i i=1 For large k, this is indeed a large number. In order for the computations to be practical, one must reduce this by pruning and other techniques; see section 9. 0 a0 . Lastly, note that in the above0 we assumed .a to consist of a single quadratic 0 aj (x) with J0 = {1, 2, . . . , J0 }, where each In general, we take a = j∈J0 0j (x − zj0 ) + rj0 . a0j (x) = 12 (x − zj0 ) Q . k by a factor of J0 . Denote the elements of Mk = This increases the size of each Q 0 k k k {mi }i=1 mi ∈ M ∀i by m ∈ M . With the additional quadratics in a and the k in the form reduced-complexity indexes, one sees that at each step we would have Q k k , zk k , rk k j ∈ J0 , mk ∈ Mk . k = Q Q (35) (j,m ) (j,m ) (j,m ) 5. Error bounds and convergence. There are two error sources with this curse-of-dimensionality-free approach to solution of the HJB PDE. 0 we suppose that the algorithm is initialized with V = For concreteness, . m m∈M V . Letting T = N τ , where N is the number of iterations, the first error source is 0 ¯ τ V 0 (x). (36) εF T (x, N, τ ) = ST V (x) − S̄ T This is the error due to the time-discretization of the μ process. The second error source is 0 (37) εIT (x, N, τ ) = V (x) − ST V (x). This is the error due to approximating the infinite time-horizon problem by the finite-time horizon problem with horizon T . The total error is obviously εT E = εF T (x, N, τ ) + εIT (x, N, τ ). We begin the error analysis with the former of these two error sources in section 6. In section 7, we consider the latter, and in section 8, the two are combined. 6. Errors from time-discretization. We henceforth assume that all the σ m ’s used in the dynamics (4) are the same, i.e., σ m = σ for all m ∈ M. The authors do not know if this is required but were unable to obtain the (already technically difficult) proofs of the estimates in this section without this assumption. The proof of the following is quite technical. However, we assure the reader that the proof of the time-horizon error estimate in section 7 is much less tedious. Theorem 6.1. There exists Kδ < ∞ such that, for all sufficiently small τ > 0, ¯ τ [V m ](x) ≤ K (1 + |x|2 )(τ + √τ ) 0 ≤ ST [V m ](x) − S̄ δ T for all T ∈ (0, ∞), x ∈ Rn , and m ∈ M. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE 3063 Proof. Fix δ > 0 (used in the definition of Gδ ). Fix m ∈ M. Fix any T < ∞ and x ∈ Rn . Let ε̂ > 0 and ε = (ε̂/2)(1 + |x|2 ). Let wε ∈ W, με ∈ D∞ be ε-optimal for ST [V m ](x), i.e., T 2 ε γ ε̂ m μ ε ε 2 m ε l t (ξt ) − |wt | dt + V (ξT ) ≤ ε = (38) ST [V ](x) − 1 + |x|2 , 2 2 0 where ξ ε satisfies (12) with inputs wε , με . ε τ (where τ has yet to be We will let ξ satisfy (12) with inputs wε and με ∈ D∞ chosen). Integrating (12), one has (39) ξtε t ε Aμr ξrε + σwrε dr + x = and 0 t ε ξt = t 0 ε ε Aμr ξ r + σwrε dr + x. Taking the difference of these, one has ξtε − ε ξt = 0 (40) t = 0 t ε ε ε Aμr ξrε − Aμr ξ r dr ε ε (Aμr − Aμr )ξrε dr + 0 ε ε Aμr ξrε − ξ r dr. ε . Letting zt = 12 |ξtε − ξ t |2 and using (40) yields ε ε ε ε ε ε ż= ξrε − ξ r (Aμr − Aμr )ξrε + ξrε − ξ r Aμr ξrε − ξ r which, by using Assumption Block (A.m), ε ε ε ≤ ξrε − ξ r (Aμr − Aμr )ξrε − 2cA zr . Noting that z0 = 0 and solving this differential inequality, one obtains (41) 1 2 t ε ε ε 2 ε ε Aμr − Aμr ξrε dr. e−2cA (t−r) ξrε − ξ r ξt − ξ t ≤ 0 Consequently, we now seek a bound on the right-hand side of (41), which will go to zero as τ ↓ 0 independent of 0 ≤ t ≤ T < ∞. We will use the boundedness of ξ ε , ε ξ , and wε , which is independent of T for this class of systems [36]. (Precise statements are given in Lemmas 6.6 and 6.7 below.) For any given τ > 0, we build μεr from μεr over [0, T ] in the following manner. Fix τ > 0. Let N t be the largest integer such that N t τ ≤ t. For any Lebesgue measurable subset of R, I, let L(I) be the measure of I. For t ∈ [0, T ], m ∈ M, let (42) Itm = {r ∈ [0, t) | μεr = m}, Ītm = {r ∈ [0, t) | μεr = m}, m λm t = L(It ), m λ̄m t = L(Īt ). At the end of any time step nτ (with n ≤ N T ), we pick one m ∈ M with the largest (positive) error so far committed and we correct it, i.e., let m m̄ ∈ argmax λm (43) nτ − λ̄(n−1)τ , m∈M Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3064 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG and we set ∀ r ∈ [(n − 1)τ, nτ ). μεr = m̄ (44) Finally, we simply set (45) m μεr ∈ argmax{λm T − λ̄N T τ } ∀ r ∈ [N T τ, T ]. m∈M Obviously, for all 0 ≤ r ≤ T the sum of the errors in measure is null, that is, m (λm λm λ̄m r − λ̄r ) = r − r = r − r = 0. m∈M m∈M m∈M With this construction we also get the following. m Lemma 6.2. For any t ∈ [0, T ] and any m ∈ M, one has λm t − λ̄t ≥ −τ . Proof. Let us assume that at time nτ , it is true (which is certainly true for n = 0). m We first consider the case (n + 1)τ ≤ T . Since m∈M λm nτ − λ̄nτ = 0, there exists m̂ m̂ m̂ m m m such that λnτ − λ̄nτ ≥ 0. Hence, maxm∈M (λ(n+1)τ − λ̄nτ ) ≥ maxm∈M (λm nτ − λ̄nτ ) ≥ 0. Now, choosing m̄ by (43) (for time step (n + 1)τ ), we have (46) m̄ λm̄ (n+1)τ − λ̄nτ ≥ 0. Now let t ∈ [nτ, (n + 1)τ ]. Then, by (46), m̄ m̄ m̄ m̄ m̄ λm̄ t − λ̄t ≥ −(λ(n+1)τ − λt ) − (λ̄t − λ̄nτ ) which by (44) and the choice of m̄, m̄ = −(λm̄ (n+1)τ − λt ) − (t − nτ ) m and since λm t − λs ≤ (t − s) for all m and t ≥ s, ≥ (t − (n + 1)τ ) − (t − nτ ) (47) = −τ. m Now since μεr = m̄ for all r ∈ [nτ, (n + 1)τ ), λ̄m r = λ̄nτ for all r ∈ [nτ, (n + 1)τ ) and all m = m̄. Consequently, for t ∈ [nτ, (n + 1)τ ] and m = m̄, m m m m m λm t − λ̄t = λt − λ̄nτ ≥ λnτ − λ̄nτ which by the induction assumption (48) ≥ −τ. Combining (47) and (48) completes the proof for the case (n + 1)τ ≤ T . We must still consider the case nτ < T < (n + 1)τ . The proof is essentially identical to that above with the exception that (n + 1)τ is replaced by T , m̄ ∈ m argmaxm∈M {λm T − λ̄nτ } is used in place of (43), and (45) is used in place of (44). We do not repeat the details. Using this, we will find the following. m Lemma 6.3. For any t ∈ [0, T ] and any m ∈ M, one has λm t − λ̄t ≤ (M − 1)τ . m m Proof. Since m∈M λt − λ̄t = 0, for all m̃ ∈ M, m̃ m λm λm̃ t − λ̄t = − t − λ̄t m=m̃ Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3065 CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE which by Lemma 6.2, ≤ (M − 1)τ. We now develop some more-delicate machinery, which will allow us to make ε fine estimates of the difference between ξtε and ξ t . For each m we divide Itm into tm = max{k ∈ N ∪ {0} | ∃ integer n ≤ pieces Ikm,t of length τ as follows. Let K m . m m t/τ such that λ̄m nτ = kτ }. Then, for k ≤ Kt , let nk = min{n ∈ N ∪ {0} | λ̄nτ = kτ } m, t m m m . m [ where for any integers m ≤ n, and I = [(n − 1)τ, n τ ]. Let K =]1, K k k t k t ]m, n[ denotes {m, m + 1, . . . , n}. Loosely speaking, we will now let Ikm,t denote a m, t . More specifically, we define Im,t subset of I m of measure τ , corresponding to I k t k as follows. Introduce the functions Φm,t k (r) which are monotonically increasing (hence m, t = [(nm − 1)τ ; nm τ ] with Im,t ) given by measurable) functions (that will match I k k k k m m Φm,t k (r)= inf ρ ∈ [0, t] | λρ = (k − 1)τ + [r − (nk − 1)τ ] m = inf ρ ∈ [0, t] | λm ρ = r + (k − nk )τ , where, in particular, we take Φm,t k (r) = t if there does not exist ρ ∈ [0, t] such that m λm = r + (k − n )τ . We note that Φm,t ρ k k (r) are translations by part. Then (neglecting m, t the point r = t which has measure zero anyway) Ikm,t = Φm,t k (I k ). m, t as the last part of Ī m , with length L(I m, t ) ≤ τ , and Im,t We also define I f f t f m as the last part of Itm not corresponding to an interval of length τ of I t . That is, m, t and Im,t = I m \ ! m, t = Ī m \ ! m,t I f t t k∈K m I k k∈K m Ik . f t t ε With this additional machinery, we now return to obtaining the bound on |ξtε −ξ t |. Note that e−2cA t =e 0 −2cA t ε ε ε e2cA r ξrε − ξ r Aμr − Aμr ξrε dr ε e2cA r ξrε − ξ r Am ξrε dr − t m∈M I¯tm Itm e 2cA r ξrε − ε ξr Am ξrε dr . Combining this with (41) yields 1 2 ε 2 ε −2c t ξt − ξ t ≤ e A Itm m∈M =e −2cA t e 2cA r ξrε ⎧ ⎨ m∈M ⎩ k∈Ktm m,t I k − m,t nk m,t (nk + m,t I f e 2cA r − ε ξr Am ξrε dr − I¯tm e 2cA r ε ξrε − ξ r Am ξrε dr ε e2cA r ξrε − ξ r Am ξrε dr τ −1)τ e 2cA r ε ξrε −ξ r ξrε − Am ξrε ε ξr Am ξrε dr dr− m, t I f e 2cA r ε ξrε −ξ r Am ξrε ⎫ ⎬ dr , ⎭ Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3066 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG which by the definition of Φm,t k , =e −2cA t ⎧ ⎨ m∈M ⎩ k∈Ktm m,t nk τ m,t m,t (nk −1)τ e2cA Φk (r) (ξ ε Φm,t (r) − ξ ε −e2cA r ξrε − ξ r Am ξrε dr (49) + m,t I f e 2cA r ε ξrε −ξ r Am ξrε ε (r) ) Am ξ ε Φm,t (r) k dr− m,t Φk k I f m, t e 2cA r ξrε − ε ξr Am ξrε ⎫ ⎬ dr ⎭ . Note that m,t ε ε e2cA Φk (r) ξ ε Φm,t (r) − ξ Φm,t (r) Am ξ ε Φm,t (r) − e2cA r ξrε − ξ r Am ξrε k k k Φm,t (r) ε k ε ε = 2cA e2cA ρ ξρε − ξ ρ Am ξρε + e2cA ρ ξρε − ξ ρ Am Aμρ ξρε + σwρε r ε ε ε +e2cA ρ (Aμρ ξρε − Aμρ ξ ρ ) Am ξρε dρ Φm,t (r) k ε (50) ≤ e2cA t K1 |ξρε |2 + |ξ ρ |2 + |wρε |2 dρ r for proper choice of K1 independent of r, t, x, m, k, and ε > 0. Substituting (50) into (49) yields 1 2 ⎧ τ ⎨ nm,t k ε 2 ε − ξ ≤ ξt t m,t ⎩ −1)τ (n m m∈M k∈Kt k + (51) m,t I f e 2cA (r−t) ξrε − m,t Φk (r) ε 2 ε 2 ε 2 |ξ dρ K | + |ξ | + |w | dr 1 ρ ρ ρ r ε ξr Am ξrε dr − m, t I e 2cA (r−t) ξrε f − ε ξr Am ξrε ⎫ ⎬ dr ⎭ . Define Φm,t,+ (r) k = Φm,t k (r) r if Φm,t k (r) ≥ r, otherwise, (r) Φm,t,− k = Φm,t k (r) r if Φm,t k (r) ≤ r, otherwise. Then m,t Φk (r) (r) Φm,t,+ k ε 2 ε ε 2 ε 2 K1 |ξρ | + |ξ ρ | + |wρ | dρ = K1 |ξρε |2 + |ξ ρ |2 + |wρε |2 dρ r r r ε + (52) K1 |ξρε |2 + |ξ ρ |2 + |wρε |2 dρ, Φm,t,− (r) k where at most one of the integrals on the right is nonzero. We need to evaluate the m,t distance |Φm,t − 1)τ, nm,t k (r) − r| for r in [(nk k τ ]. m,t (r) < nm,t Lemma 6.4. For all m ∈ M, Φm,t,+ k k+2 τ + (r − nk τ ) for all r ∈ m,t m,t m,t,− m,t tm − 2[, and Φ [(nk − 1)τ, nk τ ] and k ∈]1, K (r) > nk−M τ + (r − nm,t k k τ ) for all m,t m,t m r ∈ [(nk − 1)τ, nk τ ] and k ∈]M + 1, Kt [. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE 3067 Proof. We prove each of the two inequalities separately. Suppose there exist m m m ∈ M, r ∈ [(nm,t − 1)τ, nm,t k k τ ], and k ∈]1, Kt−2 [⊆ Kt such that m,t Φm,t,+ (53) (r) ≥ nm,t k k+2 τ + r − nk τ . By definition, m kτ + r − nm,t k τ= λΦm,t (r) k which by Lemma 6.2, −τ ≥ λ̄m Φm,t (r) k − λ̄m + λ̄m −τ = λ̄m Φm,t (r) nm,t τ +r−nm,t τ nm,t τ +r−nm,t τ k k+2 k k+2 k which by (53) and the monotonicity of λ̄m · , − τ. ≥ λ̄m nm,t τ +r−nm,t τ (54) k+2 k However, note that m,t λ̄m ρ = jτ + ρ − nj τ for ρ ∈ (55) m,t nj − 1 τ, nm,t j τ m,t m,t m,t and that nm,t k+2 τ + r − nk τ ∈ [(nk+2 − 1)τ, nk+2 τ ], and so m,t m,t m,t λ̄m = (k + 2)τ + nm,t k+2 τ + r − nk τ − nk+2 τ = (k + 2)τ + r − nk τ. nm,t τ +r−nm,t τ k+2 k Consequently, (54) yields m,t m,t kτ + r − nm,t k τ ≥ (k + 2)τ + r − nk τ − τ > kτ + r − nk τ, which is a contradiction. − Now we turn to the second inequality. Suppose there exist m ∈ M, r ∈ [(nm,t k tm [⊂ Ktm such that τ ], and k ∈]M + 1, K 1)τ, nm,t k m,t Φm,t,− (r) ≤ nm,t k k−M τ + (r − nk τ ). (56) Again, by definition, m kτ + r − nm,t k τ = λΦm,t (r) k which by Lemma 6.3, ≤ λ̄m + (M − 1)τ Φm,t (r) k which by (56) and the monotonicity of λ̄m · , ≤ λ̄m nm,t (57) m,t τ k−M τ +r−nk + (M − 1)τ. m,t m,t m,t Noting that nm,t k−M τ + r − nk τ ∈ [(nk−M − 1)τ, nk−M τ ] and again appealing to (55), = (k − M )τ + r − nm,t one sees that λ̄m k τ , and so (57) implies nm,t τ +r−nm,t τ k−M kτ + r − nm,t k τ k m,t ≤ (k − M )τ + r − nm,t k τ + (M − 1)τ = (k − 1)τ + r − nk τ, which is a contradiction. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3068 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG Lemma 6.5. Suppose f (·) is nonnegative and integrable over [0, T ]. For any t ∈ [0, T ] and any m ∈ M, m,t m,t t nk τ Φk (r) dr ≤ (M + 2)τ f (ρ) dρ f (r) dr. m,t k∈Ktm (nk −1)τ 0 r Proof. Using the first inequality of Lemma 6.4, one has Φm,t,+ τ (r) nm,t k k f (ρ) dρ dr k∈Ktm ≤ (nm,t −1)τ k r nm,t τ k (nm,t −1)τ k k∈Ktm m,t [(ñm,t )τ +r]∧t k+2 −nk f (ρ) dρ dr, r tm and ñm,t = +∞ otherwise, and with a change of where ñm,t = nm,t if k ≤ K k k k variables, this is m,t τ [(ñk+2 −1)τ +ρ]∧t f (s) ds dρ = k∈Ktm = k∈Ktm = τ (nm,t −1)τ +ρ k 0 τ [(ñm,t k+2 −1)τ +ρ]∧t f (s) ds + [(ñm,t k+1 −1)τ +ρ]∧t 0 ⎧ ⎨ [(ñm,t k+1 −1)τ +ρ]∧t (nm,t −1)τ +ρ k [(ñm,t k+2 −1)τ +ρ]∧t f (s) ds + m,t ⎩ k∈Ktm [(ñk+1 −1)τ +ρ]∧t k∈Ktm t τ t (58) ≤ 2 f (s) ds dρ = 2τ f (s) ds. 0 0 0 f (s) ds dρ ⎫ ⎬ [(ñm,t k+1 −1)τ +ρ]∧t (nm,t −1)τ +ρ k f (s) ds ⎭ dρ 0 Φm,t,− k To handle the terms, one can employ the second inequality of Lemma 6.4. In particular, one finds m,t nk τ r f (ρ) dρ dr k∈Ktm ≤ (nm,t −1)τ k Φm,t,− (r) k nm,t τ k (nm,t −1)τ k k∈Ktm r m,t [n̂m,t τ ]∨0 k−M τ +r−nk f (ρ) dρ dr, where n̂m,t = nm,t if k ≥ 1 and n̂m,t = −∞ otherwise, and again with a change of k k k variables, this becomes m,t τ (nk −1)τ +ρ f (s) ds dρ = k∈Ktm = 0 τ −1 = 0 0 −1 τ k∈Ktm ≤ [(n̂m,t k−M −1)τ +ρ]∨0 0 m =−M τ −1 m =−M [(n̂m,t −1)τ +ρ]∨0 k+m m =−M k∈Ktm 0 [(n̂m,t −1)τ +ρ]∨0 k+m +1 [(n̂m,t −1)τ +ρ]∨0 k+m +1 [(n̂m,t −1)τ +ρ]∨0 k+m f (s) ds dρ f (s) ds dρ t f (s) ds dρ Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3069 CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE (59) =M τ 0 t f (s) ds dρ = M τ 0 t 0 f (s) ds. Applying Lemma 6.5 to (51) yields 1 2 ε 2 ε ξt − ξ t ≤ 0 m∈M + (60) t (M + 2)τ m,t I f e 2cA (r−t) ε K1 |ξρε |2 + |ξ ρ |2 + |wρε |2 dρ ξrε − ε ξr Am ξrε dr − m, t I e 2cA (r−t) ξrε − ε ξr Am ξrε dr . f Now, by the system structure given by Assumption Block (A.m) and by the fact that the V m ’s are in Gδ , one obtains the following lemmas exactly as in [36]. Lemma 6.6. For any t < ∞ and ε > 0, 2ε 1 cA γ 2 −cA N τ cD ε 2 + e + w L2 (0,t) ≤ |x|2 . δ δ c2σ cA Lemma 6.7. For any t < ∞ and ε > 0, 2 ( ) t cσ cD 2ε c2σ γ2 1 ε 2 |ξr | dr ≤ + + 2 + |x|2 δ c2A δcA c2A cσ cA 0 and 2 ( ) t 2 cσ cD 2ε c2σ γ2 1 ε + + 2 + |x|2 . ξ r dr ≤ δ c2A δcA c2A cσ cA 0 Applying Lemmas 6.6 and 6.7 to (60), one obtains 2 ε ε ε 2 1 − ξ ≤ (1 + |x| )τ + e2cA (r−t) ξrε − ξ r Am ξrε dr K 2 t 2 ξt m,t I f m∈M − (61) m, t I f e 2cA (r−t) ε ξrε −ξ r Am ξrε dr for proper choice of K2 independent of x, t, and ε̂ ≤ 1. m, t ). By Lemma 6.3, Now we deal with the end parts (the integrals over Ifm,t and I f m for any m ∈ M, λm − λ̄ ≤ (M − 1)τ . By the monotonicity of λm t t · , this implies N τ N τ m λm t − λ̄N t τ ≤ M τ, m, t , and this implies L(Ifm,t ) ≤ M τ for all t ∈ [0, T ]. Also, by the definition of I f m, t t L(I ) ≤ t − N τ ≤ τ for all m ∈ M and t ∈ [0, T ]. Also, note that f d ε2 c2 [|ξ | ] ≤ −2cA |ξ ε |2 + 2cσ |ξ ε ||wε | ≤ −cA |ξ ε |2 + σ |wε |2 dt cA which implies (62) |ξtε |2 ≤ |x|2 + c2σ ε 2 w cA Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3070 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG for all t ∈ [0, T ]. Combining this with Lemma 6.6 implies |ξtε |2 ≤ K3 (1 + |x|2 ) (63) ∀t ≥ 0 for appropriate choice of K3 independent of ε̂ ≤ 1 and t ∈ [0, ∞). Similarly, one obtains ε |ξ t |2 ≤ K3 (1 + |x|2 ) (64) ∀ t ≥ 0, independent of ε̂ ≤ 1 and t ∈ [0, ∞). Now note that ε ε e2cA (r−t) ξrε − ξ r Am ξrε dr − e2cA (r−t) ξrε − ξ r Am ξrε dr m,t I f ≤ m,t I f dA 2 m, t I f 2 ε ε 2 3|ξr | + ξ r dr + m, t I f 2 dA ε ε 2 3|ξr | + ξ r dr 2 (where we recall dA = maxm∈M |Am |), which by (63) and (64), 2dA K3 (1 + |x|2 ) dr + 2dA K3 (1 + |x|2 ) dr, ≤ m, t I f m,t I f which by the remarks just above (65) ≤ 2(M + 1)dA K3 (1 + |x|2 )τ. Combining (61) and (65) yields ε 2 2 1 ε (66) − ξ ξ t ≤ K4 (1 + |x| )τ t 2 for proper choice of K4 independent of x, t, and ε̂ ≤ 1. t ε Now that we have a bound on 12 |ξtε − ξ t |2 , we also obtain a bound on 0 21 |ξrε − ε ξ r |2 dr in a similar fashion. Note that t d ε ε 2 1 ε |ξr − ξ r |2 dr = 12 ξtε − ξ t , 2 dt 0 ε which, using ξ0ε − ξ 0 = 0, t ε ε ε ε ξrε − ξ r (Aμr ξrε − Aμr ξ r ) dr = 0 t t ε ε ε ε ε ε μr μεr ε ξr − ξ r A −A ξr dr + ξrε − ξ r Aμr ξrε − ξ r dr = 0 0 t t ε ε ε ε 2 ε μr ε 1 ε ξ ξ dr. ≤ −2cA |ξ − ξ | dr + − ξ A − ξ r r r r r r 2 0 0 t ε ε ε . t In other words, letting zt = 0 21 |ξrε −ξ r |2 dr, we have ż ≤ −2cA z + 0 (ξrε −ξ r ) Aμr (ξrε − ε ξ r ) dr where z0 = 0. Solving this differential inequality yields t r t ε ε ε 2 ε −2cA (t−r) 1 ε (67) − ξ dr ≤ e ξρε − ξ ρ Aμρ − Aμρ ξρε dρ dr. ξ r r 2 0 0 0 Now proceeding exactly as above, but without an e−2cA (r−ρ) term in the integral ε (which was irrelevant in the above bound on 12 |ξtε − ξ t |2 , one finds r ε ε ε ξρε − ξ ρ Aμρ − Aμρ ξρε dρ ≤ K4 (1 + |x|2 )τ (68) 0 Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE 3071 for all x ∈ Rn , ε̂ ≤ 1, and r ∈ [0, ∞). Substituting (68) into (67) yields t t K4 ε 2 1 ε (69) |ξ − ξ | dr ≤ e−2cA (t−r) K4 (1 + |x|2 )τ dr ≤ (1 + |x|2 )τ. r 2 r 2cA 0 0 ε Now that we have the above bounds on these differences between ξtε and ξ t , we ¯ τ [V m ]. Note that turn to bounding the difference between ST [V m ] and S̄ T T T ε γ 2 ε γ2 ε 2 μεt ε m ε μεt ε 2 m l (ξt ) − |wt | dt + V (ξT ) − l ξ t − |wt | dt + V ξT 2 2 0 0 T ε ε ε ε ε ε 1 (70) = 2 ξtε Dμt ξtε − ξ t Dμt ξ t dt + 12 (ξTε ) P m ξTε − 12 ξ T P m ξ T . 0 We work first with the integral terms on the right-hand side. One has T T ε ε ε ε ε ε ξtε Dμt ξtε − ξ t Dμt ξ t dt = (ξtε ) Dμt − Dμt ξtε dt 0 0 + (71) 0 T ε ε ε ε (ξtε ) Dμt ξtε − ξ t Dμt ξ t dt. Also 0 T ε ε ε ε (ξtε ) Dμt ξtε − ξ t Dμt ξ t dt ε ε ε ξtε + ξ t Dμt ξtε − ξ t dt 0 T ε ε 2 ε ≤ |Dμt | ξtε − ξ t + 2|ξtε | ξtε − ξ t dt 0 ⎡ 1/2 ⎤ T 2 2 T ε ε ⎦, ≤ cD ⎣ ξtε − ξ t dt + 2ξ·ε ξtε − ξ t dt T = 0 0 which by (69) and Lemma 6.7, (72) 1/2 1/2 ≤ 2cD M 2 K5 (1 + |x|2 )τ + 2 K6 (1 + |x|2 ) 2M 2 K5 (1 + |x|2 )τ √ ≤ K7 (1 + |x|2 )(τ + τ ) for appropriate K5 , K6 , and K7 independent of x, T, τ , and ε̂ ≤ 1. Now we turn to the first integral on the right-hand side of (71). Bounding this ε term is more difficult, and we use techniques similar to those used in bounding |ξtε −ξ t | above. Note that T 0 ε ε (ξtε ) Dμt − Dμt ξtε dt= m∈M Im T (ξtε ) Dm ξtε dt − Ī m T (ξtε ) Dm ξtε dt ⎧ ⎡ ⎤ nm,T τ ⎨ ⎪ k ε m ε ε m ε ⎣ (ξt ) D ξt dt⎦ = (ξt ) D ξt dt − m,T m,T ⎪ k I n −1 τ m∈M ⎩k∈K m k T ⎫ ⎪ ⎬ ε m ε ε m ε (ξt ) D ξt dt − (ξt ) D ξt dt + m, T m,T ⎪ I I ⎭ f f Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3072 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG which as in (49)–(51), ⎧ ⎡ ⎤ ⎪ nm,T τ ⎨ ⎣ k = ξ ε m,T (t) Dm ξ ε m,T (t) − (ξtε ) Dm ξtε dt⎦ m,T Φ Φ ⎪ n −1 τ k k m∈M ⎩k∈K m k T ⎫ ⎪ ⎬ ε m ε ε m ε + (ξt ) D ξt dt − (ξt ) D ξt dt m, T m,T ⎪ f I I ⎭ f ⎧ ⎤ ⎡ Φm,T (t) nm,T τ ⎨ ε ⎪ k k ε m μr ε ε ⎣ = A ξr + σwr dr dt⎦ 2(ξr ) D m,T ⎪ n −1 τ t m∈M ⎩k∈K m k T ⎫ ⎪ ⎬ ε m ε ε m ε (ξt ) D ξt dt − (ξ ) D ξ dt + t t m, T m,T ⎪ f I I ⎭ f ⎧ ⎡ ⎤ nm,T τ Φm,T (t) ⎨ ⎪ k ⎣ k ≤ 2K8 (|ξrε |2 + |wrε |2 ) dr dt⎦ m,T ⎪ n −1 τ t m∈M ⎩k∈K m k T ⎫ ⎪ ⎬ ε m ε ε m ε + (ξt ) D ξt dt − (ξ ) D ξ dt t t m, T ⎪ fm,T I I ⎭ f for appropriate choice of K8 independent of x, T, τ , and ε̂ ≤ 1. By Lemma 6.5, this implies T 0 (ξtε ) D + μεt −D μεt ξtε dt ≤ 0 m∈M (ξtε ) Dm ξtε m,T I f which, by Lemmas 6.6 and 6.7, 2 ≤ K9 (1 + |x| )τ + I f m∈M (ξ ε ) Dm ξtε m, T t If (ξtε ) Dm ξtε m,T K8 (|ξrε |2 + |wrε |2 ) dr dt − T (M + 2)τ dt , dt − (ξ ε ) Dm ξtε m, T t I f dt for appropriate choice of K9 independent of x, T, τ , and ε̂ ≤ 1. Then, applying the ε same steps as used in bounding |ξtε − ξ t |, this is (73) ≤ K10 (1 + |x|2 )τ for appropriate choice of K10 independent of x, T, τ , and ε̂ ≤ 1. Combining (71), (72), and (73), (74) 0 T ε ε ε ε ξtε Dμt ξtε − ξ t Dμt ξ t dt ≤ K7 (1 + |x|2 )(τ + √ τ ) + K10 (1 + |x|2 )τ. Now we turn to the last two terms in (70). As with the above integral terms, ε ε ε 2 ε (ξTε ) P m ξTε − (ξ T ) P m ξ T ≤ |P m | ξtε − ξ t + 2|ξTε | ξtε − ξ t which by (66) and (64), . . . ≤ cP 2K4 (1 + |x|2 )τ + 2 K3 1 + |x|2 2K4 (1 + |x|2 )τ Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE where cP = maxm∈M |P m |, ≤ K11 (1 + |x|2 )(τ + (75) √ 3073 τ) for appropriate choice of K11 independent of x, T, τ, m, and ε̂ ≤ 1. Combining (70), (74), and (75), T T γ2 ε 2 γ2 ε 2 μεt ε m ε μεt ε m ε l (ξt ) − |wt | dt + V (ξT ) − l (ξ t ) − |wt | dt + V (ξ T ) 2 2 0 0 √ (76) ≤ Kδ (1 + |x|2 )(τ + τ ) for appropriate choice of Kδ independent of x, T, τ, m, and ε̂ ≤ 1. Combining (38) and (76), one has ST [V m ](x) − 0 T ε ε lμt (ξ t ) − √ γ2 ε 2 ε̂ ε |w | dt + V m (ξ T )≤ (1 + |x|2 ) + Kδ (1 + |x|2 )(τ + τ ). 2 t 2 This implies ¯ τ [V m ](x)≤ ε̂ (1 + |x|2 ) + K (1 + |x|2 )(τ + √τ ), ST [V m ](x) − S̄ δ T 2 and since this is true for all ε̂ ∈ (0, 1], we finally obtain ¯ τ [V m ](x)≤ K (1 + |x|2 )(τ + √τ ), ST [V m ](x) − S̄ δ T which completes the proof of Theorem 6.1. 0 Recall that a reasonable initialization is V = m∈M V m . Using the max-plus linearity of Sτ , one easily sees that Theorem 6.1 implies the following. Corollary 6.8. There exists Kδ < ∞ such that, for all sufficiently small τ > 0, 0 ¯ τ [V 0 ](x) ≤ K (1 + |x|2 )(τ + √τ ) 0 ≤ ST [V ](x) − S̄ δ T for all T ∈ (0, ∞), x ∈ Rn , and m ∈ M. 7. Finite-time truncation errors. Going back to our main subject, we want ¯ τ [V 0 ](x) toward V (x) as T → ∞ and to get an estimate of the convergence speed of S̄ T τ ↓ 0. In the previous section we have already obtained an estimate of the convergence ¯ τ [V 0 ](x) to S [V 0 ](x). We now need to evaluate the difference between S [V 0 ](x) of S̄ T T T and V (x) as T → ∞. Theorem 7.1. For all δ > 0 satisfying V ∈ Gδ , V ∈ Gδ , there exists K̄δ such that for all T > 0 and all x ∈ Rn , (77) K̄ δ (1 + |x|2 ). ST [V ](x) − V (x) ≤ T Proof. The proof is similar to results in [28], [29], [36]. In particular, let δ > 0 be . γ 2 c2A such that with γ 2 = (γ − δ)2 , one has c2 cD > 1, and suppose V ∈ Gδ , i.e., such that σ (78) 0 ≤ V (x) ≤ 2 2 cA γ |x| . 2c2σ Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3074 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG Recall the integral bound from Lemma 6.7 (which held for all t < ∞ and ε ∈ (0, 1]). As in [28] and [36], applying the bound to the integral between T̄ /2 and T̄ , one finds that, for any T̄ > 0, there exists T ∈ [T̄ /2, T̄ ] such that 2 ( ) cσ cD 2 2ε c2σ γ2 1 2 + + + |x| |ξTε |2 ≤ δ c2A δcA c2A c2σ cA T̄ ( ) 2 2 2 cσ cD γ 2 2ε cσ 1 (79) + + 2 + ≤ (1 + |x|2 ). δ c2A δcA c2A cσ cA T̄ Also, for all t ≥ T̄ one has by (62), |ξtε |2 ≤ |ξTε |2 + which by Lemma 6.6, ≤ c2σ ε 2 w L2 [T,t] cA c2 cD cA γ 2 2ε c2σ + 1+ σ + 2 |ξTε |2 . δ cA δcA cA cσ Replacing |ξTε |2 by its bound in (79) we find 2ε c2σ δ cA 2 ( ) 2 2ε c2σ γ2 cA γ 2 cσ cD 1 c2σ cD (80) + + + 2 + + 2 (1 + |x|2 ). 1+ δcA c2A cσ cA δcA cA cσ T̄ δ c2A 01 2 01 2 / / C2 C1 |ξtε |2 ≤ Finally, for any T > 0 given and any wε , με ε-optimal on [0, T ], T, wε , με ) + ε ST [V ](x) ≤ J(x, T ε γ2 lμt (ξt ) − |wtε |2 dt + V (ξTε ) + ε ≤ 2 0 which by (78), (80), and V ≥ 0, T ε γ2 lμt (ξt ) − |wtε |2 dt + V (ξTε ) +ε ≤ / 01 2 2 0 2 + cA γ 2c2σ / c2σ 2ε 2 + δ cA T ≥0 2ε + C1 C2 (1 + |x|2 ) , δ 01 2 c2σ c2A ε) ≥V (ξT where the lower bracket bound follows from Assumption Block (A.m) and (80). This is 2 2ε c2σ cA γ 2 2ε c2σ 2 ≤ ST [V ](x) + ε + + + C1 C2 (1 + |x| ) 2c2σ δ cA T δ c2A cA γ 2 2ε c2σ 2 2ε c2σ 2 = V (x) + ε + + + C1 C2 (1 + |x| ) . 2c2σ δ cA T δ c2A Since this is true for all ε > 0, one sees that for all T > 0, γ2 2 cA ST [V ](x) ≤ V (x) + C1 C2 (1 + |x|2 ), T 2c2σ Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE 3075 which for proper choice of K̄δ , K̄δ (1 + |x|2 ). ≤ V (x) + T We note that we can repeat exactly the same reasoning with V (x) = ST [V ](x) on the left side of the inequalities and ST [V ](x) on the right side. Hence, we obtain the result K̄ δ (1 + |x|2 ). ST [V ](x) − V (x) ≤ T 8. Combined errors. We are now able to give a precise estimate of the conver¯ τ [V 0 ] towards V as τ ↓ 0 and T → ∞. Indeed we have gence of S̄ T ¯ τ [V 0 ](x) ≤ V (x) = S [V ](x) S̄ T T K̄δ (1 + |x|2 ) T ¯ τ [V 0 ](x) + K̄δ (1 + |x|2 ) + K (1 + |x|2 )(τ + √τ ). ≤ S̄ δ T T 0 ≤ ST [V ](x) + ¯ τ [V 0 ](x) ≤ 2ε(1 + |x|2 ), we can choose For example, if we want 0 ≤ V (x) − S̄ T . (1) T ≥ K̄δ /ε = T , √ (2) τ such that τ + τ ≤ ε/Kδ . Suppose that such a τ satisfies τ ≤ 1. Then item (2) becomes τ ≤ [ε/(2Kδ )]2 . Hence, to get an approximation of order ε, it is sufficient to have N = T /τ ∝ ε−3 . 9. The theoretical and the practical. There are two aspects to this work. One is the theoretical result that there exists a numerical method (for this class of problems) which is not subject to the curse-of-dimensionality. The second is the question of the practicality of the new approach. As with the interior point methods developed for linear programming, where construction of increasingly fast methods required further advances over the initial algorithm concept, it is clear that this will be the case here as well. We will discuss the main theoretical computational cost bound and then turn to some remarks on the issue of practical implementation. 0 Suppose one begins the algorithm with an initial V , or equivalently an initial a0 , 0 J0 0 0 Vj for some set of initial quadratic forms V 0 = {Vj0 }Jj=1 . Let of the form V = j=1 . J0 =]1, J0 [. From section 8, one sees that we can obtain an approximate solution V ε with error 0 ≤ V − V ε ≤ ε(1 + |x|2 ), ¯ τ [V 0 ] in N = (K/ε3 ) steps, where K may depend on c , A, c , c , γ, where V ε = S̄ ε A D σ Nτ 0 = J0 . and choice of V 0 . Then the number of elements in the initial set of triples is #Q Recalling the algorithm from section 4, at the k + 1 step, one generates each of the 3 k by matrix/vector operations requiring C(1+n k+1 from the triples in Q ) triples in Q operations where we recall that n is the space dimension and C is a universal constant. k at each step k. Consequently, we have the Note that there are J0 M k triples in Q following result, which is of theoretical importance as it clearly states the freedom from the curse-of-dimensionality. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3076 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG 0 0 Theorem 9.1. Suppose one initializes with V 0 = {Vj0 }Jj=1 (where V = J0 0 j=1 Vj ). Then the number of arithmetic operations required to obtain a solution with error no greater than ε(1 + |x|2 ) over the entire space is bounded by N ε k CJ0 M (1 + n3 ), k=0 is where Nε = (K/ε3 ), K may depend on cA , A, cD , cσ , γ, and choice of V 0 , and C universal. Two remarks regarding the practicality are appropriate here. First, this approach is clearly most appropriate when one is not attempting to obtain a solution with extremely small errors, due to the curse-of-complexity (i.e., the exponential growth with base M ). Second, direct implementation of the algorithm of section 4 is likely not reasonable without some technique for mitigation of the exponential growth rate. We note the following techniques for reduction of the computational complexity, without which the algorithm is not practical. 0 that the initialization V = M9.1. mInitialization. In practice, wem have found 1 m V (where we recall that each V (x) = x P x and P m is the solution of m=1 2 the Riccati equation for the mth constituent Hamiltonian) greatly reduces the com0 putation over the use of initialization V ≡ 0. The savings are due to the reduction of Nε . 9.2. Pruning. We have also found that quite often the overwhelming majority k of the J0 M k triples at the kth step do not contribute at all to V . That is, they never n achieve the maximum value over M at any point x ∈ R \ {0}, and this provides an opportunity for reduction of computational cost. 0 a , the solution at each step had the Recall that with initialization a0 = j∈J0 representation given by (35). We now introduce the additional notation . k k k k q(j,m Q(j,mk ) , z(j,m (j,m k) = k), r k) k , let the set of all its progeny at for all j ∈ J0 and mk ∈ Mk . Given any q(kj̄,m¯k ) ∈ Q step k + κ be denoted by κ . k+κ m ∈ Mκ . k,κ ¯ = Q q ¯ k k κ (j̄,{m ,m }) (j̄,m ) Let . k k k k ak(j,mk ) (x) = 12 (x − z(j,m (j,m (j,m k ) ) Q(j,mk ) (x − z k)) + r k). The following is obvious from the definition of the propagation. Theorem 9.2. Suppose (j̄, m̄k ) ∈ J0 × Mk is such that (81) ak(j̄,m̄k ) (x) < ak (x) ∀ x ∈ Rn \ {0}. Then, for all κ > 0 and all mκ ∈ Mκ , ak(j,{mk ,mκ }) (x) < ak+κ (x) ∀ x ∈ Rn \ {0}. k . We say Q ∗ ≺ Q k if (81) holds for all qk k ∈ Q ∗ . ∗ ⊂ Q Let Q k k k (j,m ) Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE 3077 ∗ ⊂ Q satisfies Q ∗ ≺ Q k . Let Corollary 9.3. Suppose Q k k 3 ∗k+κ = Q ∗ ¯ ∈Qk (j̄,mk ) qk k,κ ¯ , Q (j̄,mk ) . ∗ . Let V 1,k+κ be the solution generated by Q 1 . Then 1 = Qk+κ \ Q and Q k+κ k+κ k+κ V 1,k+κ (x) = V k+κ (x) ∀ x ∈ Rn . Let us say that q(kj̄,m¯k ) is strictly inactive if (81) holds. Then the corollary implies that one may prune any and all strictly inactive elements at any and all steps k with no repercussions. To give some idea of the computational savings with such pruning, suppose that one happened to remove the same fraction f at each step of the iteration. Then the fractional computational reduction from that given by Theorem 9.1 would be Nε k k=0 [(1 − f )M ] , Nε k k=0 M which is (very) roughly (1 − f )Nε . In some of the examples tested so far, 1 − f was small, and so this type of pruning has been useful in practical implementation of the approach. It should be noted that condition (81) is not obviously easy to check. Instead we utilize the simple, conservative test where we prune q(kj̄,m¯k ) if there exists (j, mk ) such that ak(j̄,m¯k ) (x) < ak(j,mk ) (x) ∀ x ∈ Rn \ {0}. (Note that this test may be done analytically.) In spite of the apparent conservativeness of this test, it has produced excellent computational savings over implementation without pruning in some (but not all) cases. 0 to V from any V = 9.3. 0Overpruning. Note that the algorithm converges 0 V ∈ G for sufficiently small δ (with quadratic V of course). Suppose that δ j j∈J0 j k which were for steps k ∈ {1, 2, . . . , Kp } (for some finite Kp ), one pruned elements of Q Kp not necessarily strictly inactive. By viewing the resulting V as a new initialization, we see that the algorithm will nonetheless converge to the correct solution. Noting the curse-of-complexity, we see that this overpruning (removing potentially useful triples) may be an attractive approach. In practice, we have employed this approach in a heuristic fashion by, for a fixed number of steps, removing all triples whose corresponding ak(j,mk ) did not achieve the maximum over (j, mk ) ∈ J0 × Mk on some fixed, pre-specified, finite set Xp ⊂ Rn . For example, in some tests, we chose Xp to consist of the corners of the unit hypercube. Note that in this case #Xp = 2n , and so by performing this overpruning for a fixed number of steps we are introducing a curse-of-dimensionality-dependent component to the computations. (However, the growth rate of 2n is extremely slow relative to that for grid-based techniques.) In the examples so far tested, employing this purely heuristic pruning for quite a few steps led to tremendous improvements in computation time or equivalently, multiple orders of magnitude reduction in solution error ε(1 + |x|2 ). Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 3078 WILLIAM M. McENEANEY AND L. JONATHAN KLUBERG k , This leads to the question of whether some approach that overpruned the Q reducing the number of the (not strictly inactive) elements of Qk by some fraction going to zero as k increased, might be a highly effective approach. Of course, one would use a pruning criterion that was not curse-of-dimensionality dependent. Acknowledgments. The authors thank the referees for their substantial comments, which have led to a greatly improved paper. REFERENCES [1] M. Akian, S. Gaubert, and A. Lakhoua, The max-plus finite element method for solving deterministic optimal control problems: Basic properties and convergence analysis, SIAM J. Control Optim., 47 (2008), pp. 817–848. [2] M. Akian, S. Gaubert, and A. Lakhoua, A max-plus finite element method for solving finite horizon determinsitic optimal control problems, in Proceedings of the 16th International Symposium on Mathematical Theory of Networks and Systems, Leuven, Belgium, FWO, 2004. [3] F. L. Baccelli, G. Cohen, G. J. Olsder, and J.-P. Quadrat, Synchronization and Linearity, John Wiley, New York, 1992. [4] M. Bardi and I. Capuzzo-Dolcetta, Optimal Control and Viscosity Solutions of Hamilton– Jacobi–Bellman Equations, Birkhauser, Boston, 1997. [5] M. Boué and P. Dupuis, Markov chain approximations for deterministic control problems with affine dynamics and quadratic cost in the control, SIAM J. Numer. Anal., 36 (1999), pp. 667–695. [6] F. Camilli, M. Falcone, P. Lanucara, and A. Seghini, A domain decomposition method for Bellman equations, Contemp. Math., 180 (1994), pp. 477–483. [7] E. Carlini, M. Falcone, and R. Ferretti, An efficient algorithm for Hamilton-Jacobi equations in high dimensions, Comput. Vis. Sci., 7 (2004), pp. 15–29. [8] G. Cohen, S. Gaubert, and J.-P. Quadrat, Duality and separation theorems in idempotent semimodules, Linear Algebra Appl., 379 (2004), pp. 395–422. [9] G. Collins and W. M. McEneaney, Min-plus eigenvector methods for nonlinear H∞ problems with active control, Optimal Control, Stabilization and Nonsmooth Analysis, Lecture Notes in Control and Inform. Sci. 301, M.S. de Queiroz, M. Malisoff, P. Wolenski, eds., Springer, New York, 2004, pp. 101–120. [10] M. G. Crandall, H. Ishii, and P.-L. Lions, Uniqueness of viscosity solutions of Hamilton– Jacobi equations revisited, J. Math. Soc. Japan, 39 (1987), pp. 581–595. [11] M. G. Crandall and P.-L. Lions, Viscosity solutions of Hamilton–Jacobi equations, Trans. Amer. Math. Soc., 277 (1983), pp. 1–42. [12] M. G. Crandall and P.-L. Lions, On existence and uniqueness of solutions of Hamilton– Jacobi equations, Nonlinear Anal. Theory Appl., 10 (1986), pp. 353–370. [13] R. A. Cuninghame-Green, Minimax Algebra, Lecture Notes in Econom. Math. Systems 166, Springer, New York, 1979. [14] P. Dupuis and A. Szpiro, Convergence of the optimal feedback policies in a numerical method for a class of deterministic optimal control problems, SIAM J. Control Optim., 40 (2001), pp. 393–420. [15] L. C. Evans, Partial Differential Equations, Springer, New York, 1998. [16] M. Falcone and R. Ferretti, Convergence analysis for a class of high-order semi-Lagrangian advection schemes, SIAM J. Numer. Anal., 35 (1998), pp. 909–940. [17] W. H. Fleming, Max-plus stochastic processes, Applied Math. Optim., 49 (2004), pp. 159–181. [18] W. H. Fleming, Functions of Several Variables, Springer, New York, 1977. [19] W. H. Fleming and W. M. McEneaney, A max-plus-based algorithm for a Hamilton–Jacobi– Bellman equation of nonlinear filtering, SIAM J. Control Optim., 38 (2000), pp. 683–710. [20] W. H. Fleming and H. M. Soner, Controlled Markov Processes and Viscosity Solutions, Springer, New York, 1993. [21] F. John, Partial Differential Equations, Springer, New York, 1978. [22] J. W. Helton and M. R. James, Extending H∞ Control to Nonlinear Systems: Control of Nonlinear Systems to Achieve Performance Objectives, Adv. Des. Control 1, SIAM, Philadelphia, 1999. [23] V. N. Kolokoltsov and V. P. Maslov, Idempotent Analysis and Its Applications, Kluwer, Norwell, MA, 1997. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. CURSE-OF-DIMENSIONALITY-FREE CONVERGENCE 3079 [24] H. J. Kushner and P. Dupuis, Numerical Methods for Stochastic Control Problems in Continuous Time, Springer, New York, 1992. [25] G. L. Litvinov, V. P. Maslov, and G. B. Shpiz, Idempotent functional analysis: An algebraic approach, Math. Notes, 69 (2001), pp. 696–729. [26] V. P. Maslov, On a new principle of superposition for optimization problems, Russian Math. Surveys, 42 (1987), pp. 43–54. [27] W. M. McEneaney, A. Deshpande, and S. Gaubert, Curse-of-complexity attenuation in the curse-of-dimensionality-free method for HJB PDEs, in Proceedings of the American Control Conference 2008, Seattle, AIAA, 2008. [28] W. M. McEneaney, Max-Plus Methods for Nonlinear Control and Estimation, Systems Control Found. Appl. Birkhäuser Boston, Boston, 2006. [29] W. M. McEneaney, A curse-of-dimensionality-free numerical method for solution of certain HJB PDEs, SIAM J. Control Optim., 46 (2007), pp. 1239–1276. [30] W. M. McEneaney, Curse-of-dimensionality free method for Bellman PDEs with Hamiltonian written as maximum of quadratic forms, in Proceedings of the 44th IEEE Conference on Decision and Control, 2005. [31] W. M. McEneaney, A curse-of-dimensionality-free numerical method for a class of HJB PDEs, in Proceedings of the 16th IFAC World Congress, 2005. [32] W. M. McEneaney, Max-plus summation of Fenchel-transformed semigroups for solution of nonlinear Bellman equations, Systems Control Lett., 56 (2007), pp. 255–264. [33] W. M. McEneaney and P. M. Dower, A max-plus affine power method for approximation of a class of mixed l∞ /l2 value functions, in Proceedings of the 42nd IEEE Conference on Decision and Control, Maui, 2003, pp. 2573–2578. [34] W. M. McEneaney, Max-plus eigenvector methods for nonlinear H∞ problems: Error analysis, SIAM J. Control Optim., 43 (2004), pp. 379–412. [35] W. M. McEneaney Error analysis of a max-plus algorithm for a first-order HJB equation, in Stochastic Theory and Control, Proceedings of a Workshop held in Lawrence, Kansas, 2001, Lecture Notes in Control and Inform. Sci., Springer, B. Pasik-Duncan, ed., 2002, pp. 335–352. [36] W. M. McEneaney, A Uniqueness result for the Isaacs equation corresponding to nonlinear H∞ control, Math. Control Signals Systems, 11 (1998), pp. 303–334. [37] R. T. Rockafellar, Conjugate Duality and Optimization, CBMS-NSF Regional Conf. Ser. Appl. Math. 16, SIAM, Philadelphia, 1974. [38] R. T. Rockafellar and R. J. Wets, Variational Analysis, Springer, New York, 1997. [39] P. Soravia, H∞ control of nonlinear systems: Differential games and viscosity solutions, SIAM J. Control Optim., 34 (1996), pp. 1071–1097. [40] A. Szpiro and P. Dupuis, Second order numerical methods for first order Hamilton–Jacobi equations, SIAM J. Numer. Anal., 40 (2002), pp. 1136–1183. [41] A. J. van der Schaft, L2 -gain analysis of nonlinear systems and nonlinear state feedback H∞ control, IEEE Trans. Automat. Control, 37 (1992), pp. 770–784. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

Convergence Rate for a Curse-of-Dimensionality-Free Please share

Related documents

Products

Support

Convergence Rate for a Curse-of-Dimensionality-Free Please share

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib