Reduced Basis Approximation of Large Scale Parametric

advertisement
A. Schmidta · B. Haasdonka
Reduced Basis Approximation of Large Scale
Parametric Algebraic Riccati Equations
Stuttgart, June 2015
a
Institute for Applied Analysis and Numerical Simulation, University of Stuttgart,
Pfaffenwaldring 57, 70569 Stuttgart/ Germany
{schmidta,haasdonk}@mechbau.uni-stuttgart.de
www.agh.ians.uni-stuttgart.de
Abstract Optimal feedback control problems for partial differential equations (PDEs) can be tackled
by applying the linear quadratic regulator (LQR) theory to the spatially discretized equations. The
solution to the control problem is then obtained by solving an Algebraic Riccati Equation (ARE).
However, especially in multi-query scenarios for parametric problems, this procedure can become very
expensive since the ARE is a quadratic matrix equation of the same dimension as the underlying
discretized system. In this paper, we present the application of the Reduced Basis (RB) method for
parametric AREs (RB-ARE). We focus on the basis generation where we propose a new algorithm
called “Low Rank Factor Greedy” algorithm and on the residual based error estimation. Furthermore,
we investigate the error induced by using the reduced ARE solution as a surrogate for the expensive
high dimensional solution in the LQR problem by giving rigorous bounds on the cost functional, state
and outputs. An offline/online decomposition of the RB-ARE procedure will result in a computationally
efficient scheme.
Keywords Reduced Basis Method, Optimal Feedback Control, Algebraic Riccati Equation, Low
Rank Approximation
Stuttgart Research Centre for Simulation Technology (SRC SimTech)
SimTech – Cluster of Excellence
Pfaffenwaldring 5a
70569 Stuttgart
publications@simtech.uni-stuttgart.de
www.simtech.uni-stuttgart.de
A. Schmidta , B. Haasdonka
2
1 Introduction
The Algebraic Riccati Equation (ARE) is an important equation that frequently arises in control,
filtering and state estimation problems. The most popular application is certainly in the field of optimal
feedback control for linear and time invariant (LTI) systems, with the special case of the linear quadratic
regulator (LQR), see e.g. [32]. The goal to find an optimal control that transforms feedback information
of the system into suitable control signals for steering the system into a desired state can be solved
by finding the solution of an ARE. Other applications, like the optimal sensor placement or linear
quadratic Gaussian balanced truncation also require the solution of AREs, see [22, 37].
Many physical, biological or other technical phenomena are modeled in terms of partial differential
equations (PDEs). Often they depend on one or more parameters describing for instance material
coefficients, geometric properties or quantifying uncertainties. For computational purposes one usually
discretizes the PDE in the first place, which typically results in large discrete systems. Formulating the
AREs for those systems leads to quadratic matrix equations with the same dimension as the underlying
discrete system. Hence, solving this equation can be a very challenging task, especially in multi-query
scenarios like in real-time contexts or on small computing devices such as micro controllers. In the
recent decades a lot of very efficient solvers for the ARE have been developed. We refer to [6] for a
review of the state-of-the-art solvers.
The focus of this paper is to apply model reduction techniques to the ARE in order to obtain
approximate solutions rapidly and with reliable error bounds. One reduction technique that has proved
to work very well for parametric partial differential equations is the Reduced Basis (RB) method,
cf. [16, 17, 40]. Extensions to the open loop optimal control of PDEs have been carried out in [28]
or [38]. The RB method is divided in a computationally expensive offline part where a problem adapted
low dimensional solution space is constructed from high dimensional solutions for suitably selected
parameter values. The original problem is then projected onto this low dimensional space, which
results in small equations that can be solved rapidly online for different parameters. This expensive
procedure pays off during the online phase where solutions to the problem can be obtained rapidly with
a complexity ideally completely independent of the high dimension. A crucial ingredient in all certified
RB methods are a-posteriori error estimators that can be cheaply evaluated along with the reduced
solution. Such error estimators are used in the online step for the verification of the approximation
quality as well as in the offline step for choosing the worst approximated element in a greedy loop,
cf. [12, 14].
Although we are not directly dealing with partial differential equations, we adopt the basic RB
methodology for the parametric ARE scenario (RB-ARE). In the following section we briefly present
the basic setup and establish the notation used throughout this paper. In Section 3 the reduction
is explained and a new algorithm for the basis generation is presented, followed by a discussion of
approximation error statements in Section 4. We will focus on an a posteriori error estimator between
the true and the approximated solution of the ARE, but we will also establish a link to the feedback
control problem by providing rigorous bounds for the cost functional and the state if the approximation
to the ARE is used as a surrogate for the true solution. The online performance is ensured by a complete
offline/online decomposition of the procedure, which is explained in Section 5. Numerical examples in
Section 6 show the performance of our approach. We conclude in Section 7 with final remarks.
2 Problem Setting
Throughout this article we are considering the following parametric Algebraic Riccati Equation for the
unknown solution P (µ) ∈ Rn×n :
A(µ)T P (µ)E(µ) + E(µ)T P (µ)A(µ) − E(µ)T P (µ)B(µ)B(µ)T P (µ)E(µ) + C(µ)T C(µ) = 0.
(1)
The matrices E(µ), A(µ) ∈ Rn×n , B(µ) ∈ Rn×m , C(µ) ∈ Rp×n are given and the solution matrix
P (µ) is sought. We allow parameter dependency on all data matrices, with a parameter vector µ from
a bounded parameter set P ⊂ Rq . Furthermore, we require that E(µ) is invertible for all parameters.
Due to the direct link of the data matrices in equation (1) to dynamical systems, we call n the state
dimension, m the number of inputs and p the number of outputs of the system, see also Example 2.1
below.
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
3
Equation (1) consists of n quadratic equations for n2 unknowns. Hence the solvability is not clear
unless additional assumptions and requirements are posed. One possibility is to require the following
properties for the matrices A(µ), B(µ) and C(µ), cf. [32, 34]:
rk[λE(µ) − A(µ), B(µ)] = n,
T
T
T
rk[λE(µ) − A(µ) , C(µ) ] = n,
∀λ ∈ C with <(λ) ≥ 0,
(Stabilizability)
(2)
∀λ ∈ C with <(λ) ≥ 0,
(Detectability)
(3)
for all parameters µ ∈ P where rk(S) denotes the rank of the matrix S. As proven in [34], condition (2)
and (3) guarantee the existence of a unique solution P (µ) ∈ Rn×n of the ARE (1) with the properties
P (µ) = P (µ)T ,
E(µ)−1 (A(µ) − B(µ)B(µ)T P (µ)E(µ)) is Hurwitz.
P (µ) 0,
(4)
Here, the expression P 0 denotes positive semidefiniteness of the matrix P . A matrix is called
stable (or Hurwitz), when the real parts of all eigenvalues are strictly less than zero, i.e. all eigenvalues
are located in the open left complex half plane. The unique solution with the properties (4) is called
“stabilizing solution” of the ARE.
For improved readability we omit the explicit notion of the parameter dependence in the following
sections and come back to the full parametric problem in Section 5, where the offline/online decomposition is explained.
One of the main application for the ARE, which is also our main interest, is the feedback stabilization of LTI control systems.
Example 2.1 Consider the following LTI system in descriptor form:
E ẋ(t) = Ax(t) + Bu(t),
y(t) = Cx(t)
(5)
with matrices E, A ∈ Rn×n , B ∈ Rn×m and C ∈ Rp×n . Systems of this form typically arise after
the spatial discretization of time dependent partial differential equations, see for example [13, 27]. The
function u(t) is called “control” of the system and serves as an input variable that can be chosen to
alter the dynamics in a specific way. The function y(t) is called “output”.
One possible scenario that is used in many applications is the linear feedback control (also called
LQR) of (5), see e.g. [3, 5]. This means that one wants to find a linear control law of the form
u(t) = Kx(t) with the feedback gain matrix K ∈ Rm×n that steers the system from an initial state
x(0) = x0 to the origin x∗ = 0. Since there are potentially many different ways to achieve this, one
additionally requires the minimization of a cost functional
Z ∞
J(u) :=
(y(t)T Qy(t) + u(t)T Ru(t))dt,
(6)
0
where the matrices Q ∈ Rm×m , R ∈ Rp×p are symmetric and positive definite weighting matrices.
Without loss of generality, Q and R can be set to the identity matrix Im respectively Ip , because
otherwise a transformation ỹ = Q1/2 y and ũ = R1/2 u can be applied. Therefore, in all of the following
we assume Q = Im and R = Ip .
It can be shown, that the problem of minimizing (6) can be solved by setting u(t) = −B T P Ex(t)
where P is the unique symmetric and positive semidefinite solution to the ARE
AT P E + E T P A − E T P BB T P E + C T C = 0.
(7)
Since spatial discretizations like the finite element method for linear PDEs lead to large LTI systems
and hence to large AREs, a fast and reliable approximation of (7) is the key ingredient for solving
parametric LQR problems in multi-query or real-time scenarios.
Remark 2.2 Note that we are discussing the ARE in its generalized form (1) and (7), i.e. with explicit
dependence on the mass matrix E. This is due to the fact that the underlying control systems result
from spatial semidiscretizations of time dependent partial differential equations (PDE), which naturally
leads to mass matrices in front of the time derivatives in the ordinary differential equations (ODEs) for
A. Schmidta , B. Haasdonka
4
the state. Since we are assuming invertibility of E, equation (5) can be left multiplied by E −1 , resulting
in
−1
−1
ẋ = E
| {z A} x + |E {z B} u,
Â
y = Cx.
(8)
B̂
Formulating the ARE for this equation yields the standard form of the Riccati equation
ÂT P + P Â − P B̂ B̂ T P + C T C = 0
(9)
Many solvers for systems in the generalized form (7) internally transform the system to the standard
form (9). Since the multiplication of A with E −1 would destroy important properties like sparsity or
symmetry, usually a Cholesky factorization (or more generally an LU-factorization) E = LLT is used
together with a state transformation x̃ := LT x.
x̃˙ = L−1 AL−T x̃ + L−1 Bu,
y = CL−T x̃.
The explicit formation of the matrix product L−1 AL−T is avoided in favor of the solution of linear
systems by forward and backward substitution. We refer to the overview article [6] and the thesis [46]
for further information.
We will assume that the systems under consideration are stable, i.e. E −1 A has only eigenvalues in the
left open complex half plane. Unstable systems can be preprocessed by finding a matrix H such that
E −1 (A − BH) is stable and by replacing A with A − BH in the remainder of this article. Note that
this is always possible due to the controllability assumption (2).
If not otherwise stated, we will denote k · k = k · k2 as the Euclidean 2-norm for vectors and induced
operator norm of an operator. Further, for a matrix A = (aij )ni,j=1 ∈ Rn×n , we define the Frobenius
p
Pn
norm kAkF = tr (AT A), where tr(A) := i=1 aii is the trace function. The eigenvalues of A are
denoted by λi (A) where we will impose a descending ordering of the real parts
<(λn (A)) ≤ · · · ≤ <(λ1 (A)).
Another important property for matrices is the logarithmic norm, cf. [9, 49]. The logarithmic norm of
A is defined as
ν[A] := lim
h→0+
kIn + hAk − 1
.
h
(10)
In the case k · k = k · k2 , the logarithmic norm can be calculated as ν[A] = λ1 ( 12 (A + AT )). The
logarithmic norm has the property, that it is directly related to the decay of the norm of the matrix
exponential eAt , i.e. for any induced norm k · k it holds
keAt k ≤ eν[A]t ,
t ≥ 0.
(11)
A proof for this well known inequality can for example be found in [19]. We will make use of this
property in Section 4.1.1, for proving the stability of the reduced controller in the full dimensional LTI
system. A matrix A with ν[A] < 0 is called dissipative. Usually we consider systems with mass matrices
E. The logarithmic norm for those systems is defined as ν[E −1 A]. For symmetric E one can employ
the Cholesky factorization to show that the calculation of ν[E −1 A] can be equivalently performed by
finding the eigenvalue of the generalized eigenvalue problem 21 (A + AT )x = λEx with the largest real
part, see also [39].
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
5
3 Reduction Approach
In cases where the number of inputs m and the number of outputs p is small compared to the number
of degrees of freedom n of the underlying system, the solution matrix P to the ARE (1) often can be
approximated by a low rank factorization of the form P ≈ ZZ T with Z ∈ Rn×k and k n. This is
possible due to a fast decay in the singular values of P . In the contributions [2, 35, 42, 53] theoretical
justifications for this observation are carried out, especially for the Lyapunov equation with low rank
right hand side. Lyapunov equations are closely linked to the ARE, since for example a classical Newton
approach for solving the nonlinear equation (1) results in a loop where Lyapunov equations with a low
rank structure have to be solved, cf. [4, 6]. Hence, P inherits this property.
The fact, that the solution can be efficiently represented by low rank factors implies that the solution manifold of the parametric low rank factors Z can be approximated by linear spaces of much lower
dimension than the original system dimension n. Hence, a projection of Equation (1) on the relevant
subspace might lead to a much smaller problem that can easily be solved by standard techniques. Projection based methods for the Lyapunov equation are discussed for example in [44] and were extended
to Krylov and rational Krylov subspace methods in [24, 48]. Articles concerned with projection based
solution algorithms for the ARE are for instance [20, 25, 26]. For the parametric case, in [31] the RB
method for large and sparse Lyapunov equations has been presented.
A general class of projections can be described by a pair of biorthogonal matrices V, W ∈ Rn×N with
T
W V = IN and N n. The columns of W span the N dimensional subspace W := colspan(W ) ⊂ Rn
and hence any matrix P̂ ∈ Rn×n with colspan(P̂ ) ⊂ W can be written as W G for some matrix
G ∈ RN ×n . Since the solution matrices of the ARE are always symmetric, one has the additional
requirement P̂ = W G = GT W T = P̂ T . Hence, the matrix P̂ can also be written as P = W ḠW T for
a symmetric (and positive semidefinite) matrix Ḡ ∈ RN ×N . In the following, we will write Ḡ = PN ,
since this will be the N dimensional approximation to the full solution P . Substituting the solution P
by its approximation
P̂ := W PN W T
(12)
in the left hand side of equation (1), we arrive at
R(W PN W T ) = AT W PN W T E + E T W PN W T A − E T W PN W T BB T W PN W T E + C T C,
where we introduced the residual R(·) of the ARE. Requiring a so called Galerkin condition1 V T R(P̂ )V =
0 leads to the reduced ARE
T
T
T
T
ATN PN EN + EN
PN AN − EN
PN BN BN
PN EN + CN
CN = 0,
(13)
where the reduced matrices are defined as
EN = W T EV,
AN := W T AV,
B = W T B,
CN = CV.
(14)
Equation (13) is of dimension N × N and can be easily solved by using standard methods like a
Hamiltonian approach, which is implemented in the function care in MATLAB.
The projection based factorization naturally defines a low rank approximation for the full dimensional solution matrix: We can calculate the low rank factor Ẑ by employing an eigenvalue decompoT
sition of the small matrix PN = UN SN UN
and by setting
1/2
Ẑ := W UN SN ∈ Rn×N ,
(15)
which yields P̂ = Ẑ Ẑ T , since
1/2
1/2
T
Ẑ Ẑ T = W UN SN SN UN
W T = W PN W T .
Instead of forming the whole and usually dense n × n matrix P̂ through (12), the low rank factor (15)
is used.
1
See for example [44]
A. Schmidta , B. Haasdonka
6
Remark 3.1 We want to briefly compare the presented RB-ARE reduction approach to the classical
projection based model reduction techniques for dynamical systems such as Krylov-subspace methods,
balanced truncation or proper orthogonal decomposition, cf. [1]. We will again consider systems of the
form (5).
All these model reduction approaches are based on the projection of the state to a subspace, whose
basis is defined by the columns of a projection matrix V ∈ Rn×N . If we now introduce a reduced
state xN (t) ∈ RN and approximate x(t) ≈ V xN (t), we can substitute the high dimensional state
in equation (5). Restricting the residual to be orthogonal to another test space, defined by a matrix
W ∈ Rn×N , e.g. after introducing a Petrov-Galerkin condition, we end up with the reduced system
(
W T EV ẋN (t) = W T AV xN (t) + W T Bu(t)
yN (t) = CV xN .
(16)
Abbreviating EN := W T EV , AN := W T AV , BN = W T B and CN := CV reveals the link to the
reduced ARE (13). Defining the minimization target
Z
min
∞
kyN (t)k2 + ku(t)k2 dt
(17)
0
shows, that equation (13) solves the reduced LQR problem (17),(16). Although the reduced closed loop
system
T
EN ẋN = (AN − BN BN
PN EN )xN
is optimally controlled by chosing PN as the solution to the reduced ARE (13), the reconstruction
P̂ := W PN W T must not necessarily yield a good approximation to the solution P of the full dimensional
ARE (1) if only state snapshots are used for the generation of the basis, see for example [7, 47]. This
will be the motivation for our basis generation procedure, see 3.1.
We finish this section with a useful property that allows verifying zero error whenever the projection
matrix contains the whole (low rank) solution space of the ARE.
Proposition 3.2 (Reproduction of Solutions) Let V, W ∈ Rn×N with W T V = IN be given. Let
the symmetric and positive semidefinite matrix P ∈ Rn×n be the unique solution to the ARE (1) and
assume colspan(P ) ⊂ colspan(W ). We assume that (AN , BN ) is stabilizable and (AN , CN ) from (14)
is detectable. Then it follows
P = W PN W T
where PN solves the reduced ARE (13).
Proof Due to symmetry of P and the assumption that colspan(P ) ⊂ colspan(W ), there exists a
symmetric and positive semidefinite matrix G ∈ RN ×N with P = W GW T . Using the definition of the
ARE, we get
AT W GW T E + E T W GW T A − E T W GW T BB T W GW T E + C T C = 0.
Multiplying this equation with V T from left and V from right yields
T
T
T
T
ATN GEN + EN
GAN − EN
GBN BN
GEN + CN
CN = 0,
(18)
where the matrices are defined as in (14). Due to the assumption of stabilizability and detectability
of the reduced matrices, the reduced equation (13) has a unique positive semidefinite and symmetric
solution PN . Since both G and PN satisfy the same Riccati equation, we can conclude G = PN and
hence P = W PN W T .
t
u
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
7
3.1 Low Rank Factor Greedy
Besides biorthogonality W T V = IN we have not yet made any further assumptions on the matrices
W, V . There are many possible ways how a basis can be constructed. In a parametric scenario, the
reduced basis (RB) approach has proven to work well for the approximation of partial differential
equations, see for example [12, 40, 43, 55]. The goal in all RB methods is to split the calculation
in an offline/online scheme, where the online computation time ideally does not depend on the high
dimension n. This is achieved by accepting a possibly expensive offline phase, where a problem adapted
solution space is constructed which then allows the fast solution in the online phase. Rigorous error
bounds that are likewise fast to evaluate are the key ingredient for justifying the approach online.
The standard approach for constructing the reduced basis works in a greedy fashion as it has
been introduced for RB methods in [55]. The greedy procedure is an iterative algorithm that tries
to uniformly minimize the approximation error on a training set Ptrain ⊂ P by subsequently adding
vectors constructed from the worst approximated element to the basis. A crucial ingredient in all
greedy approximations is the choice of a suitable error indicator for selecting the worst approximated
element. There are several possible choices, each with its own advantages and disadvantages. One
possibility would certainly be the true approximation error. However, this is often not possible, since
the evaluation in each iteration would be too expensive as the true solution for all training parameters
would be required. A better approach makes use of error indicators that are cheaply computable for
all parameters in the training set Ptrain .
We propose a new algorithm for the construction of a subspace for the RB-ARE method which we
denote Low Rank Factor Greedy (LRFG) procedure. The pseudocode is listed in Algorithm 1. The
basic structure of the algorithm follows the greedy-principle: In a loop the error indicator is evaluated
for all elements µ in the training set Ptrain ⊂ P and for the current reduced basis, that is spanned by
the columns of V . If the algorithm determines a maximum error indicator value beneath a given desired
tolerance ε, the loop terminates and the basis construction is complete. The biorthogonal counterpart
W is chosen as V , which is a suitable choice since V itself has orthogonal columns. Otherwise, the
parameter µ∗ ∈ Ptrain of the worst approximated solution for the current basis is selected by finding
the parameter µ∗ that maximizes the error indicator and the high dimensional low rank factor Z(µ∗ )
is obtained by a suitable solution algorithm for the ARE. Although the low rank factorization itself
is only an approximation to the true solution, we will assume that the error is negligible compared to
the model reduction error.
The next step consists in finding the relevant information in the low rank factor Z(µ∗ ). As a
preprocessing step, only the part which is perpendicular to the current basis Z := (In − V V T )Z(µ∗ )
is considered. Afterwards the remaining information in the column space of Z is compressed and
added to the basis. We propose two different ways for extracting the relevant information. Both rely
on the proper orthogonal decomposition (POD, sometimes also called principal component analysis)
technique which also appears for example in the basis building procedure for time-dependent problems,
see e.g. [18, 56]. The idea behind POD is to find orthogonal vectors z̃l ∈ Rn that capture the essential
information in the columns of Z ∈ Rn×k = (z1 , . . . , zk ). Mathematically speaking, we seek the solution
of the following constrained maximization problem:
max
z̃1 ,...,z̃l ∈Rn
k
l X
X
|hzj , z̃i i|2 ,
subject to hz̃i , z̃j i = δij , for i, j = 1, . . . , l
(19)
i=1 j=1
where h·, ·i denotes the standard inner product in Rn . Problem (19) can be solved by setting z̃i = ui
for i = 1, . . . , l, where ui ∈ Rn are the first columns of the matrix U1 in the thin singular value
decomposition Z = U1 SU2T , with S = diag(σ1 , . . . , σk ) ∈ Rk×k and U1 ∈ Rn×k with orthogonal
columns. In the following we will call z̃i POD modes. Often one does not want to prescribe the number
of elements that should be returned by the POD, but a tolerance εI for the error for the approximation
by the first POD modes. This can for instance be implemented by determining the smallest natural
number l that satisfies 1 − E(l) < εI , where
Pl
σ2
E(l) := Pri=1 i2 ,
i=1 σi
r := rk(Z).
(20)
A. Schmidta , B. Haasdonka
8
This function represents the “energy” of the snapshots captured by the first l POD modes. We will
denote by (z̃1 , . . . , z̃l ) = Z̃ = POD(Z, εI ) the POD method applied to the matrix Z with a prescribed
inner tolerance εI and POD1 (Z) denotes the first dominant mode returned by (19) for l = 1. We refer
to [56] for more information on POD in the context of model reduction.
The first possibility that we propose is to only add the first dominant mode z̃1 = POD1 (Z) in
each iteration of the greedy loop (add one). This clearly results in a very good basis but comes with
high computational costs, since in each iteration the error indicator must be evaluated and a POD
must be performed. We hence propose an alternative that will lead to a potentially larger basis but
results in a faster offline phase: In each iteration of the greedy loop, we add more information to the
basis by extending it with the vectors resulting from the POD technique with prescribed tolerance
εI (add many). This basis generation approach will in most cases result in larger bases but lower the
offline computation time. This theoretical expectation is confirmed by our numerical experiments, see
Section 6.2.
In practice, the algorithm must certainly be extended at some points. For example it must be
ensured that the algorithm terminates with a reasonably sized basis. This can be achieved by prescribing
a maximum number of basis elements or by using techniques such as adaptive basis enrichment or ppartitioning, cf. [15].
Algorithm 1: Low Rank Factor Greedy Algorithm (LRFG) for the basis generation.
Data: Initial basis matrix V0 , training set Ptrain ⊂ P, tolerance ε, mode ∈ {add one, add many}, error
indicator ∆(V, µ) (, inner tolerance εI )
Set V := V0 .
while maxµ∈Ptrain ∆(V, µ) > ε do
µ∗ := arg maxµ∈Ptrain ∆(V, µ)
Solve the full dimensional ARE for the low rank factor Z(µ∗ )
Set Z⊥ := (In − V V T )Z(µ∗ )
if mode == add one then
Z̃ = POD1 (Z⊥ )
else
Z̃ = POD(Z⊥ , εI )
Extend the current basis matrix V = (V, Z̃)
Return V .
4 Error Estimation
A crucial ingredient in certified RB methods is the validation of the procedure by means of error
bounds. They should be rigorous and cheaply computable, which means that in the ideal case the
evaluation complexity of the bound is independent of the high dimension n. The main result in this
section is a computable a-posteriori error bound for the error between the approximated and the true
solution to the ARE. The application of the bound however requires the knowledge of several constants
that are very expensive to calculate. As a remedy we state bounds for those constants and show how
interpolation based methods can be used to achieve fast (but nonrigorous) approximations to those
constants. Based on the bound for the ARE we discuss suboptimality results for the cost functional, the
state and the outputs of the optimal feedback control problem, if the approximation to the ARE solution
is used as a surrogate for the full dimensional solution. A complete offline/online decomposition of the
procedure allows the calculation of the approximation and the corresponding bounds in a complexity
independent of the high dimension n and thus ensures a good online performance.
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
9
4.1 A-Posteriori Error Estimator for the Parametrized ARE
In this section we present a new residual based error estimator for approximations to the generalized
ARE (1). We recall the definition of the residual of the ARE for P ∈ Rn×n :
R(P ) := AT P E + E T P A − E T P BB T P E + C T C.
(21)
n×n
The ARE can be seen as a nonlinear problem defined on the Banach space R
. We make use of
a generic theorem that can be found in [8], and which can be used to derive a-posteriori residual
based error bounds for nonlinear problems. We begin with a rigorous statement and later relax the
assumptions to allow online efficient computations.
Theorem 4.1 (Residual Based Error Bound (Rigorous)) Let P̂ ∈ Rn×n be an approximation
to the unique positive semidefinite and stabilizing solution to the ARE (1). Set
Y (P̂ ) := A − BB T P̂ E
and assume that the matrix E −1 Y (P̂ ) is stable. Define the linear operator LP̂ := DR|P̂ :
LP̂ : Rn×n → Rn×n :
N 7→ Y (P̂ )T N E + E T N Y (P̂ )
(22)
and the constants γ := kL−1
k = supkSk=1 kL−1
(S)k and ε := kR(P̂ )k. Define the function
P̂
P̂
L(α) :=
sup
kLP̂ − LX k,
(23)
X∈B̄α (P̂ )
where B̄α (P̂ ) ⊂ Rn×n denotes the closed ball of radius α around P̂ in Rn×n . If the inequality
2γL(2γε) ≤ 1
(24)
is valid, the following upper bound for the error between the true solution P and the approximated
solution P̂ holds
γε
≤ 2γε.
(25)
kP − P̂ k ≤
1 − γL(2γε)
Proof Since the residual (21) is quadratic in P it is Fréchet differentiable with derivative DR|P̂ = LP̂ .
This operator is an isomorphism whenever the matrix E −1 Y (P̂ ) is stable, which is true due to the
assumptions. This follows from the solution theory for Lyapunov equations, see e.g. [21]. We hence can
define the constant γ = kL−1
k. The proof then follows the lines in [8, Theorem 2.1]. Note that the final
P̂
bound stated in [8] is the inequality kP − P̂ k ≤ 2γε. The potentially sharper bound kP − P̂ k ≤
occurs in the proof, but the assumption 2γL(2γε) < 1 is used to get the upper bound 2γε.
γε
1−γL(2γε)
t
u
The application of Theorem 4.1 requires the knowledge of the constants γ, L(·) and the norm of the
residual ε, all depending on the n dimensional data. In order to get an error bound that only depends on
the RB dimension N , we state the following theorem, where the constants are replaced by computable
upper bounds.
Proposition 4.2 (Computable Error Bound) Let the assumptions of Theorem 4.1 be valid. Assume further, that rapidly computable upper bounds γ ≤ γN , ε ≤ εN are available. Then it holds:
1. The function L(α) in Theorem 4.1 can be bounded by L(α) ≤ 2αkBk2 kEk2
2. If the condition
2
8kEk2 kBk2 εN γN
≤1
(26)
is valid, the following computable bound holds
kP − P̂ k ≤ ∆P :=
1−
γN εN
2
4γN εN kEk2 kBk2
≤ 2γN εN .
(27)
A. Schmidta , B. Haasdonka
10
Proof We start by having a closer look at the definition of L(α):
L(α) :=
kLP̂ − LX k
sup
X∈B̄α (P̂ )
=
sup
sup k(E T BB T (X − P̂ )S)T E + E T S(BB T (X − P̂ )E)k
X∈B̄α (P̂ ) kSk=1
≤
sup
sup 2kEk2 kBk2 kX − P̂ kkSk
X∈B̄α (P̂ ) kSk=1
≤ 2αkBk2 kEk2 .
Here B̄α (P̂ ) is the closed ball with radius α around P̂ . The application of Theorem 4.1 requires a
validity criterion (24), which is satisfied due the upper bound for L(α) derived before and ε ≤ εN ,
γ ≤ γN :
2γL(2γε) ≤ 2γN 2(2γN εN )kBk2 kEk2
(26)
2
= 8γN
εN kBk2 kEk2 ≤ 1
Hence Theorem (4.1) is applicable and results in the upper bound
kP − P̂ k ≤
1−
γN εN
2
4γN εN kBk2 kEk2
(26)
≤ 2γN εN .
t
u
In the following sections we see how suitable upper bounds εN and γN can be obtained, such that an
online complexity independence of n is guaranteed.
We want to emphasize, that the above bound can theoretically be made arbitrarily sharp, as the
residual norm ε is directly dependent on the quality of the reduced space. In particular, if the validity
criterion (26) is not valid, the reduced space should be improved (e.g. by some additional greedy steps
or by rerunning the greedy procedure with lower tolerances). A simple but important consequence of
the Propositions 3.2 and 4.2 is the correct prediction of zero error when the reduced space contains
the complete low rank factor for a given parameter:
Corollary 4.3 (Zero Error Prediction) Let the assumptions of Proposition 3.2 hold, i.e. in particular
colspan(P ) ⊂ colspan(W ). Assume that εN := ε. Then ∆P = 0.
Proof From Proposition 3.2 we can conclude P̂ = P and hence R(P̂ ) = 0 resulting in ε = 0 and hence
∆P = 0 from Proposition 4.2.
t
u
We want to provide a short comparison of existing error bounds for the ARE and want to highlight
the difference to the bound stated in Proposition 4.2. All existing error bounds are restricted to the
case where E = In , hence our bound generalizes the existing work. Thus, to be able to compare our
result to theorems stated in the literature, we set the mass matrix to the identity. Articles devoted to
residual based error bounds for the ARE can for example be found in [11, 29, 30, 52]. We will briefly
discuss two existent bounds and compare them with the result obtained in Proposition 4.2.
In [30] an error bound is provided for answering the question whether an additional Newton refinement will improve the approximative solution to the ARE. Their resulting error bound has a very
similar structure to our bound.
Theorem 1 (Bound 2’ in [30]) Let A − BB T P̂ be stable and assume that kP − P̂ k < 1/(3kBk2 γ)
and 4γ 2 kBk2 ε < 1, where γ and ε is defined as in Theorem 4.1. Then
kP − P̂ k ≤
2γε
1+
p
1 − 4kBk2 γ 2 ε
≤ 2γε.
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
11
The application of this theorem requires the a-priori bound on the difference of the unknown correct
solution and the approximated solution. Since we want to find a-posteriori error bounds that provide
rigorous predictions to the true error, this assumption is not feasible in our setting. However we note
that the resulting bound has similar structure to the bound in Proposition 4.2.
A potentially better and sharper bound has been developed in [52].
Theorem 2 (Theorem 5.1 in [52]) Let P̂ be the approximate unique Hermitian positive semi-definite
solution P to the ARE. Assume that A−BB T P̂ is stable. Define the linear operator L : Hn×n → Hn×n
by
LH := (A − BB T P̂ )T H + H(A − BB T P̂ ),
H ∈ Hn×n
where H = {H ∈ Cn×n : H = H̄ T } is the set of Hermitian matrices. Let R̂ := R(P̂ ) be the residual
and let k · k be any unitarily invariant norm. If
4kL−1 kkL−1 R̂kkBk2 < 1
then
kP − P̂ k ≤ 2kL−1 R̂k.
This bound is potentially smaller than our bound. This can be seen after splitting the norm kL−1 R̂k ≤
kL−1 kkR̂k. However the application of this theorem requires the calculation of the norm kL−1 R̂k which
(to our best knowledge) cannot be performed in an online efficient manner.
4.1.1 Stability and Closed Loop Behaviour
A crucial requirement for the applicability of Proposition 4.2 is the stability of the matrix E −1 Y =
E −1 (A − BB T P̂ E). In terms of the LQR problem this is equivalent to the asymptotic stability of the
closed loop system
E ẋ = (A − BB T P̂ E)x,
(28)
which should be ensured either by a-priori or a-posteriori analysis. Furthermore, we will see in the next
chapter that the decay of the LTI system (28) is closely linked to the calculation of upper bounds for
the constant γ.
We begin with a simple consequence of the properties of the logarithmic norm of the matrix E −1 Y .
Proposition 4.4 (Sufficient Condition for Stability by Exponential Bounds) Let the matrix
E −1 A be stable and set Y = A − BB T P̂ E. Then the matrix E −1 Y is stable whenever one of the
following requirements is valid
1. ν[E −1 Y ] < 0
2. ν[E −1 A] + ν[−E −1 BB T P̂ E] < 0
3. ν[E −1 A] + kE −1 BB T P̂ EkF < 0.
−1
−1
Proof If ν[E −1 Y ] < 0, the stability follows from keE Y t k ≤ eν[E Y ]t → 0 as t → ∞. The inequality
ν[E −1 Y ] = ν[E −1 A − E −1 BB T P̂ E] ≤ ν[E −1 A] + ν[−E −1 BB T P̂ E] implies the second statement.
Finally, the bound ν[E −1 BB T P̂ E] ≤ kE −1 BB T P̂ Ek ≤ kE −1 BB T P̂ EkF proves the last result.
t
u
For more details and proofs of the used properties for the logarithmic norm ν[·] we refer to [49].
Applying this Lemma requires an efficient way to calculate the logarithmic norm ν[E −1 A]. This can
for example be implemented efficiently by employing the inverse power iteration. Furthermore, more
sophisticated methods like the successive constraint method or the min-Θ approach allow the rapid
calculation of lower bounds of stability factors with a complexity independent of n which could as well
be applied to our situation, see for example [23, 40]. Especially the last criterion of Proposition 4.4 is
A. Schmidta , B. Haasdonka
12
easy to check a-posteriori since it only involves the calculation of the norm of kE −1 BB T P̂ EkF which
can be efficiently performed due to a possible offline/online decomposition, see Section 5.
Note that the above Lemma has only limited applicability, since the logarithmic norm of a stable
matrix E −1 Y can be positive. A more detailed analysis indeed shows, that the value ν[E −1 Y ] is an
indicator for the transient growth at time t = 0, since
d
E −1 Y t ke
= ν[E −1 Y ],
k
dt+
t=0
where dtd+ is the upper right-hand derivative, see for instance [21, Prop. 5.5.8]. Hence ν[E −1 Y ] > 0 only
shows that the norm of the matrix exponential initially grows and it does not make any predictions
−1
for the long term behaviour of keE Y t k.
Proposition 4.4 furthermore indicates, that the scaling of the matrix B could be a way to ensure
stability for the closed loop system whenever the requirement ν[E −1 A] + kE −1 BB T P̂ Ek for the application of Proposition 4.4 is not fulfilled. However, for example in the LQR application the matrix
B directly determines the influence of the control. Decreasing the norm of B thus leads to weaker
control impact, which is not always desirable when for instance a fast stabilization should be achieved
by feedback control. We will discuss this phenomenon by an example in Section 6.2.
4.1.2 The Constant γ
One of the main ingredients in the error estimation theory shown in the previous section is the constant
γ. It is crucial to have a good upper bound or a suitable approximation γN since it enters the validity
criterion and the resulting error bound for the ARE. We will see that a direct calculation of γ is not
feasible in an online context, since it would involve the solution of a large scale Lyapunov equation
with high rank right hand side, what can only be realized with high computational effort.
We recall the definition of γ:
−1
−1
γ = k DR|P̂ k = sup k DR|P̂ (X)k.
(29)
kXk=1
The derivative of the ARE at the point P̂ is a generalized Lyapunov operator (see also the definition
in Equation (22)):
DR|P̂ (X) = LP̂ (X) := Y (P̂ )T XE + E T XY (P̂ ),
(30)
where Y (P̂ ) = A − BB T P̂ E. If not otherwise stated we will omit the explicit dependence on P̂ and
write L(X) and Y . Theory and numerics have been extensively studied for the generalized Lyapunov
equation, cf. [24, 35, 41, 42, 50, 51]. The following lemma gives an explicit formula and computational
procedure for the norm of the inverse operator L−1 .
Lemma 4.5 (Calculation of γ) Let E −1 Y be stable. It holds γ = kL−1 k = kHk where H is the
unique solution of the generalized Lyapunov equation
Y T HE + E T HY = −In .
t
u
Proof See for example [51].
Note that this Lyapunov equation is of dimension n. Thus the computational demands are high as the
right hand side has no low-rank structure which indicates that the solution cannot be approximated
efficiently by low rank factors as in the ARE case, and projection based model reduction techniques
and solution algorithms will likely fail. Since γ = kL−1 (−In )k, an explicit representation of the inverse
operator L−1 is a good starting point for deriving upper bounds for γ:
Lemma 4.6 Let Y, E ∈ Rn×n such that E is nonsingular and E −1 Y is stable. Then the inverse
operator for (30) is explicitly given by
Z ∞
−1
T
−1
L−1 (X) = −E −T
e(E Y ) t Xe(E Y )t dtE −1 .
(31)
0
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
13
Proof The proof exploits the equivalence of the generalized Lyapunov equation (30) and the Lyapunov
equation in the case where E is nonsingular. The latter is defined as
L1 (X) := (E −1 Y )T X + X(E −1 Y ).
For this representation, a result from [33] shows, that the inverse operator is explicitly given by
Z ∞
−1
T
−1
L−1
e(E Y ) t Xe(E Y )t dt.
(X)
=
−
1
0
On the other hand, we can easily verify L(X) = L1 (E T XE) and hence we have
Z ∞
−1
T
−1
−1
−T −1
−1
−T
L (X) = E L1 (X)E = −E
e(E Y ) t Xe(E Y )t dtE −1
0
t
u
which proves the claimed result.
Remark 4.7 A more detailed discussion of γ reveals deeper insights to interesting properties of the
closed loop LTI system with E = In
ẋ = (A − BB T P )x,
x(0) = x0 .
(32)
The discussion in [29] shows that there is a close link between the “damping” of the solution of (32)
and γ:
√
γ = max kx(·, x0 )kL2 ([0,T ],Rn )
kx0 k=1
This equivalence shows that large γ values indicate a bad damping of the closed loop solution.
In the following we will discuss different possibilities to approximate γ. We will divide this in
rigorous bounds and non-rigorous estimates, where the latter will lack an analytic justification but still
may be accurate and are rapidly computable. The explicit formula (31) in Lemma 4.6 shows that γ
−1
can be bounded once a bound for the matrix exponential eE Y t is available. We state one possibility
which is based on the logarithmic norm.
Lemma 4.8 Let ν = ν[E −1 Y ]. If ν < 0, the upper bound
γ ≤ γN,1 := −
kE −1 k2
2ν
(33)
is valid.
Proof After bounding the explicit solution formula (31), it follows
Z ∞
(E −1 Y )T t (E −1 Y )t γ = kL−1 (−In )k ≤ kE −1 k2 e
e
dt
0
Z ∞
−1
≤ kE −1 k2
keE Y t k2 dt
0
Z ∞
kE −1 k2
≤ kE −1 k2
e2νt dt = −
.
2ν
0
t
u
The calculation of kE −1 k can be done without an explicit formation of the inverse matrix, by using
1
the equality kE −1 k = σmin
with σmin denoting the smalles eigenvalue of E. Note that the logarith−1
−1
mic norm ν[E Y ] = ν[E (A − BB T P̂ E)] is not available in the online phase without solving an
expensive n dimensional eigenvalue problem. However, by using the same arguments as in the proof of
Proposition 4.4, ν[E −1 Y ] can be replaced by the upper bound ν[E −1 A] + kE −1 BB T P̂ EkF as long as
the latter is less than 0, which can be calculated in O(N ):
A. Schmidta , B. Haasdonka
14
Corollary 4.9 Let P̂ ∈ Rn×n . If ν[E −1 A] + kE −1 BB T P̂ EkF < 0, the upper bound
γ ≤ γN,2 := −
kE −1 k2
2(ν[E −1 A] + kE −1 BB T P̂ EkF )
(34)
is valid.
Although many problems that describe physically stable phenomena yield LTI systems with ν[E −1 A] <
0, the closed loop system with the applied feedback control must not be dissipative. In this case the
bounds in Lemma 4.8 and Corollary 4.9 cannot be applied and other approaches must be used.
We furthermore note that this bound might lead to bad approximations when E 6= In . Usually
the mass matrices E are sparse and have band structure, but the inverse often is dense and has a
large norm. As the inverse enters the bound quadratically one would expect a bad estimation. Indeed,
the numerical experiments in Section 6.2 confirm the expectation. Hence, other approaches must be
employed for estimating or approximating γ.
Besides the rigorous error bounds presented above, one can also use interpolation methods that may
not lead to rigorous bounds but provide very efficient and often quite accurate approximations. Let
{µ1 , . . . , µM } = Ptrain,γ ⊂ P be a finite set of training parameters with corresponding values (γ(µi ))M
i=1
computed by solving the generalized Lyapunov equation of Lemma (4.5). The goal is then to find a
function Γ (µ) : P → R that interpolates between the calculated γ values. If the points in Ptrain,γ are
located on a grid in the parameter domain, techniques like linear, cubic or spline interpolation can
be employed to construct Γ (µ). Large parameter dimensions or a fine grid resolution for Ptrain,γ can
result in a very large number of training points such that the procedure might become too expensive,
even for the offline phase.
A better and more flexible approach is to use scattered data interpolation techniques such as kernel
interpolation. With those methods, the sampling points must not be located on a structured set but
can be chosen for example as random elements from the parameter domain. The idea behind those
techniques is to define a kernel function k(x, y) and weights α = (αi )M
i=1 , such that the interpolation
takes the form
Γ (µ) :=
M
X
αi k(µ, µi ),
with
µi ∈ Ptrain,γ .
i=1
Once suitable weights αi are calculated, the interpolation can be evaluated rapidly for new values
µ. We use so called thin plate splines as kernel functions k(x, y) = kx − yk ln(kx − yk) as they turn
2
out to provide good results. Other possible choices
are Gaussian functions k(x, y) = e−kx−yk/σ or
p
multiquadratic radial basis functions k(x, y) = 1 + akx − yk2 .
The weights can be determined by solving the linear equation Kα = g, where K = (k(µi , µj ))i,j=1,...M ,
g := (γ(µi ))i=1,...,M and α = (αi )i=1,...,M . For more details concerning the scattered data approximation by using radial basis functions we refer to the monograph [57] and to [36, 58].
4.2 Error Bounds for the LQR Problem
The optimal feedback control problem for LTI systems is certainly among the most interesting applications of the ARE. We recall the LQR problem as it was introduced in Example 2.1:
Z ∞
J := min
(y(t)2 + u(t)2 )dt, E ẋ(t) = Ax(t) + Bu(t), y(t) = Cx(t), x(0) = x0 .
(35)
u
0
As it was shown, this problem can be solved by setting u(t) = −B T P Ex(t) with P being the solution
to the ARE (1). Clearly, using the reduced controller û(t) := −B T P̂ Ex(t) in the full dimensional
problem (35) results in suboptimal behaviour but may represent a viable solution in practice. Denoting
ˆ and ŷ, we derive rigorous bounds for
the corresponding quantities for the reduced controller by J,x̂
ˆ
J − J, kx(t) − x̂(t)k and ky(t) − ŷ(t)k. All results are based on the a-posteriori error bound for the
reduced solution to the ARE.
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
15
4.2.1 Suboptimality-Bounds for the Cost Functional
The optimal feedback control u = −B T P Ex minimizes the cost functional
Z ∞
x(t; P )T C T Cx(t; P ) + u(t)2 dt,
J(P ) :=
Z 0∞
=
x(t; P )T (C T C + E T P BB T P E)x(t; P )dt,
(36)
0
where x(t; P ) is the solution of the closed loop LTI system with the feedback control law u(t) =
−B T P Ex(t; P ):
E ẋ(t; P ) = (A − BB T P E)x(t; P ),
x(0; P ) = x0 .
(37)
The goal will be to quantify the error J(P̂ ) − J(P ). For that purpose, the following result shows how
the value of J(P ) can be efficiently calculated without the numerical quadrature of the cost functional.
Lemma 4.10 (Calculation of Cost Functional) Let the matrix E −1 Y (X) with Y (X) := A −
BB T XE be stable. The value of J(X) can be calculated through
J(X) = (Ex0 )T Q(Ex0 )
(38)
where Q is the unique positive definite solution of the generalized Lyapunov equation
Y (X)T QE + E T QY (X) = −G(X),
(39)
with G(X) := C T C + E T XBB T XE. If X is the true solution P of the ARE, it furthermore holds
P = Q.
−1
Proof Rewriting equation (36) with the explicit solution x(t; P ) = eE Y (X)t x0 yields
Z ∞
T
(E −1 Y (X))T t
E −1 Y (X)t
J(X) = x0
e
Ge
dt x0 .
{z
}
| 0
=Q̃
The matrix Q̃ is the unique solution to the Lyapunov equation
(E −1 Y (X))T Q̃ + Q̃E −1 Y (X) = −G.
Substituting Q̃ = E T QE yields equation (39) and (38).
t
u
The following proposition gives an upper bound for the difference between the values of the cost
functional for the true controller and the reduced controller:
Proposition 4.11 (Error Bound for the Cost Functional) Let P be the correct solution to the
ARE (1) and let P̂ be an approximation to P such that the bound from Proposition 4.2 is valid. Then
the following bound for the cost function values holds
J(P̂ ) − J(P ) ≤ ∆J := J(P̂ ) − xT0 E T P̂ Ex0 + kEx0 k2 ∆P .
Proof The knowledge of the solution P allows a simple representation of J.
J(P ) = xT0 E T P Ex0 .
This yields
J(P̂ ) − J(P ) = J(P̂ ) − xT0 E T P Ex0
= J(P̂ ) − xT0 E T P̂ Ex0 − xT0 E T (P − P̂ )Ex0
≤ J(P̂ ) − xT0 E T P̂ Ex0 + kEx0 k2 ∆P .
t
u
A. Schmidta , B. Haasdonka
16
Observing J(P̂ ) = xT0 E T P̂ Ex0 and ∆P = 0 from Corollary 4.3 in the case where the approximation
is exact, the zero error prediction follows:
Corollary 4.12 Let all assumptions of Corollary 4.3 hold. If colspan(P ) ⊂ colspan(W ) the error
indicator correctly predicts zero error ∆J = 0.
Proof According to 4.3 it holds ∆P = 0. Furthermore, by Lemma 4.10 it follows J(P ) = J(P̂ ) =
xT0 E T P̂ Ex0 what proofs ∆J = 0.
t
u
In order to apply the bound stated in Proposition 4.11 the value of the cost functional J(P̂ ) for the
reduced controller must be calculated. This can be done in several ways. The naive approach involves
the integration of the closed loop ODE system with the reduced controller up to a final time T ,
where the solution is almost zero and does not contribute to the cost functional anymore. However,
this approach requires a very fine time discretization due to the fast initial solution behaviour in the
closed loop LTI system. Furthermore, the online independence of n is violated. However, the high
dimensional system can be replaced by a suitable reduced order model for the closed loop state ODE,
and the cost function value can be approximated by using the reduced surrogate model. We have tested
this approach and ended with bad results, since the quality of the reduced order model must be very
good to capture the state behaviour for the cost functional in a satisfactory way.
Looking again at the right hand side of the generalized Lyapunov equation (39) reveals a low rank
structure. The matrix G can be factorized as
C
T
G = F F, F :=
∈ R(p+m)×n .
(40)
BT P E
Hence, as already pointed out in the introduction, the solution Q is likely to have a low rank structure
as well, allowing the efficient approximation as in the ARE case. We employ the same technique as in
Theorem 4.1 for deriving an a-posteriori error estimator for the generalized Lyapunov equation:
Theorem 4.13 (RB-Approximation of Generalized Lyapunov Equations) Let E, Y, G ∈ Rn×n ,
and assume that E −1 Y is stable2 . Let N ∈ Rn×n be the unique and positive definite solution to the
generalized Lyapunov equation
L(N ) = −G,
where
L(N ) := Y T N E + E T N Y.
Assume N̂ ∈ Rn×n is an approximation to N . Define the residual εL and γL by
εL := kY T Q̂E + E T Q̂Y + Gk,
γL := kL−1 k.
Then the following bound
kN − N̂ k ≤ ∆L := γL εL
is valid.
Proof The proof follows the lines in Theorem 4.1. Note that the validity criterion for applying the
theory from [8] in the linear case is always fulfilled, since L(α) = 0 and hence 2γL L(2γL εL ) = 0.
t
u
We will not discuss the theory in more detail since it would be a wide reproduction of the theory
developed for the ARE. Especially we emphasize that a computable version of this theorem can be
established completely analogously to Proposition 4.2 by replacing the quantities with their corresponding computable upper bounds. In [31] a very similar result to Theorem 4.13 for the RB approximation
of generalized Lyapunov Equations is developed. Especially the stated error bound in [31] follows from
Theorem 4.13 when E and A are assumed to be symmetric, hence Theorem 4.13 can be understood
as a generalization.
By using Theorem 4.13, the quantity J(P̂ ) in Theorem 4.11 can be replaced by its approximation
together with the error from the a-posteriori error estimator 4.13.
2
Note that Y is now an arbitrary matrix and not the closed loop matrix A − BB T P̂ E.
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
17
Corollary 4.14 (Cost Functional Error Bound) Let Y = A − BB T P̂ E such that E −1 Y is stable
and G = F F T as in (40). Let Q̂ ∈ Rn×n be an approximation to the solution of the generalized
Lyapunov equation (39). Let the assumptions of Theorem 4.13 and Proposition 4.2 be valid. Then the
following bound is valid:
J(P̂ ) − J(P ) ≤ ∆J,2 := xT0 E T (Q̂ − P̂ )Ex0 + kEx0 k2 (∆P + ∆L ).
(41)
Proof Using the same arguments as in 4.11 it follows:
J(P̂ ) − J(P ) ≤ J(P̂ ) − xT0 E T P̂ x0 + kEx0 k2 ∆P .
By using J(P̂ ) = xT0 E T QEx0 with Q being the solution of the generalized Lyapunov equation (39)
and by choosing a suitable approximation Q̂ together with its error bound ∆L it follows
J(P̂ ) = xT0 E T Q̂Ex0 + xT0 E T (Q − Q̂)Ex0 ≤ xT0 E T Q̂Ex0 + kEx0 k2 ∆L ,
t
u
which proofes the claimed bound (41).
The question remains, how a suitable approximation for Q can be found. If we define Q̂ := W QN W T
we can follow the lines as in Section 3 and conclude with a reduced Lyapunov equation
T
YNT QN EN + EN
QN YN = GN ,
(42)
where YN := W T Y V , EN := W T EV and GN := W T GW . It turns out, that the basis constructed for
the approximation of the ARE is also a good choice for approximating the Lyapunov equation for the
cost functional value J(P̂ ).
4.2.2 State and Output Bounds
Usually one is not only interested in the value of the cost functional but also in the outputs of the
system. We use the error bound for the approximate solution to the ARE to define the following
state error bound between the controlled system with the optimal LQR controller and the suboptimal
controller.
Proposition 4.15 (State Error Bound) Let E, A, B, C define a controllable and observable LTI
system with ν := ν[E −1 A] < 0. Let P be the solution to the corresponding ARE and let P̂ be an
approximation to P such that the validity criterion (26) from Proposition 4.2 is satisfied. Define the
systems
E ẋ(t) = (A − BB T P E)x(t),
x(0) = x0
and
˙
E x̂(t)
= (A − BB T P̂ E)x(t),
x̂(0) = x0 . (43)
Let C̃ := kExk∞ = supt≥0 kEx(t)k be known. Then the error e(t) = x(t) − x̂(t) can be bounded by
1 νt
1 νt
ke(t)k ≤ ∆x (t) := C̃kE −1 BB T k∆P
e − 1 exp
e − 1 kE −1 BB T P̂ Ek , ∀t ≥ 0. (44)
ν
ν
Proof The explicit solutions for x(t) and x̂(t) are given through
Z t
−1
−1
x(t) = eE At x0 −
eE A(t−s) E −1 BB T P Ex(s)ds,
0
Z t
−1
E −1 At
x̂(t) = e
x0 −
eE A(t−s) E −1 BB T P̂ E x̂(s)ds.
0
Hence the error fulfills
Z t
−1
e(t) =
eE A(t−s) E −1 BB T (P Ex(s) − P̂ E x̂(s))ds
0
A. Schmidta , B. Haasdonka
18
Z
=
t
eE
−1
A(t−s)
E −1 BB T (P Ex(s) − P̂ Ex(s) + P̂ Ex(s) − P̂ E x̂(s))ds.
0
Taking the norm on both sides and application of the triangle inequality gives
Z t
−1
keE A(t−s) kkE −1 BB T k kP − P̂ k kEx(s)k ds
ke(t)k ≤
| {z } | {z }
0
≤∆P
t
Z
keE
+
−1
A(t−s)
≤C̃
kkE −1 BB T P̂ Ekke(s)kds.
0
Applying Gronwall’s inequality results in
Z t
Z t
−1
−1
keE A(t−s) kdskE −1 BB T P̂ Ek ∆P .
keE A(t−s) kds exp
ke(t)k ≤ C̃kE −1 BB T k
0
0
If we now use the dissipativity of the matrix A, we find
Z t
Z t
1 νt
A(t−s)
eν(t−s) ds =
ke
kds ≤
e −1 ,
ν
0
0
which completes the stated error bound.
Clearly, this bound greatly depends on the behaviour of the uncontrolled systems. Especially if ν[E −1 A]
is close to zero, the bound can become very large and not useful anymore. Finally, we can use this
state bound to define an error bound for the outputs of the system:
Corollary 4.16 (Output Error Bound) Let the assumptions of Proposition 4.15 hold. Define the
outputs y(t) = Cx(t) and ŷ(t) = C x̂(t). Then the difference can be bounded by
ky(t) − ŷ(t)k ≤ ∆y (t) := kCk∆x .
Furthermore, each individual error in the output components can be bounded by
|yi (t) − ŷi (t)| ≤ k(cij )nj=1 k∆x (t),
i = 1, . . . , m.
5 Offline/Online Decomposition
The key to the efficient calculation of the important quantities is an offline/online strategy which relies
on the parameter separability of the data and system matrices, this means we assume representations
of the form
A(µ) =
C(µ) =
QA
X
qA
ΘA
(µ)AqA ,
B(µ) =
qA
qB
QC
X
QE
X
qC
ΘC
(µ)CqC ,
E(µ) =
qC
Qx0
x0 (µ) =
QB
X
X
qB
ΘB
(µ)BqB
qE
ΘE
(µ)EqE .
(45)
qE
q
Θxx00 (µ)x0qx .
0
qx0 =1
This structure allows the precalculation of many parts of the residual norm and other quantities in an
offline step. The online step then only requires matrix operations with complexity depending on the
quantities QA , QB , QC , QE , Qx0 and the dimension of the reduced system N . Hence, those numbers
should be preferably small.
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
19
Calculating the Residual Norm
The calculation of the residual norm ε(µ) can take an excessive amount of time for large n. This is due
to the matrix multiplications that are necessary for forming the residual matrix and the calculation of
the norm. We will now show how the residual can be obtained in a complexity completely independent
of the high dimension.
Clearly the induced 2-norm of R(P̂ (µ)) is not feasible for large scale applications, since this would
require the knowledge of the largest eigenvalue of an n × n matrix. A better choice involves the
Frobenius norm, which is an upper bound
√
√ εN (µ) for the induced 2-norm of a matrix. Indeed, the norm
equivalence inequality kAk ≤ kAkF ≤ nkAk indicates, that it does not differ moreqthan a factor n
from the 2 norm of A ∈ Rn×n . The calculation of the Frobenius norm kR(P̂ )kF =
yields
tr(R(P̂ )T R(P̂ ))
kR(P̂ ))k2F = 4 tr(C T CAT P̂ E) − 4 tr(E T P̂ BB T P̂ EAT P̂ E)
− 2 tr(E T P̂ BB T P̂ EC T C) + tr(E T P̂ BB T P̂ EE T P̂ BB T P̂ E)
+ 2 tr(E T P̂ AAT P̂ E) + 2 tr(E T P̂ AE T P̂ A)
(46)
+ tr(C T CC T C)
We here made extensive use of the identity tr(ABC) = tr(CAB) and tr(AT B) = tr(AB T ). If we now
use the definition of the reconstruction of the reduced solution P̂ = W PN W T , we can exploit the
structure further and identify matrices that can be treated by an offline/online decomposition:
tr(C T CAT P̂ E) = tr(W T EC T CAT W PN )
tr(E T P̂ BB T P̂ EAT P̂ E) = tr(W T EE T W PN W T BB T W PN W T EAT W PN )
tr(E T P̂ BB T P̂ EC T C) = tr(W T EC T CE T W PN W T BB T W PN )
tr(E T P̂ BB T P̂ EE T P̂ BB T P̂ E) = tr(W T EE T W PN W T BB T W PN W T EE T W PN W T BB T W PN )
tr(E T P̂ AAT P̂ E) = tr(W T EE T W PN W T AAT W PN )
tr(E T P̂ AE T P̂ A) = tr(W T AE T W PN W T AE T W PN ).
All underlined matrices are of dimension N ×N and parameter separable and can be assembled rapidly.
For instance the product W T EC T CAT W can be treated in the following way3 :
W T E(µ)C T (µ)C(µ)A(µ)W =
QE X
QC X
QA
X
j
i
k l
(ΘE
ΘC
ΘC
ΘA )(µ) W T Ei CjT Ck Al W .
|
{z
}
i=1 j,k=1 l=1
=:Mi,j,k,l
Note that this sum only requires matrix operations on N ×N matrices once the parameter independent
components Mi,j,k,l are calculated. All other parts in the Frobenius norm containing P̂ can be treated
in a similar way. The last summand tr(C T CC T C) is decomposeable into
tr(C(µ)T C(µ)C(µ)T C(µ)) =
QC
X
j
i
k l
(ΘC
ΘC
ΘC
ΘC )(µ) tr(CiT Cj CkT Cl ),
i,j,k,l=1
where again a precalculation of all parameter independent traces tr(CiT Cj CkT Cl ) allows the rapid
calculation. We see that the overall complexity for the calculation of the Frobenius norm in the online
step is
O(Q4C + N + N 3 (QE QA Q2C + Q2E + Q2B + Q2A + QE QA + Q2E Q2C )).
(47)
Hence, if the parameter independent components are precalculated and stored in an offline phase, the
Frobenius norm of the residual can be assembled in a complexity independent of n. The calculation
time depends quartic on the number of terms in the expansion of C(µ) and quadratically on C(µ),
E(µ) and B(µ).
3
We use the abbreviation (Θ1 Θ2 )(µ) := Θ1 (µ)Θ2 (µ).
A. Schmidta , B. Haasdonka
20
Calculating the Error Bound for the Parametric ARE
The computable error bound for approximations to the ARE as stated in Proposition 4.2 can be
calculated online in a complexity independent of n. We have already seen that the Frobenius norm of
the residual can be calculated online. Furthermore, we assume that a rapidly computable approximation
γN (µ) is available. An upper bound on the validity check 8kE(µ)k2 kB(µ)k2 εN (µ)γN (µ) ≤ 1 can also
be decomposed in an offline/online fashion. To see this, we again make use the fact that the Frobenius
norm is an upper bound for the Euclidean norm:
kE(µ)k2 ≤ kE(µ)k2F =
QE
X
j
i
(ΘE
ΘE
)(µ) tr(EiT Ej ),
i,j=1
kB(µ)k2 ≤ kB(µ)k2F =
QB
X
j
i
(ΘB
ΘB
)(µ) tr(BiT Bj ).
i,j=1
Hence, the bound can be calculated online in a complexity independent of n once tr(EiT Ej ) and
tr(BiT Bj ) is precalculated and stored offline. If QE = 1 or QB = 1, the direct calculation of kB(µ)k2 =
1
1
(ΘB
(µ))2 kB1 k2 and kE(µ)k2 = (ΘE
(µ))2 kE1 k2 might yield sharper
validity criteria, since the Frobe√
nius norm can differ from the Euclidean norm by a factor of n.
Calculating the LQR Bounds Online
We will now see how the bounds for the error in the cost functional can be calculated efficiently
online, see Proposition 4.11. We will assume for simplicity that an RB approximation for Q̂(µ) takes
the form Q̂(µ) = W QN (µ)W T , with the same basis matrix W as for the approximation of the ARE
P̂ (µ) = W PN (µ)W T . Then it follows
Qx0
xT0 (µ)E(µ)T (Q̂(µ)
T
− P̂ (µ))E(µ) x0 (µ) =
QE
X X
k l
(Θxi 0 Θxj 0 ΘE
ΘE )(µ)JTi,k QN Jj,l ,
i,j=1 k,l=1
where Ji,k := W T Ei x0k ∈ RN ×1 . Furthermore, the norm kE(µ)x0 (µ)k can also be calculated efficiently:
Qx0
2
T
T
kE(µ)x0 (µ)k = x0 (µ) E(µ) E(µ)x0 (µ) =
QE
X X
k l
(Θxi 0 Θxj 0 ΘE
ΘE )(µ)xT0i EkT El x0j ,
i,j=1 k,l=1
where again a precalculation and storage of all components xT0i EkT El x0j can accelerate the computation.
The online calculation of ∆x and ∆y can be performed efficiently when QE = 1. Under this
assumption, the matrix norm kE(µ)−1 B(µ)B(µ)T k can be bounded by the Frobenius norm:
kE(µ)−1 B(µ)B(µ)T k2 ≤ kE(µ)−1 B(µ)B(µ)T k2F = tr(E(µ)−1 B(µ)B(µ)T E(µ)−T B(µ)T B(µ))
=
1
1
ΘE (µ)2
QB
X
j
i
k l
(ΘB
ΘB
ΘB
ΘB )(µ) tr(Bi BjT E1−T E1−1 Bk BlT ).
i,j,k,l=1
The other term kE(µ)−1 B(µ)B(µ)T P̂ (µ)E(µ)k can also be estimated similarly, by again using the
Frobenius norm and the trace identities used also for the residual.
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
21
6 Numerical Examples
We will now present three examples that will show the performance of the RB-ARE approach and
the bounds developed in Section 3 and 4. All examples were performed in Matlab with the toolbox
RBmatlab4 . The solution to the AREs in the offline phase and for comparison purposes were calculated
by using the package MESS5 .
6.1 Overview of the Examples
Our examples for parametric ARE scenarios are all related to optimal feedback control problems for
parabolic partial differential equations that model heat transfer phenomena. We give a brief description
of the models and show how they can be transformed to standard LTI systems, which then should be
controlled by applying a feedback control with weighting matrices Q = Ip and R = Im . As pointed out
in example 2.1, the problem can be handled by solving the corresponding ARE, which will be subject
to our numerical investigation.
The purpose behind the first two examples is to highlight the different performance and behaviour
of the theoretical bounds developed in Section 4 for different discretization techniques, namely the
finite difference (FD) method and the finite element (FE) method and different model complexities.
FD discretizations yield LTI systems with E = In . This allows us to employ the rigorous bounds
for the constant γ, developed in Section 4.1.2 and to prove the stability of the reduced controller, if
employed as a controller for the full dimensional system, by Proposition 4.4. On the other hand, the
FE discretization is characterized by E 6= In which technically will still allow the computation of an
upper bound for γ, but as we will see in the experiments the values are very pessimistic due to kE −1 k
entering the bound. For small systems, the elimination of E by inversion or by employing suitable
state space transformations based on Cholesky (or LU) decompositions would be a possbile remedy.
However this is not always recommended or even impossible due to fill in effects unless the explicit
inversion is avoided by adjusting the algorithms to make use of forward and backward substitution.
Hence we must rely on interpolation techniques that lack rigorosity for the resulting error bounds but
provide very rapid and accurate approximations. Finally, the last example has a very high dimension
and serves as a benchmark example for demonstrating the runtime benefit when using the RB-ARE
approach. An overview of all models is given in Table 1.
Model Name
FDM
Thermalblock
Rail
n
3600
3721
20209
m
1
2
6
p
1
2
7
dim P
3
4
3
|PT |
729
600
400
E = In
yes
no
no
tfull [s]
1.7
6.6
243.5
∅ rk(Z)
80
203
1578
Table 1: Overview of the experiments with corresponding dimensions and properties. State dimension
n, number of inputs m, number of outputs p and dimension of parameter space dim P. Furthermore
the average time for the solution of the detailed ARE tfull and the average low rank factor rank is
given, based on 10 calculations for different parameters.
Two Dimensional Heat Transfer with Distributed Control
The first “simple” model describes a two dimensional heat transfer with distributed control and homogeneous Dirichlet zero values. Denoting the spatial coordinates as ξ = (ξ1 , ξ2 ) ∈ Ω := (0, 1)2 , for
given µ ∈ P and T > 0 we seek a function w(·, ·; µ) : Ω × [0, T ] → R solving
wt (ξ, t; µ) − µdiff ∆w(ξ, t; µ) + µadv · ∂ξ1 w(ξ, t; µ) = B(ξ; µ)u(t)
4
5
in Ω × (0, T ],
(48a)
RBmatlab can be freely obtained via www.morepas.org.
MESS (Matrix Equation Sparse Solver) has been gratefully provided by J. Saak from the MPI Magdeburg.
A. Schmidta , B. Haasdonka
22
ΩB
ΩC
(a) Sketch of the setting for the fi- (b) Initial value. Dark red repre- (c) Time t = 0.5. The parameters
sents values close to 1 and dark blue in the upper row are (0.4, 0, 1), in
nite difference model.
is 0.
the lower (0.4, 5, 1). Left is uncontrolled, right controlled.
Fig. 1: Setup for the finite difference model (FDM).
w(ξ, t; µ) = 0
w(ξ, 0; µ) = w0 (ξ; µ),
on ∂Ω × (0, T ]
on Ω.
(48b)
(48c)
The function B(ξ; µ) = µB 1ΩB (ξ) determines the influence of the controller u(t) on the control domain
ΩB , where 1ΩB (ξ) is the characteristic function on the set ΩB := [0.4, 0.6]2 . Finally, we will consider
a partial measurement of the state on a subdomain ΩC = [0.8, 0.85] × [0.05, 0.95]:
Z
s(t; µ) =
w(ξ, t; µ)dξ, t ∈ [0, T ].
(49)
ΩC
The initial value was set to w0 (ξ; µ) = sin(πξ1 ) sin(πξ2 ). The system depends on three parameters
µ = (µdiff , µadv , µB ), stemming from the parameter set P = [0.1, 2]×[0, 10]×[1, 15]. The first parameter
µdiff determines the heat coefficient. Small values lead to a slow thermal diffusion which makes the
problem harder to control. The second parameter is the velocity of the heat transport in ξ1 direction.
We will use the last parameter µB to change the control influence. In Figure 1 the setup is sketched
and the controlled state is shown for two different parameter choices.
The differential equation is semidiscretized in space by using central finite differences on a uniform
grid with equidistant step size in each spatial dimension. This yields an n = n2ξ = 3600 dimensional
ODE system, where nξ denotes the number of discrete points in the interior of the domain in one
spatial dimension. The boundary values are not explicitly mapped by the discrete representation.
2
X
d
qA
1
ΘA
(µ)AqA x(t; µ) + ΘB
x(t; µ) =
(µ)Bu(t)
dt
q =1
(50a)
A
x(0; µ) = x0 (µ),
y(t; µ) = Cx(t; µ),
(50b)
(50c)
1
2
1
with ΘA
(µ) = µdiff , ΘA
(µ) = µadv and ΘB
(µ) = µB .
Thermalblock Model with Finite Element Discretization
The second “complex” model describes a heat conduction in the two dimensional domain Ω = (0, 1)2
with insulation, i.e. homogeneous Neumann boundary values, on the boundary part Γ1 = ({0} ×
[0, 0.5]) ∪ ([0, 1] × {0}) ∪ ({1} × [0, 0.5]). On the remaining edges Γ2 = ∂Ω \ Γ1 we impose homogeneous
Dirichlet boundary values. In this example we control the system with distributed controls on two
subsets ΩB1 and ΩB2 by using two distinct control functions u1 (t) and u2 (t). For a fixed parameter
µ ∈ P and T > 0 we seek the solution w(ξ, t; µ) of the following PDE:
∂t w(ξ, t; µ) − ∇ · (a(ξ; µ)∇w(ξ, t; µ)) = 5(1ΩB1 (ξ)u1 (t) + 1ΩB2 (ξ)u2 (t))
on Ω × (0, T ],
(51a)
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
10
Ω2
8
Γ2
7
6
ΩB1
100
y1 (t,µ)
y2 (t,µ)
9
ΩB2
5
4
Output yi (t,µ)
Ω1
23
80
60
40
20
3
Γ1
2
ΩC1
ΩC2
1
0
0
0
0.2
0.4 0.6
Time t
0.8
1
(b) Controlled state at time t = (c) The value of the uncontrolled
(a) Setup for the thermalblock ex- 0.05.
(solid) and controlled (dashed) outperiment.
puts.
Fig. 2: Setting for the thermalblock experiment. The state plot and the output are plotted for µ =
(1, 2, 10, 4).
∂ν w(ξ, t; µ) = 0
w(ξ, t; µ) = 0
w(ξ, 0; µ) = w0 (ξ; µ).
on Γ1 × (0, T ],
on Γ2 × (0, T ],
in Ω.
(51b)
(51c)
(51d)
The function a(ξ; µ) = µ1 1Ω1 (ξ) + µ2 1Ω2 (ξ) determines the heat coefficient in the two subdomains
Ω1 and Ω2 with values µ1 , µ2 ∈ [1, 5]. The measurements are taken near the lower boundary on two
regions ΩC,1 = [0.2, 0.3] × [0.05, 0.15], ΩC,2 = [0.7, 0.8] × [0.05, 0.15] and can be individually weighted
by coefficients µCi :
Z
1
µC w(ξ, t; µ)dξ, t ∈ [0, T ], for i = 1, 2.
si (t; µ) :=
|ΩC,i | ΩC,i i
The complete setting is depicted in Figure 2 together with a controlled example realization. Overall,
the system depends on four parameters µ = (µ1 , µ2 , µC1 , µC2 ) ∈ [1, 5]2 × [1, 20]2 .
The spatial discretization is now performed by using finite elements with linear ansatz functions
on a regular triangular grid, resulting in an n = 3721 dimensional system of the form
E
3
X
d
qA
x(t; µ) =
ΘA
(µ)AqA x(t; µ) + B(u1 (t) u2 (t))T ,
dt
q =1
(52a)
A
x(0; µ) = x0 (µ)
y(t; µ) =
2
X
qC
ΘC
(µ)C qC x(t; µ)
(52b)
(52c)
qC =1
1
2
3
1
where the coefficient functions are given as ΘA
(µ) = µ1 , ΘA
(µ) = µ2 , ΘA
(µ) = 1, ΘC
(µ) = µC1 and
2
ΘC
(µ) = µc2 . In particular the boundary values are now explicitly included in the expansion of A(µ).
In this example, the initial value is chosen to be 10 everywhere.
Optimal Cooling of Steel Rail Profiles
The last example under consideration has been developed to optimize the active cooling of hot steel
profiles. This problem arises in rolling mills and is considered in order to achieve high production rates,
where a fast cooling is crucial. The cooling is accelerated by spraying water through 7 nozzles onto
the hot steel rod. Measurements at 6 different positions on the outside of the steel profile are used to
ensure even temperature distribution throughout the cooling process to avoid unwanted deformations
or brittleness. We refer to [5, 45, 54] for more information about the modeling and the exact description
A. Schmidta , B. Haasdonka
24
of the control problem. The system is semidiscretized by applying linear finite elements, which results
in the following system of ordinary differential equations.6
E ẋ(t; µ) = µA Ax(t; µ) + Bu(t),
x(0; µ) = x0 ,
y(t) = µC Cx(t; µ).
We artifically parametrize the model by introducing parameters µA and µC . The first parameter
µA ∈ [0.1, 10] describes different thermal conductivities of the material and µC ∈ [1, 100] can be used
to adjust the cooling speed.
(a) Coarse grid of the cross section
of the rail profile.
(b) Examples of the temperature distribution and the isolines in the uncontrolled case. The left plot shows the initial temperature distribution,
the right at time t ≈ 45s. The rightmost plots shows a finer grid.
Fig. 3: Grid and example of temperature distribution for the rail model. The plots were taken from [45].
6.2 Numerical Results
Basis Generation Results
We first examine the results from the LRFG basis generation algorithm, where we compare the methods
add one and add many as they were proposed in Section 3.1. The algorithm was initialized with the
desired tolerance ε = 10−6 and the inner tolerance was set to εI = 10−4 . We use the normalized
residual
∆(V, µ) :=
kR(P̂ (µ))kF
.
kC(µ)T C(µ)kF
(53)
as error indicator. This measure is also frequently used to quantify the approximation quality in
iterative solution algorithms for the ARE, cf. [6, 46]. Due to the offline/online decomposition, ∆(V, µ)
is cheaply evaluable, even in cases where the number of training elements |PT | is quite large. The
resulting basis sizes in Table 2 show that the basis for the add one mode is smaller but the basis
generation times confirm the expectation that the smaller basis comes with much higher offline costs.
The basis construction with the add many also requires less solutions of high dimensional AREs. The
smaller basis size may become crucial in applications where only limited storage is available or where
the performance is very critical.
Figure 4 shows two insights to the greedy procedure. We see in Figure 4a a nice exponential decay
for the residual based error indicator (53) for all examples under consideration and for both modes.
Especially for the rail model, the results are hard to distinguish. Comparing the very similar decay in
the true relative errors in Figure 4c to the relative residuals used as an error indicator during the basis
building procedure indicates that (53) indeed is a good indicator for the true error.
Figure 4b shows the compression achieved by the POD in the inner iterations. Due to the fast
singular value decay, the low rank factors can be efficiently represented by only ≈ 20 − 30 POD modes
6
The system matrices can be downloaded on
https://portal.uni-freiburg.de/imteksimulation/downloads/benchmark/Steel Profiles (2838881).
FDM
Thermalblock
Rail
101
10−2
10
25
200
Elements
Max. Normalized Residual
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
100
−5
10−8
0
0
50
100
Basis Size
1
150
2
3
4
5
6
Outer Greedy Iteration
7
(b) Low rank factor sizes and the number of vectors returned by the POD in the greedy iterations.
(a) Decay of error indicator during the basis building procedure. The solid lines are for the add one
mode, dashed lines are add many.
Maximum Relative Error
100
True Error
Error Indicator
10−2
10−4
10−6
0
50
100
Basis Size
(c) True relative error kP (µ) − P̂ (µ)kF /kP (µ)kF
for the Thermalblock model plotted over increasing basis sizes, calculated on a random test set
containing 10 parameters.
Fig. 4: Basis generation results. The dashed line is the algorithm with the mode add many, the solid
line is add one.
for reaching the desired tolerance εI . Note that the basis generation times in Table 2 do not include
the time for preparing the calculation of the γ values in cases where interpolation technique is applied.
We will see later on in this section that this significantly increases the offline time. We will use the
bases generated by the “add one” method in the following experiments.
Model Name
FDM
Mode
add one
add many
Basis Size
131
158
Gen. Time [s]
439.4
47.8
#ARE Solves
71
11
Thermalblock
add one
add many
122
141
813.8
91.4
31
7
Rail
add one
add many
109
158
15187.1
856.1
50
3
Table 2: Basis generation results for all experiments.
A. Schmidta , B. Haasdonka
26
5
10
4
−2
µ1
Stability criterion
0
3
0
−4
−6
2
1
0
5
10
µB
15
20
−10
5
10
15
µC 1
(a) Stability criterion for the FDM model with µdiff =
0.1 (solid), µdiff = 0.3 (dashed) and µadv = 0.
(b) Stability criterion plotted for varying parameters
µC1 and µ1 . The other parameters were set to µ2 =
µC2 = 1.
Fig. 5: Investigation of the sufficient stability criterion (54) that was stated in Proposition 4.4.
Stability Considerations
One property that is desired not only in applications but also for theoretical purposes is the stability
of the matrix G = E −1 (A − BB T P̂ E), where P̂ = W PN W T with PN being the solution to the reduced
ARE. The stability of G is a crucial ingredient for the error estimator developed in Section 4, since it
ensures the solvability of the generalized Lyapunov equation that serves as a basis for the calculation
of γ. Note that by using the full dimensional solution to the ARE (1), i.e. P̂ = P , the stability is
always guaranteed. In Proposition 4.4 we have developed a strategy for proving the stability of G by
using the dissipativity of E −1 A. The resulting computable condition
ν[E −1 A] + kE −1 BB T P̂ EkF < 0
(54)
is related to the controller influence on the dynamics of the system. In order to see this, we plot
the value of the left hand side of the stability criterion (54) for varying parameters for the FDM
and the Thermalblock model in Figure 5. For the FDM model the parameter µB scales the input,
i.e. the control. Large values lead to more influence of the controller and make the stability criterion
invalid. A similiar behaviour can be seen in Figure 5b, where the value of (54) is plotted for the
Thermalblock model for varying parameters µ1 and µC1 . Increasing µ1 causes faster heat conduction
and hence a fast stabilization whereas the value of µC1 changes the controller influence again. However,
we see that for large subdomains of P we can conclude stability by using the criterion (54) in both
models. Furthermore, all of our calculations led to stable closed loop systems, also in cases where the
criterion (54) was not applicable. This was checked by calculating the largest eigenvalue of G. Note
that (54) is only a sufficient condition that must not hold for stable systems.
Calculation of Gamma
For the error estimation, the factor γ or suitable approximations to it are required. Though, its calculation is a tough problem. We have seen in Section 4.1.2 that the exact value can be calculated by
solving a generalized Lyapunov equation with high rank right hand side and by taking the 2-norm
of the solution. However, this approach is very expensive and cannot be performed online. We hence
have stated two alternatives for the calculation: An upper bound γN,2 (Corollary 4.9) and a kernel
interpolation based on radial basis functions. We begin by checking the applicability and the results
of the upper bound in Corollary 4.9 for the FD model. The bound greatly relies on the dissipativity
of the system, since we can only apply the upper bound when ν[E −1 A] + kE −1 BB T P̂ EkF < 0. Since
this term occurs in the denominator of the bound, values close to zero make the bound useless. Nevertheless, the FD model has nice stability properties as we have seen before and the bound predicts
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
27
the correct values of γ very accurately as indicated in Table 3. Applying the bound for the thermalblock model with FE discretization yields a large factor kE −1 k2 in the numerator of the upper bound,
which makes it useless for practical applications although it still may be rigorous. We hence focus on
an interpolation based method for calculating approximations to γ. We calculated the kernel weights
offline for a varying number of random sample points in the parameter domain P and validate the
interpolation on a test parameter set Ptest with 50 random points. As radial basis functions we choose
so called “thin-plate splines”: k(x, y) = kx − yk2 ln(kx − yk). The results are given in Table 4. The
decreasing error indicates a good approximation for increasing sample sizes, but this comes with a very
high computational effort in the offline phase.
We furthermore note that due to memory limitations a direct calculation of γ(µ) for the rail model
was not possible. Note that the solution to the Lyapunov equation for calculating the constant has no
low rank structure and is dense. The development of other bounds or good approximations for very
large applications will be part of our future research. The offline phase for the interpolation method
was performed on a random but fixed training set. This can be improved by using techniques like
Kriging or by making use of heuristic strategies for choosing smart sampling parameters, cf. [36].
γ
FDM
Thermalblock
(µ)
Effectivity N,2
γ(µ)
Mean
Max
Min
1.1524
1.9263
1.0002
1036.1761
2757.3814
261.7055
Table 3: Effectivities for the upper bound for γ(µ).
Training Set Size
meanµ∈Ptest ρ(µ)
maxµ∈Ptest ρ(µ)
Offline Time [s]
10
0.80701
0.98885
16922
20
0.086894
0.17583
35033
50
0.077703
0.2057
87608
70
0.068529
0.13961
122820
100
0.048615
0.12525
170233
N (µ)|
Table 4: Relative error ρ(µ) := |γ(µ)−γ
and calculation time for the kernel interpolation of the γ
γ(µ)
values in the thermalblock setting.
Error Bounds for the ARE
The main theorem in this paper is the a-posteriori error bound for reduced solutions of the ARE,
stated in Theorem 4.1 and its computable version in Proposition 4.2. We will now test the quality of
the bound and define therefor a test set Ptest ⊂ P consisting of 50 randomly chosen parameter values.
Instead of checking the absolute error bound, we define the relative error estimator
∆P,rel (µ) :=
∆P (µ)
kP̂ (µ)kF
,
for ease of comparison of the results for the different models. For the calculation of the γ values we
choose the exponential bound 4.4 for the FD model and the interpolation for the Thermalblock model.
Note that the latter does not yield rigorous results but provides excellent approximations to the error
bound values when the correct γ value is used.
We begin by checking the validity and the decay of the error estimator for the FD model and
the Thermalblock model for increasing basis sizes, see Figure 6a. The maximum relative error is
nicely decreasing with increasing basis size. Furthermore we see, that the validity criterion (26) is
fulfilled even for relatively small basis sizes. Looking at Figure 6b however reveals, that the bound
overestimates the true error by a factor of ≈ 200, but the characteristics in the true error are nicely
reproduced by the error estimator. The relative true errors in the ARE solutions for the rail model
A. Schmidta , B. Haasdonka
10−5
FDM
Thermalblock
103
10−7
100
10−9
10−3
10−6
kP (µ) − P̂ (µ)kF
∆P (µ)
Error
maxµ∈Ptest ∆P,rel (µ)
28
10−11
0
20
40
60
80
Basis Size
100
120
(a) Relative error estimator for increasing basis
sizes (thick) and corresponding validity criterion
(thin). The estimator is valid when the validity
criterion is beneath the black dotted line.
0
10
20
30
Parameter Index
40
50
(b) Absolute value of the true error and the estimated error for the FD model over a random
subset of parameters.
Fig. 6: Results for the a-posteriori residual based error bounds.
are all in the range of 10−4 to 10−5 . Note that the true error in the Frobenius norm kP (µ) − P̂ (µ)kF
can be calculated efficiently by only using the low rank factors, such that the formation of the high
dimensional matrices P (µ) and P̂ (µ) can be avoided. Unfortunatelly, it was not possible to apply the
rigorous error estimation procedure, since the calculation of the corresponding γ values exceeded the
memory capacities.
The benefit of using the RB-ARE approach for parametric scenarios becomes clear when regarding
the calculation times and the speedup. All simulation times for the reduced ARE vary in a range
between 10ms for the FD model and 20ms for the Thermalblock example, including the error estimation
and residual calculation. Hence, the speedup factor is approximately 160 for the FD model and 360
for the more complex Thermalblock example. Large problems like the rail model yield large speedups
in the magnitude of 500 − 1.000.
LQR Error Bounds
Finally we discuss the results that were developed for the LQR framework if the approximation from
the RB-ARE approach is used as a surrogate for the true solution. The bound for the cost functional
turns out to be too pessimistic. A basis size of ≈ 40 already results in a numerically zero relative
error (J(P̂ ) − J(P ))/J(P ) for the FDM and Thermalblock model, whereas the bound 4.14 predicts a
guaranteed relative error of 10−3 . One possible explanation for this behaviour is the diffusivity in the
models, because errors in the suboptimal controllers are smoothed by the dynamics of the system and
hence lead to almost identical behaviour.
We lastly discuss the state error bound in Proposition 4.15. It is increasing monotone and tends
1
−1
BB T k∆P . Applying this bound for the FD model
to the limit ∆∞
x = limt→∞ ∆x (t) = −ν[E −1 A] C̃kE
∞
−6
with µ = (0.5, 1.5, 3) results in ∆x = 0.39 · 10 . The true maximum difference maxt≥0 kx(t) − x̂(t)k is
1.22 · 10−10 . Hence also this bound can be pessimistic but still is a rigorous upper bound. Particulary
the long term behaviour is not captured well by this bound, since the true error exponentially decays to
zero. However, our results show that the RB-ARE approach yields very accurate results in all examples
when comparing the true errors.
7 Conclusion
In this paper we have studied the application of the RB approach for the approximation of large
scale AREs in a parametric scenario. After developing a suitable reduced surrogate equation a new
algorithm was proposed for the basis building procedure. The quality of the approximation is validated
by rigorous and non-rigorous error estimations for the solution to the ARE. Furthermore a performance
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
29
bound has been developed, proving the feasibility of the reduction approach in the optimal feedback
control (LQR) setting. Splitting the procedure in an online/offline phase allows an online complexity
completely independent of the dimension of the original problem. Experiments with two dimensional
heat conduction problems show the benefit of applying the reduction technique in multi-query scenarios.
The whole procedure greatly relies on the assumption of low rank approximability of the solution
to the ARE, which is feasible in many scenarios where the number of inputs and outputs is very small
compared to the number of states. This assumption is crucial, but an extension of the approximation
to systems with high dimensional controls or outputs is certainly of great interest. Another interesting
application is the combination of the presented RB-ARE approach together with an approximated
state estimation for example by using a Kalman filter, see [10]. Applying model reduction for the
parametric discrete ARE is also of great practical interest.
8 Acknowledgement
The authors acknowledges funding by the Landesstiftung Baden Württemberg gGmbH and would like
to thank the German Research Foundation (DFG) for financial support within the Cluster of Excellence
in Simulation Technology (EXC 310/1) at the University of Stuttgart.
References
1. A.C. Antoulas. Approximation of large-scale dynamical systems, volume 6. Society for Industrial
and Applied Mathematics, 2005.
2. A.C. Antoulas, D.C. Sorensen, and Y. Zhou. On the decay rate of Hankel singular values and
related issues. Systems & Control Letters, 46(5):323–342, 2002.
3. J.A. Atwell, J.T. Borggaard, and B.B. King. Reduced order controllers for Burgers’ equation
with a nonlinear observer. International Journal of Applied Mathematics and Computer Science,
11:1311–1330, 2001.
4. P. Benner, J.-R. Li, and T. Penzl. Numerical solution of large-scale Lyapunov equations, Riccati equations, and linear-quadratic optimal control problems. Numerical Linear Algebra with
Applications, 15(9):755–777, 2008.
5. P. Benner and J. Saak. Linear-quadratic regulator design for optimal cooling of steel profiles. 2005.
6. P. Benner and J. Saak. Numerical solution of large and sparse continuous time algebraic matrix
Riccati and Lyapunov equations: a state of the art survey. GAMM-Mitteilungen, 36(1):32–52, 2013.
7. J. Borggaard, M. Stoyanov, and L. Zietsman. Linear feedback control of a von Kármán street
by cylinder rotation. In Proceedings of the 2010 American Control Conference, pages 5674–5681.
IEEE, 2010.
8. G. Caloz and J. Rappaz. Numerical analysis for nonlinear and bifurcation problems. Handbook of
Numerical Analysis, 5:487–637, 1997.
9. G. Dahlquist. Convergence and stability in the numerical integration of ordinary differential equations. Mathematica Scandinavia, 4(1):33–53, 1956.
10. M. Dihlmann and B. Haasdonk. A reduced basis Kalman filter for parametrized partial differential
equations. ESAIM: Control, Optimisation and Calculus of Variations, 2015.
11. A.R. Ghavimi and A.J. Laub. Backward error, sensitivity, and refinement of computed solutions
of algebraic Riccati equations. Numerical Linear Algebra with Applications, 2(1):29–49, 1995.
12. M.A. Grepl and A.T. Patera. A posteriori error bounds for reduced-basis approximations of
parametrized parabolic partial differential equations. ESAIM: Mathematical Modelling and Numerical Analysis, 39(1):157–181, 2005.
13. C. Großmann and H.G. Roos.
Numerische Behandlung partieller Differentialgleichungen.
Vieweg+Teubner Verlag, 2005.
14. B. Haasdonk. Convergence rates of the POD–Greedy method. ESAIM: Mathematical Modelling
and Numerical Analysis, 47(3):859–873, 2013.
15. B. Haasdonk, M. Dihlmann, and M. Ohlberger. A training set and multiple basis generation approach for parametrized model reduction based on adaptive grids in parameter space. Mathematical
and Computer Modelling of, 17:423–442, 2011.
30
A. Schmidta , B. Haasdonka
16. B. Haasdonk and M. Ohlberger. Reduced basis method for finite volume approximations of
parametrized linear evolution equations. ESAIM: Mathematical Modelling and Numerical Analysis,
42(2):277–302, 2008.
17. B. Haasdonk and M. Ohlberger. Efficient reduced models and a posteriori error estimation for
parametrized dynamical systems by offline/online decomposition. Mathematical and Computer
Modelling of Dynamical Systems, 17(2):145–161, 2011.
18. B. Haasdonk, M. Ohlberger, and G. Rozza. A reduced basis method for evolution schemes with
parameter-dependent explicit operators. Electron. Trans. Numer. Anal., 32:145–161, 2008.
19. E. Heinz. Beiträge zur Störungstheorie der Spektralzerleung. Mathematische Annalen, 123(1):415–
438, 1951.
20. M. Heyouni and K. Jbilou. An extended block Arnoldi algorithm for large-scale solutions of
the continuous-time algebraic Riccati equation. Electronic Transactions on Numerical Analysis,
33:53–62, 2009.
21. D. Hinrichsen and A.J. Pritchard. Mathematical Systems Theory I: Modelling, State Space Analysis, Stability and Robustness. Springer, 2005.
22. K. Hiramoto, H. Doki, and G. Obinata. Optimal sensor/actuator placement for active vibration
control using explicit solution of algebraic Riccati equations. Journal of Sound and Vibration,
229(5):1057–1075, 2000.
23. D.B.P. Huynh, D.J. Knezevic, Y. Chen, J.S. Hesthaven, and A.T. Patera. A natural-norm successive constraint method for inf-sup lower bounds. Computer Methods in Applied Mechanics and
Engineering, 199(2932):1963–1975, 2010.
24. I.M. Jaimoukha and E.M. Kasenally. Krylov subspace methods for solving large Lyapunov equations. SIAM Journal on Numerical Analysis, 31(1):227–251, 1994.
25. K. Jbilou. Block Krylov subspace methods for large algebraic Riccati equations. Numerical Algorithms, 34(2-4):339–353, 2003.
26. K. Jbilou. An Arnoldi based algorithm for large algebraic Riccati equations. Applied Mathematics
Letters, 19(5):437 – 444, 2006.
27. C. Johnson. Numerical solution of partial differential equations by the finite element method.
Courier Corporation, 1989.
28. M. Kärcher and M.A. Grepl. A certified reduced basis method for parametrized elliptic optimal
control problems. ESAIM: Control, Optimisation and Calculus of Variations, 20(2):416–441, 2014.
29. C. Kenney and G. Hewer. The sensitivity of the algebraic and differential Riccati equations. SIAM
Journal on Control and Optimization, 28(1):50–69, 1990.
30. C. Kenney, A.J. Laub, and M. Wette. Error bounds for Newton refinement of solutions to algebraic
Riccati equations. Mathematics of Control, Signals and Systems, 3(3):211–224, 1990.
31. D. Kressner, P. Sirkovic, N. Thanh, and T. Stykel. A low-rank reduced basis method for parameterdependent lyapunov equations. In 21st International Symposium on Mathematical Theory of Networks and Systems, 2014.
32. H. Kwakernaak and R. Sivan. Linear optimal control systems, volume 1. Wiley-interscience New
York, 1972.
33. P. Lancaster. Explicit solutions of linear matrix equations. SIAM Review, 12(4):544–566, 1970.
34. P. Lancaster and L. Rodman. Algebraic Riccati equations. Oxford University Press, 1995.
35. J.-R. Li and J. White. Low rank solution of Lyapunov equations. SIAM Journal on Matrix Analysis
and Applications, 46(4):693–713, 2004.
36. A. Manzoni and F. Negri. Heuristic strategies for the approximation of stability factors in quadratically nonlinear parametrized PDEs. Advances in Computational Mathematics, 2015.
37. D. Mustafa and K. Glover. Controller reduction by h∞ balanced truncation. IEEE Transactions
on Automatic Control, 36(6):668–682, 1991.
38. F. Negri, G. Rozza, A. Manzoni, and A. Quarteroni. Reduced basis method for parametrized
elliptic optimal control problems. SIAM Journal on Scientific Computing, 35(5):A2316–A2340,
2013.
39. H. Panzer, T. Wolf, and B. Lohmann. A strictly dissipative state space representation of second
order systems. at - Automatisierungstechnik, 60(7):392–397, 2012.
40. A.T. Patera and G. Rozza. Reduced Basis Approximation and a Posteriori Error Estimation for
Parametrized Partial Differential Equations. MIT, 2007. Version 1.0, Copyright MIT 2006-2007,
to appear in (tentative rubric) MIT Pappalardo Graduate Monographs in Mechanical Engineering.
Reduced Basis Approximation of Large Scale Parametric Algebraic Riccati Equations
31
41. T. Penzl. Numerical solution of generalized Lyapunov equations. Advances in Computational
Mathematics, 8(1-2):33–48, 1998.
42. T. Penzl. Eigenvalue decay bounds for solutions of Lyapunov equations: the symmetric case.
Systems & Control Letters, 40(2):139–144, 2000.
43. G. Rozza, D.B.P. Huynh, and A.T. Patera. Reduced basis approximation and a posteriori error
estimation for affinely parametrized elliptic coercive partial differential equations. Archives of
Computational Methods in Engineering, 15(3):1–47, 2007.
44. Y. Saad and X.V. Gv. Numerical solution of large Lyapunov equations. In Signal Processing, Scattering and Operator Theory, and Numerical Methods, Proc. MTNS-89, pages 503–511. Birkhauser,
1990.
45. J. Saak. Effiziente numerische Lösung eines Optimalsteuerungsproblems fr die Abkühlung von
Stahlprofilen. Diploma thesis, Universitt Bremen, 2003.
46. J. Saak. Efficient numerical solution of large scale algebraic matrix equations in PDE control and
model order reduction. PhD thesis, Chemnitz University of Technology, 2009.
47. A. Schmidt, M. Dihlmann, and B. Haasdonk. Basis generation approaches for a reduced basis
linear quadratic regulator. In Proceedings of MATHMOD 2015, pages 713–718, 2014.
48. V. Simoncini. A new iterative method for solving large-scale Lyapunov matrix equations. SIAM
Journal on Scientific Computing, 29(3):1268–1288, 2007.
49. G. Söderlind. The logarithmic norm. history and modern theory. BIT Numerical Mathematics,
46(3):631–652, 2006.
50. T. Stykel. Analysis and numerical solution of generalized Lyapunov equations. PhD thesis, Institut
für Mathematik, Technische Universität, Berlin, 2002.
51. T. Stykel. Numerical solution and perturbation theory for generalized Lyapunov equations. Linear
Algebra and its Applications, 349(1-3):155–185, 2002.
52. J.-G. Sun. Residual bounds of approximate solutions of the algebraic Riccati equation. Numerische
Mathematik, 76(2):249–263, 1997.
53. N. Truhar and K. Veselić. Bounds on the trace of a solution to the lyapunov equation with a
general stable matrix. Systems & Control Letters, 56(7-8):493–503, 2007.
54. A. Unger and F. Tröltzsch. Fast solution of optimal control problems in the selective cooling
of steel. ZAMM - Journal of Applied Mathematics and Mechanics / Zeitschrift für Angewandte
Mathematik und Mechanik, 81(7):447–456, 2001.
55. K. Veroy, C. Prud’homme, D.V. Rovas, and A.T. Patera. A posteriori error bounds for reducedbasis approximation of parametrized noncoercive and nonlinear elliptic partial differential equations. In 16th AIAA Computational Fluid Dynamics Conference. American Institute of Aeronautics
and Astronautics, 2003. Paper 2003-3847.
56. S. Volkwein. Model reduction using proper orthogonal decomposition. Lecture notes, University
of Konstanz, 2013.
57. H. Wendland. Scattered Data Approximation. Cambridge University Press, 2004. Cambridge
Books Online.
58. D. Wirtz. Model reduction for nonlinear systems: Kernel methods and error estimation. PhD
Thesis, University of Stuttgart, 2013.
Download