Document 14947192

advertisement
Adjoint Broyden a la GMRES
Andreas Griewank
1
2
y1
2
, Sebastian Shlenkrih , and Andrea Walther
2
Institut f
ur Mathematik, HU Berlin
Institut f
ur Wissenshaftlihes Rehnen, TU Dresden
Otober 21, 2007
Abstrat
It is shown here that a ompat storage implementation of a quasi-Newton method
based on the adjoint Broyden update redues in the aÆne ase exatly to the well
established GMRES proedure. Generally, storage and linear algebra eort per step
are small multiples of n k, where n is the number of variables and k the number of
steps taken in the urrent yle. In the aÆne ase the storage is exatly (n + k) k
and in the nonlinear ase the same bound an be ahieved if adjoints, i.e. transposed
Jaobian-vetor produts are available. A transposed-free variant that relies exlusively
on Jaobian-vetor produts (or possibly their approximation by divided dierenes)
requires roughly twie the storage and turns out to be somewhat slower in our numerial
experiments reported at the end.
Keywords: nonlinear equations, quasi-Newton methods, adjoint based update,
ompat storage, generalized minimal residual, Arnoldi proess, automati dierentiation
1 Introdution and Motivation
As shown in [SGW06, GSW06, SW06℄ the adjoint Broyden method desribed below has
some very nie properties, whih lead to strong theoretial onvergene properties and good
experimental results. A standard objetion to low rank updating of approximate Jaobians
is that their storage and manipulation involves per step O(n2 ) loations and operations,
respetively, sine sparsity and other struture seems immediately lost. In the ase of unonstrained optimization this drawbak has been overome by very suessful limited memory variants [NW99℄ of the quasi-Newton method BFGS, whih in the ase of quadrati
objetives and thus aÆne gradients redue to onjugate gradients, the method of hoie
for positive denite linear systems. Sine GMRES has a similar status with respet to the
iterative solution of nonsymmetri systems it is a natural idea to implement a nonlinear
solver that redues automatially to GMRES on aÆne systems. As it turns out this is the
ase for a suitable implementation of the adjoint Broyden method. The insight gained from
the aÆne senario also helps us in dealing with singularities and other ontingenies in the
general ase.
The paper is organized as follows. In Setion 2 we desribe the adjoint Broyden sheme
and its main properties. In Setion 3 we develop a ompat storage implementation with
several variants depending on the derivative vetors that are available. These are all equivalent in the aÆne ase for whih we show in Setion 4 that the iterates are idential to the
ones produed by GMRES, provided a linearly exat line-searh is employed. Nevertheless,
Matheon
Partially supported by the DFG Researh Center
"Mathematis for Key Tehnologies", Berlin
y Corresp. author: e-mail: griewankmathematik.hu-berlin.de, Fax: +49-30-2093-5859
1
our methods are geared towards the general, nonlinear senario, where the basi shem is
guaranteed to onverge [Sh07, Se. 4.3.2℄, provided singularity of the atual Jaobian is
exluded. Finally, in Setion 6 we report omparative numerial results, mostly on nonlinear
problems.
2 Desription of the quasi-Newton method
We onsider the iterative solution of a system of nonlinear equations
F (x) = 0;
assuming that F : Rn ! Rn has a Lipshitz ontinuously dierentiable Jaobian F 0 (x) 2
Rnn in some neighborhood N Rn of interest. Given an initial estimate x0 reasonably
lose to some root x 2 F 1 (0) \ N and an easily invertible approximation A 1 to F 0 (x )
we may apply our algorithms to the transformed problem
0 = F~ (~x) F (x0 + A
1
1
x~)
Therefore we will assume without loss of generality that the original problem has been
rewritten suh that for some saling fator 0 6= 2 R
A
1
= I and x0 = 0
This assumption on A 1 greatly simplies the notation without eeting the mathematial
relations for any sensible algorithm.
Throughout the paper we use the onvention that the subsript k labels all quantities
related to the iterate xk as well as all quantities onerning the step form xk 1 to xk . Hene
the basi iteration is
x =x
k
+ k sk with As 1 sk = Æk Fk
1
k
1
and k
2R3Æ
k
where Fk 1 F (xk 1 ). After eah iteration the Jaobian approximation Ak 1 is updated
to a new version Ak in a way that distinguishes various quasi-Newton methods and is
our prinipal onern in this setion. The salar Æk allows for (near) singularity of the
approximate Jaobian Ak 1 and k represents the line-searh multiplier, both of whom will
be disussd below. Whenever disrepanies are omputed or symbolially represented, we
subtrat the (more) exat quantity from the (more) approximate quantity. This is just a
notational onvention onerning the seletion of signs in forming dierenes.
The Rank-one Update
Our methods are based on the following update formula.
Denition 1 (Adjoint Broyden update)
For a given matrix A 1 2 R and a urrent point x 2 R set
A = A 1 v v> A 1 F 0 with F 0 F 0 (x )
n
k
k
where v = =k
k
k
k
k
k
k
k
'Tangent': = A
k
k
k
k
1
s
k
k
k with 2 R
'Residual': = F
'Seant': = A
n
n
k
F0 s
k
(Fk
k
k
2 R n f0g
for some 2 R n f0g
for some s
k
F
k
k
hosen aording to one of the three options:
1
k
n
k
1
)=k
k
n
k
2
(1)
It an be easily seen that the formula represents the least hange update with respet to the
Frobenius matrix norm in order to satisfy the adjoint tangent ondition
A> = F 0> k
k
k
k
The residual hoie has the nie property that after the update
A> F = F 0> F
k
k
k
k
rf (x
k
) for f (x) kF (x)k2 =2
so that the gradient of the squared residual norm is reprodued exatly. Throughout the
paper k k denotes the Eulidean norm of vetors and the orresponding indued 2-norm of
matries.
When k is seleted aording to the tangent or seant option, the primal tangent ondition Ak sk = Fk0 sk is satised approximately in that
k(A
F 0 )s =ks
k
k
k
k
kk
=
O(kx
x
k
k
1
k)
When a full quasi-Newton step sk = Ak 1 1 Fk 1 with k = 1 = Æk has been taken then
the residual and the seant options are idential. The seant option redues to the tangent
option as k ! 0 or when F is aÆne in the rst plae.
Throughout the paper we will allow the hoie k = 0, whih amounts to a pure tangent
update step on the Jaobian without any hange in the iterate xk itself. Several suh
primally stationary iterations may be interpreted as part of an inexat Newton method,
whih approximately solves the linearization of the given vetor funtion at the urrent
primal point xk .
Heredity Properties
In the ase of an aÆne funtion F (x) Ax b the tangent and seant options yield identially
= (A
k
k
A)s = D
1
k
k
1
s
with Dk
k
1
A
1
k
A2R n
n
Then it follows from (1) that the disrepany matries Dk satisfy the reurrene
D = D
k
k
D
1
k
s s> D> 1 D
kD 1 s k2
1
k
k
k
k
k
k
1
= (I
v v> )D
k
k
k
1
Form this projetive identity one sees immediately that the nullspaes of Dk and its transposed Dk> grow monotonially with eah update and must enompass the whole spae Rn
after at most n updates that are well dened in that their denominator does not vanish.
In other words in the aÆne ase the tangent and seant updates exhibit diret and adjoint
hereditary in that
A s = As
k
j
j
and A>
j = A> j for 0 j k
k
When the residual update is applied intermittently without k 2 range(Dk 1 ) and thus
vk 62 range(Dk 1 ) the diret heredity is maintained but adjoint heredity may be lost. Suh
updates an be viewed as a reset and are expetedt to be partiularly useful in the nonlinear
ase.
Jaobian Initialization
It is well known for unonstrained optimization by some variant of BFGS that, starting from
an initial Hessian approximation of the form I the performane may be strongly dependent
on the hoie of the salar 6= 0. This is so in general, even though on quadrati problems
with exat line-searhes the iterates are mathematially invariant with respet to 6= 0.
Hene we will also look here for a suitable initial saling.
3
Another aspet of the initializationn is that in order to agree with GMRES on aÆne
problems, we have to begin with a residual update using 0 = F0 before the very rst
iteration. This implies in the aÆne ase that for all subsequent residual gradients rf (xk ) =
Fk> Fk0 = Fk> Ak , whih ensure for the quasi Newton-steps
s
k+1
rf (x
= Æk Ak 1 Fk that
k
)> sk+1 = Fk> Fk0 sk+1 = Æk kFk k2
For Æk > 0 we have therefore desent, a property that need not hold in the nonlinear situation
as we well disuss below. Starting form A 1 = I with any we obtain by the initial residual
update
A0 = I v0 v0> (I F00 ) with det(A0 ) = 1 v0> F00 v0
A reasonable idea for hoosing seems to minimize the Frobenius norm of the resulting
update from A 1 to A0 . This riterion leads to = v0> F00 v0 , a number that may be positive or
negative but unfortunately also zero. That exeptional situation arises exatly if det(A0 ) = 0
with the nullvetor being v0 irrespetive of the hoie of . In any ase we have by Cauhyn
Shwartz inequality
> 0 v F v0 0
0
kF 0 v k
0 0
where the right hand side does not vanish provided F00 is nonsingular as we will assume
throughout. Hene we onlude that
sign(v0> F00 v0 )kF00 v0 k
an be used as initial saling. Should the rst omponent be zero the sign an be seleted
arbitrarily from f+1; 1g. We ould be a little bit more sophistiated here and hoose the
size jj as the Frobenius norm of the rst extended Hessenberg matrix H0 2 R21 generated
by GMRES, but that ompliates matters somewhat in requiring some look-ahead, espeially
in the nonlinear situation.
Ourrene and Handling of Singularity
As we have seen above the ontingeny det(Ak ) = 0 may arise theoretially already when
k = 0. In pratie we are muh more likely to enounter nearly singular Ak for whih the
full quasi-Newton diretions sk+1 = Ak 1 Fk beome exessively large and strongly eeted
by round-o. Provided we update along a null-vetor whenever Fk 1 is not in the range
of F 0 (x) we have even theoretially at most one null diretion aording to the following
lemma.
Lemma 2 (Rank Drop at most One)
6 0, then the tangent option
If A 1 s = Æ F 1 for Æ 2 R with s 6= 0 and F 0 s =
= (A 1 A)s ensures for the update (1) that
8
= 1 if Æ = 0 and F 0 s 62 range(A 1 )
>
>
<
= 0 if Æ = 0 and F 0 s 2 range(A 1 )
rank(A ) rank(A 1 )
2 f0; 1g if Æ 6= 0 and F 0 s 62 range(A 1 )
>
>
:
2 f 1; 0g if Æ 6= 0 and F 0 s 2 range(A 1 )
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
Proof:
The tangent update always takes the expliit form
A =A
k
k
1
A
k
1
F 0 s s> A
k
k
k
k
F0 > A
1
k
k
1
F 0 =k(A
k
k
1
F 0 )s
k
k
k
2
If Fk0 sk 2 range(Ak 1 ) the range of Ak is ontained in that of Ak 1 so that the rank annot
go up, whih implies immediately the forth ase as a rank-one update an only hange the
rank by one up or down. If Fk0 sk 62 range(Ak 1 ) then multipliation of the above equation
4
from the right by a prospetive nullvetor v shows that the oeÆient of Fk0 sk and thus the
whole rank one term must vanish. Hene v must already be a nullvetor of Ak 1 and thus
the rank annot go down, whih implies in partiular the third ase. When Æk = 0 and thus
Ak 1 sk = 0 the update simplies to Ak = Ak 1 + Fk0 sk s>k Fk0>Fk0 =kFk0 sk k2 so that sk is a
nullvetor of Ak 1 but not a null-vetor of Ak . Hene we have also proven the assertion
for the rst ase, as there an be no new null-vetor as observed above. In the remaining
seond ase all nullvetors of Ak 1 that are orthogonal to (Fk0 sk )> Fk0 are also nullvetors of
Ak and there is exatly one additional nullvetor, whih we may onstrut as follows. Let
Fk0 sk = Ak 1 vk . Then there is one value 2 R suh that
A (v + s ) = F 0 s 1 s> F 0 >F 0 v =kF 0 s
k
k
k
k
k
k
k
k
k
k
k
k
2
= 0
The lemma has the following algorithmi onsequenes. If A0 has at least rank n 1
and we selet sk as a nullvetor, i.e. set Æk = 0, whenever Ak 1 is singular, then the rank
of the approximations Ak annever drop below n 1. We will all this approah of setting
Æk = 0 as soon as Ak 1 is singular, the full rank strategy. Exatly whih value Æk 6= 0 we
hoose when Ak 1 is nonsingular does not make muh dierene in the aÆne ase, but is
of ourse quite important in the nonlinear ase, unless we perform an exat line-searh suh
that the saling of sk beomes irrelevant. We an only deviate from the full rank strategy
when the approximate Jaobian Ak 1 is singular but Fk 1 still happens to be in its range.
Then we might still hoose Æk 6= 0 and determine sk as some solution to the onsistent linear
system Ak 1 sk = Fk 1 Æk . This hoie of sk is even theoretially nonunique and pratially
subjet to severe numerial instability, espeially in the nonlinear senario.
3 Smooth formulation via Adjugate
In the aÆne situation we will see that the singularly onsistent linear systems annever
our and that the resulting property rank(Ak ) n 1 is related to the well-known fat
that the Hessenberg matrix Hk in the Arnoldi proess never suers a rank drop of more
than one, provided the system matrix itself is nonsingular. To dene sk+1 uniquely as a
smooth funtion of Ak and Fk we may set Æk+1 = det(Ak ) and use the adjugate adj (Ak )
dened as the ontinuous solution to the identity
A adj (Ak ) = det(Ak )I = adj (Ak )Ak
k
The entries of adj (Ak ) may be dened as the o-fators of Ak Then we may dene the steps
onsistently and niely bounded via
s
If rank(Ak ) = n
k+1
)
adj (Ak )Fk
Ak sk+1 =
det(Ak )Fk
1 there exist nonzero nullvetors uk and wk
2R
n
suh that
adj (Ak ) = wk u>
with Ak wk = 0 and u>
k 6= 0
k Ak = 0
Then the above formula yields the step
s
so that we have
s
k+1
=0
k+1
= wk u>
Fk
k
, F
k
2 kern(A
k
)
= 0 or 0 6= uk ? Fk =
6 0
where the seond possibility an only our when Ak is singular. The rst represents regular
termination beause the system is solved, whereas the seond possibility indiates premature
break down of the method if it is indeed dened in terms of the adjugate. It means that
the linear system Ak sk+1 = Fk is singular but still onsistent as Fk happens to lie in the
5
range fuk g? of Ak . Hene nonzero solutions sk+1 would exist but not be unique and in the
presene of round-o possibly very large. Fortunately this ontingeny an not our in the
aÆne senario as we will see in Setion 5. If it does in the nonlinear ase we may dene sk+1
as some nonzero null-vetor of Ak , whih is essentially unique as long as rank(Ak ) = n 1
irrespetive of whether Fk is in its range or not. Alternatively we may reset Ak to A0 as
disussed above with F0 = Fk , whih ertainly ensures that the subsequent step is welldened.
The use of the adjugate is more of an aestheti devie in view of the aÆne senario
that is of partiular interest in this paper. It does however alleviate the need to distinguish
the ases rank(Ak ) = n and rank(Ak ) = n 1 in proofs and other developments. The
numerial omputation of sk+1 = adj (Ak )Fk an be performed simply and stably on the
basis of an LU- or QR fatorization of Ak . To have a better hane of obtaining a desent
diretion one may multiply the step by sign(det(Ak )), whih guarantees desent aording to
(2) in the aÆne ase. More realiable for the nonlinear ase would be to evaluate always the
diretional derivative rf (xk )> sk+1 and if neessary swith the sign of sk+1 before entering
the line-searh.
Line-Searh Requirements
The line-searh from [Gri86℄ skethed below makes no assumption regarding the diretional
derivative and thus may produe negative step-multipliers. Moreover, if sk 6= 0 is seleted as
arbitrary null-vetor of Ak whenever det(Ak ) 6= 0, then that line-searh ensures onvergene
from within level sets of f in whih the atual Jaobian F 0 (x) has no singularities. That is
true even if A0 is initialized to the null matrix, whih would leave a lot of indeterminay for
the rst n step seletions.
The least-squares alulation at the heart of the GMRES proedure may be eeted in
our quasi-Newton method through an appropriate line-searh. Sine for aÆne F (x) = Ax b
the funtion
f~k () f (xk 1 + sk ) = kFk 1 + Ask k2 =2
is quadrati, just three values of f~k or two values and one diretional derivative will be
enough to ompute the exat minimizer k 2 R. Alternatively, we may interpolate the
vetor funtion itself by
F~ ()
k
)F
(1
k
1
+ F (xk
1
+ sk )
on the basis of Fk 1 and F (xk 1 + sk ) alone. In the aÆne situation we have exatly f~k () =
kF~k ()k2 =2 so that the two approahes are equivalent and yield the optimal multiplier
=
k
F > 1 As
kAs k2
k
k
k
> >
s kAAs kAs
k
k
1
k
k
2
The multiplier k may be negative or even zero but it always renders the new residual
Fk = Fk 1 + k Ask exatly orthogonal to Ask . This orthogonality is ruial to proving the
equivalene with GMRES and we will all any line-searh yielding suh an k in the aÆne
ase as linearly exat. Throughout we will refer to the step xk xk 1 = k sk as
trivial : s = 0 ; full : = 1=Æ ; singular : det(A
k
k
k
k
k
1
)=0;
exat : = :
k
k
In the nonlinear situation we may have to perform several interpolations as desribed in
[Gri86℄ before an aeptable k is reahed. As we will see in the nal setion our line-searh
based on vetor interpolation rarely requires more than one readjustment of k from the
initial estimate k = 1=Æk . Of ourse in the aÆne ase the initial guess does not matter at
all if at least one interpolation is performed so that k is reahed.
6
Algorithmi Speiation
Putting the piees together we get the following algorithm
Algorithm 3 (Adjoint Broyden)
Initialize: Set x0 = 0 and A0 = I v0 v0> (I F00 ) with
v0 = F0 =kF0 k and = sign(v0> F00 v0 )kF00 v0 k, set k = 1
Iterate:
Compute s = adj (Ak 1 )Fk 1 and
dene by the tangent or seant option.
Terminate: If k k " return x = x 1 + s =Æ and stop
Update: Inrement x = x 1 + s for some 2 R
set v = =k k, update A = A 1 v v> (A 1 F 0 )
and ontinue with Iterate for k = k + 1
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
The algorithm involves at eah iteration one evaluation of Fk 1 , one of vk> Fk0 a few trial
values for Fk during the line-searh. In terms of linear algebra we have to ompute the step
sk by solving a system in the approximated Jaobian Ak 1 and then update an appropriate
representation of it to that of Ak . This means that both linear algebra subtasks require
O(n2 ) operations and the storage requirement is n2 or 1:5 n2 oating point numbers for a
QR and LU version, respetively.
4 Compat Storage Implemention
In order to redue storage and linear algebra at least for early iterations we onsider the
additive expansion
A = I
k
X
v v> A
j
k
j
j
F0 :
1
j
j =0
Abbreviating
V
k
v ; v ;:::; v 2 R
0
1
k
(
and Wk
k+1)
n
F 0> v ; : : : ; F 0> v 2 R
0
0
k
k
(
n
k +1)
we obtain the following representation of Ak and its inverse.
Lemma 4 (Fatorized Representation)
With L 1 2 R( +1)( +1) the lower triangular part of V > V inluding its diagonal we have
1
A = I V L V W > and det(A ) = det(H ) k
k
k
k
k
k
k
k
k
k
k
k
where
n
k
H W > V R and R V > V L 1 2 R( +1)( +1)
with R being stritly upper triangular. Sherman-Morrison-Woodbury yields the inverse
A 1 = I= + V H 1 V W = >
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
if det(A ) 6= 0 and in any ase the adjugate
k
adj (Ak ) = det(Ak )I= + n
k 1
7
Vk adj (Hk )(Vk
Wk =)>
Proof: For k = 1 the rst assertion holds trivially with all matries other than A
vanishing ompletely. The indution from k 1 to k works as follows
A = A 1 v v> A 1 F 0 = I v v> A 1 + v v> F 0
= I + I v v > A 1 I v v > I F 0
>
v v> I F 0
= I + I v v > V 1 L 1 W 1 V 1
k
k
k
= I
V
k
1
= I
V
k
k
k
1
k
k
k
k
;v
;v
h
k
k
k
k
k
k
I
v> V
i
k
k
k
k
L
v> V
W >:
k
k
1
1
L
L
k
k
k
V
1
W
1
k
0
1
1
k
= I
k
k
k
k
1
k
k
k
k
k
V L V
= I
k
k
h
k
k
1
ih
k
k
k
>
1
V 1 W 1
v> I F 0
k
k
k
v v> I
> i
k
k
F0
k
k
k
Hene we have proven the representation of Ak provided Lk is shown to be the inverse of
the upper triangular part of Vk>Vk assuming this relation holds for Lk 1 . That last part of
the indution holds sine
h
L 11 0
>
v V 1 1
k
ih
L
v> V
k
k
1
k
1
k
k
L
k
0
1
1
i
=
h
I 0
i
0 1
so that the matrix in the middle represents indeed Lk .
Assuming rst det(Hk ) 6= 0 we obtain aording to the Shermann-Morrison-Woodbury
formula the inverse
A
k
1
=
=
1h
1h
I +V I
L V
k
k
W = > V
k
I + V L 1 V > V + W >V
k
k
k
k
k
k
1
L V
k
k
k
1
V
k
k
W =)>
i
W >
i
k
k
whih an obviously be rewritten in the asserted form using the matries Rk and Hk . The
adjugate is obtained by multiplying both sides with det(Ak ) = det(Hk )n k 1 .
Sine Lk is not needed expliitly we an implement the adjoint Broyden method storing
the two n (k + 1) matries Vk ; Wk and the matrix Hk 2 R(k+1)(k+1) in fatorized or
inverted form. For small k this is ertainly muh less than the usual dense LU or QR
implementation of Ak . However, as k approahes n it is signiantly more even if we do not
store Vk>Vk whih is only needed for the appliation of Ak itself.
Limited Memory Strategy
Sine we have managed to eliminate the intermediate approximations Aj from the representation of Ak and its inverse or adjugate, it is in fat quite easy to throw out or amalgamate
older diretions vj and the orresponding adjoints vj>Fj0 from Vk and Wk , respetively. Then
the orresponding rows and olumns of Vk>Vk and most importantly Hk disappear or are
merged as well, whih amounts to rank-two orretion that is easily inorporated into the
inverse or a fatorization. Hene we have the apaity to always only use a window of m
omparatively reent piees of diret and adjoint seant information, a strategy that is used
very suessfully in limited memory BFGS. In a rst test implementation we simply hoose
a xed maximum m and (over)write vk for k > m into the [(k 1) mod m℄ + 1-th olumns
of Vm . Obviously Wm and Hm are treated aordingly.
As we will show below, we nd in the aÆne ase that Vk is orthogonal so that Lk = I ,
Rk = 0 and Hk is atually upper Heisenberg, i.e., has only one nonvanishing subdiagonal.
In the limited memory variant the orthogonality of Vk is maintained but the Hessenberg
property of Hk is lost.
8
Step alulation variants
Using some temporary (k + 1) vetor t the atual omputation of the next quasi-Newton
step sk+1 = Æk Ak 1 Fk an be broken down into the subtasks
(i) Multiply t (Æk =)Wk>Fk
(ii) Multiply and derement t = Vk>Fk Æk
(iii) Solve Hk t = t
(iv) Multiply and derement sk+1 = (Æk =)Fk
V t
k
The most promising savings are possible in the rst step sine we have
W >F
k
k
v>F 0 F
j
j
k
j =0:::k
v>F 0 F
j
k
k
j =0:::k
V > F0 F
k
k
k
The approximation holds as equality exatly in the linear ase, where F 0 is onstant and
thus very nearly in the smooth ase. The vetor on the right hand side represents in fat
newer derivative information than then original one on the left. So we an get by without
storing Wk at all, whih pretty muh halves the total storage requirement as long as k n.
However, there is another ritial issue namely how we build up the matrix Hk . Its
ompared to Hk 1 new k -th olumn and row are given by
v>F 0 v
j
j
k
j =0:::k
V >F 0 v 2 R
k
k
k
k +1
and vk>Fk0 Vk
v> F 0 v
k
j
j
j =0:::k
2R
k+1
:
For the olumn we may simply use the approximation based on the single, urrent diretional
derivative Fk0 vk . For the row we have at least three dierent hoies. Firstly, we an ompute
the adjoint vk>Fk0 but do not store it for any longer. Seondly, we store all the diretional
derivative Fj0 vj for j = 0 : : : k . Finally, we an relay on the near upper Hessenberg property
of Hk and only ompute the last two entries vk> Fk0 1 vk 1 and vk> Fk0 vk . The third option
requires virtually no extra storage other than that of Vk and Hk in Hessenberg form. In that
way the whole alulation would redue almost exatly to the GMRES proedure exept for
the stritly upper triangular orretion Rk , whih is theoretially zero in the linear ase.
For the solution of the nonlinear test problems in Setion 6 we used the following three
variants of the adjoint Broyden method:
(0) original adjoint Broyden update method storing Vk , Wk , and the QR fatorization of
Hk . This requires evaluation of F (xk ) and vk>F 0 (xk ) at eah iterate.
(1) minimal storage implementation using only Vk , the QR fatorization of Hk , and approximating Wk> v Vk> (Fk0 v ). Requires evaluation of F (xk ), F 0 (xk )sk , and vk>F 0 (xk ).
(2) forward mode based implementation using Vk , Zk = Fj0 vj j =0:::k , the QR fatorization
of Hk , and approximating Wk> v Vk> (Fk0 v ), vk> Fk0 Vk vk> Zk . Requires F (xk ) and
F 0 (xk )vk
Of ourse, it is also possible to implement method (2) based on nite dierenes approximations to the diretional derivatives F 0 (x)v . However, preliminary numerial tests showed
that onvergene of this variant is rather unreliable. For aÆne problems the Jaobian of F is
onstant and hene the variants (0) to (2) yield up to round-o idential numerial results.
5 Redution to GMRES/FOM in aÆne Case
For the following result we assume that the adjoint Broyden method is applied with virtually
arbitrary step-multipliers k . Naturally whenever k = 0 we have to apply the tangent
update, whih ould however also be approximated by a divided dierene. Now we obtain
the main theorem of this paper.
9
Theorem 5 Suppose the algorithm 3 is applied in exat arithmeti with stopping tolerane
" = 0 to an aÆne system F (x) = Ax b with det(A) 6= 0. Then:
(i) If > 0 the iteration performs exatly the Arnoldi proess irrespetive of the hoie
of 2 R. If < 0 the v and the orresponding entries in H may dier in sign.
arrives at a rst for whih .
k
k
k
(ii) With k^ n the rst index suh that ^ = 0 we have x = x^ 1 + s^ =Æ^ = A 1 b. This
nal step is well dened and must be taken as F^ 1 6= 0 6= s^ and Æ^ = det(A^ 1 ) 6= 0.
k
k
k
k
k
k
k
k
(iii) For k < k^ all full steps x = x 1 + s =Æ with Æ = det(A 1 ) (would) lead to points
that oinide with the k-th iterate of the full orthogonalization method (FOM).
(iv) If (linearly) exat are used throughout the resulting iterates x oinides with those
generated by GMRES.
k
k
k
k
k
k
k
k
Proof: In the aÆne ase we may always use the tangent option for so that the only
impat of the step size hoies on the prinipal quantities A and v appears to be via
the the residuals F = Ax
b. As we will see, there is in fat no suh dependene, but
we an ertainly state already now that for any partiular sequene of values there must
be a ertain rst k^ for whih ^ = 0. The adjoint heredity property disussed in Setion 2
implies that for k^ > k > j 0
> = s> D > s = s > 0 = 0
k
k
k
k
k
k
k
k
j
k
k
k
1
j
k
so that Vk> Vk = Ik and onsequently Lk = Ik ; Rk = 0 in the representations of Ak and
adj (Ak ). Assuming that F (0) = b 6= 0 and det(A) 6= 0 we nd that 1 k^ n sine no more
than n othogonal diretions vk an exist Rn .
Now we establish the following relations by indution on k = 1; 2; : : : ; k^
2 K spanfb; Ab; : : : ; A bg = spanfv ; v ; : : : ; v g R
(vi) F
2 K + AK = K
(vii) s 2 K 3 x
(v ) vk
k
1
k
k
1
1
k
k
1
k
1
0
1
k
1
n
k
k
All three assertions hold learly for k = 1 where the Krylov subspae K1 is just the span of
F (0) = b and v0 = 0 =k0 k is seleted by the residual option = F (0) = b . To progress
from k to k + 1 we note that
=D
k
k
1
s =A
k
k
1
s
As = F
k
k
k
1
Æ
k
As
k
2K
k
+ AKk = Kk+1
whih proves (v ) sine vk is olinear to k and orthogonality proves that their span is the
whole of Kk . Similarly we have
F =F
k
k
1
+ k Ask
2K
k
+ AKk = Kk+1
whih proves (vi). From the representation of adj (Ak ) in Lemma 3 we see that by (v ) up
to and inluding vk
s
k+1
=
adj (Ak )Fk 2 Kk+1 + range(Vk ) Kk+1
whih proves (vii) as the assertion for xk+1 = xk + k sk is obvious. Sine the vj are
orthogonal and span suesively the Krylov subspaes Kk they must be idential ( up to
sign hanges ) to the bases generated by the Arnoldi proess. As a onsequene it is well
know that eah Avi 1 2 Ki+1 is a linear ombination of the vj with j = 0 : : : i so that there
k suh that
is an upper Hessenberg (k + 2) (k + 1) matrix H
AV = V H
k
k
k
and Vk> AVk = Hk
10
Here Hk is for any k < k^ exatly the (k + 1) (k + 1) matrix ourring in Lemma 2 and an
be obtained from H k by simply leaving of the last row. It would be nie to be show that
the subdiagonal elements of Hk are positive to have omplete oinidene with Arnoldi but
that is not an essential property. In any ase it follows form our Lemma X in agreement
k has always full olumn
with [Saa03℄ that det(A) 6= 0 implies that the retangular matrix H
rank (k + 1) and Hk has therefore at least the rank k . Hene the adjugates adj (Ak ) and
adj (Hk ) are always nontrivial. Moreover, sine the elements in the subdiagonal of Hk are all
nonzero we know that the a left nullvetor t>
of Hk must have a nontrivial rst omponent
k
is it exists at all.
Now we an proof the remaining assertions in an expliit fashion. Firstly we obtain for
the step sk+1 using the fatorized representation of the adjugate from Lemma 4 and the
identity Wk = AVk with Æk+1 = det(Ak )
1
I= + V adj (Hk )Vk>(I A=) F
1
V adj (Hk )Vk> Fk
where we have used that F = V V > F so that the AF term anels out.
+1 = (A A)s +1
1
=
F Æ +1 + AV adj (Hk )Vk> Fk
1
=
Æ +1 I AV adj (Hk )Vk> F
1
=
Æ +1 I AV adj (Hk )Vk> F0
1
=
Æ +1 F0 + AV adj (Hk )e0 kF0 k
where e0 is the rst Cartesian basis vetor. The last simpliations ome about beause F
F0 2 AK belong to the null spae of the matrix in square brakets and v0 = F0 =kF0 k. Hene
we see that indeed the and thus the v and A for k < k^ are ompletely indepedendent of
the hoie of whih may produe an arbitrary residual in F0 + AV . Moreover it follows
s
Æ
=
=
k+1
n
k
k
n
k+1
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
k
n
k
n
k
n
k
n
k
k
k
k
k
k
k
k
k
k
k
k
k
from Cramers rule that the last omponent in the vetor adj (Hk )ek is exatly the produt
of the k subdiagonal elements of the Hessenberg matrix Hk , whih are well known to be
positive in the Arnoldi proess. Hene this property is maintained by indution if > 0.
Now let us onsider the nal situation k^ = 0. By denition the previous k for k < k^
and thus the Fk for k < k^ 1 and the orresponding subdiagonals of Hk annot vanish.
Thus we must have
0 = Æk^ Fk^ 1 AVk^ 1 adj (Hk^ 1 )e0 kF0 k
Sine the olums of AVk are linearly independent and adj (Hk^ 1 )e0 kF0 k annot be zero
neither Fk^ 1 nor Æk^ an vanish so that the last step sk^ is neither zero nor singular. That
implies that Fk^ = Fk^ 1 + Ask^ Æk^ = k^ Æk^ = 0. Generally we have after eah full step that
Fk is a multiple of k , whih belongs to the orthogonal omplement of Kk . That is exatly
the dening property of an FOM iterate so that we have now proven (ii) and (iii).
Sine k is obtained by a line-searh minimizing kFk 1 + Ask k22 we must have exatly
Fk> Ask = 0. We now proof by indution on k < k^ the dening property of GMRES namely
that Fk> Asj for all 0 < j k . It does hold for k = 1 = j as we have just shown. Sine for
k>1
F =F
k
k
1
+ k Ask = k Dk 1 sk + (1
)F
k
k
1
= k + (1
)F
k
k
1
the orthogonality of Fk to all Asj for j < k follows from the indution hypothesis Fk> 1 Asj
and the fat that k ?Kk 3 Asj .
To illustrate the above result in an extreme situation let us onsider the ase where
A or more generally AA0 1 is equal to the right shift matrix so that for any vetor u =
(1 ; 2 ; : : : n 1 ; n )> 2 Rn
A(1 ; 2 ; : : : n
1
; )> = (2 ; 3 ; : : : n
11
n
1
; 1 )>
In other words A is zero exept for 1's in the subdiagonal and the (1; n) element. Sine
AA> = I this yli permutation matrix is orthogonal and thus ertainly normal, whih
aording to the usual linear algebra folklore suggests that GMRES should not do too
badly. In fat we nd for the right hand side b = (1; 0 : : : 0; 0)> and x0 = 0 that by GMRES
also xk = 0 for k = 1 : : : n 1 and only the very last, namely n-th step leads to the solution
xn = x = (0; 0 : : : 0; 1)>. Moreover the vk are the Cartesian basis vetors ek and all matries
Hk = Vk> AVk have the null-vetors sk+1 = ek+1 , whih means in partiular that FOM is
never dened.
6 Numerial results
The adjoint Broyden methods are applied to several nonlinear equation problems. The
subset of nonlinear equation problems with variable dimension of the More test set [MGH81℄
is seleted. The results for these test problems should give an overview of the performane
of the variants of the adjoint Broyden method. Additionally, three spei test problems are
seleted to investigate the onvergene properties of the adjoint Broyden methods in more
detail. For that purpose the problem dimensions and initial states are varied. The iteration
is globalized by a derivative-free line searh in the range of F . This line searh was proposed
in [Gri86℄ to prove global onvergene of Broyden's method and it is adapted to the adjoint
Broyden's method in [Sh07, Se. 4.3.2℄.
The ompat storage variants of the adjoint Broyden method are implemented in the
ode abrnlq2 given as Matlab and C routine. For the onsidered test problems and the
Matlab ode derivatives are evaluated by applying AD by hand. The appliation of the C
ode uses the AD tool ADOL-C. As proposed in Setion 4, we onsider three variants of the
algorithm. These variants are either applied to the original funtion or to the preonditioned
funtion hoosing A 1 = F 0 (x0 ) or A 1 = F (x0 )> F 0 (x0 )F (x0 )=F (x0 )> F (x0 ) I .
The nonlinear equation problems with salable dimension of the More test set are given
in Table 1.
Table 1: Nonlinear equation problems of More test set
Number
(21)
(22)
(26)
(27)
(28)
(29)
(30)
(31)
Name
Extended Rosenbrok funtion
Extended Powell singular funtion
Trigonometri funtion
Brown almost-linear funtion
Disrete boundary value funtion
Disrete integral equation funtion
Broyden tridiagonal funtion
Broyden banded funtion
Referene
[Spe75℄
[Spe75℄
[Spe75℄
[Bro69℄
[MC79℄
[MC79℄
[Bro65℄
[Bro71℄
The olumn Number represents the number of the problem in [MGH81℄. Additionally
the performane of the adjoint Broyden updates is examined in more detail for three spei
test problems:
12
Test funtion 1: The disrete integral equation funtion (29) in the More test set given
by x = (x( ) ) =1 , F (x) = (f (x)) =1 , and
i
i
:::n
i
i
:::n
X
h
f (x) = x( ) + (1 t ) t (x( ) + t + 1)3 +
i
i
2
i
h
2
j
i
j
j
j =1
n
X
t
i
t )(x( ) + t + 1)3 :
(1
j
j
j
j =i+1
Here h = 1=(n + 1) and ti = ih. The funtion F is dierentiable and its Jaobian is dense.
The default initial iterate is hosen by x0 = (ti (ti 1))i=1:::n .
Test funtion 2: The extended Rosenbrok funtion (21) in the More test set given by
x = (x( ) ) =1 , F (x) = (f (x)) =1 , and
i
i
:::n
i
i
:::n
(
f (x) =
i
10 x(i+1)
1 x(i 1)
x2( )
i
if i odd
:
if i even
The funtion is dierentiable and its Jaobian is tridiagonal. The default initial iterate is
hosen by x0 = ( 1:2; 1; 1:2; 1; : : :).
Test funtion 3: A matrix X 2 R is sought as the matrix ube root for a given real
diagonalizable matrix Z 2 R , i.e.,
d
d
d
d
X 3 = X X X = Z:
(2)
The eigenvalue deomposition of Z = T DT 1 yields the diagonal matrix D = diag f1 ; : : : ; d g.
Denoting
D1=3 = diagf11=3 ; : : : ; 1d=3 g;
one obtains for X = T D1=3 T
1
the identity
X 3 = T DT
1
= Z:
Thus problem (2) has a solution and an be formulated as nonlinear equation problem by
F (X ) = X 3 Z = 0 2 R d
d
with dimension n = d2 . In the implementation the matrix X is assoiated row-wise with
the state vetor x = (x(i) ), where
x([
k
1℄d+l)
= Xk;l for k; l = 1; : : : ; d:
Here we hoose Z = tridiag ( 1; 2; 1). As default initial iterate the identity matrix X0 =
I 2 R is used. Note that the (i; j )-th entry of X impats all elements in the i-th row as
well as the j -th olumn of X 2 . Consequently the same entry impats all elements of X 3 ,
whih means that the Jaobian of this test funtion F (X ) is dense and has thus d4 nonzero
d
d
entries.
Convergene results for More test set funtions To illustrate the performane of
the adjoint Broyden update methods, the number of iterations needed to reah onvergene
with a reasonable tolerane are ompared. Additionally, the run times required for the whole
iteration proess are stated. For that purpose, the C version of the program is ompiled
using g 4.1 and exeuted on a PC with AMD Athlon(tm) 64 X2 Dual Core Proessor
3800+ with 2 GHz and 512 KB ahe size.
13
The results for the higher dimensional nonlinear equation problems of the More test set
with default initial iterates are displayed in Table 2. The ompat storage representation of
the adjoint Broyden method is ompared to the full storage representation based on updating
an LU fatorization of Ak . The update is evaluated by an algorithm of Bennett [Ben65℄.
The numbers in the rst olumn refer to the number of the test problem in [MGH81℄.
If not otherwise stated, these tests are performed for the dimension n = 1000 using the
initial iterates as proposed in the test set. The iteration is performed up to a tolerane of
tolF = 10 14 in the residual kF (xi )k2 and at most 500 iterations.
Table 2: Results of More test set for default initial iterates
Test problem
(21)
(a)
(b)
()
(21)
(a)
(P1)
(b)
()
(22)
(a)
(b)
()
(22)
(a)
(P1)
(b)
()
(26)1
(a)
(b)
()
(26)1
(a)
(P1)
(b)
()
(27)2
(a)
(b)
()
(27)2
(a)
(P1)
(b)
()
full adj.
Broy.
{
15
0.36
24; 24
{
28
0.68
0; 0
14
0.36
3; 1
17
0.43
1; 0
9
3.1e-4
1; 0
237
9.4e-3
464; 464
adjoint Broyden variant
(0)
(1)
(2)
183
190
{
0.64
0.59
177; 0
184; 0
14
20
{
0.40
0.78
24; 24
26; 26
44
44
{
0.05
0.05
9; 4
14; 7
28
28
{
0.79
1.10
0; 0
0; 0
13
14
116
0.03
0.04
0.41
3; 0
2; 0
117; 7
17
21
{
0.49
0.85
1; 0
4; 1
9
9
226
3.9e-4
4.1e-4
0.13
0; 0
0; 0
0; 0
237
276
{
0.17
0.27
464; 464
547; 545
Test problem
(28)
(a)
(P1)
(b)
()
(29)
(a)
(b)
()
(29)
(a)
(P1)
(b)
()
(30)
(a)
(b)
()
(30)
(a)
(P1)
(b)
()
(31)3
(a)
(b)
()
(31)3
(a)
(P1)
(b)
()
full adj.
Broy.
4
0.10
0; 0
8
5.39
0; 0
5
3.25
0; 0
51
1.26
2; 0
15
0.37
0; 0
55
1.42
18; 0
19
0.49
0; 0
adjoint Broyden variant
(0)
(1)
(2)
4
4
4
0.13
0.16
0.08
0; 0
0; 0
0; 0
7
8
8
5.13
8.75
5.27
0; 0
0; 0
0; 0
5
6
6
3.98
6.86
4.18
0; 0
0; 0
0; 0
51
53
89
0.09
0.09
0.14
2; 0
1; 0
1; 0
15
15
18
0.44
0.61
0.35
0; 0
0; 0
0; 0
42
30
70
0.10
0.09
0.17
10; 0
3; 0
58; 0
19
18
36
0.54
0.72
0.68
0; 0
0; 0
1; 0
(P1) preonditioned problem with A 1 = F 0 (x0 ), (a) iteration ounts, (b) run times in
seonds, () additional linesearh trials; sign hange in step multiplier, default problem
dimension n = 1000
1
Initial iterate is hosen with x0 = 21 x^0 with x^0 as proposed in the test set. Otherwise
no onvergene is ahieved for dimension n = 1000.
2
dimension is n = 10, tolerane is tolF = 10
3
tolerane is tolF = 10
12
12
As one an see nothing is gained by the ompat storage implementations when the initial
Jaobian F 0 (x0 ) is evaluated, fatorized and then used as a preonditioner, whih is mathematially equivalent to starting the adjoint Broyden method with A0 = F 0 (x0 ). Then there
is essentially no saving with regards to the linear algebra eort. However on the test problems 21 and 22 our dense implementation of full adjoint Broyden does not work at all, where
as the rst two ompat storage versions work quite niely. Judging by our experiene so far
the trouble of evaluating adjoint vetors, i.e. row-vetor Jaobian produts seems to pay o
sine the third version based exlusively on Jaobian-vetor produts performes signiantly
worse on these smooth but nonlinear problems. All three versions generate idential iterates
on aÆne problems, of ourse. On test problem 3 with diagonal preonditioning the rst two
ompat storage versions generate virtually the same iterates as the full storage vesion but
the run-time is redued by a fator of ten, whih is not surprising sine n = 1000. Similar
benets are obtained for problems 30 and 31 when the preonditioner is a multiple of the
identity. Here preonditioning by the initial Jaobian redues the number of iterations but
does prolong the runtime signiantly. Hene we may onlude that the ompat storage
implementation is indeed quite eÆient, espeially when the overall number of steps is only
a fration of the problem dimension.
14
In addition the problems of the More test set are solved for initial iterates further away
from the solution. The approah of multiplying the initial iterate by a salar fator to
test the performane of a method is suggested in [MGH81℄. Table 3 displays the required
iterations and run times. The hoie of the dimension n and the tolerane tolF for these
test problems is the same as before.
Table 3: Results of More test set for distant initial iterates
Test problem
(21), (P1)
(a)
100 0
(b)
()
(22)
(a)
100 0
(b)
()
(22), (P1)
(a)
100 0
(b)
()
(26)
(a)
10 0
(b)
()
(26), (P1)
(a)
10 0
(b)
()
(27)4
(a)
20 0
(b)
()
(27)4 , (P1)
(a)
20 0
(b)
()
x
x
x
x
x
x
x
full adj.
Broy.
6
0.15
2; 2
{
34
0.81
0; 0
{
47
1.21
1; 0
18
5.9e-4
7; 3
58
2.1e-3
63; 44
adjoint Broyden variant
(0)
(1)
(2)
4
12
{
0.13
0.46
2; 2
4; 4
45
65
{
0.05
0.09
9; 5
27; 12
34
34
{
0.92
1.30
0; 0
0; 0
34
24
130
0.08
0.08
0.49
18; 10
5; 2
180; 89
47
48
185
1.34
1.94
3.88
1; 0
1; 0
11; 7
18
148
{
8.4e-4
0.04
3; 1
164; 54
58
93
{
4.7e-3
0.01
40; 18
110; 92
Test problem
(28), (P1)
(a)
100 0
(b)
()
(29)
(a)
100 0
(b)
()
(29), (P1)
(a)
100 0
(b)
()
(30)5
(a)
100 0
(b)
()
(30)5 , (P1)
(a)
100 0
(b)
()
(31)5
(a)
10 0
(b)
()
(31)5 , (P1)
(a)
10 0
(b)
()
x
x
x
x
x
x
x
full adj.
Broy.
24
0.59
1; 0
20
18.08
1; 0
26
16.71
0; 0
{
56
1.37
20; 0
{
35
0.90
8; 0
adjoint Broyden variant
(0)
(1)
(2)
24
24
{
0.69
0.96
1; 0
2; 0
16
14
38
10.57
14.87
25.17
0; 0
0; 0
9; 0
26
26
{
17.51
28.22
0; 0
0; 0
{
67
{
0.12
8; 0
57
57
{
1.62
2.28
20; 0
17; 0
{
48
{
0.16
7; 0
35
35
{
1.03
1.44
8; 0
10; 0
(P1) preonditioned problem with A 1 = F 0 (x0 ), (a) iteration ounts, (b) run times in
seonds, () additional linesearh trials; sign hange in step multiplier, default problem
dimension n = 1000
4
dimension is n = 10, tolerane is tolF = 10
5
tolerane is tolF = 10
12
12
From remoter initial points the dierene between Version 1 and 2 of the ompat storage
implementation beomes more marked. The latter requires only about half the storage but
seems to do a better job at disarding older information as desribed at the end of Setion 4.
Hene it sueeds on problems 30 and 31 with diagonal preonditioning where the original
method fails. Obviously, some kind of restart must be developed, espeially in view of
problem 27 where the iteration ounts exeeds the dimension. In suh ases one also needs
a transition from the ompat to the full storage sheme, whih is yet to be developed.
Performane on speial test problems 1-3 For the spei test problem funtions,
varying problem dimensions and initial iterates the Matlab version of the adjoint Broyden
variants are ompared to the build-in Matlab funtion fsolve for the solution of nonlinear equations. For the test funtions 1 and 3 tolF = 10 14 and for the test funtion 2
tolF = 10 12. The maximal number of iterations allowed is again imax = 500. Here the run
times for the preonditioned problems inlude the run time to evaluate and fatorize the
initial Jaobian. Apparently Matlab uses some version of the Levenberg Marquardt method,
whih leads to signiantly smaller iterations ounts ompared to the diagonally preonditioned adjoint Broyden method. However, the total runtimes are always signiantly larger.
Presumably, beause a lot of eort goes into the dierening for Jaobian approximations.
For remote initial points the preonditioning may not pay even in terms of the iteration
number and ertainly with respet to the run-time. Obviously the very heap diagonal
preonditiong approah is a good idea and sometimes makes the dierene between suess
and failure. So we have also xed the the diagonal saler at the initial point, whereas we
the ompat storage representation allows easily to readjust it repeatedly at virtually no
extra ost.
15
Table 4: Results of test funtion 1
adjoint Broyden variant
fsolve
Test problem
(0)
(1)
(2)
default initial iterate 100 0 , varying problem dimension
= 10
(a)
16
14
37
(b)
1.5e-2
1.2e-2
2.5e-2
= 10
(P1)
(a)
21
21
{
12
(b)
2.3e-2
2.2e-2
0.21
= 10
(P2)
(a)
22
21
{
(b)
2.2e-2
2.0e-2
= 100
(a)
16
14
37
(b)
5.5e-2
5.4e-2
0.14
= 100
(P1)
(a)
26
26
{
13
(b)
0.14
0.15
3.38
= 100
(P2)
(a)
22
22
(b)
9.1e-2
0.10
{
= 1000
(a)
16
14
38
(b)
1.48
1.72
3.41
= 1000
(P1)
(a)
26
26
{
15
(b)
34.89
36.53
515
= 1000
(P2)
(a)
22
22
{
(b)
1.83
2.78
default problem dimension
= 100, varying saling of initial iterate
10 0
(a)
8
8
12
(b)
3.3e-2
3.5e-2
4.3e-2
10 0
(P1)
(a)
8
8
10
8
(b)
8.2e-2
8.6e-2
8.4e-2
1.9
10 0
(P2)
(a)
9
9
10
(b)
4.2e-2
4.3e-2
4.2e-2
500 0
(a)
49
18
98
(b)
0.21
6.8e-2
0.51
500 0
(P1)
(a)
75
75
{
19
(b)
0.40
0.46
4.6
500 0
(P2)
(a)
32
32
{
(b)
0.12
0.14
1000 0
(a)
81
20
101
(b)
0.42
7.4e-2
0.54
1000 0
(P1)
(a)
121
121
{
23
(b)
0.74
0.82
5.6
1000 0
(P2)
(a)
36
36
{
(b)
0.15
0.17
x
n
n
n
n
n
n
n
n
n
n
n
x
x
x
x
x
x
x
x
x
(P1) preonditioned
problem with A 1 = F 0(x0 ), (P2) preonditioned problem with
>
0
A 1 = F (x0 ) F (x0 )F (x0 )=F (x0 )> F (x0 ) I , (a) iteration ounts, (b) run times
Table 5: Results of test funtion 2
adjoint Broyden variant
fsolve
Test problem
(0)
(1)
default initial iterate 0 , varying problem dimension ,
=1
12
= 10
(a)
188
206
(b)
0.38
0.50
= 10
(P1)
(a)
14
19
17
(b)
1.4e-2
1.4e-2
0.19
= 10
(P2)
(a)
146
148
(b)
0.19
0.19
= 100
(a)
182
189
(b)
0.34
0.36
= 100
(P1)
(a)
14
19
23
(b)
2.6e-2
3.0e-2
0.40
= 100
(P2)
(a)
144
145
(b)
0.22
0.21
= 1000
(a)
183
189
(b)
2.0
1.7
= 1000
(P1)
(a)
14
19
18
(b)
1.3
1.9
12.0
= 1000
(P2)
(a)
144
145
(b)
1.5
1.1
default problem dimension
= 100, varying saling of initial iterate
2 0
(a)
28
371
(b)
1.8e-2
3.3
2 0
(P1)
(a)
5
11
25
(b)
1.8e-2
2.2e-2
0.28
2 0
(P2)
(a)
181
174
(b)
0.41
0.34
10 0
(a)
{
497
(b)
9.0
10 0
(P1)
(a)
4
10
10
(b)
1.7e-2
2.1e-2
0.16
10 0
(P2)
(a)
437
387
(b)
6.9
4.3
100 0
(a)
{
{
(b)
100 0
(P1)
(a)
4
11
13
(b)
1.7e-2
2.1e-2
0.14
100 0
(P2)
(a)
{
{
(b)
x
n
n
n
n
n
n
n
n
n
x
x
x
x
x
x
x
x
x
n tolF
e
n
(P1) preonditioned
problem with A 1 = F 0 (x0 ), (P2) preonditioned problem with
>
0
A 1 = F (x0 ) F (x0 )F (x0 )=F (x0 )> F (x0 ) I , (a) iteration ounts, (b) run times in
seonds, default problem dimension n = 100, default initial iterate 100x0
16
Table 6: Results of test funtion 3
Test problem
default initial iterate
= 100
(a)
(b)
= 100
(P1)
(a)
(b)
= 100
(P2)
(a)
(b)
= 1024
(a)
(b)
= 1024
(P1)
(a)
(b)
= 1024
(P2)
(a)
(b)
= 4900
(a)
(b)
= 4900
(P1)
(a)
(b)
= 4900
(P2)
(a)
(b)
default problem dimension
10 0
(a)
(b)
10 0
(P1)
(a)
(b)
10 0
(P2)
(a)
(b)
100 0
(a)
(b)
100 0
(P1)
(a)
(b)
100 0
(P2)
(a)
(b)
1000 0
(a)
(b)
1000 0
(P1)
(a)
(b)
1000 0
(P2)
(a)
(b)
n
n
n
n
n
n
n
n
n
x
x
x
x
x
x
x
x
x
adjoint Broyden variant
fsolve
(0)
(1)
(2)
0 , varying problem dimension
23
32
{
3.1e-2
5.6e-2
18
19
{
7
8.2e-2
9.3e-2
0.24
18
19
{
3.1e-2
3.8e-2
42
48
{
0.94
1.7
39
40
{
9
15.0
15.9
52.4
39
40
{
0.91
1.36
74
79
{
16.2
27.1
65
67
{
10
704
724
2772
65
67
{
14.8
21.9
= 100, varying saling of initial iterate
65
27
145
9.0e-2
4.4e-2
0.33
35
34
{
15
0.11
0.13
0.52
35
34
{
5.5e-2
6.7e-2
210
36
212
0.84
5.7e-2
0.80
49
49
{
24
0.14
0.17
0.82
49
50
{
7.3e-2
0.10
{
46
175
7.2e-2
0.57
60
60
31
0.16
0.20
{
1.08
60
60
{
9.1e-2
0.12
x
n
n
Comparison of limited memory apprahes Finally, we report some preliminary results on the limited memory implementation skethed in Setion 4, where the olumns of
Vk ; Wk and Hk are periodially overwritten one k exeeds a ertain limit m. We use a linear
test problem so that there is no mathematial dierene between our various ompat storage versions and we may use the one that has almost exatly the same memory requirement
as GMRES, namely it stores the m n matrix V and the m m matrix H .
Figure 1 ompares Matlab's GMRES solver with restart to the limited memory implementation of the adjoint Broyden's method. For this test the 2D Poisson equation with
Dirihlet boudary onditons on a square domain and ve-point disretization is solved. Although this yields a symmetri linear problem whih ould be takled by a GC method we
use it here to ompare the general nonsymmetri solvers. The dimension of the test problem
is n = 100 and it is solved upto a tolerane of tolF = 10 12 .
Without limiting the memory GMRES and adjoint Broyden are mathematially equivalent Krylov spae methods and reah the required tolerane at the 15th iteration. By
restriting the number m we destroy the Krylov subspae property and the onvergene
beomes signiantly slower. As one an see from the plot in 1 GMRES(m) takes about
twie as many steps as our 'periodi' adjoint Broyden version. That may be explainable
by the fat that on GMRES with restart every m steps utilizes on average information in
only m=2 diretions about the problem funtion, whereas adjoint Broyden uses m of them
throughout. Stritly speaking this means also that as far as the linear algebra is onerned
the GMRES(m) iterations ost only about half as muh, though there is probably a lot of
ommon overhead, espeially if m is small. Our main onern is of ourse the number of
iterations sine we assume that eah funtion evaluation is quite expensive.
17
Figure 1: Comparison of limited memory adjoint Broyden and GMRES
total number of required iterations
250
Matlab’s GMRES function with restart
200
limited memory adjoint Broyden’s method
150
100
50
15
0
2
4
6
8
10
12
14
16
number of allowed inner iterations in GMRES and number of
stored updates (plus initial update) in adjoint Broyden's method
7 Conlusion and Outlook
In this paper we have developed several ompat storage implementations of the Adjoint
Broyden method and shown that on aÆne problems they all yield idential iterates to GMRES. For that result we assumed exat line-searhes, whih is quite natural and realisti in
the aÆne ase. From a numerial linear algebra point of view our treatment is somewhat unsatisfatory in that we have barely given any onsideration to issues of round-o propagation.
In partiular we have not worried about the fat that applying the ompat representation
of the inverse of Ak or its adjugate to the urrent residual amounts to orthogonalisation
by unmodied Gram-Shmidt. From a more nonlinear point of view getting approximating
Jaobians right with a ouple of digits is already a quite satisfatory ahievement so that
numerial eets at level of the mahine preision or even its root are of little onern. Nevertheless it should be investigated whether one my design at an implementation for general
nonlinear problems that automatially redues to the standard GMRES proedure on aÆne
problems.
For the nonlinear senario of greater importane are issue related to the diagonal (re)saling
of the initial Jaobian and the thorny issue if and how to reset the proedure when the storage limits are reahed or older information appears to beome obsolete. For that purpose
one might monitor the subdiagonal entries in the projeted Jaobian Hk or the entries in
R. They must all vanish exatly in the aÆne ase and should therefore be rather small near
the roots of smooth funtions. Their relative size might also allow a smarter seletion of the
diretions to be disarded.
8 Aknowledgements
The rst author performed his researh for his paper at the IRISA Rennes, where he greatly
beneted from the hospitality and the GMRES expertise of Bernard Phlippe and his olleagues.
18
Referenes
[Ben65℄
J.M. Bennett. Triangular Fators of Modied Matries. Numerishe Mathematik,
7:217{221, 1965.
[Bro65℄
C. G. Broyden. A lass of methods for solving nonlinear simultaneous equations.
Math. Comp., 19:577{593, 1965.
[Bro69℄
K. M. Brown. A quadrati onvergent Newton-like method based upon Gaussian
elimination. J. Numer. Anal., 6:560{569, 1969.
[Bro71℄
C. G. Broyden. The onvergene of an algorithm for solving sparse nonlinear
systems. Math. Comp., 25:285{294, 1971.
[Gri86℄
A. Griewank. The "global" onvergene of Broyden-like methods with a suitable
line searh. J. Austral. Math. So. Ser. B, 28:75{92, 1986.
[GSW06℄ A. Griewank, S. Shlenkrih, and A. Walther. A quasi-Newton method with
optimal R-order without independene assumption. MATHEON Preprint 340,
2006. Submitted to Opt. Meth. and Soft.
[MC79℄
J. J. More and M. Y. Cosnard. Numerial solution of nonlinear equations. TOMS,
5:64{85, 1979.
[MGH81℄ J. J. More, B. S. Garbow, and K. E. Hillstrom. Testing unonstrained optimization
software. TOMS, 7:17{41, 1981.
[NW99℄
J. Noedal and S. J. Wright. Numerial Optimization. Springer-Verlag, 1999.
[Saa03℄
Y Saad. Iterative Methods for Sparse Linear Systems. SIAM, 2003.
[Sh07℄
S. Shlenkrih. Adjoint-based Quasi-Newton Methods for Nonlinear Equations.
Sierke Verlag, in press, 2007.
[SGW06℄ S. Shlenkrih, A. Griewank, and A. Walther. Loal onvergene analysis of TR1
updates for solving nonlinear euations. MATHEON Preprint 337, 2006.
[Spe75℄
E. Spediato. Computational experiene with quasi-Newton algorithms for minimization problems of moderately large size. Rep. CISE-N-175, Segrate (Milano),
1975.
[SW06℄
S. Shlenkrih and A. Walther. Global onvergene of quasi-Newton methods
based on Adjoint Tangent Rank-1 updates. TU Dresden Preprint MATH-WR02-2006, 2006. Submitted to Applied Numerial Mathematis.
19
Download