Convergence Results of A Local Minimax Method for Yongxin Li

advertisement
Convergence Results of A Local Minimax Method for
Finding Multiple Critical Points
Yongxin Li∗ and Jianxin Zhou
†
Abstract
In [14], a new local minimax method that characterizes a saddle point as a solution to a local minimax problem is established. Based on the local characterization, a
numerical minimax algorithm is designed for finding multiple saddle points. Numerical computations of many examples in semilinear elliptic PDE have been successfully
carried out to solve for multiple solutions. One of the important issues remains unsolved, i.e., the convergence of the numerical minimax method. In this paper, first,
Step 5 in the algorithm is modified with the design of a new stepsize rule which is
practically easier to implement and with which convergence results of the numerical
minimax method are established for isolated and non-isolated critical points. The
convergence results show that the algorithm indeed exceeds the scope of a minimax
principle. In the last section, numerical multiple solutions to the Henon’s equation
and a sublinear elliptic equation subject to zero Dirichlet boundary condition are
presented to show their numerical convergence and profiles.
Keywords. Multiple critical points, Local peak selection, Minimax algorithm, Convergence
AMS(MOS) subject classifications. 58E05, 58E30,35B38,35A15,65K15
Abbreviated titles. Convergence Results of A Minimax Method
∗
†
IBM T. J. Watson Research Center, Yorktown Hts, NY 10598.
Department of Mathematics, Texas A& M University, College Station, TX 77843. Supported in part
by NSF Grant DMS 96-10076.
1
1
Introduction
Let H be a Hilbert space and J : H → R be a Frechet differentiable functional, called,
a generic energy functional. Denote by J 0 or ∇J its Frechet derivative and J 00 its second
Frechet derivative if it exists. A point û ∈ H is a critical point of J if
J 0 (û) = 0.
A number c ∈ R is called a critical value of J if J(û) = c for some critical point û. For
a critical value c, the set J −1 (c) is called a critical level. A critical point û is said to
be non-degenerate if J 00 (û) exists and is invertible as a linear operator J 00 (û) : H → H,
otherwise û is said to be degenerate. The first candidates for a critical point are the local
maxima and minima to which the classical critical point theory was devoted in calculus of
variation. Traditional numerical methods, e.g., variational methods and monotone iteration
(sub-super solution) methods, focus on finding such stable solutions. Critical points that
are not local extrema are unstable and called saddle points, that is, critical points u∗ of J,
for which any neighborhood of u∗ in H contains points v, w s.t. J(v) < J(u∗ ) < J(w). In
physical systems, saddle points appear as unstable equilibria or transient excited states.
According to the Morse theory, the Morse Index (MI) of a critical point û of a realvalued functional J is the maximal dimension of a subspace of H on which the operator
J 00 (û) is negative definite. Thus for a non-degenerate critical point, if its MI = 0, then it is
a local minimizer of the generic energy functional J and therefore a stable solution, and if
its MI > 0, then it is a min-max type saddle point and an unstable solution.
Multiple solutions with different performance and instability indices exist in many nonlinear problems in natural and social sciences [23, 21, 17, 25, 16]. Stability is one of the
main concerns in system design and control theory. However in many applications, performance or maneuverability is more desirable, in particular, in system design or control
of emergency or combat machineries. Meanwhile instable solutions may have much higher
maneuverability or performance indices. For providing choice or balance between stability
and maneuverability or performance, it is important to solve for multiple solutions. When
cases are variational, they become multiple critical point problems. Thus it is important
for both theory and applications to numerically solve for multiple critical points in a stable
2
way. So far, little is known in the literature to devise such a feasible numerical algorithm.
One might mention Newton’s method. Without knowing or using the local structure of a
(degenerate) critical point, the usual Newton’s method will not be effective or stable. When
a local minimization is used in a quasi-Newton method, it will lead to a local minimum, a
stable solution.
Minimax principle, which characterizes a critical point as a solution to a two-level
optimization problem
min max J(v)
A∈A v∈A
for some collection A of subsets A in H, is one of the most popular approaches in critical
point theory. However, most minimax theorems in the literature (See [1], [19], [20], [21], 22,
[25]), such as the mountain pass, various linking and saddle point theorems, require one to
solve a two-level global minimax problem and therefore not for algorithm implementation.
In [14], motivated by the numerical works of Choi-McKenna [7] and Ding-Costa-Chen
[12], the Morse theory and the idea to define a solution submanifold, a new local minimax
theorem which characterizes a saddle point as a solution to a two-level local minimax problem is developed. Based on the local characterization, a new numerical minimax algorithm
for finding multiple saddle points is devised. The numerical algorithm is implemented successfully to solve a class of semilinear elliptic PDE on various domains for multiple solutions
[14]. Due to the limitation of time and other more profound analysis required, one of the
very important issues remains unsolved in [14], i.e., the convergence of the numerical local
minimax method, a paramount issue for any numerical method. The main objective of this
paper is to establish some convergence results for the algorithm.
The organization of this paper is as follows. We first modify Step 5 in the algorithm by
developing a new stepsize rule in Section 2. This is a great progress in the development of
the numerical algorithm. First it becomes practically easier to implement. Some properties
of the algorithm are verified therewith. Secondly it enables us to establish some convergence
results for the algorithm in Section 3. To the best of our knowledge, such convergence
results are the first to be established in the literature of critical point theory for saddle
point. The convergence analysis in Section 3 show that our algorithm can actually be used
to find a more general, non-minimax type critical point. In the last section we display
3
numerical convergence data and profiles of multiple solutions to the Henon’s equation and
to a sublinear elliptic equation which have several distinct features from all those examples
previously computed in [7, 12, 6 and 14]. For instance, The Henon’s equation has an explicit
dependence on the space variable x and for the sublinear equation, its “gradient flow” exists
only locally near certain direction and its sign-changing solution is not a minimax solution
which is beyond the original expectation to our local minimax method. In the rest of this
section, we introduce some notations and theorems from [14] for future use. In particular,
a new local existence theorem is established.
For any subspace H 0 ⊂ H, let SH 0 = {v|v ∈ H 0 , kvk = 1} be the unit sphere in H 0 .
Let L be a closed subspace in H, called a base space, and H = L ⊕ L⊥ be the orthogonal
decomposition where L⊥ is the orthogonal complement of L in H. For each v ∈ SL⊥ , we
define a closed half space [L, v] = {tv + w|w ∈ L, t ≥ 0}.
Definition 1.1 A set-valued mapping P : SL⊥ → 2H is called the peak mapping of J w.r.t.
H = L ⊕ L⊥ if for any v ∈ SL⊥ , P (v) is the set of all local maximum points of J on [L, v].
A single-valued mapping p: SL⊥ → H is a peak selection of J w.r.t. L if
p(v) ∈ P (v) ∀v ∈ SL⊥ .
For a point v ∈ SL⊥ , we say that J has a local peak selection w.r.t. L at v, if there is a
neighborhood N (v) of v and a mapping p: N (v) ∩ SL⊥ → H such that
p(u) ∈ P (u) ∀u ∈ N (v) ∩ SL⊥ .
Most minimax theorems in critical point theory require one to solve a two-level global
minimax problem and therefore not for algorithm implementation. While our local minimax
algorithm requires one to solve only a two-level local minimax problem. It is a great
advance in numerical algorithm implementation. However, as pointed in [14], three major
theoretical problems may take place: (a) for some v ∈ SL⊥ , P (v) may contain multiple
local maxima in [L, v]. In particular, P may contain multiple branches, even U-turn or
bifurcation points; (b) p may not be defined at some points in SL⊥ ; (c) the limit of a
sequence of local maximum points may not be a local maximum point. Although, these
problems may not happen for superlinear elliptic PDE as studied in [14], for more general
4
setting, all these three problems have to be resolved. Thus the analysis involved becomes
much more complicated. We have been devoting great efforts to solve these three problems.
We solve (a) and (b) by using a local peak selection. Numerically it is done by following
certain negative gradient flow and developing some consistent strategies to avoid jumps
between different branches of P . As for Problem (c), in this paper, we show that as long
as a sequence generated by the algorithm converges, the limit still yields a saddle point;
In a future paper, we will generalize the definition of p by developing a new approach and
further solve the problem. Since the approach is beyond the scope of a minimax principle,
more profound analysis will be involved.
Lemma 2.1 in [14] is an important result in the development of the local minimax
0
δ ))
.
characterization. However, it normalizes the steepest descent direction d = − kJJ 0 (p(v
(p(vδ ))k
Since our objective is to find a critical point, i.e., J 0 (p(vδ )) = 0, the algorithm will generate
a sequence {vn } such that J 0 (p(vn )) → 0. Normalizing the steepest descent direction
−J 0 (p(vn )) will introduce an extra error term in error analysis. This is also confirmed
by our numerical computation. Thus we will not normalize the steepest descent direction
−J 0 (p(vδ )) in the modified algorithm in this paper. Therefore necessary modifications in
theory are required.
Lemma 1.1 Let vδ ∈ SL⊥ be a point. If J has a local peak selection p w.r.t. L at vδ
such that p is continuous at vδ and dis(p(vδ ), L) > α > 0 for some α > 0, then either
J 0 (p(vδ )) = 0 or for any δ > 0 with kJ 0 (p(vδ ))k > δ, there exists s0 > 0, such that
J(p(v(s))) − J(p(vδ )) < −αδkv(s) − vδ k
for any 0 < s < s0 and
v(s) =
vδ + sd
,
kvδ + sdk
d = −J 0 (p(vδ )).
Proof: See Appendix.
The above lemma indicates that v(s) defined in the lemma represents a direction for
certain negative gradient flow of J(p(·)) from v. Hence it is clear that if p(v0 ) is a local
minimum point of J on any subset containing the path p(v0 (s)) for some small s > 0 then
5
J 0 (p(v0 )) = 0. In particular, when we define a solution manifold
o
n
M = p(v) : v ∈ SL⊥ ,
we have p(v(s)) ⊂ M. A solution submanifold was first introduced by Nehari in study of
a dynamic system [18] and then used by Ding-Ni in study of semilinear elliptic PDE [20].
They prove that a global minimum point of a generic energy function J on their solution
submanifold yields a saddle point basically with MI= 1. It is easy to see that when we set
L = {0}, our solution submanifold M coincides with their solution submanifold, thus our
definition generalizes the notion of a solution (stable) submanifold. This is also the reason
why we call M a solution submanifold. Note that a saddle point is not a stable solution.
However if a saddle point is characterized as a local minimum of the generic energy function
J on the solution submanifold, the saddle point becomes a stable solution relative to the
solution manifold M. That is why we sometimes call a solution submanifold M a stable
submanifold.
Theorem 1.1 [14] If J has a local peak selection p w.r.t. L at a point v0 ∈ SL⊥ s.t.
(i) p is continuous at v0 ,
(ii) dis(p(v0 ), L) > 0 and
(iii) v0 is a local minimum point of J(p(v)) on SL⊥ .
Then p(v0 ) is a critical point of J.
Since a peak selection may exists only locally, in the next existence theorem, we localize
conditions in Theorem 2.2 in [14]. This result is motivated by Example 4.2 in Section 4.
The following PS condition is used to replace the usual compact condition.
Definition 1.2 A function J ∈ C 1 (H) is said to satisfy the Palais-Smale (PS) condition, if
any sequence {un } ∈ H with J(un ) bounded and J 0 (un ) → 0 has a convergent subsequence.
Theorem 1.2 Assume that J : H → R is C 1 and satisfies the (PS) condition, and that
L ⊂ H is a closed subspace. If there exist an open set O ⊂ L⊥ and a local peak selection p
of J w.r.t. L defined on Ō ∩ SL⊥ such that
6
(i) c =
inf
v∈O∩SL⊥
J(p(v)) > −∞,
(ii) J(p(v)) > c
∀v ∈ ∂ Ō ∩ SL⊥ ,
(iii) p(v) is continuous on Ō ∩ SL⊥ ,
(iv) d(p(v), L) ≥ α for some α > 0 and all v ∈ O ∩ SL⊥
then c is a critical value, i.e, there exists v ∗ ∈ O ∩ SL⊥ such that
J 0 (p(v ∗ )) = 0,
Proof: Define
J(p(v ∗ )) = c =
min
v∈O∩SL⊥
J(p(v)).

 J(p(v)) v ∈ Ō ∩ S ⊥
L
¯
J(p(v)) =
 +∞
v∈
6 Ō ∩ SL⊥
¯
It can be easily checked that such defined J(p(v))
is lower semicontinuous and bounded
from below on the metric space SL⊥ . Let un ∈ O ∩ SL⊥ be chosen such that
(1.1)
1
¯
J(p(u
n )) = J(p(un )) ≤ c + 2 .
n
By Ekeland’s variational principle, there exists vn ∈ SL⊥ such that
1
¯
¯
J(p(v
n )) − J(p(v)) ≤ + kvn − vk ∀v ∈ SL⊥ ,
n
1
¯
J¯(p(vn )) − J(p(u
n )) ≤ − kvn − un k.
n
¯
By the definition of J(p(v)),
Assumptions (ii) and (iii), we must have vn ∈ O ∩ SL⊥ and
then
(1.2)
(1.3)
1
J(p(vn )) − J(p(v)) ≤ + kvn − vk ∀v ∈ O ∩ SL⊥ ,
n
1
J(p(vn )) − J(p(un )) ≤ − kvn − un k.
n
It follows
c ≤ J(p(vn )) ≤ c +
1
1
kvn − un k,
−
n2 n
or
kvn − un k ≤
7
1
.
n
For those n with J 0 (p(vn )) 6= 0, by our Lemma 1.1, there exists sn > 0 such that when
sn > s > 0, we have
vn (s) ∈ O ∩ SL⊥
and
α
J(p(vn (s)) − J(p(vn )) ≤ − kvn (s) − vn kkJ 0 (p(vn ))k,
2
(1.4)
or
kJ 0 (p(vn ))k ≤
(1.5)
2(J(p(vn )) − J(p(vn (s)))
.
αkvn (s) − vn k
If (iv) is satisfied, by (1.4), the right hand side of (1.5) is less than
2
.
nα
Thus we have
kJ 0 (p(vn ))k → 0.
J(p(vn )) → c,
By the (PS), there exists a subsequence {vnk } ⊂ {vn } such that
vnk → v ∗ ∈ Ō ∩ SL⊥ .
It leads to
J(p(v ∗ )) = c,
J 0 (p(v ∗ )) = 0.
By Assumption (ii), v ∗ ∈ O ∩ SL⊥ . Hence
c = J(p(v ∗ )) =
min
v∈O∩SL⊥
J(p(v)).
Although the above local existence result has been established, it is understood that our
main concern in this paper is to develop numerical algorithms for solving multiple solutions
in a stable way and to prove their convergence.
2
A Modified Local Minimax Algorithm
In this section, we modify the numerical local minimax algorithm developed in [14]. We
will not normalize the gradient J 0 (w k ) and in particular, in Step 5 of the algorithm, we
design a new stepsize rule to replace the original one that was an one-finite dimensional
two level min-max process. The new step-size rule is much easier to implement and it is
this step-size that enables us to establish some convergence results in Section 3.
8
Definition 2.1 A point v ∈ H is called a descent (ascent) direction of J at a critical point
û, if there exists δ > 0 such that
J(û + tv) < (>) J(û)
∀0 < |t| < δ.
A Numerical Local Minimax Algorithm
Step 1: Given ε > 0, λ > 0 and n − 1 previously found critical points w1 , w2, . . . , wn−1 of
J. Set the base space L = span{w1 , w2, . . . , wn−1 } where wn−1 is the one with the
highest critical value. Let v 0 ∈ SL⊥ be an ascent direction at wn−1. Set k = 0, t∗0 = 1
and vLk = 0;
Step 2: Solve for
w k ≡ t∗0 v k + vLk ≡ arg
max
k
u∈[L,vk ],u≈t∗0 vk +vL
J(u);
Step 3: Compute the steepest descent vector dk = −J 0 (w k );
Step 4: If kdk k ≤ ε then output wn = w k , stop; else goto Step 5;
Step 5: Set
v k (s) =
v k + sdk
kv k + sdk k
and find
sk = max
nλ o
t∗0 k
m
k
k λ
k
k λ
k
kd
m
∈
N,
2
>
kd
k,
J(p(v
(
)))−J(w
)
≤
−
k
kv
(
)−v
k
.
2m
2m
2
2m
Initial guess u = t∗0 v k ( 2λm ) + vLk is used to find the local maximum point p(v k ( 2λm )) in
[L, v k ( 2λm )] where t∗0 and vLk are found in Step 2.
Step 6: Set v k+1 = v k (sk ) and update k = k + 1 then goto Step 2.
Definition 2.2 At any point v ∈ SL⊥ , if J 0 (p(v)) 6= 0, we define the stepsize at v by
o
n 1
s λ > skdk, J(p(v(s))) − J(p(v)) ≤ − dis(L, p(v))kdkkv(s) − vk
λ≥s≥0
2
(2.1) s(v) = max
where
v(s) =
v + sd
,
kv + sdk
9
d = −J 0 (p(v)).
In addition to those remarks made for the original algorithm in [14], we should make some
new remarks on the above modified numerical local minimax algorithm.
Remark 2.1
(1) The purpose of introducing the new stepsize rule is twofold. First, the upper bound λ is
used to control the stepsize so the search will not go too far away from the solution (stable)
submanifold M to loss stability. λ can be any positive constant, e.g., λ = 1. Secondly, the
maximization in the step-size rule is used to prevent the stepsize of the search from being
too small. It can be verified that if we denote
s̄
k
n o
1 ∗ k
k
k
k
= max s λ > skd k, J(p(v(s))) − J(w ) ≤ − t0 kd k kv(s) − v k ,
λ≥s≥0
2
then we have
1 k
s̄ ≤ skn ≤ s̄k .
2
(2) Due to the nature of multi-level optimization, our stepsize rule is composite and quite
different from the one in usual optimization. In a usual numerical optimization scheme,
each iteration vn+1 = vn + αn dn is a linear variation. While in our numerical local minimax
vn + s(vn )dn
is a nonlinear variation and it is also
algorithm, each iteration vn+1 =
kvn + s(vn )dn k
composite with the mapping p. Note dk ⊥ v k and
√
2skdk k
skdk k
k
k
p
(2.2)
< kv (s) − v k < p
,
1 + s2 kdk k2
1 + s2 kdk k2
the term in the RHS of the inequality in (2.1) is bounded as the parameter s → +∞, while
in usual stepsize rules, e.g., the well-known Armijo or Goldstein’s stepsize rule in optimization, this term goes to −∞ as s → +∞. It is known that the Armijo or Goldstein’s stepsize
rule implies the Zoutendijk convergence condition, while our stepsize rule failed to satisfy
the Zoutendijk convergence condition. Thus our convergent analysis will be more difficult.
(3) The number
1
2
in Definition 2.2 can be replaced by any positive number less than 1.
(4) Initial guesses u = t∗0 v k + vLk in Step 2 and u = t∗0 v k ( 2λm ) + vLk in Step 5 have been
specified. Those initial guesses closely and consistently trace the position of the previous
point w k = t∗0 v k + vLk . This strategy is also to avoid the algorithm from possible oscillating
between different branches of the peak mapping P .
10
(5) In Step 3, to compute the steepest descent direction dk = −J 0 (w k ) (or the negative gradient) at a known point w k , usually it is carried out by solving a corresponding linearized
problem (ODE or PDE).
(6) The modified algorithm has been tested and confirmed on all numerical examples carried
out in [14]. The performances are better. Differences in numerical solutions are invisible.
Thus those examples will not be repeated here. Instead, we present numerical convergence
data and profiles of multiple solutions to the Henon’s equation and a sublinear elliptic equation which have some distinct features from those numerical examples previously computed
in [7, 12, 6, 14].
Lemma 2.1 If p is a local peak selection of J w.r.t. L at a point v ∈ SL⊥ s.t.
(i)
p is continuous at v,
(ii) dis(L, p(v)) > 0 and
(iii) kJ 0 (p(v))k > 0,
then s(v) > 0.
Proof
Applying Lemma 1.1 with vδ = v.
Theorem 2.1 Let v k and w k = p(v k ), k = 0, 1, 2, ... be the sequence generated by the algorithm such that p is continuous at v k and w k 6∈ L, then the algorithm is strictly decreasing.
Proof
(2.3)
If J 0 (w k ) = 0, it is done, otherwise, by Lemma 2.1, at v k , the step size s(vnk ) > 0.
1
J(w k+1) − J(w k ) ≤ − dis(L, w k )kJ 0 (w k )kkv k+1 − v k k < 0.
2
Thus the algorithm is strictly decreasing.
Since the sequence generated by the algorithm will make the generic energy function J
strictly decrease, the algorithm will be stable.
Lemma 2.2 If p is a local peak selection of J w.r.t. L at a point v̄ ∈ SL⊥ s.t.
11
(i)
p is continuous at v̄,
(ii) dis(L, p(v̄)) > 0 and
(iii) kJ 0 (p(v̄))k > 0,
then there exists an open neighborhood V of v̄ and a positive number s0 such that the stepsize
s(v) ≥ s0 for any v ∈ V ∩ SL⊥ .
Proof
Take v̄ as vδ in Lemma 1.1, then there exists s0 > 0 s.t. λ > s0 kJ 0 (p(v̄(s0 )))k and
1
J(p(v̄(s0 ))) − J(p(v̄)) < − dis(L, p(v̄))kJ 0 (p(v̄))k kv̄(s0 ) − v̄k.
2
Since dis(L, p(v̄))kJ 0 (p(v̄))kkv̄(s0 ) − v̄k > 0, fix s0 and let v̄ vary, as a function of v,
J(p(v(s0 ))) − J(p(v))
dis(L, p(v))kJ 0 (p(v))k kv(s0) − vk
v + sd
and d = −J 0 (p(v)). Thus we can find an
kv + sdk
open neighborhood V in N (v̄) where N (v̄) ∩ SL⊥ is a neighborhood of v̄ in which the local
is continuous around v̄. Here v(s) =
peak selection p is defined, s.t. v̄ ∈ V ∩ SL⊥ and for any v ∈ V ∩ SL⊥ , we have
λ > s0 kJ 0 (p(v(s0 )))k and
1
J(p(v(s0 ))) − J(p(v))
<−
0
dis(L, p(v))kJ (p(v))k kv(s) − vk
2
or
1
J(p(v(s0 ))) − J(p(v)) < − dis(L, p(v))kJ 0(p(v))k kv(s) − vk
2
for any v ∈ V . Thus s(v) ≥ s0 for any v ∈ V ∩ SL⊥ .
Lemma 2.3 Assume p is a continuous peak selection of J w.r.t. L with dis(p(v), L) ≥
α > 0 for any v ∈ SL⊥ , then p is a homeomorphism.
Proof
For each v ∈ SL⊥ , p(v) can be uniquely expressed as p(v) = t0v v + vL for some
t0v ≥ α > 0 and vL ∈ L. It is clear that p is 1-1. p(v) → t0v v is a projection of p(v) onto L⊥ ,
thus p(v) → v is continuous. Therefore, p is a homeomorphism.
12
Remark 2.2 Theorem 2.1 verifies that the algorithm is a descent method. When J is
bounded below on p(SL⊥ ), J converges to a finite number on the sequence {w k } generated by
the algorithm. But Lemma 2.1 and Theorem 2.1 alone can not prevent from the possibility
that the sequence accumulates around a noncritical point. Lemma 2.2 gives a locally
uniform stepsize estimate that basically guarantees that such possibility will not happen.
Therefore we can establish some convergence results in the next section.
3
Convergence Results
In this section, we establish several convergent results for the algorithm given in Section
2. Most conditions in those results can be localized. We use the same notion as in the
flow-chart of the algorithm. Let {w k } be the sequence generated by the algorithm. Since
p is homeomorphism, {v k } is the sequence in SL⊥ such that w k = p(v k ).
Theorem 3.1 Let p be a peak selection of J w.r.t. L and J satisfy the (PS) condition. If
(i)
p is continuous,
(ii) dis(L, w k ) > α > 0,
∀k = 0, 1, 2, ... and
(iii) inf v∈SL⊥ J(p(v)) > −∞,
ki
ki
ki
then {v k }∞
k=0 posses a subsequence {v } such that w = p(v ) converges to a critical point
of J.
Proof
As notations in the flow chart, we have w k = p(v k ), dk = −J 0 (w k ) and sk is the
stepsize at v k , and
1
J(w k+1) − J(w k ) ≤ − αkdk k kv k+1 − v k k.
2
(3.1)
Suppose that there exists a positive δ such that kdk k > δ for any k. From (3.1), we have
(3.2)
1
J(w k+1) − J(w k ) ≤ − αδkv k+1 − v k k for k = 0, 1, 2, . . . .
2
Adding up (3.2), we get
(3.3)
∞
X
k=0
∞
1 X k+1
J(w k+1) − J(w k ) ≤ − αδ
kv
− v k k.
2 k=0
13
Since J(w k ) is monotonically decreasing and bounded from below, the left hand side of
(3.3) converges to limk→∞ J(w k ) − J(w 1 ), and thus the right hand side of (3.3) must be
k
finite. That is {v k }∞
k=0 is a Cauchy sequence. Therefore v → v̄ for some v̄ ∈ SL⊥ . By
continuity, kJ 0 (p(v̄))k ≥ δ. Then Lemma 3.2 states that there exists s0 > 0 such that if v
is close enough to v̄, s(v) > s0 , in particular sk ≡ s(v k ) > s0 when k is sufficiently large.
On the other hand,
v k + sk d k
and dk ⊥ v k ,
kv k + sk dk k
we can see that kv k+1 − v k k → 0 and kdk k > δ imply sk → 0 as k → ∞. This leads to
v k+1 =
0
ki
a contradiction. Therefore, there exists a subsequence {w ki }∞
i such that kJ (w )k → 0
as i → ∞ and J(w ki ) converges. By the (PS) condition, {w ki }∞
i posses a subsequence,
denoted by {w ki } again, that converges to a critical point w0 . As p is a homeomorphism,
v ki converges to a point v0 ∈ SL⊥ with w0 = p(v0 ).
Theorem 1.1 serves as a mathematical justification of our local minimax method. It
states that a solution to the minimax problem is a critical point. Thus if our numerical minimax method generates a convergent sequence, it should converge to a critical
point. However, in the algorithm implementation, we use the steepest descent search to
approximate the minimization at the second level. Since the steepest descent search can
approximate not only a minimum point but also a saddle point. Thus our algorithm can
generates a sequence that converges to a critical point that is not necessarily a minimax
type. This case is not covered by any minimax principle. However, the following theorem justifies our algorithm. It states that any convergent subsequence generated by the
algorithm converges to a critical point.
Theorem 3.2 Let p be a peak selection of J w.r.t. L. If
(i)
p is continuous,
(ii) dis(L, w k ) > α > 0,
∀k = 0, 1, 2, ...,
(iii) inf v∈SL⊥ J(p(v)) > −∞,
then any limit point of {w k } is a critical point of J. In particular, any convergent subsequence {w ki } of {w k } converges to a critical point.
14
Proof
Suppose that w ki → w̄, but w̄ is not a critical point. Since p is homeomorphism,
denote w̄ = p(v̄) and w ki = p(v ki ) for some v̄ and v ki in SL⊥ . Then there exist a subsequence
of {w ki }, denoted by {w ki } again, and a constant δ > 0 such that kdki k ≡ kJ 0 (w ki )k > δ
for all i = 1, 2, ...,. By the stepsize rule
(3.4)
1
J(w ki +1 ) − J(w ki ) ≤ − dis(w ki , L)kJ 0 (w ki )kkv ki+1 − v ki k
2
1
≤ − αδkv ki+1 − v ki k.
2
Lemma 2.2 states that there exist a neighborhood N (v̄) and a constant s0 > 0 such that
s(v) > s0 for any v ∈ N (v̄) ∩ SL⊥ . Thus there exists an integer I such that when i > I,
we have v ki ∈ N (v̄) ∩ SL⊥ and ski = s(v ki ) > s0 . Then by (2.2) and the stepsize rule that
ski kdki k < λ, we get kv ki+1 − v ki k > cs0 δ for some constant c > 0 and all i > I. Inequality
(3.4) becomes
c
J(w ki+1 ) − J(w ki ) ≤ − αδ 2 s0 < 0,
2
i > I.
Since J(w k ) is monotonically decreasing and bounded from below by Theorem 2.1 and
Condition (iii), the left hand side of the above inequality converges to 0 as i → +∞, which
leads to a contradiction. Thus w̄ is a critical point.
Let us denote the set of all critical points of J by
o
n
K = w ∈ H J 0 (w) = 0 ,
and the set of all critical points of J with level c by
Kc = w ∈ H J 0 (w) = 0,
n
o
J(w) = c .
By the (PS) condition, it is clear that Kc is a compact set. As the examples shown in [14]
and Example 4.2 in Section 4, degenerate (non isolated) critical points exist naturally in
many applications. When a critical point is not isolated, the convergence analysis of the
algorithm is much more difficult. So far we can only prove its point-to-set convergence. It
also seems to us that in this case point-to-point convergence is meaningless, since we really
do not know which solution the algorithm will head for. It is known that “most” cases are
nondegenerate in the sense that any degenerate problem can be made nondegenerate by
adding a small perturbation to the problem or the domain. When a critical point is isolated,
15
we then prove its point-to-point, the usual convergence. In the next two convergence results
we assume that J satisfy the usual PS condition, which implies that the set Kc is compact.
Note that Assumption (ii) in Theorem 3.3 is similar to those commonly used in proving the
existence of a critical point in various linking and saddle point theorems [21, 25] to prevent
a “negative gradient flow” from going out. Under the assumptions of Theorem 3.4, this
condition will be automatically satisfied.
6 (V2 ∩SL⊥ ) ⊂ (V1 ∩SL⊥ ).
Theorem 3.3 Assume that V1 and V2 are open sets in H with ∅ =
Set V10 = V1 ∩SL⊥ and V20 = V2 ∩SL⊥ . If a peak selection p of J with respect to L determined
by the algorithm is continuous in V10 satisfying
(i) p(V10 ) ∩ K ⊂ Kc , where c = inf v∈V10 J(p(v)),
(ii) there is d > 0, such that
inf{J(p(v))|v ∈ V10 , dis(v, ∂V10 ) ≤ d} = a > b = sup{J(p(v))|v ∈ V20 } > c.
Let λ < d be chosen in the stepsize rule of the algorithm and {w k }∞
k=0 be the sequence
generated by the algorithm started from any v 0 ∈ V20 and dis(L, w k ) > α > 0 ∀k = 0, 1, 2, ..,
then for any ε > 0, there exists an N such that for any k > N,
dis(w k , Kc ) < ε.
Proof
Since λ < d, the maximal stepsize is less than d. It follows that if v k ∈ V10 and
dis(v k , ∂V10 ) > d, then v k+1 ∈ V10 . Since the algorithm is strictly decreasing, by selecting
an initial v0 ∈ V20 , we see that J(w k ) < b for any k > 0. Now Assumption (ii) implies
that if v k ∈ V10 , then dis(v k , ∂V10 ) > d. Thus v k+1 ∈ V10 . By induction, v k ∈ V10 and
dis(v k , ∂V10 ) > d for any k.
Theorem 3.1 states that {w k }∞
k=0 possesses a subsequence that converges to a critical
point. With Assumption (i), this subsequence converges to some w̄ ∈ Kc , which shows that
Kc is nonempty. The monotonicity of the algorithm then yields
J(p(v k )) → c = min0 J(p(v)),
v∈V1
16
as k → ∞.
Now suppose the conclusion is not valid, then there exist an ε > 0 and a subsequence
ki
k
{w ki }∞
i=0 such that dis(w , Kc ) ≥ ε for any i. Since J(p(v )) → c, for any m,
can find v km ∈ V10 such that
J(p(v km )) < c +
1
m
< d, we
1
< b.
m2
Applying Ekeland’s variational principle to J(p(·)) on V̄10 , we can find ṽ km ∈ V̄10 such that
(3.5)
(3.6)
(3.7)
J(p(ṽ km )) ≤ J(p(v km )) < c +
1
< b,
m2
1
, and dis(L, ṽ km ) > α,
m
1
J(p(v)) − J(p(ṽ km )) ≥ − kv − ṽ km k, ∀v ∈ V̄10 .
m
kṽ km − v km k <
From (3.5) and (3.6), ṽ km ∈ V10 and dis(ṽ km , ∂V10 ) ≥ d. Applying Lemma 1.1 leads to
1
J(p(v)) − J(p(ṽ km )) < − αkJ 0 (p(ṽ km ))k kv − ṽ km k,
2
k
for some v around ṽ km in V10 , e.g., v = ṽm
(s). Thus we get
kJ 0 (p(ṽ km ))k ≤
2
.
mα
By the (PS) condition, {p(ṽ km )} has a convergent subsequence, and since p is a homeomorphism, {ṽ km } has a convergent subsequence as well. Denote the convergent subsequence
by {ṽ km } again and assume it converges to ṽ. Due to (3.6), {v km } converges to ṽ as well.
Then {p(v km )} converges to p(ṽ). Since p(ṽ) ∈ Kc , we obtain that dis(p(v km ), Kc ) → 0, i.e.,
dis(w km , Kc ) → 0 as m → ∞. This contradicts to the assumption that dis(w ki , Kc ) ≥ ε.
Thus for any ε > 0, there exists an integer N such that
dis(w k , Kc ) < ε ∀k > N.
Theorem 3.4 Assume the peak selection p determined by the algorithm is continuous. If
J(p(v̄)) = loc minv∈SL⊥ J(p(v)) and p(v̄) is an isolated critical point with dis(p(v̄), L) >
α > 0, then there exists an open set V in H, v̄ ∈ V ∩ SL⊥ , such that starting from any
v0 ∈ V ∩ SL⊥ , the sequence {w k }∞
k=0 generated by the algorithm converges to p(v̄).
Proof
By our assumption, we can find an open set V1 in H, v̄ ∈ V10 ≡ V1 ∩ SL⊥ , such
that
17
(i) J(p(v̄)) = min{J(p(v))|v ∈ V10 },
(ii) p(v̄) is the only critical point of J on p(V10 ),
(iii) dis(p(v), L) > α > 0 ∀v ∈ V10 .
Denote d = inf v∈∂V10 {dis(v̄, v)}, d > 0, where ∂V10 is the boundary of V10 . Set
1
1
V2 = {v ∈ V10 |dis(v, ∂V10 ) ≥ d, kv − v̄k ≥ d}.
3
3
Thus V2 is closed and v̄ ∈
/ V2 . Denote c = J(p(v̄)) and a = inf{J(p(v))|v ∈ V2 }. Obviously
c ≤ a. Suppose c = a, then we can find a sequence {v k } ∈ V2 such that J(p(v k )) < c +
d
.
4k
Applying Ekeland’s variational principle to J(p(·)) on V̄10 , the closure of V10 , we can find
ṽ k ∈ V10 , kṽ k − v k k ≤
d
k
with dis(L, ṽ k ) > α and for any v ∈ V̄10 ,
1
J(p(v)) − J(p(ṽ k )) > − kv − ṽ k k.
k
Combining the above inequality with Lemma 1.1, we obtain that for some v ∈ V1 near ṽ k ,
1
J(p(v)) − J(p(ṽ k )) < − αkJ 0 (p(ṽ k ))k kv − ṽ k k.
2
Thus kJ 0 (p(ṽ k )))k <
2
.
kα
By the (PS) condition {p(ṽ k )} has a subsequence converging to a
critical point w0 in p(V10 ). From Lemma 2.3, {ṽ k } possesses a subsequence that converges
to a point v0 in V̄10 . Since kv̄ k − v k k < kd , v k ∈ V2 and V2 is closed, we have v0 ∈ V2 . From
Lemma 2.3 again, we see that w0 = p(v0 ) ∈ p(V2 ) must be a critical point. Since v̄ 6∈ V2 ,
we have p(v0 ) 6= p(v̄). It leads to a contradiction to (ii).
When a > c, we can find an open neighborhood V of v̄ such that
sup{J(p(v))|v ∈ V ∩ SL⊥ } = b < a.
All the assumptions of Theorems 3.3 are satisfied, By applying Theorem 3.3, we conclude
that w k converges to Kc ∩ p(V1 ). In this case, Kc ∩ p(V1 ) = {p(v̄)} is a singleton. Thus w k
converges to p(v̄).
We have to acknowledge that our convergence results are based on functional analysis,
not numerical or error analysis. It is assumed that at each step of the algorithm the
18
computation is exact. When a discretization of the boundary (domain) is used, an error
will be introduced. Note that the nonlinear problem has been linearized to find the gradient.
It is known that at each step of the algorithm, numerical approximation algorithm can be
used such that when a discretization is refined, the error will go to zero. However, how an
error is distributed from step to step is an important issue and is still unsolved. Since our
problem setting is very general, this will be a very difficult problem. Also from Example 4.2
in the next section, it is clear that for a general setting only a local convergence can be
expected, simply because a “negative gradient flow” may exist only locally.
4
Numerical Examples
In this section we display numerical convergence data and profiles of multiple solutions
to the Henon’s equation and to a sublinear elliptic equation which have several distinct
features from all those examples previously computed in [7, 12, 6 and 14].
Many numerical positive solutions to the Lane-Emden equation

 ∆u(x) + up (x) = 0, x ∈ Ω,
(4.8)
 u(x) = 0,
x ∈ ∂Ω,
with p = 3 on various domain Ω have been computed and visualized with their profiles in
[14]. In particular, when Ω is a dumbbell-shaped domain, Among the numerical solutions,
the existence of a solution concentrated on the central corridor (see Figure 4 in [14]) generated warm discussions among theoretical analysts. Since the profile of this solution is
similar to that of a solution to the same Lane-Emden equation but only on the domain of
the central corridor, .i.e., a domain formed from the dumbbell-shaped domain by cutting
off the left and right disks. Thus one might think that such a numerical solution may be
caused by error in numerical computation, not the nature of the problem. To clarify the
suspicion, let us consider the following example.
Example 4.1 As a contrast, here we compute the numerical multiple solutions of Henon’s
equation
(4.9)

 ∆u(x) + |x|` up (x) = 0, x ∈ Ω,
 u(x) = 0,
x ∈ ∂Ω,
19
with ` = 2 and p = 3 on the same dumbbell-shaped domain Ω. The Henon’s equation
is a generalization of the Lane-Emden equation in astrophysics to study rotating stellar
structures. It is known that the main result of Gidas-Ni-Nirenberg in [13, p.221, Theorem
1’] does not apply even the domain is a disk, since a factor |x|` has been multiplied to the
nonlinear term up , equation (4.9) has an explicit x-dependence. This factor gives a different
weight to the nonlinear term according to the distance of the point from the zero. Thus it
pulls an unstable solution away from the zero. As a result, the solution concentrated on the
central corridor disappears and the highest peak of solutions shifts away from the center
of “a local open space”. Since we use exactly the same domain, discretization, numerical
algorithm and the same initial guess, this example provides an evidence that the numerical
solution of Lane-Emden equation concentrated on the central corridor is caused by the
nature of the equation not error in numerical computation.
Although the nonlinear term f (x, u(x)) = |x|p up (x) explicitly depends on x, it still
satisfies the monotone condition
ft0 (x,t)
t
> f (x, t) for all x ∈ Ω and t > 0. Thus for
L = {0}, any direction is an increasing direction, and along each direction there is exactly
one maximum point of J. Here we simply use the following three “mound” functions
respectively as initial ascent directions in our numerical computation.


 cos( |x − xi | π ) if |x − xi | ≤ di,
i
di 2
v0 (x) =

 0
otherwise,
where x1 = (2, 0), d1 = 1, x2 = (−1, 0), d2 = 0.5, x3 = (0.25, 0), d3 = 0.2. It is worthwhile
indicating that when we use v03 (x) as shown above as an initial ascent direction in our
numerical computation, the final numerical solution appears to be the same as the one we
use v01 as an initial ascent direction, i.e., the numerical solution concentrated on the central
corridor as the one to the Lane-Emden equation does not appear.
Since in the local minimax method, a linear inhomogeneous elliptic equation

 ∆w(x) = −∆u(x) − f (x, u(x)), x ∈ Ω,
 w(x) = 0,
x ∈ ∂Ω
is used to solve for the negative gradient direction w(x) = −J 0 (u(x)) at a point u, we use
a boundary element method to solve this linear problem. Thus boundary discretization
20
has to be used and also the Gauss quadrature formula is utilized to approximate domain
integrals. As a benchmark, we apply the Greens formula
Z
Z
∂
(4.10)
∆E(x, ξ)dξ =
E(x, ξ)dσξ
Ω
∂Ω ∂nξ
where we choose
E(x, ξ) =
1
|x − ξ|2(ln |x − ξ| − 1)
8π
and x is a point outside the domain Ω. The difference |Vint − Bint | between the domain
integral Vint to the left hand side of (4.10) and the boundary integral Bint to the right hand
side of (4.10) will serve as an indicator to the error tolerance. In our numerical computation
of Example 4.1, totally 408 piecewise constant boundary elements for the boundary and
993 Gauss points for the domain have been used. We obtain |Vint − Bint | = 0.0007.
In order to observe the numerical convergence of the algorithm, in the following tables,
we present numerical data in each iteration of the numerical computation by displaying
the values of J, kJ 0 (un )k and maxx∈Ω |∆un (x) + f (x, un (x))|. kJ 0 (un )k indicates how close
un is to be a saddle point and maxx∈Ω |∆un (x) + f (x, un (x))| tells how close un is to be a
solution of the PDE; n-It is the number of iteration and for maxx∈Ω |∆un (x) + f (x, un (x))|,
we choose the largest value at the Gauss points of the domain.
21
Table 1: The numerical data in computing the ground state solution to Example 4.1, whose
profile and contours are shown in Figure 1.
n-It
J(un )
kJ 0 (un )k
maxx∈Ω |∆un (x) + f (x, un (x))|
1
.3533E + 01
.2893E + 01
.9686E + 01
2
.2784E + 01
.1380E + 01
.6691E + 01
3
.2550E + 01
.6947E + 00
.4609E + 01
4
.2481E + 01
.3507E + 00
.2990E + 01
5
.2462E + 01
.1743E + 00
.1730E + 01
6
.2456E + 01 .8791E − 01
.8938E + 00
7
.2454E + 01 .4843E − 01
.4953E + 00
8
.2453E + 01 .3063E − 01
.3077E + 00
9
.2453E + 01 .2092E − 01
.1966E + 00
10
.2453E + 01 .1432E − 01
.1323E + 00
11
.2453E + 01 .9531E − 02
.8560E − 01
12
.2453E + 01 .6126E − 02
.5323E − 01
13
.2453E + 01 .3792E − 02
.3177E − 01
14
.2453E + 01 .2257E − 02
.1819E − 01
15
.2453E + 01 .1290E − 02
.9971E − 02
16
.2453E + 01 .7065E − 03
.5227E − 02
17
.2453E + 01 .3702E − 03
.2615E − 02
18
.2453E + 01 .1851E − 03
.1246E − 02
19
.2453E + 01 .8817E − 04
.5636E − 03
20
.2453E + 01 .3990E − 04
.2417E − 03
21
.2453E + 01 .1710E − 04
.9783E − 04
Stop .2453E + 01 .6926E − 05
22
Table 2: The numerical data in computing the second solution to Example 4.1, whose
profile and contours are shown in Figure 2.
n-It
J(un )
kJ 0 (un )k
maxx∈Ω |∆un (x) + f (x, un (x))|
1
.5635E + 02
.1119E + 02
.1497E + 03
2
.4070E + 02
.5486E + 01
.8395E + 02
3
.3938E + 02
.1567E + 01
.5820E + 02
4
.3887E + 02
.9112E + 00
.3740E + 02
5
.3869E + 02
.5142E + 00
.2170E + 02
6
.3862E + 02
.2993E + 00
.1246E + 02
7
.3859E + 02
.1911E + 00
.7787E + 01
8
.3857E + 02
.1324E + 00
.5275E + 01
9
.3856E + 02 .9385E − 01
.3544E + 01
10
.3856E + 02 .6548E − 01
.2379E + 01
11
.3855E + 02 .4434E − 01
.1572E + 01
12
.3855E + 02 .2902E − 01
.9989E + 00
13
.3855E + 02 .1833E − 01
.6102E + 00
14
.3855E + 02 .1116E − 01
.3582E + 00
15
.3855E + 02 .6533E − 02
.2018E + 00
16
.3855E + 02 .3676E − 02
.1090E + 00
17
.3855E + 02 .1984E − 02
.5632E − 01
18
.3855E + 02 .1025E − 02
.2780E − 01
19
.3855E + 02 .5058E − 03
.1309E − 01
20
.3855E + 02 .2380E − 03
.5857E − 02
21
.3855E + 02 .1065E − 03
.2487E − 02
22
.3855E + 02 .4523E − 04
.9985E − 03
23
.3855E + 02 .1816E − 04
.3780E − 03
Stop .3855E + 02 .6875E − 05
23
Table 3: The numerical data in computing the third solution to Example 4.1, whose profile
and contours are shown in Figure 3.
n-It
J(un )
kJ 0 (un )k
maxx∈Ω |∆un (x) + f (x, un (x))|
1
.4208E + 02
.2893E + 01
.9686E + 01
2
.4134E + 02
.1380E + 01
.6691E + 01
3
.4110E + 02
.6947E + 00
.4609E + 01
4
.4103E + 02
.3507E + 00
.2990E + 01
5
.4101E + 02
.1743E + 00
.1730E + 01
6
.4101E + 02 .8791E − 01
.8938E + 00
7
.4101E + 02 .4843E − 01
.4953E + 00
8
.4100E + 02 .3063E − 01
.3077E + 00
9
.4100E + 02 .2092E − 01
.1967E + 00
10
.4100E + 02 .1432E − 01
.1323E + 00
11
.4100E + 02 .9531E − 02
.8562E − 01
12
.4100E + 02 .6126E − 02
.5324E − 01
13
.4100E + 02 .3792E − 02
.3179E − 01
14
.4100E + 02 .2257E − 02
.1821E − 01
15
.4100E + 02 .1290E − 02
.9989E − 02
16
.4100E + 02 .7065E − 03
.5243E − 02
17
.4100E + 02 .3701E − 03
.2631E − 02
18
.4100E + 02 .1851E − 03
.1263E − 02
19
.4100E + 02 .8818E − 04
.5805E − 03
20
.4100E + 02 .3999E − 04
.2588E − 03
21
.4100E + 02 .1743E − 04
.1148E − 03
Stop .4100E + 02 .7763E − 05
24
w−axis
2
1
0
−1
−0.5
3
1
x−axis
Z
0
−1
s
axi
0
2
y−
1
X
Y
1
0.256
0.392
05
5
1.3 1
1.28
.0
34 0.871 1.08 1
0.6
0.46 66
1
0.324
7
0.0.18
119
0.0505
0.7
−1
−1.5
−1
0
1
2
0.597
51.62
1.2 1.42
1
1.14
1.5
0.3
9
0.225
6
24
0.3
0.11
9
0.18
7
y−axis
1.49
39
0.9
0.802
29
0.5
0.597
56
0.2
0.392
0.0505
0
05
0.01957
00..118 1 0.529
6
0.4 0.802
0.3
24
0.66
1 39 0.673
0.870.9
4
05
0.
3
x−axis
Figure 1: The ground state solution w11 (MI = 1) and its contours. v0 = v01 , L = {0}, ε =
10−5 , J = 2.453, umax = 1.687 at (2.223, 0.011).
25
7
6
5
w−axis
4
3
2
1
0
−1
−1
−1
0
0
y−ax
1
is
1
Z
X
is
x−ax
2
3
Y
7
4.25
2.122
3.83.55
19
0
0.4
1.2
45
0.8
y−axis
1
3.4 4.68
2.98
1.7
9
0.41
−1
−1.5
−1
0
1
2
3
x−axis
Figure 2: The second solution w12 (MI = 1) and its contours. v0 = v03 , L = {0}, ε =
10−5 , J = 38.55, umax = 6.711 at (−1.111, −0.013).
26
7
6
5
w−axis
4
3
2
1
0
−1
−1
y−
is
ax
0
1
−1.5
−1
0
1
2
3
Z
x−axis
X
Y
1
0.238
0.4
1.2
3.82
0.0.4
239
84
1.52
5
0.7
5
0.7
0.238
0.238
62
5.
1.01
−1
−1.5
−1
0
1
2
0.
0.494
23
8
4.885
2.2 3.3
29.54.0
4 1 7
1.7
6
6
5
1.0
1
5
0.7
1.261.
01
4.34
94
0.7
3.0
0
3.5
2.873 2
2.0
1.5
y−axis
38
0.2
0.494
3
x−axis
Figure 3: A solution w2 with MI = 2 and its contours. v0 = v01 , L = {w12 }, ε = 10−5 , J =
41.00, umax = 6.711 at (−1.111, −0.013).
27
When the definition of a peak selection is extended to that of a local peak selection
in [14], we did it just for mathematical generalization and we did not have any model
problems in mind. The next example is a perfect one to show that such a generalization is
not luxurious but necessary.
Example 4.2 Consider a semilinear elliptic equation

 ∆u(x) + f (u(x)) = 0
for x ∈ Ω,
(4.11)

u(x) = 0
for x ∈ ∂Ω
where Ω is a smooth bounded region in R2 , ∆ is the Laplacian operator. Let λ1 < λ2 < λ3 , ...
be the sequence of eigenvalues of −∆ with zero Dirichlet boundary condition on Ω. In [4],
Castro-Cossio proved
Theorem 4.1 If f (0) = 0, f 0 (0) < λ1 , f 0 (∞) ∈ (λk , λk+1 ) with k ≥ 2, and f 0 (t) ≤ γ < λk+1
for some scalar γ, then the boundary-value problem (4.11) has at least five solutions.
Theorem A of Castro-Cossio in [4] also predicts that in addition to the trivial solution 0,
two of them are of sign-preserving solutions and other two are of sign-changing solutions.
Numerically such solutions have not been computed due to some difficulties involved. This
case is quite different from all the cases previously computed by us in [14], by Choi-McKenna
in [7], by Ding-Costa-Chen in [12] or by Chen-Ni-Zhou in [6] where all the nonlinear terms
f (t) are superlinear. However in the current example the nonlinear term is not superlinear
due to the condition f 0 (∞) ∈ (λk , λk+1). In this example, as Castro-Cossio suggested 1 , we

 γt − (γ −
f (t) =
 γt + (γ −
choose
(4.12)
λ1
) ln(1
2
+ t) for t ≥ 0,
λ1
) ln(1
2
− t) for t < 0.
Here γ is chosen such that λ2 < γ < λ3 . It can be directly verified that
f (0) = 0, f 0(0) =
λ1
< λ1 , f 0 (∞) = γ ∈ (λ2 , λ3 ), f 0(t) ≤ γ < λ3 .
2
Thus all the conditions in Castro-Cossio’s theorem are satisfied and therefore from their
theorem, the boundary value problem (4.11) has at least five solutions, the trivial solution,
1
The authors would like to thank Professors A. Castro and J. Cossio for providing us with such a
wonderful example
28
two sign-preserving solutions and two sign-changing solutions. Why those nontrivial solutions are numerically so difficult to capturer? First those solutions correspond to saddle
points, thus unstable solutions. In particular, a sign-changing solution is more unstable
than the sign-preserving solution, and therefore more elusive to capture. Numerically a
“negative gradient flow” along a solution submanifold will be used to search for those saddle points. However from the analysis below we can see that such a “negative gradient
flow” exists only locally near certain direction. A numerical computation will certainly fail
if a search leaves such a narrow region. Our definition of a local peak selection fits perfectly
in carrying such a mission.
The generic energy function is of the form
Z Z 1
1
2
|∇u(x)| − F (x, u(x)) dx ≡
− ∆u(x)u(x) − F (x, u(x)) dx
J(u) =
2
2
Ω
Ω
Rt
where F (x, t) = 0 f (x, ξ) dξ. For all the cases numerically computed in [7, 12, 6, 14],
f (x, t) ≥ 0 is superlinear in t and satisfies the monotone condition ft0 (x, t) >
f (x,t)
t
∀t > 0.
Thus F (x, t) ≥ 0 is super-quadratic. Near u = 0, the quadratic term |∇u(x)|2 dominates
F (x, u(x)) in J. Thus u = 0 is a local minimum point of J. However when kuk gets larger,
the superlinear term F (x, u(x)) takes turn to dominate other terms in J. Hence J will go to
negative infinity as kuk → ∞ in any finite-dimensional subspace of H. Therefore along any
increasing direction or in any finite-dimensional subspace J attains its local maximum and
the peak selection p is well defined. Due to the monotone condition, along each direction,
J has exactly one maximum point. The case (4.11) with (4.12) is quite different. Let us
observe that for (4.12),
(4.13)
F (t) =
λ1
γ 2
t − sig(t)(γ − )(ln(1 + |t|) − |t| + t ln(1 + |t|)).
2
2
With such F (u(x)) in J, near u = 0, the lowest order term in J is
(γ −
λ1
) ln(1 + |u(x)|)
2
which is always nonnegative if we choose γ such that λ1 < λ2 < γ. Thus u = 0 is a local
minimum point of J. Similarly every direction is an increasing direction of J at u = 0.
However, when kuk increases, the quadratic term, the highest order term in J,
29
1
|∇u(x)|2
2
− γ2 u2 (x) takes turn to dominate other terms in J. But this term may have dif-
ferent sign for different u. Consider the eigenfunctions u1 (x), u2 (x), ... of −∆ corresponding
to the eigenvalues λ1 < λ2 .... When u = ui, we have
γ
γ
1
1
1
|∇u(x)|2 − u2 (x) ≡ − ∆u(x)u(x) − u2 (x) = (λi − γ)u2 (x).
2
2
2
2
2
If λi > γ, J → +∞ along the direction ui , thus J has no local maximum along ui. On the
other hand, if λi < γ, J → −∞ along the direction ui , J will attain its local maximum
ui
. To see this let us observe
along ui . That is, a local peak selection exists only near
kuik
that for t > 0
Z 2
t
|∇u(x)|2 − F (tu(x)) dx.
J(tu) =
2
Ω
Thus
Z t|∇u(x)|2 − f (tu(x))u(x) dx
ZΩ λ1
t |∇u(x)|2 − γu2 (x) + (γ − ) ln(1 + t|u(x)|)|u(x)| dx
=
2
Ω
= 0
d
J(tu) =
dt
if and only if
Z
2
Ω
2
(|∇u(x)| − γu (x))dx = −
Z
Ω
(γ −
λ1 ln(1 + t|u(x)|) 2
)
u (x)dx.
2
t|u(x)|
Since the right hand side of the above equality is always negative and for t > 0,
ln(1 + t)
t
is continuous and monotonically decreasing in t and
n +∞ as t → 0
ln(1 + t)
→
t
0
as t → +∞,
we conclude that dtd J(tu) = 0 has a unique solution tu > 0 if and only if
Z Z 2
2
|∇u(x)| − γu (x) dx ≡ −
∆u(x) + γu(x) u(x)dx < 0.
Ω
Ω
Since J attains its local minimum at 0, tu must be a local maximum point of J(tu). Thus
along such u, our local peak selection p w.r.t. L = {0} is well defined with p = P . For
30
an eigenfunction ui of −∆ corresponding to the eigenvalue λi ,
solution ti > 0 if and only if
Z
Ω
d
J(tui ) = 0 has a (unique)
dt
(λi − γ)u2i (x)dx < 0 or λi < γ.
ui
such that for each
kuik
u ∈ Ni with kuk = 1, p(u) = tu u is uniquely defined. It is clear that our definition of a
Thus for such an eigenfunction ui , there is a neighborhood Ni of
local peak selection perfectly fits in this case. This information is also very important for
us to carry out the numerical computation successfully. Since we do not have an explicit
expression for p, it is very difficult to check if p is continuous. A new approach will be
developed to address this issue. Since the new approach is beyond the scope of a minimax
principle and more profound analysis is required, it will be discussed in a future paper.
Here we verify that when L = {0}, p is differentiable at a point v as long as p is defined
ln(1 + t)
is monotonically decreasing for t > 0 which
at v. To see this, first we note that
t
implies that
ln(1 + t)
1
<
∀ t > 0.
1+t
t
ui
Next from the above analysis, for a given u near
where λi < γ and kuk = 1, we see
kuik
that J attains its unique local maximum along u at tu u if and only if hJ 0 (tu u), ui = 0.
Define
G(t, u) ≡ hJ 0 (tu), ui = 0
and write
G(t(u), u) ≡ hJ 0 (tu u), ui ≡ 0.
Then by the implicit function theorem, t(u) or p is defined near u and differentiable at u if
6 0, which is guaranteed by
G0t (tu , u) ≡ hJ 00 (tu u)u, ui =
Z u2 (x) λ1
00
dx
|∇u(x)|2 − γu2 (x) + (γ − )
hJ (tu u)u, ui =
2 1 + tu |u(x)|
Ω
Z λ1 ln(1 + tu |u(x)|) 2 <
u (x) dx
|∇u(x)|2 − γu2 (x) + (γ − )
2
tu |u(x)|
Ω
1 0
hJ (tu u), ui = 0.
=
tu
31
As in our numerical computation, we choose Ω to be the unit disk in R2 , on which the
negative Laplacian operator −∆ has the eigenvalues
λ1 = (2.4048)2 , λ2 = (3.8317)2
and λ3 = (5.1356)2, ...
and the corresponding eigenfunctions
u1 (r, θ) = J0 (2.4048r), u2(r, θ) = J1 (3.8317r)e±iθ
and u3 = J2 (5.1356r)e±i2θ , ...
where Jk is the Bessel function of order k. We also choose γ = 20. Thus λ1 < λ2 < γ < λ3
and a local peak selection p w.r.t. L = {0} exists only near u1 and u2 . As a result,
convergence of the algorithm can be expected only when an initial guess is chosen near u1
or u2 . Therefore for a general setting, only local convergence of the local minimax method
can be expected. By choosing L = {0} and the initial increasing direction v0 (r, θ) =
J0 (2.4048r), we obtain the ground state solution w1 as shown as in Figure 4. Next by
choosing L = {w1 } and the initial increasing direction v0 (r, θ) = J1 (3.8317r) cos(θ), we
obtain the second solution w2 as plotted in Figure 5.
What will happen if we choose L = {0} and the initial increasing direction v0 (r, θ) =
J1 (3.8317r) cos(θ), which is relatively close to w2 ? According to our numerical computation,
the first a few iterations of the algorithm will lead v0 to a state even closer to w2 , then
the algorithm found a “negative gradient flow” along which the search turns towards w1 .
Finally the search stops at the same w1 . This also indicates that w2 is more unstable than
w1 .
Since w2 is a degenerate solution with MI = 2 according to our numerical algorithm,
so far we still don’t know how a quasi-Newton’s method or other numerical algorithms can
capture such a highly degenerate and unstable solution.
In our numerical computation of Example 4.2, totally 384 piecewise constant boundary
elements for the boundary and 1537 Gauss points for the domain have been used. The
error between the two sides of (4.10) is |Vint − Bint | = 0.00012. Again in the following
tables kJ 0 (un )k indicates how close un is to a saddle point and maxx∈Ω |∆un (x) + f (un (x))|
tells how close un is to a solution of the PDE; n-It is the number of iteration and for
maxx∈Ω |∆un (x) + f (un (x))|, we choose the largest value at the Gauss points of the domain.
32
Table 4: The numerical data in computing a sign-preserving solution to Example 4.2, whose
profile and contours are shown in Figure 4.
n-It
J(un )
kJ 0 (un )k
maxx∈Ω |∆un (x) + f (un (x))|
1
.1296E + 00 .7078E − 01
.1548E + 00
2
.1294E + 00 .1604E − 01
.3013E − 01
3
.1294E + 00 .2603E − 02
.2782E − 02
4
.1294E + 00 .2288E − 03
.1961E − 04
Stop .1294E + 00 .2829E − 05
Table 5: The numerical data in computing a sign-changing solution to Example 4.2, whose
profile and contours are shown in Figure 5.
n-It
J(un )
kJ 0 (un )k
maxx∈Ω |∆un (x) + f (un (x))|
1
.5770E + 02
.1330E + 01
.3444E + 01
2
.5767E + 02
.2341E + 00
.4456E + 00
3
.5767E + 02 .3753E − 01
.8739E − 01
4
.5767E + 02 .9672E − 02
.1744E − 01
5
.5767E + 02 .2143E − 02
.4104E − 02
6
.5767E + 02 .3534E − 03
.2013E − 02
7
.5767E + 02 .4962E − 04
.4217E − 03
8
.5767E + 02 .2104E − 04
.9781E − 04
9
.5767E + 02 .1129E − 04
.5365E − 04
Stop .5767E + 02 .8200E − 05
33
0.7
0.6
0.5
u−axis
0.4
0.3
0.2
0.1
0
−0.1
−1
Y−a
−1
0
xis
0
1
Z
1
X−axis
1
Y
Y−axis
X
0
−1
−1
0
1
X−axis
Figure 4: The profile of the ground state, a sign-preserving solution w1 with MI = 1 and
its contours with ε = 10−5 , J = 0.1294, umax = 0.626 at (0, 0). −w1 is also a sign-preserving
solution.
34
u−axis
11
10
0
−1
−10
0
1
0
1
−1
1
0.402
−0.402
1
2.8
8
3
−6.02
.42
−2.81
−2.01
−0.402
1
6.
2
−1.2
6.0
−2
5.22
1
2.8
1.2
0.402
−3.6
−8.4
−5.2
2
−4
3
7.6
1.2
2.01
9.24
2.01
4.42 61
3.
8.43
5.22
6.02
−9.24
02
−0.4
2
.42
−6.0
−4
.61
−5.22
−3
−7.63 .83
−6
0.402
1.2
1
2.01
4.42
3.6
3
0.402
−4 −3
.42 .61
−6.8
−2.81
7.63
.43
−6.83
0
−2.01
−1.2
Y−axis
02
0.4
6.83
1
2.8 .61
3
2
−0.402
−3.61−4.42
3
.6
.43
−7 −8
2
5.2
4.4
2
−5.2
2
−6.0
−2.01
02
−1.2
−0.4
1
2.0 .2
1
−2.81
.2
−1 .01
−2
83
Y
3.
61 4.4
2
2.8
1
X
6.02
Z
X−axis
xis
a
Y−
.8
1
1
2.0 .2
1
−1 −2.01
.2
02
0.4
−0.402
−1
−1
0
1
X−axis
Figure 5: The profile of a sign-changing solution w2 with MI ≥ 1 and its contours with
ε = 10−5 , J = 57.67, umax = 10.04 at (0.483, 0.009). Due to the symmetry of the PDE and
the domain, a rotation of the solution w2 for any angle is still a solution. Thus w2 belongs
to a one-parameter family of solutions and therefore is a degenerate saddle point.
35
It is interesting to indicate that the above sign-changing is known to be a non-minimax
type solution. Its existence is established by Castro-Cossio in [4] by using LyapunovSchmidt reduction method, not a minimax approach. This is the first time such a nonminimax solution has been numerically computed. As our convergence analysis has shown
that the local minimax algorithm could actually find non-minimax type solution. This
numerical example provides an excellent evidence.
As a final remark, we point out that although we have printed the Morse index for
each numerical solution we computed here, it is based on the local minimax algorithm we
numerically compute the solution. Their mathematical verification involves more profound
analysis, in particular when a degenerate solution is involved. Results on this topic will be
reported in subsequent papers [15, 26].
Appendix
Proof of Lemma 1.1
Suppose that J 0 (p(vδ ) 6= 0. Let δ > 0 be a number such that
kJ 0 (p(vδ )k > δ and p be the local peak selection of J w.r.t. L at vδ . By our assumption,
hJ 0 (p(vδ )), di < −δkdk.
Since p(vδ ) ∈ [L, vδ ], p(vδ ) = xδ + tδ vδ , where xδ ∈ L and tδ > α. p(vδ ) is a local maximum
point of J on [L, vδ ], thus d ⊥ [L, vδ ]. By the continuity, there exist positive numbers ε1 ,
ε2 , t1 , t2 with 0 < t1 < tδ < t2 and open balls
Bε2 ,L⊥ = {x|x ∈ L⊥ , kx − tδ vδ k < ε2 }
Bε1 ,L = {x|x ∈ L, kx − xδ k < ε1 },
s.t.
a) p(vδ ) is a local maximum point of J in (t1 , t2 ) × vδ + Bε1 ,L ;
b) hJ 0 (x1 + x2 ), di < −δkdk,
c) tvδ + sd ∈ Bε2 ,L⊥ ,
∀ x1 ∈ Bε1 ,L , x2 ∈ Bε2 ,L⊥ ;
∀ t ∈ (t1 , t2 ), 0 < s <
ε2
.
2
For any x1 ∈ Bε1 ,L , and t ∈ [t1 , t2 ], by the mean value theorem, there is ξ : 0 < ξ <
that
(4.14)
J(tvδ + x1 + sd) − J(tvδ + x1 ) = hJ 0 (tvδ + x1 + ξd), sdi < −sδkdk.
36
ξ2
2
such
By a), J(tvδ + x1 ) ≤ J(p(vδ )), thus we have
J(tvδ + x1 + sd) − J(p(vδ )) < −sδkdk.
(4.15)
On the other hand, since d⊥vδ , we have
k
(4.16)
skdk
skdk
tδ vδ + sd
− vδ k <
.
<
ktδ vδ + sdk
tδ
α
Following our notation,
s
tδ vδ + sd
,
v( ) =
tδ
ktδ vδ + sdk
thus we have
s
αkv( ) − vδ k < skdk.
tδ
Combine (4.15) with (4.17), we can find ε3 > 0 such that for 0 < s < ε3
(4.17)
s
(4.18) J(tvδ + x1 + sd) − J(p(vδ )) < −αδkv( ) − vδ k,
tδ
∀t ∈ (t1 , t2 ), x1 ∈ Bε1 ,L .
Denote
(4.19)
D = {tvδ + x1 + sd|s < ε1 , t ∈ (t1 , t2 ), x1 ∈ Bε1 ,L }.
D is an open neighborhood of p(vδ ) = tδ vδ + xδ in the subspace spanned by L, vδ and d.
By the local continuity of p at vδ , there exists ε > 0, s.t.
s
v( ) ∈ N (vδ ) ∩ SL⊥
tδ
s
and p(v( )) ∈ D,
tδ
∀
s
< ε,
tδ
where N (vδ ) is a neighborhood for the local peak selection p to be defined at vδ . Since
s
s
p(v( )) ∈ [L, v( )],
tδ
tδ
we can write
s
p(v( )) = ctδ vδ + x1 + csd
tδ
uniquely for some number c and x1 ∈ Bε1 ,L where we can choose ε1 > 0 sufficiently small
s.t., ctδ ∈ (t1 , t2 ) and cs < ε3 , ∀s <
ε1
,
2
thus (4.18) is satisfied for t = ctδ and cs < ε3 .
Therefore
(4.20)
(4.21)
(4.22)
s
J(p(v( ))) − J(p(vδ )) = J(ctδ vδ + x1 + csd) − J(p(vδ ))
tδ
cs
< −αδkv( ) − vδ k
ctδ
s
= −αδkv( ) − vδ k.
tδ
37
References
[1] A. Ambrosetti and P. Rabinowitz, Dual variational methods in critical point theory
and applications, J. Funct. Anal. 14(1973), 349-381.
[2] T. Bartsch and Z.Q. Wang, On the existence of sign-changing solutions for semilinear
Dirichlet problems, Topol. Methods Nonlinear Anal., 7 (1996), 115–131.
[3] H. Brezis and L. Nirenberg, Remarks on Finding Critical Points, Comm. Pure Appl.
Math., Vol. XLIV, (1991), 939-963.
[4] A. Castro and J. Cossio, Mutiple solutions for a nonlinear Dirichlet Problem, SIAM
J. Math. Aanl. 25(1994), 1554-1561.
[5] K.C. Chang, Infinite Dimensional Morse Theory and Multiple Solution Problems ,
Birkhäuser, Boston, 1993.
[6] G. Chen, W. Ni and J. Zhou, Algorithms and Visualization for Solutions of Nonlinear
Elliptic Equations Part I: Dirichlet Problems, Int. J. Bifurcation & Chaos, 7(2000),
1565-1612.
[7] Y. S. Choi and P. J. McKenna, A mountain pass method for the numerical solution of
semilinear elliptic problems, Nonlinear Analysis, Theory, Methods and Applications,
20(1993), 417-437.
[8] C.V. Coffman, A nonlinear boundary value problem with many positive solutions, J.
Diff. Eq., 54(1984), 429-437.
[9] E.N. Dancer, The effect of domain shape on the number of positive solutions of certain
nonlinear equations, J. Diff. Eq. 74 (1988), 120–156.
[10] Y. Deng, G. Chen, W.M. Ni, and J. Zhou, Boundary element monotone iteration
scheme for semilinear elliptic partial differential equations, Math. Comp., 65(1996),
943–982.
[11] W.Y. Ding and W.M. Ni, On the existence of positive entire solutions of a semilinear
elliptic equation, Arch. Rational Mech. Anal., 91(1986) 238-308.
[12] Z. Ding, D. Costa and G. Chen, A high linking method for sign changing solutions for
semilinear elliptic equations, Nonlinear Analysis, 38(1999) 151-172.
38
[13] B. Gidas, W.M. Ni and Nirenberg, Symmetry and related properties via the maximum
principle, Comm. Math. Phys. 68(1979), 209-243.
[14] Y. Li and J. Zhou, A minimax method for finding multiple critical points and its
applications to semilinear PDE, submitted.
[15] Y. Li and J. Zhou, Local characterization of saddle points and their Morse indices,
Control of Nonlinear Distributed Parameter Systems, Marcel Dekker, New York, pp.
233-252, to appear.
[16] F. Lin and T. Lin, “Minimax solutions of the Ginzburg-Landau equations”, Slecta
Math. (N.S.), 3(1997) no. 1, 99-113.
[17] J. Mawhin and M. Willem, Critical Point Theory and Hamiltonian Systems, SpringerVerlag, New York, 1989.
[18] Z. Nehari, On a class of nonlinear second-order differential equations, Trans. Amer.
Math. Soc., 95(1960), 101-123.
[19] W.M. Ni, Some Aspects of Semilinear Elliptic Equations, Dept. of Math. National
Tsing Hua Univ., Hsinchu, Taiwan, Rep. of China, 1987.
[20] W.M. Ni, Recent progress in semilinear elliptic equations, in RIMS Kokyuroku 679,
Kyoto University, Kyoto, Japan, 1989, 1-39.
[21] P. Rabinowitz, Minimax Method in Critical Point Theory with Applications to Differential Equations, CBMS Reg. Conf. Series in Math., No. 65, AMS, Providence,
1986.
[22] M. Schechter, Linking Methods in Critical Point Theory, Birkhauser, Boston, 1999.
[23] M. Struwe, Variational Methods, Springer, 1996.
[24] Z. Wang, On a superlinear elliptic equation, Ann. Inst. Henri Poincare, 8(1991), 43-57.
[25] M. Willem, Minimax Theorems, Birkhauser, Boston, 1996.
[26] J. Zhou, Instability indices of saddle points by a local minimax method, preprint.
39
Download