MITLibraries Document Services

advertisement
MITLibraries
Document Services
Room 14-0551
77 Massachusetts Avenue
Cambridge, MA 02139
Ph: 617.253.5668 Fax: 617.253.1690
Email: docs@mit.edu
http: //libraries. mit. edu/docs
DISCLAIMER OF QUALITY
Due to the condition of the original material, there are unavoidable
flaws in this reproduction. We have made every effort possible to
provide you with the best copy available. If you are dissatisfied with
this product and find it unusable, please contact Document Services as
soon as possible.
Thank you.
Due to the poor quality of the original document, there is
some spotting or background shading in this document.
LIDS-P-1124
August 1981
F
FINAL. SIZE 8.F X 11
MO(.,Dl Pl'P R
i77% Rt:).
_.
r
.
_
_
_--_-
PROJECTED NEWTON METI-IODS FOR
OPTIMIZATION: PROBLEMIS IITH SIMPLE CONSTRAINTS*
.
_.
^ ......
follow apply also with fiinor nodifica.t
i :.... :'.Le...:,
.........t~:~ i;oJ-, ,
..:i
.................. , .
.
i
~ Dimitri P. Bertsekas
'~j
Department of Electrical Engineering and Computer Sciencel
Massachusetts Institute of Technology . .......
Cambridge, Massachusetts 02139 .
....
-- -_--------..........
[
,8milP.:'..w
.:t al Abstr iact"
|sl).-e
,d
.. '..._."--
.................
.....
1..._.1
.. .____..........................
,C'J~i~,
t1
-o
: - HERE
[":F Irv
P
'U i
!-"')
: i l11
/,T
ri_-,
lems with rectangle
.....
We consider the problem min {f(x) Ix > 0} and
[x k - kDkVf(xk) +
algorithms of the form Xk+l
constraints
tins
to probT
such as
i
(2
x
minimi ze-
----------subject to b < x.< b 2
where [.]+ denotes projection on the positive ....
.
orthant, ok is a stepsize chosen by an Armijo-like
Problems (1)
:~where b and b are given vectors.
~~~k
. .
'
.
:2
rule, and Dk is a positive definite symmetric maand (2) are referred to as simply constrained probWeshowth-a-!
We show that Dtrixcan
trix which is partly diagonal.
ems,
diagonal.
ande their
showha algorithmic
Dk
solution is the primary
which
n is partly
subject of this paper.
be calculated simply on the basis of second deri.vatives of f so that the resulting Newton-like algoIn view of the simplicity of the constraints,
rithm has a typically superlinear rate of converone would expect that.solution of problem (1) is
With other choices of Dk convergence at a
gence.
almost as easy as unconstrained minimization of f.
typi.cally linear rate is obtained. The algorithmrs
This expectation is partly justified in that the
order necessary condition for a vector x =
first
are almost as simple as their unconstrained counter-n
-1
parts. They are well suited for problems of large
) to be a local minimum of problem (1)
(x ,...,x
dimension such as those arising in optimal control.
takes the simple form
while being competitive with existing methods for
Bf(x)
low-dimensional problems.
(3a)
-1, ... ,n
i
-> 0
ax).
0
Df(x)
0
1.
>0,
i = 1,...,n (3b)
Furthermore the direct analog of the method of
steepest descent takes the simple form
(1)
f(x)
subject to x ,>
x
Introduction
We consider the problem
minimize
if
=
k
[xk
-
Vf(xk)]
k = 0,1,...
..
(4)
0
where f : R - R is a continuously differentiable
function and the vector inequality x > 0 is meant
to be componentwise [i.e. for x = (,x
.....
xn),,
> 0 for all i = l,....n]. This
we write x > 0 if x
type of problem arises very often in applications;
for example Vwhen f is a dual. functional relative to
an original inequality constrained primal problem,
and x represents a vector of nonnegative lagrange
multipliers corresponding to the inequality constraints, and when f represents an Augmented Lagrangian or exact penalty function taking into account other possibly nonlinear equality and inequalThe analysis and algorithms that
ity constraints.
Work supported by Grant NSF ECS-79-20834 and by
Alphatech, Inc. through DOE Contract No. DE-AC0179ET29243.
To be presented at' the 20th
December 1981.
where ok is a positive scalar stepsize and for any
n...,z6R
R we denote
vector z = (z,
[zl
:
[
ax {0,z
X
{O,z
The stepsize ak may be chosen i.na number of ways.
In the original proposal of Goldstein [1], and
Levitin and Poljak [2], ak is taken to be a constant
E C for all k) and a convergence result
a (i.e.
is shown under the assumption that a is sufficientIn generly small and Vf is Lipschitz continuous.
al a proper value for a can be found only through
experimentations.
,0i alternative suggested by INMc-
An alternative suggested by cxperimetatio.
Connick [3] is co choose ak by function minimization
along the arc of points xk(l), a > 0 where
IEEE Conference
on Decision and Control, San Diego,
CA.
iUT-Ori
77%c RED.
NU
F R PAG ES i i R
,
MO)DEL PAPER
F INAL SIZE
Thus1,-s-hen _
.ta---------Thuscl~iks-cosen
so-that--
of-the inverse H'eSsian V f()
½ X 1ii
lastaong
te
~
_suitable
subspace. At thispoint
wefind.that the
algorithms available at present are far more comr
TIT-LE ON! W-IUi PAGE 'plicated than' their unconstrained counterparts,
f~~
min f[xk(C)].(6)
f[xk(] =__ m,
mm
xk(d)
particularly when the problem has large..dimension.
a>O'~~Thus the most straightforward..extension of. Newton's
Unfortunately the minimization above is very difmethod is given by
ficult to carry out, particularly for problems of:H.lO'-+i
_
large dimension, since f[xk(c)] need not-be difk+1- --=- Xk-+
Xk
l
(12)l
_ L
I_
ferentiable, convex, or-unimodal as a function of
hre
isa a solution
ion of
ofte
ad
program
where xis
the quadratic
program
a even if f is convex.
For most problems we prefer,;,-,,
. k
the Armijo-like stepsize rule first
proposed in
+ 1
_.1m_
V~f(x
_(x o k
).-E~k)(xx
i Betsek[4]
[4 whereby',
wheeby c;iPK
'is
Bertsekas
given byk
by,-- ----isgiven
-,xk - --,--r--- ,nimimze
----,-I
--f (xk) (x xk)
---2k
2 -xjd)V
ins
t k-
i;,
A!{
I2
,
Mt!
5"
s3
.......- ,
-
---
--
-.
where m is the first nonnegative integer m satisfyig
.
k
fying>,-~~~~~~~~
5in
f(xin
ffx
f
mA
>
' )Im
x Os.
(xk) -[Xk(
5)] >
IvfXkj ' Xk - xk($ s)] *
(7b)
Here
scalars
the
s,
,
and
are
fixed
and
are
Here the scalars s ,
and a are fixed and are
.I v1 -dent
chosen so that s > 0, B6(0,1), and oc(O, 2 -). In addition to being easily implementable and convergent,
the algorithm (4), (7), has the advantage that when '
it converges to a local minimum x satisfying the
standard second order sufficiency conditions for
optimality (including strict complementarity) it
ativecontrantsat
te
iden
fiteatifies
theidenifis
active constraints
at xin
x in a fiite
finite
number of iterations in the sense that there exists
k such that
B(x)-
B=x )
Bx=B(xk
V kk
> k
->T()
(8)
where, for every x6R , B(x) denotes the set of
indices of binding
constraints
at
indices
constraints
of binding
at x
x
B(x)
-{i
l
0
show that the results stated above hold also for
sothtthe
algorithm
[xk
~~~~~the
-
algorithm
~performance
kf(X)+(10)
~k+l
~kVf(~x~~k)]~
= ~k -
(lens
where D is a diagonal positive definite matrix and
k
ak is chosen by (7) where now xk((a) is given by
xk(i)
[Xk - cDkVf(xk)]
For this it
i
elements dk,
dldC i
d - dd k < -<
0
(13)
(11)
is necessary to assume that the diagonal
=
,...,n
_.[10]
d
of the matrices Dk satisfy
= 0,1
0,1 ....
ii = 11,...,n
,
,n .direction
kk =
-hred
sm pstv
n jar
sar.the
where d and d are
some positive
scalars.
-here psivhas
'
and
----and okI-----is a stepsize parameter.-
There are- conver-
:gence and superlinear rate of convergence., results
the literature regarding this type of method
(Levitin and Poljak [2],
Dunn [5]) and its QuasiNewton versions (Garcia-Palomares and Mangasarian
[6]), however its effectiveness is strongly depenupon the computational requirements of solving
the quadratic program (13).
For problems of small
dimension, problem (13) can be solved rather quickdimension, problem (13) can be solved rather quickly by standard pivoting or manifold suboptimization
methods but when the problem
is of large dimension
methods but whien the problem is of large dimension
solution of the quadratic program (13) by standard
methods can be very time consuming.
Indeed, there
are large-scale quadratic programming problems aare
large-scale
problems arising
in optimalquadratic
control programming
the solution of which by
pivoting methods is unthinkable.
In any case the
facility or lack thereoff of solving the quadratic
program (13) must be accounted for when comparing
method (12) against other alternatives.
Another possible approach for constructing
superlinearly convergent algorithms for solving
1
=l =-I 1,...n}.
=,
(9)
Minor
of
te
.
modif
n
Minor modifications of the proofs given in [4]
x
subject to x >
(7a)
ar soe
problem (1) stems from the original gradient projection proposal of Rosen [7], and is based on manifold
suboptimization and active set strategies as in Gill
and Murray [8]
and
Goldfarb
[8], 'urray
Goldfarb
[9]
[9],
Luenberger
Luenberger
[10] and
[10] and
other sources, (see Lenard [11] for an up-to-date
evaluation of various alternatives).
Methods of this type are quite efficient for probof relatively small dimension, but are typically
large-scale problems
a
ly unattractive
unattractive for
for large-scale
problems smith
with a
large number of constraints binding at a solution.
The main reason is that typically at most one constraint can be added to the active set at each iteration, so if, for example, 1,000 constraints are
binding at the point of convergence and an interior
starting point is selected, then the method will
require at least 1,000 iterations (and possibly
many more) to converge.
While several authors [8],
have alluded to the possibility of bending the
...... .....
of search along the constraint boundary,
a
only specific proposal known to the author that
often possible to achieve s'ubstan-
been made in the context of the manifold suboptimization approach is the one of McCormick [12]
tial computational savings by proper diagonal scaling of VE as in (10), the resulting algorithm is
typically
characterized by linear
linear convergence
rate.
typically characterized by
convergence rate.
proposed by Brayton and Cullu
[13 incorporate
proposed
by Brayton
and Cullum
[131 incorporate
bending but
simultaneously
require the
solution of
While it
is
Any attempt to construct a superlinearly
algorithm
must by necessity
involve
a
convergent
nondiagonal
h t b
scaling matrix Dk which is an adequate approximation
and
it does not
seem particularlyr
attractive methods
for
large-scale
problems.
(The Quasi-Newton
quadratic programming
subproblens).
Manifold
sub-
..
.i.
s
optimization methods require also additional
computation overhead in deciding which constraint to
FINAL SIZE 8¥½ X 11
`1MODEL PAPER
77% RIi)D.
a
ur
rithtake
i
ticurrent'it
For the apdrop- from ,the: currenitly-:active. set...
suitably chosen rectangle' (i.e. a set described by I
parently most successful strategies (Lenard [11]) t
which attempt to drop as many constraints as pos--- ------.upper and lower-bounds on the -variables) --as"a "lo.cal universe" instead of a manifold.
sible, !this overhead can be significant and must be,-_
'
taken into account when comparing the manifold subThroughout- the- paper-we emphasize Newton-like
- -optimization-- approach with other alternatives. -.-. ---'methods as prototypes. for'broad-classes of-super- ----------....-.....------------r....
linearly converging algorithms that fit the frameThe algorithms proposed in this paper attempt
work of the paper. We often make positive definite-.
to combine the basic simplicity of the steepest
descent iteration-(4), (7) with the sophistication -I-- ness assumptions on-the Hessian-matrix of-f-in order
--.to avoid getting bogged domm--in--techni.cal details
and fast convergence of the.constrained Newton's They do not involve solution of. i ,relating to modifications of Newton's method such
imethod (12), (13).
as those employed in unconstrained minimization
la quadratic program thereby avoiding the associated :''[14]-[16] to account for the possibility that V2f
computational overhead, and there is no bound to is not positive definite., Quasi-Newton versions of
!the number of constraints that can be added to the the Newton-like methods presented are possible but
!currently active set thereby bypassing a' serious
the discussion of specific implementations is be;
inherent limitation of manifold suboptimization
yond the scope of the paper. More generally it may
The'basic form of the method is
methods.
ibe said that the nature
!
Xk+l
where
xka)
Xk.(ak)
=
=
.
..
..........
'.'.
[x
- aD kVf(x)]
,
::''
a > 0.
:'"
of the algorithms proposed
'is such that almost every-useful idea from uncon'strained minimization can be fruitfully adapted
:-within the constrained minimization framework con=
here, however, the precise details of how
'-:sidered
: '''
this should be'done may involve considerable furIther research and experimentation.
(14)
-
''
(15)
Dk is a positive definite symmetric matrix which is
employed throughout the paper is
e
Th notation
as follows. All vectors are considered to be col-,
The
umn vectors, A prime denotes transposition.
i.e. for
standard norm in Rn is denoted by
n
i21
(x
. .,x ) we write x = [
x = (x
i=l
n
The gradient and Hessian of a function f; R + R
2
are denoted by Vf and V f respectively. Proofs of
all results stated may be found in reference [17].
partly diagonal, and ak is a stepsize determined by
'k
an Armijo-like rule similar to (7) that will be described later. The convergence and rate of convergence properties of this method are discussed in
A key property of the method is that
Section 2.
under mild assumptions it identifies the manifold
of binding constraints at a solution in a finite
This
number of iterations in the sense of (8).
means that eventually the method is reduced to an
unconstrained method on this manifold and brings to
bear the extensive methodology and analysis relating to unconstrained minimization algorithms.
In reference [17] we discuss how the method
(14), (15) can form the basis for constructing algorithms for general linearly constrained problems of
the form
minimize
Algorithms for Minimization Subject to Simple
Constraints
2.
We consider first the problem min f(x) x > 0}
of (1). Any vector x satisfying the first order
necessary condition (3) will be referred to as a
We
critical point with respect to problem (1)o
focus attention at iterations of the form
f(x)
(16)
+
[xk
0
-
k DkVf(Xk)]
subject to bl < Ax < b2.
Xk+l
The main idea here is to view problem (16) locally
as a simply constrained problem via a transformation
of variables. For example, if the matrix A is
square and invertible, problem (16) is equivalent
to the problem
r
where Dk is a positive definite symmetric matrix
minimize
h(y)
=
and
f(A-ly)
[xk - aDkVf(xk)]
,
a >0
The
stepsize a (i.e. f[xk(a)] > f(xk), V a > 0).
following proposition identifies a class of matrices
n objective function reduction is posDk for which ..
Define for all. x > 0
sible.
Ax.
A similar approach based on an active set strategy
is employed when A is not square and invertible.
The ideas are similar to those involved in manifold
suboptimization methods where a linear manifold is
selected as a "local universe" for the purposes of
=
chosen by search along the arc of points
It is easy to construct examples where an arbitrary
positive definite choice of the matrix Dk leads to
situations where it is impossible to reduce the
value of the objective by suitable choice of the
via the transformation
=
k is
xk(a)
subject to b1 < y < b2
y
t
-
.
.
..
v'
AurHOFD
77'. R ID.
J!NMD F FP
; PA GES HE RE
]
MODE L PAPER
FINAL_ SIZE 8'1/ )X11
{11Xs(X)dxkt,>tO..Dl.pdteznsecld
O;-;i'.i)I%;
h
i0,
i
o0}.
>
of e
i'F-'
(17)
(17)
!
__-
s^l- acedeiq p;i*s bare
l
We say that a symmetric nxn matrix Djwithrelements'
,
;_xi-_[-
-
I
MVf(xk]
d
min{,w}.
is'diagonal with respect-.to a subset-.of indices-------- kt.................--.....
--.....
.-----........ _
IC
!2...n}-if
...----....
I (k+l)st Iteration of the Algorithm
l
|.. j
'd1 J ~ 0,
Vi 6 I, j =,2;...,n,
j
i.
(18)
i
.. _
atrix
definite symmetric
diagonal with.respect to the set IK
.......... __ We select a _positive
....
.......
............
Dk which is
Proposition 1: Let x > 0 and D be a positive def- .
linite symmetric matrix which is diagonal with reigie n by
tspect to I
+
(x) and denote.
i
I
Jc
IX
"i talf(x)]
Ix(c
-;
'
!
,
b)
If
ea > 0.
x is
-
not a critical point with respect to
(1) there exists a scalar a
problem
f[x(a)] < f(x),
Vae (O,--].
=
[Xk -akDk
(20)
Vf(x k )+]
should be chosen diagonal with respect to a subset
of indices that contains
I
0
Denote
iPk
DVf(x22)
k)
k
Ixk(a)
=
Xk+l
Xk(c)
The algorithm that *we describe utilizes a sca-
(24)
where
°k
O
(k
2
(25)
=
and mk is the first nonnegative integer m such that
f(x)
k
ff(
- fx (m)
f[xk(
af(k
(x
______i
discontinuity at the boundary of the constraint set
whereby given a sequence {Xk} of interior points
kthat convergs to I+(x)
a
k
that converges to a boundary point x the set I+(x )
+_
k
This
may be strictly smaller than the set I (x).
causes difficulties in proving convergence of the
algorithm and may have an adverse effect on its
rate of convergence.
(This phenomenon is quite common in feasible direction algorithms and is referFor this reason we
red to zigzagging or jamming).
will employ certain enlargements of the sets I+(xk)
with the aim of bypassing these difficulties.
(23)
Vkc > 0.
k+l
af(xk)
,
ax
)
Unfortunately the set I+(Xk exhibits an undesirable
k(xk)
[xk cP
is given-by
Then x
>0 such that
Based on Proposition 1 we are led to the conclusion that the matrix Dk in the iteration
k
k+
1
.(21)
*
a)
The vector x is a critical point with respect
to problem (1) if and only if
x - x(a),
.
i
(19)
>0
'j:~~
af(x)
<
.{iJ]o- ......-<-L n JI+
I
i 6
k)
> om
Pi +
x
fk
axkii
(26)
kx'
The stepsize rule (25), (26) may be viewed as
a combination of the Armijo-like rule (7) and the
Armijo rule usually employed in unconstrained minimization (see e.g. Polak [18]).
When I+ is empty
k
the right-hand side of (26) becomes aomVf(xk) 'Pk
and is identical to the corresponding expression of
the Armijo rule in unconstrained optimization,
(26) is
+equality
while if Ik = {1,2,....,n then inequality (26) is
identical with (7).
Note that, for all k; we have
I
(27)
xk
lar c > 0 (typically small), a fixedt diagonal positive definite matrix M (for example the identity),
so the matrix Dk is diagonal with respect to I+(Xk).
It is possible to show that, for all in > 0, the
and two parameters B3c(0,1)
and ae(O,8) that will be
an tw c01armtr
)tawilb
used in connection with an Armijo-like stepsize
rule. An initial vector x0 > 0 is chosen and at
right-hand side of (26)
positive if and only if
the kth iteration of the algorithm we have a vector
decreases the value of the objective function at
each iteration k for which xk is not a critical.
Actually the results that follow can be shown also
for the case where MNis changed from one iteration
to the next in a way that its diagonal elements
are bounded above and away from zero.
point, and
We proceed
convergence
use of the
is nonnegative, and it is
x is not a critical point.
k
In conclusion the algorithm is well defined,
~~l~-~~T~~--_-~~--~~~~~-~~-us
the---_I-~
of--I~~
essentially terminates if xk is critical.
to analyze its convergence and rate of
properties
To this end we will make
folloiing two assumptions:
folio-~ring
tw
asuptos
AUTHO'R
PAGES HE RE
]iNMBER
.-
NMODE L PAPER
77% RfED.
FINAL_ SIZE 81½ X 11
(A).;KiThe _gradient:.Vf-is -Lipschitz. continuous on
each bounded set of R , i.e. given any bounded set
chosen in a way-that approxiiates the'invers
oif
i'the
portion of the Hessian of f corresponding to
S c Ri there exists a scalar L (depending on S) such
that
IH
tha-t
(28)
-- .
....
J -- -s-1
11vztxj
vrkpj
~ - "/
IVf(x)
V
.
-
l, -
yI-.
- ~o)
'
··
-....'
VX,. y-y.
..............-l--.'"
these -same-indices------------------------------I
'Proposition
3Let x* be a local minimum of, problem
Assume -also--that
(1) satisfying-Assumption (C).---.
'
(B) holds in the stronger form whereby,-in-addition
(B) There exist-positive:scalars XA, X2 and nonneg
ative scalars q, q--such th
I1
2
1
..
wk[x
lx
of
Note that when q
the form
zlk
2
IzI
< z'DkZ
2
<
st
a
Mfxk
~~ksome
2AzI
= q2
k_
_
_2n
X
(30)
0, relation (30)
i ,
', VzRn, k = 0,1
-hee. -.
.
{x I converges
I
to * and we have
k
1
=- B(xk):= B(x*) , V k > k + 1-.
-:-
-then
--...
.
.....
(31)
-----
.
are uni-f
I
problem ( No1teatwhnqlT
conditions. For all x > 0 we denote by B(x) the
set of indices of binding constraints at x, i.e.
I+
.k.....
{ix
-O},
V x >
.
== 0,
= {xx
iten
ie
=
(37)
th
Xk
Under the assumptions of Proposition 3 we see
that if the algorithm converges to a.local minimum
x* satisfying Assumption (C) then it reduces evenby
iternimization
meth(24)
and fore-
We now focus
attention
at a local minimum
satZisfying'D~I
< hZ/ZI
.I.~jlE~n~.X
D.1..:
(x
B(x)
(24) and for
index k we have
takes
Proposition 2: Under Assumptions (A) and (B) above,
every limit point of a sequence {xk} generated
-
(35)1
isf-o
a soequence generated by iteration
Dk-s
--
and simply says that the eigenvalues of
formly bounded above and away from zero.
problem (1).
ements d-to of41),
the matrices
satisfy- for some scalar. X- >0O-----
-------------
{x
k
vicB(x*)} .
(38)
onverges
wlavehave to x* and
}
=
B(x*),
=(x*),
Vk>
V
>k
(39)
(37)
+.
(32)
This shows that if the portion of the matrix D
(C) The local minimuand away
from
(1) is such
that for some 6 > 0, f is twice continously differentiable in the open sphere {x x-x*j < 6} and
there exist positive scalars problem(A)sat(1). and(B) above,
{xlx
2
is chosen to
w see
e
k
the inverse of the Hessian of f with respect to
i indices then the algorithm eventually reduces
x*='
these
T
=
0, .
VizB(x*)}
(38)
to Newton's method restricted on the subspace T.
mlz12 < zV
More specifically, by rearranging indices if
necessary, assume without loss of generality that
2
f(x)z < m2 |zj 2 , V x such that
<We
m
i - x*i < 6cand z / 0 such that z i0,
ix
and icB(x*)
correspondinger
o the indices iofI
(33)
Ik =
{r
...
,n},
(40)
Furthere
is some
integer.
index rk
k we
w
ill have
>
UBx)=
0,
i
VicB(x*).
0}, V
x
0.
:
(34)
(32)
The following proposition demonstrates an important property of the algorithm namely that under
mild conditions it is attracted
of
oblem
by a local
(1) is minimum
such
x* satisfying Assumtion (C) anId identifies the set
of active constraints atx*
in a finite numbdiffer
of
iterations. Thus if the algorithm converges to x*
Then D
has the form
-(39)
Dix
>
k
k
'
-
(41)
Dk
°
n
dk
where d' > 0, i =r.
n and D can be an arbik
rk+l
k
equivalent to an unconstrained ontimization method
trary positive definite matrix.
Suppose we choose
restrictds oan tle
sunl~~
sace
of
at
[bindings)~;
) to No
be he inverse of the Hessian of f ith torespect
____drestricted
binding constraintsc
on the subspao
.
at x*. ThiMs property is instrumetcall in proving
spct to the indices i =
... rk i.e. the elements
-suporlinear
the
- -- 1at
Furthermore conavergenlce of the algorityhm wlhe
portion
ofDw
to the indices
of
are
fo
~ corresponding
kx
.. .
.
.
..
then after a finite number of iterations it is
: NU
AU--THOR
77% RED.
Rc,!fin text ob 2 f--..~a
k= -I
ktk
[
,o'R
. rC
IIEHE
ES
FINAL SIZE 8./ X 11
MODEL PAPERt
s!.ccedi-,,
Cd
C
f.ap.S.s ihee
Vi!+t
I -----t
!-
G::P.'. :McCormick, '"The: Variable. RedUction Meth 2
od for Nonlinear Programming", Management
--- Science;--Vol-. --17,-- 1970, pp. 146-_160
.
[12]1
-
(42)
1
`
A
|
2By Assption
is positive
(C), Vf(x*)definite on
113'i R.K. 'Brayton and J. Cullum, "An Algorithm for
By Asstdmption (C), F f(x*) is positive definite on
T so it follows from (39) 'that .--..this choice is well
..
Minimizing a Differentiable Function Subject
oLI
Box Constraints and Errors";--Journal of
defined and satisfies the assumption of Proposition'
- to
to Box Costraints and Errors"---Journal of
13 forlk sufficiently large.
Since the conclusion
Optimization Theory and Applications, Vol.
Oimi-t
29,
,of this proposition asserts that the method even29 1979,
1979, pp.
pp. 521-558.
521-558.
. . ....
..
.
..... ...
'tually reduces- to Newton's method restricted on the:- ..
'subspace T a superlinear convergence rate result
-[14] IN. Murray, "Second Derivative Methods", in
:follows.
This type of argument can be used to -con--; , iNumerical Methods for Unconstrained Optimi!struct a number of Newton-like and Quasi-Newton
zation, W. Murray (ed.), Academic Press, N.Y.,
57-71..
...........
1972,
methods and prove corresponding' convergence and1972 pp.
p
-irate of'convergence'results
,
,
.rat.e.of,.,
iconvergence'r
'!,>
,.
>+esui.ts.,...|
j )[15]R. Fletcher, and T.L. Freeman, "A Modified
Newton Method for Minimization", Journal of
References
.
.'.'..
.
...
..
Optimization Theory and Applications,
23,
[1]
[2]
A.A. Goldstein, "Convex Programming in Hilbert
Space", Bull. Amer. Math. Soc., Vol. 70, 1964,
pp. 709-710.
E.S. Levitin and B.T. Poljak, "Constrained
Minimization Problems", U.S.S.R. Comput. Math.
Phys., Vol. 6, 1966, pp. 1-50.
[3]
G.P. McCormick, "Anti-Zig-Zagging by Bending",
Management Science, Vol. 15, 1969, pp. 315319.
[4]
D.P. Bertsekas, "On the Goldstein-LevitinPoljak Gradient Projection Method", IEEE
Transactions on Aut. Control, Vol. AC-21,
1976, pp. 174-184.
[5]
J.C. Dunn, "Newton's Method and the Goldstein
Step Length Rule for Constrained Minimization
Problems", SIAM J. on Control and Optimization,
Vol. 6, 1980, pp. b59-6/4.
[6]
U.M. Garcia-Palomares, and O.L. Mangasarian,
"Superlinearly Convergc;nt Quasi-Newton Algo-rithms for Nonlinearly Constrained Optimization Problems", Mathematical Programming,
Vol. 11, 1976, pp. 1-13.
[7]
J.B. Rosen, "The Gradient Projection Method
for Nonlinear Programming, Part I:
Linear
Constraints", J. SIAM, Vol. 8, 1960, pp. 181217.
[8]
P.E. Gill. and M. Murray (eds.), Numerical
Methods for Constrained Optimization, Academic
Press; N.Y., 1974, Ch. 2 an 3. -
[9]
D. Goldfarb, "Extension of Davidon's Variable
Metric Algorithm to Maximization Under Linear
Inequality and Equality Constraints", SIAM
J. of Applied Mlath., Vol. 17, 1969, pp. 739-
[10]
D.G. Luenberger, Introduction to Linear and
Nonlinear Programming, Addison-lWesley, Reading, Mass. 1973, Ch. 11.
[11]
M.L. Lenard, "A Computational Study of Active
Set Strategies in Nonlinear Progranmning with
Linear Constraints", Mathematical Programming,
Vol. 16, 1979, pp. 81-97.
'
[16]
I
Vol.
1977, pp. 357-372.
J.J. More and D.C. Sorensen, "On the Use of
Directions of Negative Curvature in a Modified Newton Method", Mathematical Programming, Vol. 16, 1979, pp. 1-20.
[17]
D.P. Bertsekas, "Projected Newton Methods for
Optimization Problems with Simple Constraints",
SIAM J. on Control and Optimization, (to appear).
[18]
E. Polak, Computational Methods in Optimization:
A Unified Approach, Academic Press,
1971.
Download