Steepest Descent and Conjugate Gradient Methods MATH 450 October 6, 2008

advertisement
Steepest Descent and Conjugate Gradient Methods
MATH 450
October 6, 2008
Steepest Descent Method
Recall the minimization of function p(x) = hx, Axi − 2hx, bi:
hy, b − Axi2
.
p(x + t̂y) = p(x) −
hy, Ayi
Thus we can construct an iterative method by step
(k)
t
hy (k) , b − Ax(k) i
=
hy (k) , Ay (k) i
along the direction of y (k) at x(k) , i.e.,
x(k+1) = x(k) + t(k) · y (k) .
For steepest descent method we choose
y (k) = b − Ax(k) ,
which is the steepest descent direction, i.e, the negative gradient of p(x) at
x(k) . This is the residual vector, and we can show that hy (k) , e(k) i ≥ 0 for positive
definite A unless Ax = b (see Problem 4 of this section).
1
Steepest Descent Method
What is the number of iterations?
2
General Questions for Non-stationary Iterative Methods
1. What is the direction to step from x(k) to x(k+1) ?
2. What is the step size?
3. How many iterations needed?
3
A-orthonormal System
For a set of nonzero vectors {yk }:
1. Orthogonal: hyi , yj i = 0 if i 6= j.
2. Orthonormal: hyi , yj i = δij .
3. A-Orthonormal: hyi , Ayj i = δij for A symmetric and positive definite.
4
A-orthonormal System
Let {u(1), u(2), . . . , u(n) } be an A-orthonormal system. Define
x(k) = x(k−1) + hb − Ax(k−1) , u(k) iu(k) ,
(1 ≤ i ≤ n)
in which x(0) is an arbitrary point in Rn . Then Ax(n) = b.
Proof Note the step size is t(k) = hb − Ax(k−1) , u(k) i and the iteration is
x(k) = x(k−1) + t(k) u(k) ,
and thus
Ax(k) = Ax(k−1) + t(k) Au(k).
(1)
But then
Ax(k−1) = Ax(k−2) + t(k−1) Au(k−1)
and so on so forth. So
Ax(n) = Ax(0) + t(1)Au(1) + t(2) Au(2) + · · · + t(n) Au(n).
Taking inner product of this vector with any u(k) with 1 ≤ k ≤ n we get
hAx(n) , u(k)i = hAx(0), u(k)i + t(k) .
5
Or
hAx(n) − b, u(k)i = hAx(0) − b, u(k) i + t(k) .
In order to show that hAx(n) − b, u(k) i = 0 we need to show the right-hand side
is 0. By definition
t(k) = hb − Ax(k−1) , u(k) i
= hb − Ax(0) , u(k) i + hAx(0) − Ax(1) , u(k) i + · · · + hAx(k−2) − Ax(k−1) , u(k) i
= hb − Ax(0) , u(k) i + h−t(1)Au(1) , u(k) i + · · · + h−t(k−1)Au(k−1) , u(k) i (use Eq.(1))
= hb − Ax(0) , u(k) i,
thus hAx(n) − b, u(k) i = 0 for any u(k), and therefore must be 0 in Rn .
Normalization is needed if {u(i)} is an A-orthogonal system rather than Aorthonormal, see Thm 2 in Page 237.
How to produce orthonormal system → Conjugate Gradient Method
Input x(0), A, M, b, ǫ
r (0) ← b − Ax(0)
u(0) ← r (0)
while not convergent and k < M
If u(k) = 0 then stop
hr (k) ,r (k)i
(k)
t
← (k) (k)
hu ,Au i
x(k+1) ← x(k) + t(k)u(k)
r (k+1) ← r (k) − t(k)Au(k)
If kr (k+1)k ≤ ǫ then stop
hr (k+1) ,r (k+1) i
(k)
s
←
hr (k) ,r (k)i
u(k+1) ← r (k+1) + s(k)u(k)
Major idea: use residual vectors to generate a set of orthogonal vectors
6
How to produce orthonormal system → Conjugate Gradient Method
• Prove that {u(k) } is an A-orthogonal set
• Prove that {r(k) } is an orthogonal set
• Prove that r(i) = b − Ax(i)
Please see Thm 3 of Section 4.7 for the full proof.
7
Download