Math 1321 Week 8 Lab Worksheet Due Thursday 03/07

advertisement
Math 1321
Week 8 Lab Worksheet
Due Thursday 03/07
1. Find ∇f (x, y, z) if f (x, y, z) = x2 + 3xz + z 2 y
Solution:
E
D∂
∂
∂
f (x, y, z), f (x, y, z), f (x, y, z)
∂x
∂y
∂z
D
E
= (2x + 3z), (z 2 ), (3x + 2zy)
∇f (x, y, z) =
= (2x + 3z)i + (z 2 )j + (3x + 2zy)k
2. Consider Newton’s second law F = ma. Suppose that the force is F = −∇U , where
U = U (r). Let r = r(t) = [x(t), y(t), z(t)] be the trajectory along which Newton’s law
is satisfied (i.e. a(t) = r00 (t)). Prove that the quantity E = mv 2 /2 + U (r) is a constant
with respect to t, that is dE/dt = 0 where v = kr0 (t)k. This constant is called the total
energy of a particle. (Notice that mv 2 /2 represents the kinetic energy and U represents
the potential energy of the particle).
Solution: We want to show dE/dt = 0
mv 2
+ U (r)
2
dE
md 2
d
=
(v ) + U
dt
2 dt
dt
E=
Now notice that we can write the following:
v2 = v · v
Hence,
d 2
d
(v ) = (v · v) = v0 · v + v · v0 = 2v · v0 = 2v · a
dt
dt
Also,
d
U = Ux x0 (t) + Uy y 0 (t) + Uz z 0 (t) = r0 · ∇U = v · ∇U
dt
This gives,
d
E = mv · a + v · ∇U = v · (ma − F) = 0
dt
So the total energy is conserved for the trajectory of the motion.
3. Gradient Descent: Gradient descent (also known as the steepest descent method) is
an iterative method that is used to find the minimum of a function F . The method is
given an initial point r0 , and it follows the negative of the gradient in order to move
the point toward a critical point, which is hopefully the desired local minimum. The
equation below is the general formulation of the iterative procedure (update equation).
rn+1 = rn − µ∇F (rn )
Gradient descent is popular for very large-scale optimization problems because it is easy
to implement and each iteration is computationally cheap. In the following questions,
you will be verifying the gradient descent algorithm for a simple quadratic bowl function
and making inferences about the gradient descent step size µ.
(a) Consider the convex function given as, f (x, y) = x2 +y 2 . We can rewrite the gradient
descent update equation in terms of x and y by using the fact that, rn = (xn , yn ).
This gives the following scalar update equations:
∂
f (x, y)
∂x
∂
= yn − µ f (x, y)
∂y
xn+1 = xn − µ
yn+1
Derive the gradient descent update equation for the x and y components for the
function given by f (x, y)
Solution:
xn+1 = xn − 2µxn
yn+1 = yn − 2µyn
(b) Let r0 = (x0 , y0 ) = (5, 3) be the initial values. Compute the first 5 iterations of
the x and y updates (i.e n = 0, 1, 2, 3, 4) when µ = 0.25. Can you guess what the
minimum of the function is from your calculations?
Solution:
n
0
1
2
3
4
5
6
7
8
xn
5.0000
2.5000
1.2500
0.6250
0.3125
0.1563
0.0781
0.0391
0.0195
yn
3.0000
1.5000
0.7500
0.3750
0.1875
0.0938
0.0469
0.0234
0.0117
From the values computed, it can be inferred that the minimum of the function
f (x, y) occurs at (x, y) → 0
(c) Repeat (b) for when µ = 0.45 and µ = 0.75. What can you say about the choice of
the step size µ?
Solution: for µ = 0.45, we have,
n
xn
yn
0 5.0000 3.0000
1 0.5000 0.3000
2 0.0500 0.0300
3 0.0050 0.0030
4 0.0005 0.0003
5 0.0000 0.0000
6 0.0000 0.0000
7 0.0000 0.0000
8 0.0000 0.0000
9 0.0000 0.0000
for µ = 0.75, we have,
n
xn
yn
0 5.0000 3.0000
1 -2.5000 -1.5000
2 1.2500 0.7500
3 -0.6250 -0.3750
4 0.3125 0.1875
5 -0.1563 -0.0938
6 0.0781 0.0469
7 -0.0391 -0.0234
8 0.0195 0.0117
9 -0.0098 -0.0059
Increasing the step size µ increases the convergence speed, however if the step
size is too large it can cause oscillatory behavior. This means that gradient
descent is very sensitive to the choice of µ
Download