Lecture 23 18.086 Report due date • Please note: report has to be handed in by Monday, May 16 noon. • Course evaluation: Please submit your feedback (you should have gotten an email) such as airplanes or turbines. From the weak form to finite e element method has big advantages over finite di↵erences egular mesh discretization. elements We will discuss the finite element g geometries such as moving pistons. • Electrostatics: The electrostatic potential p and the charge ensional Poisson equation distribution r fulfill the Poisson equation (1D): 00 (x) = 4⇡⇢(x) on interval Ω=[a,b] (3.29) • Alternative system: r the mass density, p the gravitational potential • Dirichlet the analysis (but not necessary): (0) = BCs (1)to=simplify 0. (3.30) (a) = a , (b) = b • We have seen that this can be expressed in its weak form as pand theZsolution (x) in terms of basis functions {vi }, i = b pace: for all test functions v ( 00 v + 4⇡⇢v) dx = 0 a or(x) = Z b 1 X ai vi (x). (3.31) i=1 ( 0 v 0 + 4⇡⇢v) dx + 0 v|ba = 0 for all testafunctions v ion the infinite basis set needs to be truncated, choosing fia 0 (BCs) orthogonal, N of N linearly independent, but not necessarily FE approximation • (x) = Plugging 1 X i vi (x) into the weak form we get: i=1 Find coefficients i such that ! ! Z b N X 0 0 i vi (x) v (x) + 4⇡⇢(x)v(x) dx = 0, a • i=1 for all v 2 H 1D (⌦) What about those v’s? Since we have a basis and the problem is linear in v and v’, we can just write the equivalent problem: Find coefficients i such that ! ! Z b N X 0 0 v (x) v i i j (x) + 4⇡⇢(x)vj (x) dx = 0, a i=1 for j = 1, .., N FE approximation Find coefficients i such that ! ! Z b N X 0 0 v (x) v i i j (x) + 4⇡⇢(x)vj (x) dx = 0, a • for j = 1, .., N i=1 Rearranging the terms a bit, the equation can be written as: Solve A ~ = ~b with ~=( • 1, 2 , ..., N), Aij = Z b a vi0 (x)vj0 (x)dx , bi = 4⇡ Z b ⇢(x)vi (x)dx a Since we need to invert a matrix, it makes sense to choose the basis such that most entries in A are zero! This is the case if different vi have only very small support (hence the name “finite elements”) ods nlinear partial di↵erential equations Nonlinear finite elements ment method can also be applied to non-linear partial di↵erential equations ig changes. Let us consider a simple example • d2 (x) 2 (x) = dx Solve 4⇡⇢(x) (3.41) • Eq. Equivalent statement: Find Φthe such that for all vi: e ansatz, (3.32) asweak before and minimizing residuals Z b g ai ( = Z1 00 [ (x) + 4⇡⇢(x)) v (x)dx = 0 i 00 (x) + 4⇡⇢(x)] w (x)dx i (3.42) 0 The same derivation as before now results in nonlinear coupled equations ow end upX with a nonlinear equation instead of a linear equation: • i,j Aijk X i j = bk Aijk ai aj = bk i,j with bk as before but • Z1 Aijk = (1) Z (3.43) b a vi (x)vj00 (x)vk (x)dx Aijk = ui (x)u00j (x)wk (x)dx (3.44) We now need a nonlinear root-finding algorithm to solve (1) for ɸi. 0 as before. di↵erence between the case of linear and nonlinear partial di↵erential Nonlinear FE • How to solve n problems • X i j = bk ? i,j Write as rk ( ~ ) = ation problems n et f : R • → (−∞, ∞], Minimize |~r( ~ )|22 find Aijk X Aijk i j bk i,j ! minn {f (x)} x∈R We thus need methods to solve nonlinear problems. Quite general: n m: Let f : R → (−∞, ∞], find given x∗ s.t. f (x∗ ) =f, we minnneed {f (x)} a function to • find minn {f (x)} x∈R x∈R al, but some cases, like f convex, are fairly solvable. find x∗ s.t. f (xn∗ ) = minn {f (x)} blem: •How about f : R → R, differentiable? x∈R Let’s assume f is differentiable… Then the problem becomes a root finding problem: eneral, butfind somexcases, like f convex, are fairly solvable. ∗ s.t. ∇f (x∗ ) = 0 s problem: How about f : Rn → R, differentiable? easonable shot at this, especially if f is twice lest we can get The simplest we can get Nonlinear root-finding c optimization: f (x) = 12 x t Ax − x t b + c. common (actually universal) or expansion • Let’s first consider Quadratic optimization: a particular form of f: f (x) = 12 x t Ax − x t b + c. t ∇∇f (x)∆x + · · · + ∆x) = f (x) + (∆x)t ∇f (x) + 21very (∆x)common (actually universal) • Interpretation: Energy, quadratic in x. We have seen these kinds of ∇f (x) = 0 energies in the contextTaylor of CG.expansion Here: f (x + ∆x) = f (x) + (∆x)t ∇f (x) + 21 (∆x)t ∇∇f (x)∆x + · ∇f (x) = Ax − b = 0 Finding ∇f (x) = 0 −1 x∗ = A b ∇f (x) = Ax − b = 0 mean •A In has to be invertible? Is this all we need? other words, quadratic optimization amounts to a single matrix x∗ = A−1 b inversion (using direct methods, iterative methods (CG etc.)). Does this mean A has to be invertible? Is this all we need? • But A has to be more than invertible! R. A. Lippert Non-linear optimization R. A. Lippert Non-linear optimization Max, min, saddle, or what? Positive definitenes reQuadratic optimization: f (x) = x Ax − x b + c. Requires A be positive definite, why? examined very common (actually universal) 1 t 2 3 • 0 Taylor expansion A not only needs to be invertible, but also positive definite f (x + ∆x) = f (x) + (∆x)t ∇f (x) + 21 (∆x)t ∇∇f (x)∆x + · · · 2.5 −0.5 2 −1 1.5 −1.5 1 −2 0.5 −2.5 Finding ∇f (x)f= 0 could also be a maximum or saddle point Otherwise, or a “line”! 0 1 • t −3 1 1 0.5 0 −0.5 −0.5 −1 0.5 0 0 −0.5 1 0.5 0.5 0 −0.5 −1 −1 −1 ∇f (x) = Ax − b = 0 1 1 0.5 x∗ = A 0.8 0 0.6 −0.5 −1 b 0.4 −1 −1.5 −2 1 Does this mean A has to be invertible? Is this all we need? 0.2 0 1 1 0.5 • −0.5 −1 −1 −1 Positive definiteness is crucial to guarantee that we found a minimum by matrix inversion for quadratic energies! R. A. Lippert • 0 −0.5 −0.5 −1 0.5 0 0 −0.5 1 0.5 0.5 0 Non-linear optimization For general energies, positive definiteness requires convex f(x) R. A. Lippert Non-linear optimization image source: Ross A. Lippert D. E. Shaw Research Nonlinear root-finding • What do we do if f(x) is more complicated? • If you have no clue, Taylor you can do! • To start, lets consider 1D case: • Minimize f(x): If x* is solution, we can write 0 f (x⇤ ) = f (x) + f (x)(x⇤ • 1 00 x) + f (x)(x⇤ 2 Need to find x* such that f’(x*) = 0, thus x = x⇤ x⇡ f 0 (x) f 00 (x) xn+1 = xn 2 x) + ... f 0 (xn ) f 00 (xn ) Nonlinear root-finding x = x⇤ • x⇡ f 0 (x) f 00 (x) xn+1 = xn f 0 (xn ) f 00 (xn ) Instead of minimizing f, we could also search for a root of g(x)=f’(x): xn+1 = xn g (xn ) g 0 (xn ) Lecture Nonlinear root finding Nonlinear root-finding • Now in higher dimensions: • Taylor is now f (x⇤ ) ⇡ f (x) + (x⇤ 1 x) rf (x) + (x⇤ 2 T x)T rrf (x)(x⇤ • ∇∇f(x) is a matrix of second derivatives (also called Hessian or Jacobian) • Again, min f(x) means ∇f(x*)=0 for solution x* x = x⇤ x⇡ (rrf (x)) 1 rf (x) x) + ... wton’s method wton’s method Naive nonlin. root solver (Newton-Raphson method) Newton’s method finding x s.t. ∇f (x) = 0) • Algorithm: Newton’s method finding x s.t. ∇f (x) = 0) find ; Start with initial guess x0. Then iterate −1 −1 ∆x∆x = − (∇∇f (x )) ∇fi )) (xi ) = − (∇∇f (x i i i xi+1 xi + x ∆x+ i ∆x x = = i+1 i i ∇f (xi ) until Δxi <))tolerance or max. number if steps t ∇f (xi ) posdef, (∇f (x (x − x ) < 0 so ∆x is a i i+1t i i ∇f (xof (∇f (xovershoot) (xi+1to−invert xi ) < the 0 soHessian, ∆xi is ai.e. solve tion (could i ) decrease i )) need •posdef, This means we tion overshoot) ∇f (xi ) of notdecrease posdef,a∆x might be in an increasing again linear problem! i(could tion. ∇f (x ) not posdef, ∆x might be in an increasing i • i 1D), we have to do this many times (at However, (as in convex, f (x ) ≤ f (x ), so problems go away. i+1 i tion. each iteration!) convex, f (xi+1 ) ≤ f (xi ), so problems go away. R. A. Lippert Non-linear optimization aïve Newton’s method Why and where it fails… (Actually Newton’s method finding x s.t. ∇f (x) = 0) ∆xi = − (∇∇f (xi )) xi+1 = xi + ∆xi 1 2 3 −1 ∇f (xi ) tT if ∇∇f (xi ) posdef, (∇f (xi )) (xi+1 − xi ) < 0 so ∆xi is a direction of decrease (could overshoot) if ∇∇f (xi ) not posdef, ∆xi might be in an increasing direction. if f is convex, f (xi+1 ) ≤ f (xi ), so problems go away. 1D example of trouble Example in 1D 1D example of trouble 1D example of trouble: f (x) = x 4 − 2x 2 + 12x 20 1D example of trouble: f15(x) = x 4 − 2x 2 + 12x 20 10 15 5 10 0 5 −5 −10 0 −15 −5 −20 −2 −10 −1 −0.5 0 0.5 1 Has one local minimum −15 −20 −2 −1.5 Is not convex (note the concavity near x=0) −1.5 −1 −0.5 0 0.5 1 R. A. Lippert Non-linear optimization source: Ross A. Lippert D. E. Shaw Research 1.5 Example in 1D 1D example of trouble derivative of trouble: f ′ (x) = 4x 3 − 4x + 12 20 15 10 5 0 −5 −10 −15 −2 the −1.5 negative f ′′ negative f’’ around x=0 is a barrier to reach solution! −1 −0.5 0 0.5 1 region around x= 0 A. repells the iterates: source: Ross Lippert D. E. Shaw Research 1.5 on Line search methods ) ≤ f (x ) ar Newton 1 i force•f (xi+1 ) ≤ f (xi ) xi 1 Algorithm: Try to enforce −1 f(xi+1)<f(xi) by going in gradient direction “far enough”: = − (∇∇f (xi )) ∆xi = − (∇∇f (xi )) = x xi +=αi ∆x x +i α ∆x i+1 ∇f−1(xi ) i i i ∇f (xi ) at0 such f (x + αifΔx ∆x f (x ).fin(x Ifwhich is that (xii is+ ). Ifi ∆x isdirection a direction • iSince i )aα≤ i ∆x i ) i≤ i∆x ia direction f decreases, there will be some αi>0 ase, some αithat exists. such f decreases. αi exists. minimization do 1D optimization problem, on do 1D optimization problem, • But since f is nonlinear, we would need to solve the nonlinear optimization problem min f (xi + αi ∆xi ) i ∈(0,β] min fα(x i + αi ∆xi ) αi ∈(0,β] ijo-search use this rule: αi = ρµn some n It’s getting complicated… n some n h use fthis rule: α = ρµ t i (x + s∆x ) − f (x ) ≤ νs (∆x ) ∇f (x ) • • i i i i i Convergence very much depends on quality of initial guess t1 + s∆x ) −(e.g. f (x ρ) = ≤2,νs (∆x ). (x ) ρ, µ, ν fixed µ= ν =) ∇f Iterative methods Alternatives • Instead of solving ∇f=0, it is often easier to just minimize f, by gradient descent: using, e.g. gradient descent and a good guess for the step length: 1 Search direction: ri = −∇f (xi ) 1. Start descent: with initial guess xi gradient 2 Search step: x i+1 1 2. Search direction: ri = −∇f (xi ) = xi + αi ri 3 Pick on what’s cheap) 2 xi+1alpha: = xi + α(depends r 3. Search step: i i 3 4. rit (∇∇f )ri 1 Pick on what’s cheap)t αi = Find alpha: optimal(depends ! i linearized ri ri t ri (∇∇f )ri = minimization 2αi 1D E.g.linearized by approx rit ri of f (xi + αri ) (danger: low 1D minimization t 2 1D minimization f (x + αr ) (danger: low quality) i i 3 zero-finding r ∇f (x + αr ) = 0 etc. i i i t 1 3 zero-finding ri ∇f (xi + αri ) = 0 • Can be extended to nonlinear CG as well • Do not need second derivative evaluations! What if f has: many local minima, is not differentiable or x is discrete? • Traveling salesman problem: • Find shortest route to visit all cities once! • Here, f=length of route, x: ordered indices of cities to visit • Many different methods available: Monte Carlo, simulated annealing, genetic algorithms etc. What if f is has many local minima? • Rough energy landscape: Gradient descent fails. • General idea: Make sure we don’t get stuck in local minimum! Genetic algorithm • Sketch: Start with a population of “genes” (for instance, random values of x in f(x)) A B C M • Calculate cost for each. Then: XXX (no drugs) and rock & roll: • Generate new x by crossing “parents” of good fitness • Incorporate random mutations into children (=>overcomes minima) • Run long enough until most populations converge to an optimal set of “genes” (=> best values of vector x) Monte Carlo • • Idea: Find a way to statistically sample all states according to well-defined probability (lower energy states = higher probability) What if f is has many local minima? Need: A way to walk through phase space such that there is a finite probability to visit each state in finite steps Ex.: Particle in energy landscape, coupled todescent heat bathfails. (temperature • • Rough energy landscape: Gradient T): • E • Statistical physics: PDF for states with energy E is given by 1 E p(E) = e = kB T In order to visit all possible states with correct probability, the Metropolis algorithm can be used Metropolis algorithm • Make a trial move from current state x to a new randomly selected state x’ • Calculate energy change dE = E’ - E • If dE<0, accept the new state x’ and repeat. • If dE>0, accept state with probability • Draw random number r in [0,1) • Accept state if e dE e dE : >r Application: Packing problems (see class)