Lecture 23 18.086

From the weak form to finite element method

The finite element method has big advantages over finite differences for irregular mesh discretization and complex geometries such as moving pistons.

• Electrostatics: The electrostatic potential and the charge distribution fulfill the Poisson equation (1D):

00 (x) = 4⇡⇢(x) on interval Ω=[a,b] (3.29)

• Alternative system: r the mass density, p the gravitational potential

• Dirichlet BCs to simplify the analysis (but not necessary): (0) = (1) = 0. (3.30)

(a) = a , (b) = b

• We have seen that this can be expressed in its weak form as

Z b
( 00 v + 4⇡⇢v) dx = 0 for all test functions v
a

or

Z b
( 0 v 0 + 4⇡⇢v) dx + 0 v|ba = 0 for all test functions v
a

and the solution (x) in terms of basis functions {vi }, i =

(x) = 1 X ai vi (x). (3.31) i=1 ( 0 v 0 + 4⇡⇢v) dx + 0 v|ba = 0 for all testafunctions v ion the infinite basis set needs to be truncated, choosing fia 0 (BCs) orthogonal, N of N linearly independent, but not necessarily FE approximation • (x) = Plugging 1 X i vi (x) into the weak form we get: i=1 Find coefficients i such that ! ! Z b N X 0 0 i vi (x) v (x) + 4⇡⇢(x)v(x) dx = 0, a • i=1 for all v 2 H 1D (⌦) What about those v’s? Since we have a basis and the problem is linear in v and v’, we can just write the equivalent problem: Find coefficients i such that ! ! Z b N X 0 0 v (x) v i i j (x) + 4⇡⇢(x)vj (x) dx = 0, a i=1 for j = 1, .., N FE approximation Find coefficients i such that ! ! Z b N X 0 0 v (x) v i i j (x) + 4⇡⇢(x)vj (x) dx = 0, a • for j = 1, .., N i=1 Rearranging the terms a bit, the equation can be written as: Solve A ~ = ~b with ~=( • 1, 2 , ..., N), Aij = Z b a vi0 (x)vj0 (x)dx , bi = 4⇡ Z b ⇢(x)vi (x)dx a Since we need to invert a matrix, it makes sense to choose the basis such that most entries in A are zero! This is the case if different vi have only very small support (hence the name “finite elements”) ods nlinear partial di↵erential equations Nonlinear finite elements ment method can also be applied to non-linear partial di↵erential equations ig changes. Let us consider a simple example • d2 (x) 2 (x) = dx Solve 4⇡⇢(x) (3.41) • Eq. Equivalent statement: Find Φthe such that for all vi: e ansatz, (3.32) asweak before and minimizing residuals Z b g ai ( = Z1 00 [ (x) + 4⇡⇢(x)) v (x)dx = 0 i 00 (x) + 4⇡⇢(x)] w (x)dx i (3.42) 0 The same derivation as before now results in nonlinear coupled equations ow end upX with a nonlinear equation instead of a linear equation: • i,j Aijk X i j = bk Aijk ai aj = bk i,j with bk as before but • Z1 Aijk = (1) Z (3.43) b a vi (x)vj00 (x)vk (x)dx Aijk = ui (x)u00j (x)wk (x)dx (3.44) We now need a nonlinear root-finding algorithm to solve (1) for ɸi. 0 as before. di↵erence between the case of linear and nonlinear partial di↵erential Nonlinear FE • How to solve n problems • X i j = bk ? i,j Write as rk ( ~ ) = ation problems n et f : R • → (−∞, ∞], Minimize |~r( ~ )|22 find Aijk X Aijk i j bk i,j ! minn {f (x)} x∈R We thus need methods to solve nonlinear problems. Quite general: n m: Let f : R → (−∞, ∞], find given x∗ s.t. f (x∗ ) =f, we minnneed {f (x)} a function to • find minn {f (x)} x∈R x∈R al, but some cases, like f convex, are fairly solvable. find x∗ s.t. f (xn∗ ) = minn {f (x)} blem: •How about f : R → R, differentiable? x∈R Let’s assume f is differentiable… Then the problem becomes a root finding problem: eneral, butfind somexcases, like f convex, are fairly solvable. ∗ s.t. ∇f (x∗ ) = 0 s problem: How about f : Rn → R, differentiable? easonable shot at this, especially if f is twice lest we can get The simplest we can get Nonlinear root-finding c optimization: f (x) = 12 x t Ax − x t b + c. common (actually universal) or expansion • Let’s first consider Quadratic optimization: a particular form of f: f (x) = 12 x t Ax − x t b + c. t ∇∇f (x)∆x + · · · + ∆x) = f (x) + (∆x)t ∇f (x) + 21very (∆x)common (actually universal) • Interpretation: Energy, quadratic in x. We have seen these kinds of ∇f (x) = 0 energies in the contextTaylor of CG.expansion Here: f (x + ∆x) = f (x) + (∆x)t ∇f (x) + 21 (∆x)t ∇∇f (x)∆x + · ∇f (x) = Ax − b = 0 Finding ∇f (x) = 0 −1 x∗ = A b ∇f (x) = Ax − b = 0 mean •A In has to be invertible? Is this all we need? other words, quadratic optimization amounts to a single matrix x∗ = A−1 b inversion (using direct methods, iterative methods (CG etc.)). Does this mean A has to be invertible? Is this all we need? • But A has to be more than invertible! R. A. Lippert Non-linear optimization R. A. Lippert Non-linear optimization Max, min, saddle, or what? Positive definitenes reQuadratic optimization: f (x) = x Ax − x b + c. Positive definiteness

Quadratic optimization: f (x) = x Ax − x b + c. Requires A be positive definite, why?

• A not only needs to be invertible, but also positive definite

3
2.5
2
1.5
1
0.5
0
1

0
−0.5
−1
−1.5
−2
−2.5
−3
1

0.5
0
−0.5
−1

0.5
0
−0.5
−1

−1

1
0.5
0.8
0.6
0.4
0.2
0
1

0
−0.5
−1

−1.5
−2
1

0.5
0
−0.5
−1

−1

1
0.5
0
−0.5
−1

−1

Otherwise, ∇f (x) = 0 could also be a maximum or saddle point or a "line"!

• Positive definiteness is crucial to guarantee that we found a minimum by matrix inversion for quadratic energies!

• For general energies, positive definiteness requires convex f(x) Shaw Research Nonlinear root-finding • What do we do if f(x) is more complicated? • If you have no clue, Taylor you can do! • To start, lets consider 1D case: • Minimize f(x): If x* is solution, we can write 0 f (x⇤ ) = f (x) + f (x)(x⇤ • 1 00 x) + f (x)(x⇤ 2 Need to find x* such that f’(x*) = 0, thus x = x⇤ x⇡ f 0 (x) f 00 (x) xn+1 = xn 2 x) + ... f 0 (xn ) f 00 (xn ) Nonlinear root-finding x = x⇤ • x⇡ f 0 (x) f 00 (x) xn+1 = xn f 0 (xn ) f 00 (xn ) Instead of minimizing f, we could also search for a root of g(x)=f’(x): xn+1 = xn g (xn ) g 0 (xn ) Lecture Nonlinear root finding Nonlinear root-finding • Now in higher dimensions: • Taylor is now f (x⇤ ) ⇡ f (x) + (x⇤ 1 x) rf (x) + (x⇤ 2 T x)T rrf (x)(x⇤ • ∇∇f(x) is a matrix of second derivatives (also called Hessian or Jacobian) • Again, min f(x) means ∇f(x*)=0 for solution x* x = x⇤ x⇡ (rrf (x)) 1 rf (x) x) + ... wton’s method wton’s method Naive nonlin. root solver (Newton-Raphson method) Newton’s method finding x s.t. ∇f (x) = 0) • Algorithm: Newton’s method finding x s.t. ∇f (x) = 0) find ; Start with initial guess x0. Then iterate −1 −1 ∆x∆x = − (∇∇f (x )) ∇fi )) (xi ) = − (∇∇f (x i i i xi+1 xi + x ∆x+ i ∆x x = = i+1 i i ∇f (xi ) until Δxi <))tolerance or max. number if steps t ∇f (xi ) posdef, (∇f (x (x − x ) < 0 so ∆x is a i i+1t i i ∇f (xof (∇f (xovershoot) (xi+1to−invert xi ) < the 0 soHessian, ∆xi is ai.e. solve tion (could i ) decrease i )) need •posdef, This means we tion overshoot) ∇f (xi ) of notdecrease posdef,a∆x might be in an increasing again linear problem! i(could tion. ∇f (x ) not posdef, ∆x might be in an increasing i • i 1D), we have to do this many times (at However, (as in convex, f (x ) ≤ f (x ), so problems go away. i+1 i tion. each iteration!) convex, f (xi+1 ) ≤ f (xi ), so problems go away. R. A. 1D example of trouble

1D example of trouble: f (x) = x 4 − 2x 2 + 12x

20
15
10
5
0
−5
−10
−15
−20
−2

−1.5

−1

−0.5

0

0.5

1

Has one local minimum
Is not convex (note the concavity near x=0) Example in 1D

1D example of trouble: f ′ (x) = 4x 3 − 4x + 12

20
15
10
5
0
−5
−10
−15
−2

−1.5

−1

−0.5

0

0.5

1

1.5

the negative f ′′ negative f'' around x=0 is a barrier to reach solution!
region around x= 0 repells the iterates: Ifi ∆x isdirection a direction • iSince i )aα≤ i ∆x i ) i≤ i∆x ia direction f decreases, there will be some αi>0 ase, some αithat exists. such f decreases. αi exists. minimization do 1D optimization problem, on do 1D optimization problem, • But since f is nonlinear, we would need to solve the nonlinear optimization problem min f (xi + αi ∆xi ) i ∈(0,β] min fα(x i + αi ∆xi ) αi ∈(0,β] ijo-search use this rule: αi = ρµn some n It’s getting complicated… n some n h use fthis rule: α = ρµ t i (x + s∆x ) − f (x ) ≤ νs (∆x ) ∇f (x ) • • i i i i i Convergence very much depends on quality of initial guess t1 + s∆x ) −(e.g. f (x ρ) = ≤2,νs (∆x ). (x ) ρ, µ, ν fixed µ= ν =) ∇f Iterative methods Alternatives • Instead of solving ∇f=0, it is often easier to just minimize f, by gradient descent: using, e.g. gradient descent and a good guess for the step length: 1 Search direction: ri = −∇f (xi ) 1. Start descent: with initial guess xi gradient 2 Search step: x i+1 1 2. Search direction: ri = −∇f (xi ) = xi + αi ri 3 Pick on what’s cheap) 2 xi+1alpha: = xi + α(depends r 3. Search step: i i 3 4. rit (∇∇f )ri 1 Pick on what’s cheap)t αi = Find alpha: optimal(depends ! i linearized ri ri t ri (∇∇f )ri = minimization 2αi 1D E.g.linearized by approx rit ri of f (xi + αri ) (danger: low 1D minimization t 2 1D minimization f (x + αr ) (danger: low quality) i i 3 zero-finding r ∇f (x + αr ) = 0 etc. i i i t 1 3 zero-finding ri ∇f (xi + αri ) = 0 • Can be extended to nonlinear CG as well • Do not need second derivative evaluations! What if f has: many local minima, is not differentiable or x is discrete? • Traveling salesman problem: • Find shortest route to visit all cities once! • Here, f=length of route, x: ordered indices of cities to visit • Many different methods available: Monte Carlo, simulated annealing, genetic algorithms etc. What if f is has many local minima? • Rough energy landscape: Gradient descent fails. • General idea: Make sure we don’t get stuck in local minimum! Genetic algorithm • Sketch: Start with a population of “genes” (for instance, random values of x in f(x)) A B C M • Calculate cost for each. Then: XXX (no drugs) and rock & roll: • Generate new x by crossing “parents” of good fitness • Incorporate random mutations into children (=>overcomes minima) • Run long enough until most populations converge to an optimal set of “genes” (=> best values of vector x) Monte Carlo • • Idea: Find a way to statistically sample all states according to well-defined probability (lower energy states = higher probability) What if f is has many local minima? Need: A way to walk through phase space such that there is a finite probability to visit each state in finite steps Ex.: Particle in energy landscape, coupled todescent heat bathfails. (temperature • • Rough energy landscape: Gradient T): • E • Statistical physics: PDF for states with energy E is given by 1 E p(E) = e = kB T In order to visit all possible states with correct probability, the Metropolis algorithm can be used Metropolis algorithm • Make a trial move from current state x to a new randomly selected state x’ • Calculate energy change dE = E’ - E • If dE<0, accept the new state x’ and repeat. • If dE>0, accept state with probability • Draw random number r in [0,1) • Accept state if e dE e dE : >r Application: Packing problems (see class)