QR Decomposition

QR Decomposition When solving an overdetermined system by projection (or a least squares solution) often the following method is used: • Factorize A = Q · R with R upper triangular and Q orthogonal, i.e. QT Q = 1. • Compute y = QT · b. • Solve Rx = y by substitution, ignoring the row entries that do not belong to columns of the original A. Q can be obtained by applying Gram-Schmidt orthogonalization to the columns of A and extending to a orthonormal basis of Rn . R holds the coefficients of the Gram-Schmidt process. (In practice not Gram-Schmidt, but another process – “Householder Transformations” – are used.) Eigenvalues Computing the characteristic polynomial as determinant is a very unstable process. Instead eigenvalues are computed by transforming • The matrix is converted by orthogonal transformations to “almost upper diagonal” form (upper Hessenberg form). • The matrix is transformed to upper diagonal form. • The eigenvalues are the diagonal entries. This process can be performed by the LAPack routine sgeev/dgeev. Nonlinear equations We are given a function f : R → R and want to find (one or all) z with f (z) = 0. Typically methods work by iteration, starting at a point x0 and then iteratively approximate a zero z. If there are several zeroes, it might be necessary to work with several start values. The three main methods are: • Bisection • Newton’s method (using tangents) • Secant method In general, problems are: • How to select good start values. • How to enforce convergence for ‘bad’ start values. • How long to iterate. Quadratic, Ternary, Quartic We’ve seen the formula for the solutions of a quadratic equation. Similar formulas exist for equations of degree 3 and 4, but they are numerically unstable. Furthermore one can show (this is done in an abstract algebra course) that there cannot be a formula for polynomials of higher degree. Newton’s method We have that 0 = f (z) ≈ f (x) + f 0 (x)(x − z) Solving for z gives the iteration (replace x with zero of the tangent line). f (x) x→x− 0 f (x) This method converges if x0 is chosen close enough to z (and f 0 has no zeroes in the interval, in particular z is no double zero of f ). If we let ek = xk − z the error, we obtain: ek+1 f (xk ) = xk+1 −z = xk −z− 0 f (xk ) f (xk ) − f 0 (xk )ek =− f 0 (xk ) 1 f 00 (ξk ) = 2 f 0 (xk ) for ξk in the interval (Taylor approximation for 0 = f (z) by a degree 1 polynomial around xk ). As xk → z we get approximately ek+1 1 f 00 (z) 2 ≈ ek , 0 2 f (z) i.e. each step we double the number of digits. Problem: Bad (or no convergence) if f 0 (z) = 0. As a stop criterion check: • Change in step width smaller than some tolerance. • Given upper limit for number of iterations. Generalizations of Newton’s exist for multidimensional systems. Systems of polynomial equations Consider a system of polynomial equations in several variables: f1 (x1 , . . . , xn ) = 0 f2 (x1 , . . . , xn ) = 0 .. . fm (x1 , . . . , xn ) = 0 To solve this system we want to eliminate variables in a similar way as with solving a system of linear equations. Problem: How to eliminate xi y versus yz? Convention: For xα1 1 xα2 2 · · · xαnn write xα . Gröbner basis approach We define an ordering (lex ordering) on monomials: xα ≺ xβ if α < β lexicographically. (One can define an “admissible” ordering in more general. One main variant is to compare the total degrees first.) This way, we identify in every polynomial p a leading term lt(p). If S = {p1 , . . . , pm } is a set of polynomials, we say that a polynomial f reduces at S if q = lt(pi ) · r for a monomial q in f , some monomial r and some i, The reduction of f at S is the polynomial obtained by subtracting multiples of pi until no leading term divides any longer. Note: In this process the monomials in f become smaller, this process can have only finitely many steps. S-polynomial To define some measure of “reduction”, we define for two polynomials p, q their S-polynomial as l l S(p, q) = ·p− ·q lt(p) lt(q) where l = lcm(lt(p), lt(q)). Observation 1: Common zeroes of p and q are zeroes of S(p, q). Observation 2: We can also reduce the S-polynomial at p and q and get a “smaller” polynomial without losing common zeroes. Example: p = x2 y 3 + 3xy 4 , q = 3xy 4 + 2x3 y, lt(p) = x2 y 3 , lt(q) = 2x3 y Then lcm(lt(p), lt(q)) = 2x3 y 3 and S(p, q) = 2x · p − y 2 · q = −3xy 6 + 6x2 y 4 We now can reduce S(p, q) at p and get: S(p, q) = 6y · p = −3xy 6 − 18xy 5 Buchberger’s Algorithm Given a set F of polynomials, we now iterate this process. Require: F = (f1 , . . . , fs ). Ensure: A set G = (g1 , . . . , gt ). begin G := F ; repeat G0 := G; for every pair {p, q}, p 6= q in G0 do S := S(p, q); (S-polynomial) G0 S := S ; (reduction modulo G0 ) if S 6= 0 then G := G ∪ {S}; fi; end for until G = G0 ; end Gröbner bases The resulting set G is called a Gröbner basis of F . (One can reduce terms against each other and this way get a reduced Gröbner basis.) Observation: Common zeroes of polynomials in F are common zeroes of polynomials in G. Note: One might get different performance/results for a different ordering of variables. Theorem: If one can obtain polynomials from F that only involve the last variable, this process will find them. One can thus use a back-substitution approach to solve for common zeroes. Example Consider the equations x2 + y 2 + z 2 = 1, x2 + y 2 = z, x = y; respectively the set of polynomials {x2 + y 2 + z 2 − 1, x2 + y 2 − z, x − y} The (reduced) Gröbner basis calculation in Maple proceeds as this: > with(Groebner); > f:=[xˆ2+yˆ2+zˆ2-1,xˆ2+yˆ2-z,x-y]; > g:=gbasis(f,plex(x,y,z)); g := [2y 2 − z, z + z 2 − 1, x − y] We now solve first for z, then for x and y. Application Suppose we want to find the maximum value of the function f (x, y, z) = x3 + 2xyz − z 2 subject to the constraints (points on a sphere) x2 + y 2 + z 2 = 1 By the method of Lagrange multipliers, we know that ∇f = λ∇g at a local maximum or minimum. The three partial derivatives, and the constraints give the equations: 3x2 + 2yz = 2xλ 2xz = 2yλ 2xy − 2z = 2zλ x2 + y 2 + z 2 = 1 We now compute a Gröbner basis for z ≺ y ≺ x ≺ λ and get 655 3 11 1763 5 z + z − z, 1152 1152 288 1605 4 453 2 6 3 − 576 z + yz + z − yz − z , 59 118 118 827 3 3839 5 2 z + y z + z − z, − 6912 3835 295 3835 906 3 2562 5 3 2 z + y + yz + z − y − z, − 9216 3835 295 3835 1152 5 2556 3 − 3835 z + yz 2 − 108 z + xz + z, 295 3835 1999 3 6403 5 z + z + xy − z, − 19584 3835 295 3835 2 2 2 z7 − x + y + z − 1, λ− 167616 6 z 3835 + 36717 4 z 590 − 32 yz − 134419 2 z 7670 − 32 x Solving for z yields: √ 2 11 z = 0, ±1, ± , ± √ 3 8 2 and from this one can solve for each z-value the corresponding x and y values and finally test for maxima/minima. Observation: This process can be done in an exact way or even using variables as coefficients. There are many issues with making this process effective, for example using different orderings.

QR Decomposition

Related documents

Products

Support

QR Decomposition

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib