Using Galois Theory to Prove Structure form Motion Algorithms are Optimal By David Nister, Richard Hartley and Henrik Stewenius Two Structure for Motion Problems 1. Five-point Calibrated Relative Orientation Two calibrated cameras (i.e., intrinsic parameters are known) Problem: Determine the relative orientation of the camera R w.r.t camera L from five corresponding image points L R Five-point Calibrated Relative Orientation L, R two camera projection matrices and the world frame is the camera frame of L L = K [ I | 0 ], R = K’ [ R | -R T ] K, K’ the calibration matrices (3x3 and upper triangular). R, T are the rotation and translations p between the two camera frames. pl pr a OL b OR T Essential Matrix S has rank two. E is the essential matrix that is the product of R and S. Knowing E, we can recover R but not T (Why?) Given two corresponding image points, a, b, multiplied by the inverses of K, K’ yield pr pl . This places one constraint on E. Five-point Calibrated Relative Orientation Given two views, the problem is to determine the essential matrix E between the two calibrated cameras from corresponding image points: E is defined only up to a scale, and it has rank two ( i.e., det( E ) = 0). It has five degrees of freedom. We need five corresponding image points. Furthermore, E’s two nonzero singular values are equal: Five-point Calibrated Relative Orientation The five points provide a system of five linear equations A E = 0. A (is 5x 9) has a four dimensional null space and the solution E = x X + y Y + z Z + w W for some x, y, z, w. ( Set w=1 or 0) Putting E back into the two constraints gives a set of 10 cubic equations in x, y, z. This 10 cubic equations can be written as the following linear system: Five-point Calibrated Relative Orientation Where GG is a 10x10 matrix in z. For example, det (GG) = 0 gives a tenth degree polynomial in z. Five-point Calibrated Relative Orientation Question: Can the problem be solved (generally) with polynomials of lower degrees? 2nd Problem: L2-optimal two view Triangulation Given a pair of corresponding image points a=(xL, yL), b=(xR, yR) and camera matrices (L, R), find a 3D point P that minimizes the L2 re-projection error: p a OL b OR 2nd Problem: L2-optimal two view Triangulation Finding the optimal P requires the solution of a sixth degree polynomial (This is explained on page 317 in the book by Hartley and Zisserman). Question: Can the problem be solved (generally) with polynomials of lower degrees? The answers to both questions are NO. In this way, the current solutions to these two problems are optimal. Computer vision stops here … The rest is abstract algebra. Groups A group G is a set with an operation (product) GxG ->G. The operation 1. is associative a (b c) = (a b) c, 2. identity: there is an e such that a e = e a= a for all a, 3. Inverse: for each a, there is a-1 such that a a-1 = a-1 a = e. The operation is commutative if a b = b a for all a, b in G. In this case, G is called an abelian group. A subgroup S of G is a subset of G that is itself a group (with respect to G’s operation). Groups : Examples The set of integers with addition as the operation is a group. It is not a group w.r.t. multiplication (no inverses). This is an example of an abelian group. The set of nonsingular nxn matrices with multiplication is a group. This is an example of a non-abelian group. The group ( Zn ,+ ) : elements are 0, 1, 2, …., n-1, a+b = c (mod n). (Zn ,+) is an example of cyclic group. It can be generated by one element a and every other element in the group is a multiple of a. If A and B are groups, their cartesian product AxB is also a group. Homomorphism f between groups G and H is a mapping between G and H that preserves their respective operations: f(ab) = f(a) f(b) f(eG) = f(eH) Two groups are isomorphic if there is a bijective (one-to-one and onto) homomorphism between them. Permutation Groups Permutation group: The set of permutations of n objects is a (nonabelian) group Sn. S3 Sn has n! elements. A permutation is a transposition if it permutes two objects only. Every permutation can be written as a product of an even number or odd number of transpositions. An, the alternating group, is the subgroup of Sn consists of even permutations. Normal Subgroups and Quotient Groups A subgroup N of G is called normal if for every g in G and h in N, there is h’ in N such that g h g -1 = h’, or g h = h’ g. Every subgroup of an abelian group is normal. Given a normal subgroup N, we can partition G into disjoint subsets such that g and g’ are in the same subset if there is h in N and g’ = g h ( g and g’ are connected through N). The set of subsets has a group structure and it is the quotient group ( G/N ) of G by N. x y y’ x’ N x’ = x n; y’= y m => x’ y’ = x n y m =x y w n, m, w are in N G xy Example Take the group Z of integers, and the subgroup 7Z (integers which are multiples of 7). 7Z is a normal subgroup of Z because Z is abelian. Z / 7Z is the group Z7 discussed before. An is the only normal subgroups of S n for n > 4. The kernel Ker f of a homomorphism f between two groups G, H, is the subset of G such that f(x) = eH Ker f is a normal subgroup of G. If f: G H is surjective (onto), then H is isomorphic to the quotient group G / ker f. For example, take the homomorphism f : Z Z7 : f(x) = x % 7. What is the ker f ? Fields Examples: Q (the field of rational numbers) R (the field of real numbers) and C (the field of complex numbers) A field K is a set with two associative and commutative operations (+, x) with identities 0 and 1. The two operations satisfy the distributive law: a (b + c ) = a b + a c; Every element in K has (+) -inverse and every element except 0 in k has ( x )-inverse ( thanks to the distributive law). ( ab = a (b + 0) = a b + a 0. Therefore a 0 = 0 ) Field Extension A field L is called a field extension (L : K) of K if K is a subfield of L. Key idea: Treat L as a vector space (with multiplication) with coefficients in K. Vector space: we can add and subtract vectors and for every real number a, av is another vector. L L : K: we can add and subtract elements in L and for every element k in K and x in L, kx is another element in L. The degree of extension [ L : K ] is the dimension of L as a vector space with coefficients in K. K Examples C is a degree-two extension of R. Take the field Q of rational numbers and the irrational number u=sqrt(2). Q(u) denotes the smallest field (in R) containing both Q and u. What is the dimension of [ Q(u) : Q ]? Two, because u is a root of the polynomial x2 – 2 =0 (with coefficients in Q). Every element in Q(u) is of the form a + b u. Q(u) is closed under multiplications and additions (a + b u) (a’ + b’ u) = A + B u. Is it closed under taking quotient (inverse)? Examples (cont) In fact, elements in Q(u) are f(u), f is a polynomial in Q[x]. Let q(x)=x2 – 2 and p(x) any polynomial in Q[x]. We know that p(x) = w(x) q(x) + r(x), where the remainder r(x) has degree strictly smaller than the degree of q(x). That is, q(x) is a linear polynomial. p(u) = w(u) q(u) + r(u) = r(u). Let r(x) be any linear polynomial, since r(x) and q(x) are relatively prime, there are polynomials A(x), B(x) such that A(x) r(x) + B(x) q(x) = 1, or A(u) r(u) = 1. That is, r(u) has an inverse. Theorems Theorem 1. Let P(x) be an irreducible polynomial in Q[x] and u is a root of P(x). [ Q(u) : Q ] = the degree of P(x). Theorem 2. If L : Q be a finite degree field extension of Q, then every element in L is algebraic over Q. That is, for every u in L, there is a polynomial p(x) in Q[x] such that p(u) =0. Theorem 3. If M:L and L: K are two field extensions of finite degrees, the extension M : K has degree [ M:L ] [L : K]. M C is a degree-2 extension of R. [R : Q] is infinite. L There are elements u in R (transcendental numbers) that do not satisfy p(u) = 0 for any polynomial with coefficients in Q. K Ruler and Compass Construction 1. It is impossible to use ruler and compass construction to trisect any angle (in fact, 60o can’t be trisected). 2. It is impossible to use ruler and compass construction to duplicate a cube of side length 1. These answer two famous problems raised by ancient Greeks. A real number r is said to be constructible if we can locate r on the x or y axis using only ruler and compass. 0 All integers are constructible. 1 Ruler and Compass Construction If a and b are constructible, so are 1/a and ab. So all rational numbers are constructible. In particular, if a number u is constructible, we can construct any number in Q(u). a Y=ax-1 Y= a X ab b 1 1/a 1 a Ruler and Compass Construction We introduce a “new number” by taking the intersection of a circle with a line. Suppose F is a field whose elements are constructible. A, B, a, b, c, d, e, f are in F. ax+by=c x2 + y2 + dx + ey +f = 0. The X coordinates of the two intersection points satisfy a quadratic equation X2 + A X + F = 0 ( X + aa )2 + bb = 0 That is, X is in the field F( u ), u = sqrt( -bb ), u a square root of an element in F. Ruler and Compass Construction Therefore, if a number v is constructible, then, there exists a sequence of field extensions: Q < Q (u1) < Q(u1, u2 ) < … < Q(u1, u2, …, un) such that R 1. Q(u1, u2, … ui) = Q(u1, u2, … u i-1) (u i) 2. u i 2 is in Q(u1, u2, …, u i-1) 3. v and Q(v) is in Q(u1, u2, … ui) v [ Q(u1, u2, …, un):Q ] = 2p (why?) Suppose we can trisect 60o angle using ruler and compass. This means that v =cos 200 is constructible. 60o Q Ruler and Compass Construction We have the formula: cos 3a = 4 cos3 a – 3 cos a. That is ½ = 4v 3 – 3 v. Or a is the root to the polynomial 8x 3 - 6x -1, an irreducible polynomial in Q[x]. This is a contradiction! 1) We have the inclusions Q < Q(v) < Q (u1, u2, …, un). 2) [ Q (u1, u2, …, un):Q ] = [Q(v):Q ] [ Q (u1, u2, …, un):Q(v) ] . 3) What is the degree [ Q (u1, u2, …, un) :Q] ? 4) What is the degree [ Q(v) : Q ] ? Where is the contradiction? If we can duplicate the unit cube, then, s, the side length of a cube of volume 2, is a root to the polynomial X 3 – 2, which is also an irreducible polynomial. Ruler and Compass Construction Note that this does not imply that we cannot trisect any angle. Just that there is not a general method that can trisect every angle. We can easily trisect 135o angle using ruler and compass: Construct the right angle first and then bisect it. Radical Extension What does it mean (mathematically) that a polynomial equation p(x) =0 can be solved by a formula? Definition Let p(x) be a polynomial. Its splitting field SF(p) is the smallest field in C that contains all roots of p(x)=0. That is, in the splitting field SF(p), the polynomial splits into linear factors: p(x) = (x – a1) ( x – a 2) …. (x – an), ai in SF(p). Suppose you have a formula for solving a polynomial equation a x 5 + bx 4 + c x 3 + d x 2 + ex + f = 0. You may not able to evaluate this formula in Q. Radical Extension Let u1 = (ab + c)1/3 , u2 = (ab + be + cd )¼, u3 = (d u1)1/5. The formula can be evaluated in the field Q(u1, u2, u3), i.e., the splitting field SF (p) < Q (u1, u2, u3). Q(u1) : Q u13 in Q Q(u1, u2) : Q(u1) u24 in Q(u1) Q(u1, u2, u3): Q(u1, u2) u35 in Q(u1, u2) This is an example of radical extension. Radical Extension A polynomial p(x) is solvable by a formula if its splitting field SF( p ) is contained in a radical extension of Q. Definition A field extension E : Q is called a radical extension if there is a tower of field extensions: E = Q(u1, u2, …, un) : Q(u1, u2, .., u n-1) : Q(u1) : Q such that Q(u1, u2, … u i) = Q(u1, u2, … u i-1) (u i) ui m is in Q(u1, u2, …, u i-1) for some m. Five-point Calibrated Relative Orientation We have a tenth-degree polynomial p(x). Let E : Q E = Q(u1, u2, …, un) : Q(u1, u2, .., u n-1) : Q(u1) : Q be a field extension such that 1. Q(u1, u2, … u i) = Q(u1, u2, … u i-1) (u i) 2. ui m is in Q(u1, u2, … , u i-1) for some m or u i is a root of a polynomial with degree < 10. Question: Is the splitting field SF(p) of p contained in such an extension E? If there is a general algorithm involving only polynomials of degree < 10, then, SF(p) should be contained in such an extension field E for any 10th degree polynomial arising from a five-point calibrated relative orienation. Math Problem 1. A specific type of field extensions E: E1 : E2 : … : Q, specified by the problem 2. A field extension L : Q defined by an instance of the problem (e.g., 60o angle) Want to know if L can be included in E, i.e., does L < E? For the two classical problems, we use the simplest invariant for field extension, the degree, to show that the inclusion is not possible. For other problems (including the ones studied in the paper), more refined invariant (galois group) is required. To show these negative results, one comes up with one instance of L :Q and show that L cannot be included in E using the invariant. Galois Group of Field Extension L : K Let L be a field. An automorphism s of L is a bijective mapping s : L -> L that preserves the field structure: s (a b) = s(a) s(b) L s ( a+ b) = s(a )+s(b) s ( 0 ) =0, s ( 1 ) =1. The set of automorphisms of L form a group K under composition. The galois group Gal( L/K ) of L over K consists of automorphisms of L that fix elements of K. That is, s in Gal( L/K ) if and only if s ( x ) = x for every x in K. s(xy) = x s(y) for every x in K, y in L (Another notation for Gal (L/K) is AutK L. ) Analogy with linear algebra: Gal(L/K) is a generalization of the concept of linear maps. Examples of Galois Group Consider the extension F(u) : F, where u is a root of a polynomial p(x) with coefficients in F. Note that any s in Gal( F(u)/ F ) is determined by its action on u. This is because F(u) has a basis consists of powers of u. Furthermore, 0 = s ( 0 ) = s ( p(u) ) = p (s(u) ). That is, s(u) is a root of the polynomial p(x) as well !! If F(u) contains all the roots of p(x), then s simply permutes the roots. Examples: C = R ( i ). The Galois group Gal( C/R) is Z2. Q(sqrt(2) ) : Q. The Galois group is again Z2. Q(sqrt(2), sqrt(3)) = Q(sqrt(2)) (sqrt(3)). The Galois group is Z2 + Z2 since 1, sqrt(2), sqrt(3) sqrt(6), form a basis. Any s in the Galois group is determined by its values s (sqrt(2)), s(sqrt(3)). Galois Group of a Polynomial Definition Let P(x) be a polynomial with rational coefficients and SF(P) its splitting field. The Galois group for P(x) is the group Gal (SF(P) / Q ). Assume that P(x) is of degree n and it has n distinct roots. SF(P) contains every root of P(x), and it is generated by these roots: SF(P) = Q ( u1, u2, … , un), where ui are roots of P(x). Therefore, Gal (SF(P) / Q ) can be considered as a subgroup of the permutation group Sn on n objects. There are many polynomials with maximal possible Galois group, the permutation group. Examples What is the Galois group for the polynomial x2 – 2? The Galois group of the polynomial x3 – 4x + 2 can be shown to be S3. What about the Galois group for the polynomial x3 -1 ? Is S3 or something else (Z2)? The three roots are (1, a, a2 ) What about the Galois group for the polynomial x4 -1 ? What about the Galois group for the polynomial x5 -1 ? Cyclic Extension Let F be a field containing all the m-th roots of unity: p(x) = xm – 1 =0 a0,a1, a2, a3, . . . , a m-1 What is the Galois group Gal ( F(u) / F), where u is a root of the polynomial q(x) = um - f for some element f in F? F(u) is a splitting field for q(x) because the roots of q(x) are all in F(u) ua0, ua1, ua2, ua3, . . . , ua m-1 Gal ( F(u) / F) is Zm and in particular, it is abelian. Galois Group of a Field Extension L : K Let G be a Galois group of a field extension L : K. Let E be an intermediate field : L : E : K, and H a subgroup of G. Let E’ denote the subset of G consists of elements fixing E : E’ is a subgroup of G. Let H’ denote the subset of L consists of elements fixed by H: H’ is a subfield of L. G L E’ H’ E K H e Simple Exercise Let’s show the following: Let E’ denote the subset of G consists of elements fixing E. E’ is a subgroup of G. Proof: 1. The set E’ is closed under composition. If r and s are elements in E’, is rs in E’? 2. If s in E’, is its inverse in E’? 3. Is the identity e in E’? G L E’ H’ We have E’’ > E and H’’ > H. E K H e Galois Correspondences (fundamental theorem of Galois Theory) Given a field extension L : K and intermediate fields E1 and E2 (disregarding some technical details) L 1 E2 Gal(L/E2) E1 Gal(L/E1) The structure of the field extension L : K is encoded in the galois group AutK L K Gal(L/K) Gal (E1 /K) = Gal (L/K) / Gal (L/ E1) Why there is no formula for solving polynomials with degree > 4? E is the radical extension containing the splitting field for P(x) and G the galois group of the extension E: Q E Q E1 E2 …. E G G1 G2 . . . GN e The quotient Gi / G i+1 is abelian for all i and each G i is a normal subgroup of Gi-1. SF(P) G is a called a solved group. If the polynomial P(X) is solvable by radicals (formula), then SF(P) is contained in E and the Galois group Gal (P) is a quotient group of G. Gal(P) must be solvable as well. Q Radical Extension A polynomial p(x) is solvable by a formula if its splitting field SF( p ) is contained in a radical extension of Q. Definition A field extension E : Q is called a radical extension if there is a tower of field extensions: E = Q(u1, u2, …, un) : Q(u1, u2, .., u n-1) : Q(u1) : Q such that Q(u1, u2, … u i) = Q(u1, u2, … u i-1) (u i) ui m is in Q(u1, u2, …, u i-1) for some m. Cyclic Extension Let F be a field containing all the m-th roots of unity: p(x) = xm – 1 =0 a0,a1, a2, a3, . . . , a m-1 What is the Galois group Gal ( F(u) / F), where u is a root of the polynomial q(x) = um - f for some element f in F? F(u) is a splitting field for q(x) because the roots of q(x) are all in F(u) ua0, ua1, ua2, ua3, . . . , ua m-1 Gal ( F(u) / F) is Zm and in particular, it is abelian. Why there is no formula for solving polynomials with degree > 4? The groups S n and A n are known to be not solvable precisely for n > 4. Therefore, if the Galois group of a polynomial P(x) is Pn or An, for n > 4, there is no way that every root of P(x) can be computed by additions, multiplications, subtractions, divisions and taking radicals using coefficients of P(x). That is, one cannot have a general formula for solving the roots of polynomials with degrees > 4. Note that this does not forbid formulae for some restricted classes of polynomials. Back to the paper: Lemma 4.7 Lemma 4.7 Let Fp, Fq be the splitting fields for two polynomials p, q, respectively over a base field F. Fpq the smallest field containing both Fp and Fq. Then Gal ( Fpq / Fp ) is isomorphic to a normal subgroup of Gal ( Fq / F) Gal ( F pq / F ) Fpq = Gal( F q / F) Gal ( Fpq / Fq ) Fq Fp F Arrows are inclusions Gal ( Fpq / Fp ) is a normal subgroup of Gal ( F pq / F ) Every element of Gal ( F pq / F p ) survives the quotient: If s in Gal (Fpq / F p) maps to e in Gal (Fq / F), s must be in Gal (Fpq / Fq). This implies that s is the identity. Theorem 4.8 Consider a sequence of field extension (minus some details) F0 < F1 < … < F N-1 < F N . Let P be a polynomial of degree n > 4 over F0 with Galois group Sn (or An). If FN is the first field in this sequence containing one of the roots of P, then it contains all the roots of P. Furthermore, G(FN / FN-1) has a quotient group isomorphic to Sn or An. Let Fi(P) be the splitting field of P(x) over Fi. F0(P) F1(P) F2(P) . . . FN-1(P) F N(P) F0 F1 F2 . . . FN-1 FN. By Lemma 4.7, Gal (Fi (P) / Fi) is isomorphic to a normal subgroup of Gal(F i-1 (P) / F i-1) In particular, Gal (FN (P) / FN) and Gal (FN-1 (P) / FN-1) is isomorphic to a normal subgroup of Gal (F0(P) / F0) = S n. So they are either the identity, An or Sn. If Fn contains one root, then it contains every root. F n-1 does not contain any root, therefore, Gal(F n-1 (P) / F n-1 ) cannot be identity. One more step … ( FN-1 -> FN-1(P) F N ) QED Five-point Calibrated Relative Orientation We have a tenth-degree polynomial p(x). Let E : Q E = Q(u1, u2, …, un) : Q(u1, u2, .., u n-1) : Q(u1) : Q be a field extension such that 1. Q(u1, u2, … u i) = Q(u1, u2, … u i-1) (u i) 2. ui m is in Q(u1, u2, … , u i-1) for some m or u i is a root of a polynomial with degree < 10. Question: Is the splitting field SF(p) of p contained in such an extension E? No, if the Galois group of p is S10. If there is a general algorithm involving only polynomials of degree < 10, then, SF(p) should be contained in such an extension field E for any 10th degree polynomial arising from a five-point calibrated relative orientation. Theorem 4.8 Consider a sequence of field extension (minus some details) F0 < F1 < … < F N-1 < F N . Let P be a polynomial of degree n > 4 over F0 with Galois group Sn (or An). If FN is the first field in this sequence containing one of the roots of P, then it contains all the roots of P. Furthermore, G(FN / FN-1) has a quotient group isomorphic to Sn or An. Proof of Impossibility: Let FN be the splitting field of a tenth-degree polynomial P(x) whose Galois group is S10. We got F N by adjoining a root of a polynomial with degree < 10 to F N-1. Gal (FN/ F N-1) is a subgroup of Sn, for n< 10. By Theorem 4.8, it has a quotient group that is S10. This is a contradiction. Putting Everything Together What is left is to come up with a five-point calibrated relative orientation problem that will require us to solve for the roots of a tenth-degree polynomial P(x) such that the Galois group of P(x) is S10. Use a computer program (MAGMA), it can be checked that the Galois group of this polynomial is S10 !