On the Complexity of the Descartes Method when using

On the Complexity of the Descartes Method when using Approximate Arithmetic Michael Sagraloff MPI for Informatics, Saarbrücken, Germany Abstract In this paper, we introduce a variant of the Descartes method to isolate the real roots of a square-free polynomial F(x) = ∑ni=0 Ai xi with arbitrary real coefficients. It is assumed that each coefficient of F can be approximated to any specified error bound. Our algorithm uses approximate arithmetic only, nevertheless, it is certified, complete and deterministic. We further provide a bound on the complexity of our method which exclusively depends on the geometry of the roots and not on the complexity of the coefficients of F. For the special case, where F is a polynomial of degree n with integer coefficients of maximal bitsize τ, our bound on the bit complexity writes as Õ(n3 τ 2 ). Compared to the complexity of the classical Descartes method from Collins and Akritas (based on ideas dating back to Vincent), which uses exact rational arithmetic, this constitutes an improvement by a factor of n. The improvement mainly stems from the fact that the maximal precision that is needed for isolating the roots of F is by a factor n lower than the precision needed when using exact arithmetic. Key words: Root isolation, Descartes method, subdivision methods, numerical computation, complexity bounds, approximate coefficients 1. Introduction Computing the roots of a univariate polynomial can be considered as one of the fundamental problems in computational algebra, and numerous approaches have been proposed in the last decades to solve this problem. In this paper, we focus on the problem of isolating the real roots of a square-free polynomial F ∈ R[x] with arbitrary real coefficients. More precisely, given approximations of the coefficients of F to an arbitrary precision, we aim to compute disjoint intervals J1 , . . . , Jm such that each Ji contains exactly one root of F and such that their union contains all real roots of F. For polynomials with integer coefficients, the so-called Descartes method (or Email address: msagralo@mpi-inf.mpg.de (Michael Sagraloff). Preprint submitted to Elsevier 20 January 2014 ”Vincent-Collins-Akritas” method) 1 , first introduced by Collins and Akritas [10], constitutes one of the simplest and most efficient algorithms. In order to better understand the contribution of this paper, we briefly review the algorithm: It starts with an interval I containing all real roots of F and recursively proceeds as follows: For an interval I = (a, b) ⊂ I , Descartes’ Rule of Sign is used to test I for roots of F. If it yields that the number m of roots contained in I equals zero, I is discarded. If it yields that m = 1, then I is stored as an isolating interval. In all other cases, I is subdivided into two equally sized subintervals I` := (a, m(I)) and Ir := (m(I), b), where m(I) denotes the midpoint of I. For a polynomial F of degree n with integer coefficients of bit-size τ, the Descartes method induces a recursion tree of size O(n(τ + log n)), where the latter bound has shown to be optimal [15]. For Descartes’ Rule of Signs, we need to compute the polynomial 2 ax + b n . (1.1) FI,rev (x) := (x + 1) · F x+1 Using asymptotically fast Taylor shifts [16, 45, 39], the cost for this computation is bounded by Õ(n2 (log+ max(|a|, |b|) + log+ |b − a|−1 )) = Õ(n3 τ) (1.2) + bit operations, 3 where we define log (x) := log max(2, |x|) ≥ 1 for all x ∈ C and log := log2 . The bound in (1.2) follows from the fact that we have to perform Õ(n) arithmetic operations and that FI,rev has rational coefficients of bit-size O(n(log+ max(|a|, |b|) + log+ |b − a|−1 )) = Õ(n2 τ). Multiplication of the bound on the recursion tree and the bound (1.2) on the bit complexity for the computations at each node yields the bound Õ(n4 τ 2 ) on the overall bit complexity of the Descartes method. The advantages of the Descartes method are its simplicity and that the size of the recursion tree adapts well to the geometric locations of the roots, that is, the recursion tree becomes large if and only if some of the roots are clustered. A disadvantage of the Descartes method is that the exact computation of the polynomials FI,rev needs a precision of Θ̃(n2 τ) in the worst case, whereas separating the roots from each other needs only Õ(nτ) bits. In fact, the binary representation of the endpoints of all isolating intervals returned by the algorithm needs no more than Õ(nτ) bits. This brings up the question whether approximate computation of the polynomials FI,rev yields any improvement with respect to the precision demand during the computation and, thus, also with respect to the bit complexity of the Descartes method. This question has been addressed in a series of previous papers: Johnson and Krandick [19] introduced a hybrid method that uses interval arithmetic based on floating point computation (up to a certain fixed precision) to compute the polynomials FI,rev . This allows to determine the signs of the coefficients of FI,rev (and, thus, to use Descartes’ Rule of Signs) for most of the considered intervals within the subdivision process by using approximate arithmetic, whereas, for the remaining intervals, the method falls back to exact computation. Hence, floating point arithmetic is used as a filter 1 There exist numerous discussions (e.g. [1]) about whether ”Descartes method” is the correct term since Descartes did not introduce any algorithm to isolate the roots but (only) a method to estimate the number of positive roots of a univariate polynomial (i.e. Descartes’ Rule of Signs). However, because of the fact that the algorithm from Collins and Akritas (based on ideas dating back to Vincent) exclusively uses this rule as inclusion and exclusion predicate, it is reasonable to name the algorithm after Descartes without using the possessive ”s” following his name. 2 Descartes’ Rule of Sign states that the number m of roots contained in I is upper bounded by the number v of sign changes in the coefficient sequence of FI,rev and that v ≡ m mod 2. For more details, we refer to Section 2.6. 3 According to Cauchy’s Root Bound (see e.g. [47]), we can assume that I ⊂ (−1 − 2τ , 1 + 2τ ), and thus max(1, |a|, |b|) ≤ 1 + 2τ . In addition, Descartes method does not subdivide intervals of size less than half of the minimal distance between two distinct roots of F (i.e. the separation σF of F), and log max(1, σF ) = O(n(τ + log n)); see Section 2.6 for details. 2 which allows to decrease the precision demand for most intervals, however, no improvement is achieved with respect to worst case bit complexity. Rouillier and Zimmermann [34] modified the latter approach by arbitrarily increasing the working precision at each stage of the algorithm. It is currently one of the fastest algorithms in practice (e.g. the univariate solver in M APLE is based upon this method), however, no result on the needed precision demand and its computational complexity is known, and we expect that, without further modifications, there is no improvement upon the bound Õ(n4 τ 2 ) in the worst case. There also exist ”approximate versions” of the Descartes methods for which complexity results are known: In [14], Eigenwillig et al. proposed a randomized algorithm which is similar to the one from Rouillier and Zimmermann in the sense that it computes interval approximations of the polynomials FI,rev ; in fact, the method works with the Bernstein representation and not with the monomial representation of F. However, the main difference is that the subdivision points are randomly chosen in order to avoid unnecessarily large working precisions. The algorithms from [25, 35] are both deterministic, and they both start with a specific rational approximation F̃ of F for which isolating intervals are computed. Eventually, the isolating intervals for F are obtained by enlarging the isolating intervals for F̃. It has been shown that, for integer polynomials, all of the latter methods (i.e. [14, 25, 35]) need Õ(n4 τ 2 ) bit operations to isolate all real roots. In summary, there exists no theoretical proof for the improved efficiency of an ”approximate” Descartes method as observed in practice. Main Results. A main contribution of this paper is to close the above described gap between theory and practice by introducing a modified Descartes method, denoted RI SOLATE, which combines the Descartes and the Bolzano method [9, 38, 46]. More precisely, for discarding intervals that do not contain any root, our method mainly uses Descartes’ Rule of Signs, whereas an interval is confirmed to be isolating via a sign-change test at the endpoints of the interval and Rouché’s Theorem. RI SOLATE succeeds under guarantee (i.e. the method returns an exact result for any given F) with a working precision bounded by Õ(nτ) in the worst case and the size on the recursion tree is bounded by Õ(nτ). This eventually yields the improved bound Õ(n3 τ 2 ) for isolating all real roots of F. Before we give more details, we briefly sketch how this improvement is possible: For an interval I = (a, b), let FI (x) := F(a + (b − a) · x). (x + 1)n · FI (1/(x + 1)). Hence, from point 4 , we can directly compute an (1.3) Then, it holds that FI,rev (x) = an approximation F̃I of FI to L + n bits after the binary approximation F̃I,rev (x) of FI,rev (x) to L bits after the binary point (see Lemma 1 (c)). Thus, in essence, we can restrict to the computation of sufficiently good approximations of the polynomials FI . In Section 2.3, we show that, for an arbitrary approximation of F to ρ0 = Õ(nτ) bits after the binary point, corresponding roots of F and its approximation are almost at the same location with respect to their separations; see Theorem 3 and Appendix 6.2 for a more precise result. We conclude that, for isolating the roots of F, it should also suffice to consider approximations F̃I of FI to ρ0 bits after the binary point. But how can we compute such approximations F̃I in an efficient manner? Let h0 , with h0 = Õ(nτ), denote an upper bound on the depth of the recursion tree induced by the (modified) Descartes method. If we start with an approximation of F to ρ0 + 2h0 = Õ(nτ) bits after the binary point, then we can recursively compute approximations F̃I of FI such that the approximation error quadruples at most in each bisection step (Lemma 1), and thus each FI is approximated to at least ρ0 bits after the binary point. The polynomials F̃I have bitsize Õ(nτ) 4 More precisely, each coefficient of FI is approximated to an absolute error of less than 2−L−n 3 √ √ √ FI0 = 16 2x2 − 16 2x + 4 − 8x + 4 + π8 + 4 2 I0 = (− 21 , 21 ), 5145 31363 2 ≈ F̃I0 = 11585 512 x − 1024 + 512 I1 = (− 12 , 0) I2 = (0, 12 ) (0, 14 ) ( 14 , 21 ), FI = √ √ √ 2 2x + (2 2 − 2)x + 2 + π8 53 25 2 ≈ F̃I = 181 128 x + 64 x − 128 (0, 18 ) ( 18 , 14 ) ( 14 , 38 ) ( 38 , 12 ) √ Fig. 1.1. Consider the recursion tree induced by the Descartes method when applied to F(x) := 16 2 · x2 − 8 · x + π4 (with roots x1 = 0.06 . . . and x2 = 0.29 . . .). For each interval I = (a, b) in the subdivision process, we have to compute √ √ √ the polynomial FI (x). For instance, for I = ( 41 , 12 ), it holds that FI (x) = F( 14 + 4x ) = 2x2 + (2 2 − 2)x + 2 + π8 . We start with an approximation F̃I0 of FI0 to a certain number ρI0 = ρ of bits after the binary point. Then, we recursively compute approximations F̃I of FI to ρI bits, where ρI is updated in each step. Notice that the polynomials F̃I do not necessarily correspond to a specific initial approximation G of F, that is, there might exist no polynomial G such 31363 5145 2 that GI = F̃I for all considered I. In the above example, we start with F̃I0 (x) = 11585 512 x − 1024 + 512 which approxi√ 2 √ √ 1 π mates FI0 (x) = f (− 2 + x) = 16 2x − 16 2x + 4 − 8x + 4 + 8 + 4 2 to ρI0 = 10 bits. Then, F̃I0 ( 2x ) and F̃I0 ( 12 + 2x ) are evaluated and the result is rounded to 9 bits after the binary point. The resulting polynomials are then approximations of FI1 (x) = F( 21 + 2x ) and FI2 (x) = F( 2x ) to ρI1 = ρI2 = 8 bits, respectively; see Lemma 1 for details. In the following bisection steps, we proceed in exactly the same manner. For instance, for the interval I = ( 41 , 12 ), we obtain 53 25 2 F̃I (x) = 181 128 x + 64 x − 128 which approximates FI to ρI = 6 bits after the binary point. (instead of Õ(n2 τ) for the exact counterpart FI ) which allows to reduce the cost at each node by a factor n; see also Figure 1.1 for an example in the more general setting, where F has arbitrary real coefficients. The main difficulty is that the values h0 and ρ0 are not known in advance and that considering worst case bounds for these values (e.g. for integer polynomials) yields an algorithm which might achieve an improved worst case complexity bound but is not practical at all; see also Section 4.4.1 for more details. In contrast, we propose an adaptive algorithm which succeeds with a working precision comparable to the precision that is actually needed for the given input. In addition, the size of the recursion tree directly depends on the actual separations of the roots; cf. the complexity bounds in (1.6) and (1.7) for a more precise result. In the above considerations, we mainly focused on polynomials with integer coefficients. However, the proposed algorithm RI SOLATE does not only apply to integer polynomials but also to arbitrary square-free polynomials 5 n F(x) := ∑ Ai xi ∈ R[x], i=0 1 ≤ |An | ≤ 1, 4 (1.4) with real valued coefficients Ai , where we assume the existence of a coefficient oracle that provides arbitrary good approximations of the coefficients at the cost of reading the approximation. In this setting, the complexity results are exclusively stated in terms of the geometry of the roots and not in the complexity of the coefficients. More precisely, let - ξ1 , . . . , ξn ∈ C denote the (complex) roots of F, - ΓF := log+ maxi |ξi | the logarithmic root bound of F, 5 The additional requirement for the leading coefficient An yields a simpler overall presentation. Notice that, for general values An , we first have to multiply the polynomial F by some 2t , with t ∈ Z, such that 2t · |An | is contained in [1/4, 1]. 4 - σi := σ (ξi , F) := min j6=i |ξi − ξ j | the separation of the root ξi , - σF := mini σi the separation of F, and - ΣF := ∑ni=1 log+ σi−1 , (1.5) then RI SOLATE induces a recursion tree of size Õ(nΓF + ΣF ), (1.6) Õ(n(nΓF + ΣF )2 ) (1.7) and it needs bit operations to isolate all real roots of F. The coefficients of F have to be approximated to Õ(nΓF + ΣF ) bits after the binary point. We remark that the bound in (1.7) factorizes into the bound (1.6) on the size of the recursion tree, the precision Õ(nΓ + ΣF ) to carry out the computations, and a factor Õ(n) for the number of arithmetic operations needed to process an interval I in the recursion tree. Furthermore, for a polynomial F with integer coefficients of bit size τ or less, we can first divide F by its leading coefficient (to meet the requirements in (1.4)) and then apply RI SOLATE to the polynomial F/|An |. For this special case, the bounds on the size of the recursion tree and the bit complexity simplify to Õ(nτ) and Õ(n3 τ 2 ), respectively, because ΓF = O(τ), log Mea(F) = O(τ + log n), and ΣF = Õ(nτ); see also Appendix 6.2. Related Work. The literature on root finding mainly distinguishes between numerical methods that use approximate computation and methods that use exact arithmetic. Many numerical algorithms (e.g. based on Newton-Raphson iteration, the Weierstrass-Durand-Kerner method, (inverse) power iteration, Eigenvalue computation, etc.) are widely used and effective in practice 6 but lack a guarantee on the global behavior. A prominent example is the Weierstrass-DurandKerner method, where there is still no proof known that the method converges for arbitrary given starting values. For a more detailed discussion, we refer to [32]. In parallel, there is a steady ongoing research on subdivision algorithms which perform rational operations on the input coefficients. Algorithms of the latter kind are the Descartes method (e.g. [10, 13, 34]), the Bolzano method [9, 38], the Sturm method [11, 46], or the continued fraction method [2, 24, 41, 44]. 7 Many of these methods have been integrated into computer algebra systems, and experiments have shown their practical evidence [18, 34]. In addition, their computational complexity has been well-studied [11, 15, 38, 44]. Current experimental data shows that an approximate variant of the Descartes method [34] performs best for most polynomials, whereas, for some particular hard instances (e.g. Mignotte polynomials), the continued fraction approach is more efficient. From a theoretical complexity point of view, the first benchmark was set by A. Schönhage [39] in 1982. He combines a newly introduced concept denoted splitting circle method with techniques from numerical analysis (Newton iteration, Graeffe’s method, discrete Fourier transforms) and fast algorithms for polynomial and integer multiplication. With respect to the benchmark problem (i.e. isolating all roots of a polynomial F of degree n with integer coefficients of bit size τ or less), his method achieves the bit complexity bound Õ(n3 τ). Pan and others [32, 33] gave theoretical improvements which yield record bounds with respect to bit complexity and arithmetic complexity. In particular, [33, Theorem 2.1.1] implies that isolating all complex roots 6 E.g. M PSOLVE [8] is a highly efficient implementation of the Aberth-Ehrlich method. The literature on root solving is extensive, hence, we decided to restrict to a selection of representative papers and refer the reader to the references given therein. 7 5 of F needs no more than Õ(n2 τ) bit operations. Very recent work [26] turns Pan’s factorization algorithm into a root isolation method that achieves a bit complexity bound which adapts directly to the geometry of the roots. That is, similar to the bound in (1.7), the bound exclusively depends on the absolute values and the pairwise distances between roots. However, the main drawback of the asymptotically fast algorithms above is that they are rather involved and difficult to implement. In fact, Pan’s method has not been implemented, whereas Schönhage’s method has not proven to be efficient in practice so far; see [17] for a “proof of concept” implementation of the splitting circle method within the Computer Algebra system Pari/GP. All of the above mentioned exact subdivision algorithms (i.e. Sturm, Bolzano, Descartes, or the continued fraction method) need Õ(n4 τ 2 ) bit operations to isolate all real roots of F, thus they lag behind the (asymptotically) fastest method by three magnitudes. In a very recent work [37], we introduced a variant of the Descartes method which uses Newton iteration to speed up convergence. The method exclusively performs exact rational arithmetic and has bit complexity Õ(n3 τ) which is still by one magnitude worse than the method from Pan. When compared to other exact subdivision methods, the improvement in [37] mainly stems from the fact that it achieves quadratic convergence in most iterations which yields a recursion tree of almost optimal size. In contrast, the improvement with respect to bit complexity (i.e. from Õ(n4 τ 2 ) to Õ(n3 τ 2 )) as achieved by the algorithm RI SOLATE in this paper is due to the use of approximate arithmetic with a considerably smaller working precision as needed for the exact counterpart. A first version [36] of this paper appeared in arXiv in November 2010. Since that time, the algorithm RI SOLATE has been implemented as a core function in M ATHEMATICA (see [42]) and the complexity results have been applied in a series of papers (e.g. [20, 21, 37, 42, 43]). In this context, we would also like to remark that our complexity results are already confirmed by experiments [42, 43] showing that the complexity of RI SOLATE is exclusively related to the geometry of the roots. Furthermore, the adaptiveness of our bound has turned out to be very useful in the analysis [21] of an algorithm to compute the topology of an algebraic curve which makes extensive use of amortization. For the future, we expect that there will be a series of further complexity results based on our adaptive complexity bound for real root isolation. Acknowledgments A special thank goes to all anonymous reviewers for their constructive and detailed criticism that has helped to improve the quality and exposition of this contribution. 2. Preliminaries 2.1. Notations In addition to the definitions from (1.4) and (1.5), we define w(I) := b − a the width, m(I) := the center, and r(I) = w(I) 2 the radius of an interval I = (a, b). Furthermore, a+b 2 I + = (a+ , b+ ) := (a − w(I) w(I) ,b+ ) and 4n 4n w(I) w(I) I˜ = (ã, b̃) := (a − ,b+ ) 2n 2n w(I) denote extensions of I by w(I) 4n and 2n (to both sides), respectively. We will need these intervals for our modified version of the Descartes method as presented in Section 3. For an arbitrary point m ∈ C and a positive real value r, we define ∆ = ∆r (m) to be the open disk with center m and radius r. ∆¯ and I¯ denote the closure of a disk ∆ and an interval I, respectively. 6 2.2. Scaling the Polynomial Instead of isolating the roots of the given polynomial F(x) = ∑ni=0 Ai xi as defined in (1.4), we consider the equivalent task of isolating the roots of a ”scaled” polynomial n n f (x) = ∑ ai xi := ∑ (Ai · 2i·Γ ) · xi = F(2Γ · x), i=0 (2.1) i=0 where Γ ∈ N is an integer approximation of the exact logarithmic root bound ΓF = log+ (maxi |ξi |) of F such that ΓF + 1 ≤ Γ ≤ ΓF + 8 log n + 1. (2.2) According to Appendix 6.1, we can compute such a Γ with Õ(n2 ΓF ) bit operations from an approximation of F to Õ(nΓF ) bits after the binary point. From the definition of Γ, it follows that the roots z1 := ξ1 · 2−Γ , . . . , zn := ξn · 2−Γ of f are contained within the disk ∆1/2 (0). Furthermore, the absolute value of each coefficient ai is upper bounded by 2O(nΓ) since |Ai | ≤ ni Mea(F) ≤ 2n+nΓF ≤ 2nΓ for all i. We further remark that the separations of corresponding roots of F and f scale by a factor of 2Γ (i.e. σ (ξi , F) = 2Γ · σ (zi , f )). Thus, we have n Σ f = ∑ log+ σ (zi , f )−1 ≤ ΣF + nΓ = O(nΓF + n log n + ΣF ) = Õ(nΓF + ΣF ). (2.3) i=1 2.3. Approximating Polynomials We assume the existence of a coefficient oracle which, for a given ρ ∈ N, provides approximations of the coefficients of F to ρ bits after the binary point. More precisely, each coefficient Ai is approximated by a binary fraction Ãi = mi · 2−ρ with mi ∈ Z and |Ai − Ãi | ≤ 2−ρ , e.g., Ãi = sign(Ai ) · b|Ai · 2ρ |c · 2−ρ . We call a polynomial F̃ ∈ Q[x] obtained in this way a ρ-binary approximation of F. We only consider the cost for reading (i.e. O(n(nΓF + ρ))) but not for computing such an approximation. Notice that, in order to obtain a ρ-binary approximation of the scaled polynomial f , we have to approximate F to nΓ + ρ bits after the binary point since the i-th coefficient of F is shifted by i · Γ bits. i For an arbitrary polynomial g(x) := ∑m i=0 gi x ∈ C[x] with complex coefficients and an arbitrary non-negative real number µ ∈ R≥0 , we define ( ) n [g]µ := g̃(x) = ∑ g̃i xi ∈ C[x] : |gi − g̃i | ≤ µ for all i = 0, . . . , n i=0 the set of all µ-approximations of g. We remark that, since the coefficients of modulus less than µ can be approximated by zero, a µ-approximation g̃ of g might have lower degree than g. 1 9 11 10 9 10 2 2 Example. For g(x) := 12256 65589 x − 2x + 243 x − 16 , the polynomial g̃(x) := 64 x − 2x − 16 con3 2 stitutes a 6-binary approximation and g̃(x) := −2x − 4 a 2-binary approximation of g. 2.4. Taylor Shifts The following lemma provides error bounds on how the absolute approximation error µ of a polynomial g̃ ∈ [g]µ scales under the transformation x 7→ m + λ · x for some special values for m ∈ C and λ ∈ R\{0}: 7 Lemma 1. For µ ∈ R+ 0 and g̃ ∈ [g]µ an arbitrary µ-approximation of a polynomial g ∈ C[x] of degree n, it holds that (a) g̃( 12 + 21 · x) ∈ [g( 12 + 12 · x)]2µ , 1 1 1 1 (b) g̃(− 4n + (1 + 2n ) · x) ∈ [g(− 4n + (1 + 2n ) · x)]4µ , 1 1 n (c) g̃(− 2 + x) ∈ [g(− 2 + x)]2 µ , and g̃(1 + x) ∈ [g(1 + x)]2n µ . Proof. For µ(x) := (g − g̃)(x) = µn xn + . . . + µ1 x + µ0 , the absolute value of each coefficient µi is bounded by µ. Let m ∈ C and λ ∈ R\{0} be arbitrary values, then n n i n n i i µ(m + λ x) = ∑ µi (m + λ x)i = ∑ µi ∑ xk λ k mi−k = ∑ xk ∑ µi mi−k λ k (2.4) k k i=0 i=0 k=0 k=0 i=k Thus, for |m| < 1, the absolute value of the coefficient of xk is bounded by 1 i k+i k i−k i k , = µ|λ |k · µ|λ | · ∑ |m| = µ|λ | · ∑ |m| k k (1 − |m|)k+1 i≥0 i≥k (2.5) where we used −(k+1) (1 − |m|) −(k + 1) k+i k+i i i i =∑ (−1) |m| = ∑ |m| = ∑ |m|i . i i k i≥0 i≥0 i≥0 For m = λ = 1/2, it follows that the absolute value of all coefficients of µ(x) is bounded by 2µ. 1 1 This shows (a). For m = − 4n and λ = 1 + 2n , (2.5) implies that 1 1 1 1 1 1 g̃(− + (1 + ) · x) ∈ g(− + (1 + ) · x) ⊂ g(− + (1 + ) · x) 1+1/(2n) n 4n 2n 4n 2n 4n 2n µ 8· 4µ 7 1−1/(4n) √ 1+1/(2n) because 78 · 1−1/(4n) ≤ · e ≤ 4. Hence, (b) follows. The first part of (c) is also a direct implication of (2.5). The second claim from the computation in (2.4) since µi is then in (c) follows i n−k i+k n−k n n · µ. 2 (m = λ = 1) bounded by µ · ∑ni=k ki = µ · ∑ni=k i−k = µ · ∑i=0 ≤ µ · ≤ 2 ∑ i=0 i i 2.5. n 83 73 On Sufficiently Good Approximation In the next step, we derive a bound on how good f has to be approximated by some f˜ such that, for all i, the distance of corresponding roots zi and z̃i of f and f˜ is small with respect to the separation σ (zi , f ). There exist general worst-case perturbation bounds (e.g. [40, Thm. 2.7] or [23, Chapter 15]) that apply to polynomials with multiple roots and which only depend on the distance k f − f˜k1 between f and f˜. 8 For polynomials with roots of very large multiplicity, these bounds are nearly optimal. However, they often constitute vast overestimations of the amount of perturbation, in particular, for polynomials with well separated roots. In contrast, we provide a more adaptive, but implicit, bound depending on parameters, such as the separations of the roots and the absolute values of the derivatives at the roots, which can not directly be derived from the coefficients of f (or f˜). However, our algorithm as presented in Section 4 is designed in a way such that it eventually succeeds with a working precision that is related to our adaptive bound. We further remark that our bound cannot be directly derived from the bound in [40, Thm. 2.7] and vice versa. 8 In the context of real root isolation, the bound in [40, Thm. 2.7] has been used in [25] in order to derive isolating intervals for the roots of a real polynomial f ∈ R[x] from corresponding isolating intervals for the roots of a rational approximation f˜ ∈ Q[x]. 8 The following considerations are mainly adopted from our studies in [35]. For the sake of comprehensibility, we decided to briefly review the results in this paper as well. We start with the following definition: Definition 2. For t, with t ≥ 1, an arbitrary real value and f a polynomial as in (2.1), we define σ (zi , f ) · f 0 (zi ) 1 µ( f ,t) := · min (2.6) t i=1,...,n 8n2 We call a ρ ∈ N sufficiently large with respect to f if 9 ρ ≥ ρ f := d− log µ( f , 64n2 )e. (2.7) Notice that ρ f = O(Σ f + log n − log |an |) and ρ f = O(ΣF + log n) because of σ (zi , f ) · | f 0 (zi )| = σ (zi , f ) · |an | · ∏ |zi − z j | ≥ σ (zi , f ) · |an | · ∏ σ (z j , f ) = j6=i j6=i 1 |an | = nΓ σ (ξi , F) ∏ σ (ξ j , F) ≥ · 2−ΣF . 2 4 j6=i (2.8) The following theorem gives an answer to our initial question how good f has to be approximated by some f˜ in order to ensure that corresponding roots stay at almost ”the same place” with respect to their separations: Theorem 3. Let f be the polynomial as defined in (2.1), t ≥ 1 and f˜ ∈ [ f ]µ( f ,t) . (a) For all i = 1, . . . , n, the disk ∆i := ∆ σ (zi , f ) (zi ) tn contains the rootSzi of f and a corresponding root z̃i of f˜. (b) For each z ∈ C\ ni=1 ∆i , it holds that | f (z)| > (n + 1)µ( f ,t). (zi , f ) ˜ (c) If ρ ≥ ρ f , then each root zi moves by at most σ64n 3 when passing from f to an arbitrary f ∈ [ f ]2−ρ . In particular, real roots of f stay real and non-real roots stay non-real. Furthermore, (zi , f ) −ρ f for any z ∈ C with |z − zi | ≥ σ64n . 3 for all i, it holds that | f (z)| > (n + 1)2 Proof. Since all roots of f are contained within ∆1/2 (0), it follows that σ (zi , f ) < 1 for all i and, thus, each disk ∆i is completely contained within the unit disk. For an arbitrary point z ∈ ∂ ∆i on the boundary of ∆i , we have ! ! n z−zj σ (zi , f ) | f (z)| = |an | ∏ |z − z j | = ∏ |zi − z j | ∏ · |an | · tn j=1 1≤ j≤n, j6=i 1≤ j≤n, j6=i zi − z j z − z j σ (zi , f ) · | f 0 (zi )| |zi − z j | − |z − zi | σ (zi , f ) · | f 0 (zi )| = ∏ zi − z j ≥ ∏ tn tn |zi − z j | 1≤ j≤n, j6=i 1≤ j≤n, j6=i 1 n−1 σ (zi , f ) · | f 0 (zi )| σ (zi , f ) · | f 0 (zi )| 1− > > (n + 1)µ( f ,t). ≥ tn tn 2.72 · tn In addition, since f˜ ∈ [ f ]µ( f ,t) and |z| < 1, we have |( f − f˜)(z)| < (n + 1)µ( f ,t) < | f (z)|. Hence, (a) follows from Rouché’s Theorem applied to the disks ∆i and the functions f and f˜. For (b), we S remark that f is a holomorphic function on C\ ni=1 ∆i and, thus, | f (z)| becomes minimal for a 9 This definition is motivated by our results in Theorem 3 and Section 4.1 9 point z on the boundary of one of the disks ∆i . (c) follows directly from (a), (b) and the definition of ρ f in (2.7). 2 We conclude from the last theorem that it suffices to approximate the coefficients of f to ρ, with some ρ = O(Σ f + log n − log |an |), bits after the binary point to guarantee that each approximation f˜ ∈ [ f ]2−ρ has its roots at almost the same location as f . 2.6. The Descartes Method We first resume some basic facts about the Descartes method for isolating the real roots of a polynomial f (x) = ∑ni=0 ai xn ∈ R[x]. Descartes’ Rule of Signs states that the number var( f ) of sign changes in the coefficient sequence of f , that is, the number of pairs (i, j) with i < j, ai a j < 0, and ai+1 = . . . = a j−1 = 0, is not smaller than and of the same parity as the number of positive real roots of f . If var( f ) = 0, then f has no positive real root, and if var( f ) = 1, f has exactly one positive real root. The rule easily extends to an arbitrary open interval I = (a, b) via a suitable coordinate transformation: The mapping x 7→ a + (b − a)x maps (0, 1) bijectively onto I, that is, the roots of f in I exactly correspond to those of fI (x) := f (a + w(I)x) = f (a + (b − a)x) (2.9) in (0, 1). Hence, the composition of x 7→ a + (b − a) · x and x 7→ 1/(1 + x) constitutes a bijective map from (0, ∞) to I. It follows that the positive real roots of ax + b 1 n n = (1 + x) · f fI,rev (x) := (1 + x) fI x+1 x+1 correspond bijectively to the real roots of f in I. The factor (1+x)n in the definition of fI,rev clears denominators and guarantees that fI,rev is a polynomial. fI,rev is computed from fI by reversing the coefficients (i.e. the i-th coefficient is replaced by the (n − i)-th coefficient) followed by a Taylor shift by 1 (i.e. x 7→ x + 1). We now define var( f , I) := var( fI,rev ). Based on Descartes’ Rule of Sign, Collins and Akritas introduced a bisection algorithm 10 for isolating the roots of f in an interval I0 (here, we assume that I0 = (−1/2, 1/2)). We refer the reader to [3, 4, 5, 6, 10, 13] for extensive treatments and references. V CA. The algorithm requires that the real roots of f in I0 are simple, otherwise it diverges. In each step, a set A of active intervals is maintained. Initially, A contains I0 , and the algorithm stop as soon as A becomes empty. In each iteration, some interval I ∈ A is processed; If var( f , I) = 0, then I contains no root of f and we discard I. If var( f , I) = 1, then I contains exactly one root of f and, hence, is an isolating interval for it. We add I to a list O of isolating intervals. If there is more than one sign change, we divide I at its midpoint m(I) and add the subintervals to the set of active intervals. If m(I) is a root of f , we add the trivial interval [m(I), m(I)] to the list of isolating intervals. Correctness of the algorithm follows immediately from Descartes’ Rule of Signs. Termination and complexity analysis of V CA rest on the following theorem: 10 Based on the fact that Collins and Akritas used ideas dating back to Vincent (see [5]), the algorithm has been named Vincent-Collins-Akritas method (or V CA for short). In this paper, we use both denotations, that is, V CA and Descartes method, in an interchangeable way. 10 Theorem 4 ([28, 31]). For a polynomial f ∈ R[x] and an interval I = (a, b), let v := var( f , I). (a) (One-Circle Theorem) If the open disk bounded by the circle centered at m(I) and passing through the endpoints of I contains no root of f (x), then v = 0. (b) (Two-Circle Theorem) If the union of the open disks bounded by the two circles centered at √ m(I) ± i(1/(2 3))w(I) and passing through the endpoints of I contains exactly one root of f (x), then v = 1. Proofs of the one- and two-circle theorems can be found in [3, 13, 22, 28, 29, 30, 31]. Theorem 4 implies that no interval I of length σ f /2 or less is split. Such an interval cannot contain two real roots and its two-circle region cannot contain any nonreal root. Thus, var( f , I) ≤ 1 by Theorem 4. We conclude that the depth of the recursion tree is bounded by 1 + log σ −1 f . Furthermore, it holds (see [13, Cor. 2.27] or [27, Prop. 3.1] self-contained proofs): Theorem 5. Let I be an interval and I1 and I2 be two disjoint subintervals of I. Then, var( f , I1 ) + var( f , I2 ) ≤ var( f , I). According to the above theorem, there cannot be more than n/2 intervals I with var( f , I) ≥ 2 at any level of the recursion. Therefore, the size of the recursion tree TV CA is bounded by n(1 + log σ −1 f ). For polynomials with integer coefficients of maximal bitsize τ, it has been shown that − log σ f = O(n(log n + τ)), thus, the latter bound writes as Õ(n2 τ). However, a more refined argumentation [13] shows that |TV CA | is even bounded by Õ(nτ) which is due to the fact that there are amortization effects over the separations of all roots; see Appendix 6.2. The computation of fI,rev at each node of the tree is costly. It is better to store with every interval I = (a, b) the polynomial fI (x) = f (a + x · (b − a)). If I is split at its midpoint m(I) into I` = (a, m(I)) and Ir = (m(I), b), the polynomials associated with the subintervals are fI` (x) = 1 n fI ( 2x ) and fIr (x) = fI ( 1+x 2 ) = f I` (1 + x). Also, f I,rev (x) = (1 + x) f I ( 1+x ). If the coefficients of f are integers (or dyadic fractions) of bitsize τ, then the coefficients grow by n bits in every bisection step. Thus, for a node I of depth h, the bitsize τh of the coefficients of fI is bounded by τh = τ + nh. Hence, using asymptotically fast Taylor shift (see [45, 16]), the number of bit operations needed to compute fI` , fIr and fI,rev from fI is Õ(n(nh + τ)). Since the depth of the recursion tree is Õ(nτ), each fI has coefficients of bitsize Õ(n2 τ) and, thus, the cost at each node is bounded by Õ(n3 τ). Eventually, the total cost for V CA is in Õ(n3 τ) · Õ(nτ) = Õ(n4 τ 2 ). 3. A Modified Descartes Method In c computational model, where exact operations on real numbers are assumed to be available at unit costs, the Descartes method can be directly used to isolate the real roots of the polynomial f as defined in (2.1). Namely, in such a model, we can compute the number of sign variations for the polynomial fI,rev and the sign of f at the midpoint m(I) for each node I of the recursion tree no matter whether f has rational, algebraic, or transcendental coefficients. However, for an actual implementation, these computations turn out to be hard, or even infeasible, in general. Namely, if one of the coefficients of fI,rev equals zero (e.g. this is the case if one of the endpoints of I is a root of f ), then deciding the sign of this coefficient becomes infeasible since we can only ask for approximations of f . The decision problem becomes hard if one of the coefficients has a very small value because, in this case, we have to run our computations with a very large working precision. We further remark that, even for algebraic coefficients (with known algebraic 11 representation), the decision problem might be hard because this amounts to comparing algebraic numbers of large degree. In order to overcome these issues, we do not consider the original version of the Descartes method but a modified variant which completely avoids such difficult decision problems. More precisely, we will show that our method always succeeds with a working precision comparable to ρ f . A crucial step in our approach is to replace the inclusion predicate var( f , I) = 1, which is used in the Descartes method to confirm an interval to be isolating, by a predicate used in the Bolzano method. Section 3.1 resumes some useful results which are adopted from our studies on the Bolzano method [38], whereas, in Section 3.2, our modified Descartes method is formulated. 3.1. The T [g, K](·)-Test: Existence of Roots For g ∈ C[x], m ∈ C and positive real values K and r, we consider the following test which has already been introduced in [46] in a less general form: 11 g(k) (m) k T [g, K](m, r) : t[g, K](m, r) := |g(m)| − K ∑ (3.1) r > 0. k! k≥1 In order to simplify notation, we also write T [g, K](∆) or T [g, K](I) instead of T [g, K](m, r), where ∆ = ∆r (m) is disk or I = (a, b) an interval with midpoint m = m(I) and radius r = r(I). If the polynomial g is fixed and no mix-up is possible, we further omit the ”g” and write T [K](m, r) for T [K](m, r) and T 0 [K](m, r) for T [g0 , K](m, r). We mainly use K = 3/2. Therefore, whenever the ”K” is suppressed (i.e. we write T [g](m, r) instead of T [g, 3/2](m, r)), we consider K = 3/2. Before presenting the main technical lemmata, we first summarize the following useful properties of T [g, K](·): • If T [g, K](m, r) holds, then T [g, K 0 ](m, r0 ) holds for all K 0 ≤ K and all r0 ≤ r. • For arbitrary values m, r and λ 6= 0, the tests T [g, K](m, r) and T [g(m + λ · x), K](0, λr ) are equivalent since t[g(m + λ x), K](0, λr ) = t[g, K](m, r). In particular, for an interval I = (a, b), the test T [gI , K](0, r) is equivalent to T [g, K](a, r · w(I)), where gI (x) = g(a + w(I) · x). • For λ ∈ R+ , it holds that t[g, K](m, r) = t[λ g, K](m, r) · λ −1 and, thus, T [g, K](m, r) is equivalent to T [λ g, K](m, r). Hence, T [(g0 )I , K](m, r) and T [(gI )0 , K](m, r) are equivalent since (gI )0 = (g(a + w(I) · x))0 = w(I) · (g0 )I . The T [g, K](·)-test serves as an exclusion predicate but might also guarantee that a certain disk contains at most one root. We refer to [7, Theorem 3.2] for a proof of the following lemma. Lemma 6. Consider a disk ∆ = ∆r (m) ⊂ C and a polynomial g ∈ R[x]: (a) If T [K](∆) holds for some K ≥ 1, then ∆¯ contains no root of g and 1 1 1− · |g(m)| < |g(z)| < 1 + · |g(m)| K K for all z in the closure ∆¯ of ∆. (b) If T 0 [3/2](∆) holds, then ∆¯ contains at most one root of g. 11 In [46], Yakoubsohn introduces a quadtree (Weyl) construction for computing the complex roots of an analytic function, where the test T [g, 1](·) is exclusively used as an exclusion predicate. In [46, Section 9], he also provides bounds on the arithmetic complexity and the precision that is needed by his algorithm to isolate all complex roots of a square-free polynomial f . In particular, the bound on the precision is stated in terms of the degree of f , the absolute value of the roots, and the distance of f to the variety of all polynomials that have a multiple root, and thus, it is similar to our bound. 12 The T 0 (·)-test now easily applies as an inclusion predicate: Corollary 7. Let I = (a, b) be an interval and r ≥ 1 such that T [g0I ](0, r) holds. Then, I contains a root ξ of g if and only if g(a) · g(b) < 0. In the latter case, the disk ∆r·w(I) (a) is isolating for ξ . Proof. If T [g0I ](0, r) holds, then T [g0 ](a, r ·w(I)) holds as well according to the above properties of T [g, K](·). It follows that the disk ∆r·w(I) (a) and, thus, I contains no root of the derivative g0 . Now, since g is monotone on I, it suffices to check for a sign change of g at the endpoints of I. Namely, there exists a root ξ of g in I if and only if g(a) · g(b) < 0. In case of existence, ∆r·w(I) (a) is isolating for ξ due to Lemma 6. 2 In order to show that the T [g0 ](m, r)-test in combination with sign evaluation is an efficient inclusion predicate, we give lower bounds on r in terms of σg such that the predicate succeeds under guarantee. Lemma 8. For g a polynomial of degree n, a disk ∆ = ∆r (m) ⊂ C, an interval I = (a, b) and w(I) I + = (a − w(I) 4n , b + 4n ), it holds: σ (a) If r ≤ 4ng2 , then T (∆) or T 0 (∆) holds. (ξ ,g) 0 (b) If ∆ contains a root ξ of g and r ≤ σ 4n 2 , then T (∆) holds. (c) If var(g, I + ) > 0 and T [g0I ](0, 2) fails, ∆2w(I) (a) contains a root ξ of g with σ (ξ , g) < 8n2 w(I). (d) If var(g0 , I) > 0 and T [gI ](0, 1) fails, ∆2nw(I) (a) contains a root ξ of g with σ (ξ , g) < 4n2 w(I). Proof. For the proof of (a) and (b), we refer to [35, Lemma 5]. For (c), suppose that var(g, I + ) > 0 and T [g0I ](0, 2) does not hold. Then, according to Theorem 4 (a), the disk ∆r(I + ) (m(I)) ⊂ (ξ ,g) ∆2w(I) (a) contains a root ξ of g. With (b), it follows that 2w(I) > σ 4n and, thus, σ (ξ , g) < 2 2 8n w(I). For (d), we first argue by contradiction that the disk ∆2nw(I) (a) contains a root ξ of g: If |a − xi | ≥ 2nw(I) for all roots xi of g, then k k g(k) (a) 1 1 1 n 0 = ≤ , ≤ ∑i=1 |a − xi | g(a) ∑i1 ,...,ik (a − xi1 ) . . . (a − xik ) 2w(I) where the prime meansthat the i j ’s ( j = 1 . . . k) are chosen to be distinct. It follows that T (a, w(I)) (k) (a) holds because of ∑nk=1 gg(a) w(I)k ≤ ∑nk=1 2−k < 1 < 23 . In addition, Theorem 4 guarantees the existence of a root ξ 0 ∈ ∆r(I) (m(I)) of g0 . Hence, we have |ξ − ξ 0 | < 2nw(I) + w(I) < 4nw(I) which implies σ (ξ , g) < 4n2 w(I) due to the fact [12, 47] that there exists no root of the derivative g0 in ∆ σ (ξ ,g) (ξ ). 2 n 3.2. D CM: A Modified Descartes Algorithm We introduce our modified Descartes method D CM (short for “Descartes modified”) to isolate the real roots of a polynomial f . We formulate the algorithm in the REAL-RAM model, thus, it still does not directly apply to bitstream polynomials. However, in Section 4.1, we will present a corresponding version D CMρ of D CM which resolves this issue; see also Appendix, Algorithm 1 for pseudo-code of D CM. 13 D CM. D CM maintains a list A of active nodes and a list O of isolating intervals, where we initially set O = 0/ and A := {(I0 , fI0 )}, with I0 := (− 21 , 21 ). For each active node (I, fI ) from A , we proceed as follows. We first remove (I, fI ) from the list A . Then, we compute the number vI + := var( f , I + ) = var( fI + ,rev ) of sign variations for f on the extended interval I + (notice 1 1 1 that fI + (x) = fI (− 4n + (1 + 2n )x) and fI + ,rev (x) = (1 + x)n fI + ( 1+x )). 12 If vI + = 0, we do nothing, that is, I is discarded. If vI + ≥ 1, we consider the test T [ fI0 ](0, 2) which is equivalent to T [ f 0 ](a, 2w(I)). If it fails, then I is subdivided into I` = (a, m(I)) and Ir = (m(I), b) and we add (I` , fI` ) = (I` , fI ( 2x )) and (Ir , fIr ) = (Ir , fI` (x + 1)) to A . Otherwise, we evaluate the sign s of f (a+ ) · f (b+ ) = fI + (0) · fI + (1). If s < 0 and I + is disjoint from any other interval in O, we add I + to O. If s ≥ 0 or I + intersects an interval in O, we do nothing (i.e. I is discarded). The algorithm stops when A becomes empty. Theorem 9. For the polynomial f as defined in (2.1), the algorithm D CM terminates and returns a list O = {I1 , . . . , Im } of disjoint isolating intervals for all real roots of f . σ Proof. If the width w(I) of an interval I = (a, b) is smaller or equal to 8nf2 , then, according to Theorem 4 and Lemma 8 (c), var( f , I + ) = 0 or T [ fI0 ](0, 2) holds. Thus, I is not further subdivided. This shows termination of D CM. From our construction and Corollary 7, each interval in O is isolating for a real root of f and all intervals in O are pairwise disjoint. It remains to show that, for each real root ξ of f , there exists a corresponding isolating interval in O. Since all roots of f have absolute value bounded by 1/2 and D CM terminates, there must be an interval I = (a, b) of minimal (positive) length whose closure I¯ contains ξ . Since vI + > 0, I cannot be discarded in the first step of D CM. Hence, T [ fI0 ](0, 2) holds and, thus, f is monotone on I + . Since I + contains the root ξ , we have f (a+ ) · f (b+ ) < 0. It follows that either I + is added to the list of isolating intervals or I + intersects an interval J + = (c+ , d + ) ∈ O which has been added to O before. Let J = (c, d) be the corresponding smaller interval for J + . Since the w(I) 4n -neighborhood of I intersects the w(J) 4n -neighborhood of J, the following Lemma 10 shows that one of the disks ∆2w(I) (a) or ∆2w(J) (c) contains both intervals I + and J + . Since both, T [ fI0 ](0, 2) and T [ fJ0 ](0, 2), hold, each of the latter two disks contains at most one root due to Corollary 7. It follows that J + ∈ O already isolates ξ . 2 Lemma 10. 13 Let I = (a, b) and J = (c, d) be two intervals (not necessarily of equal length) of the form − 21 + i2−h , − 12 + (i + 1)2−h , where h ∈ N and i ∈ {0, . . . , 2h − 1}. If the w(I) 2n neighborhood U w(I) (I) of I intersects the 2n w(J) 2n -neighborhood U w(J) (J) of J, then one of the disks 2n ∆2w(I) (a) or ∆2w(J) (c) contains the intervals (a − w(I), b + w(I)) and (c − w(J), d + w(J)). Proof. W.l.o.g., we can assume that w(J) ≥ w(I) and, thus, w(J) = 2l w(I) with an l ∈ N0 . Let δ denote the distance between I and J. If δ = 0, then ∆2w(J) (c) contains (a − w(I), b + w(I)) and (c − w(J), d + w(J)). If δ 6= 0, then δ = 2k w(I) with a k ∈ N0 . Since U w(I) (I) ∩ U w(J) (J) 6= 0, / 2n 2n Remember that the polynomial fI + is obtained from f by mapping all roots of f in I + one-to-one and onto the interval (0, 1). When further mapping the roots one-to-one and onto the positive real axis, we obtain fI + ,rev . 13 Lemma 10 proves a slightly stronger result than necessary for the proof of Theorem 9. The stronger result applies in the proof of Theorem 14 in Section 4.2. 12 14 I a d mI b mJ c J d D 2w(J) (c) Fig. 3.1. Wlog., we can assume that w(J) ≥ w(I). w(I), w(J) and the distance δ between I and J differ by a power of 2. ˜ J˜ ⊂ ∆. For δ = 0, the disk ∆ := ∆2w(J) (c) certainly contains I˜ and J.˜ If δ 6= 0, then w(J) ≥ 2w(I) and w(J) ≥ 4δ , hence I, we must have w(J) 2n > δ2 . In particular, we have w(J) 4 > δ 2 = 2k−1 w(I). Since w(I) and w(J) differ w(J) by a power of 2, it follows that w(J) ≥ 2k+2 w(I) = 4δ and, thus, 2w(J) = w(J) + w(J) 2 + 2 ≥ w(J) + 2w(I) + 2δ . From the latter inequality our claim follows. 2 Theorem 11. For a polynomial f as in (2.1), D CM induces a subdivision tree TD CM of height h(TD CM ) = O(log n − log σ f ) and size |TD CM | = O(Σ f + n log n). Proof. The result on the height of TD CM follows directly from the proof of Theorem 9. Namely, σ we have shown that D CM never subdivides an interval of width less than or equal to 8nf2 . For the bound on |TD CM |, we use a similar argument as in [15] and [25]. Namely, for a root ξ of f and a certain h ∈ N0 we say that I = (− 12 + i2−h , − 12 + (i + 1)2−h ), i = {0, . . . , 2h − 1}, is a canonical interval for ξ if the real part of ξ is contained in [− 21 + i2−h , − 21 + (i + 1)2−h ) and σ (ξ , f ) < 8n2 2−h = 8n2 w(I). We denote Tc the canonical tree which consists of all canonical intervals. We remark that, for a canonical interval I, the parent interval of I is canonical as well. The following considerations will show that |TD CM | = O(|Tc |) and |Tc | = O(Σ f + n log n): For the size of the canonical tree, consider a leaf I ∈ Tc and let ξI be a root of f corresponding to this leaf. If there are several, then ξI is the root with minimal separation. Then, σ (ξI , f ) < 8n2 2−h and, thus, h ≤ 2 log n + 4 − log σ (ξI , f ). Since each root of f is associated with at most one leaf of the canonical tree, we conclude that |Tc | = O(n log n + Σ f ). It remains to show that |TD CM | = O(|Tc |). Consider the following mapping of internal nodes (intervals) of TD CM to canonical nodes (intervals) in Tc : Let I be a non-terminal interval of width w(I) = 2−h . Then, var( f , I + ) > 0 and T [ fI0 ](0, 2) does not hold. According Lemma 8 (c), the disk ∆2w(I) (a) contains a root ξ of f with σ (ξ , f ) < 8n2 w(I) = 8n2 2−h . Hence, one of the four intervals, I1 := (a − 2w(I), a − w(I)), I2 := (a − w(I), a), I3 := I or I4 := (b, b + (b − a)), is canonical for ξ . We map I to the corresponding interval. This defines a mapping from the internal nodes of TD CM to the nodes of the canonical tree Tc . Furthermore, each node in the canonical tree has at most four preimages in TD CM and, thus, the number of internal nodes of TD CM is bounded by O(n log n + Σ f ). Since TD CM is a binary tree, the bound on the number of internal nodes applies to the whole tree as well. 2 15 4. Algorithm We first outline our algorithm RI SOLATE to isolate the roots of f . RI SOLATE decomposes into two subroutines D CMρ and C ERTIFYρ , where ρ indicates the actual working precision. We proceed in rounds: In the first round, we start with a low working precision (e.g. ρ = 16). If our algorithm does not succeed in a certain round, the precision is doubled in the next round. Following this approach, we can eventually guarantee an adaptive behavior of our method, that is, it eventually succeeds for a working precision which is at most twice the size of the actually needed precision. The first subroutine D CMρ is essentially identical to D CM with the main difference that, at each node I = (a, b) of the recursion tree, we only consider approximations f˜I (x) of fI (x) = f (a + w(I) · x) to a certain number ρI of bits after the binary point, where ρ + 2 log w(I) ≤ ρI ≤ ρ. We remark that we process I in a way such that I is not subdivided by D CMρ if it is not subdivided by the the exact counterpart D CM. This ensures that, for any ρ, D CMρ induces a subtree TD CMρ of TD CM and, thus, |TD CMρ | = O(Σ f + n log n) due to Theorem 11. We further show that, for a precision ρ ≥ ρ max = O(Σ f + log n), D CMρ returns isolating intervals for all real roots of f ; see f Theorem 14 for the definition of ρ max and further details. However, for smaller ρ, D CMρ may f return isolating intervals only for some roots but without any information whether all real roots are captured or not. In order to overcome such an undesirable situation, we consider an additional subdivision method C ERTIFYρ similar to D CMρ which aims to certify that all roots are captured. We further show that C ERTIFYρ also induces a recursion tree of size O(Σ f + n log n) and that it succeeds if ρ ≥ ρ max f . 4.1. D CMρ : An Approximate Version of D CM We present our first subroutine D CMρ . Comments to support the approach are in italic and marked by a ”//” at the beginning. D CMρ . Let I0 = (− 21 , 21 ) be the starting interval which, by construction of f , contains all real roots of f . In a first step, we choose a (ρ + n + 1)-binary approximation f˜ of f and evaluate f˜(− 12 + x). Then, the resulting polynomial is approximated by a (ρ + 1)-binary approximation f˜I0 ∈ [ f˜(− 12 + x)]2−ρ−1 and, according to Lemma 1, we have f˜I0 ∈ [ fI0 ]2−ρ . D CMρ maintains a list A of active nodes (I, f˜I , ρI ), where I = (a, b) ⊂ I0 is an interval, f˜I approximates fI to ρI bits after the binary point and ρ + 2 log w(I) ≤ ρI ≤ ρ. D CMρ eventually returns a list O of tuples (J, sJ,` , sJ,r , BJ ), where J = (c, d) is an isolating interval for a root of f , sJ,` = sign( f (c)), sJ,r = sign( f (d)) and 0 < BJ ≤ min(| f (c)|, | f (d)|). We initially start with A := {(I0 , f˜I0 , ρ)} and O := 0. / For each active node, we proceed as follows: (1) Remove (I, f˜I , ρI ) from A . (2) Compute the polynomials n 1 i n ˜ ˜fI + (x) = f˜I − 1 + 1 + 1 · x and h̃(x) = ∑ h̃i x := (1 + x) · fI + . 4n 2n 1+x i=0 (4.1) 16 (3) If h̃i > −2n+2−ρI for all i or h̃i < 2n+2−ρI for all i, do nothing (i.e., I is dicarded). // A simple computation (see the following Lemma 12 (a)) shows that h̃ is an approxi1 ) to ρI − n − 2 bits after the binary point. Thus, if mation of fI + ,rev (x) = (x + 1)n fI + ( 1+x + var( f , I ) = 0, all coefficients h̃i are either smaller than 2n+2−ρI or larger than −2n+2−ρI . Since we want to induce a subtree of the recursion tree TD CM induced by f , we discard I if all coefficients of h̃ are larger than −2n+2−ρI (or smaller than 2n+2−ρI ). (4) If there exist h̃i and h̃ j with h̃i ≤ −2n+2−ρI and h̃ j ≥ 2n+2−ρI , consider the test T [( f˜I )0 ](0, 2), that is, evaluate t[( f˜I )0 ](0, 2) = t[( f˜I )0 , 3/2](0, 2). // Due to Lemma 12 (i), it holds that |t[( fI )0 ](0, 2) − t[( f˜I )0 ](0, 2)| < n2n+1−ρI . Hence, if T [( fI )0 ](0, 2) holds, then t[( f˜I )0 ](0, 2) > −n2n+1−ρI . Thus, we proceed as follows: (a) If t[( f˜I )0 ](0, 2) > −n2n+1−ρI , consider the polynomial fÎ (x) := f˜I (x) + sign(( f˜I )0 (0)) · n · 2n+1−ρI · x, (4.2) // Then, T [( fÎ )0 ](0, 2) holds and, in particular, fÎ is monotone on (−2, 2). evaluate 1 ˆ = f˜I + (0) − 2n−1−ρI λ := fI − 4n 1 + ˆ = f˜I + (1) + (4n + 1)2n−1−ρI λ := fI 1 + 4n 1 1 = f˜I − − 2n+1−ρI , λ := fÎ − n n − (4.3) (4.4) (4.5) and check whether the following conditions are fulfilled: w(I) w(I) ,b+ ) intersects no J for any (J, sJ,` , sJ,r , BJ ) ∈ O, I˜ = (ã, b̃) = (a − 2n 2n λ − · λ + < 0, − + n+3−ρI min(|λ |, |λ |) > 2 n, and deg fÎ +n+7−ρI 2 |λ | > 2 n . (4.6) (4.7) (4.8) (4.9) If any of the conditions (4.6)-(4.9) fails, do nothing. If all conditions are fulfilled, then ˜ sign(λ − ), sign(λ + ), min(|λ − |, |λ + |) − 2n+3−ρI n) to O. add (I, // If (4.7)-(4.9) hold, I˜ is isolating for a root ξ of f (Lemma 12 (c)). Furthermore, since 1 1 )| > |λ − | and | fÎ (1 + 2n )| > |λ + |. Then, fÎ is monotone on (−2, 2), we have | fÎ (− 2n − inequality (4.8) and Lemma 12 (b) yields that sign( f (ã)) = sign(λ ), sign( f (b̃)) = sign(λ + ), and min(| f (ã)|, | f (b̃)|) > min(|λ − |, |λ + |) − 2n+3−ρI n. (b) If t[( f˜I )0 ](0, 2) ≤ −n2n+1−ρI , subdivide I into I` := (a, mI ) and Ir := (mI , b). Compute a ρI -binary approximation f˜I` of f˜I ( 2x ) and a (ρI − 1)-binary approximation f˜Ir of f˜I ( x+1 2 ), and add (I` , f˜Il , ρI − 1) and (Ir , f˜Ir , ρI − 2) to A . If ρI < 2, return “insufficient precision”. 17 U:=U1/n ((0,1)) roots of ^f I g 0 -2 -1/n -1/4n -1 z 1 1+1/4n 1+1/n 2 D 2 (0) 1 1 , 1 + 4n ) of fÎ . Furthermore, ∆2 (0) Fig. 4.1. If λ − · λ + = fÎ (− 4n1 ) · fÎ (1 + 4n1 ) < 0, then there exists a root γ ∈ (− 4n contains no further root of fÎ . A computation shows that | fÎ (z)| > | fI (z) − fÎ (z)| for all z on the boundary of the neighborhood U of (0, 1) if the inequality (4.9) holds. Then, due to Rouché’s Theorem, U isolates a root of fI . 1 n- // Due to Lemma 1, we have f˜Il ∈ [ fIl ]2−ρI −1 and f˜Ir ∈ [ fIr ]2−ρI −2 . Hence, by induction, it follows that ρ + 2 log w(I) ≤ ρI ≤ ρ for all active nodes. D CMρ stops when A becomes empty. It may either return ”insufficient precision” (in Step 4 (b)) or a list O of isolating intervals I˜ for some of the roots of f together with the signs of f and a ˜ lower bound on | f | at the endpoints of I. Lemma 12. Let f be a polynomial as in (2.1), I = (a, b) an interval considered by D CMρ and h̃ the polynomial as defined in (4.1). Then, (a) h̃(x) ∈ [ fI + ,rev ]2n+2−ρI and |t[( fI )0 ](0, 2) − t[( f˜I )0 ](0, 2)| < n · 2n+1−ρI . (b) For an arbitrary value t with |t| ≤ 1 + n1 , it holds that | f (a +t · w(I)) − fÎ (t)| < 2n+3−ρI n, with fÎ as defined in (4.2). In particular, | f (a+ ) − λ − |, | f (a − w(I) ) − λ |, | f (b+ ) − λ + | < 2n+3−ρI n, n with λ − , λ + and λ as defined in (4.3)-(4.5). (c) Suppose that t[( f˜I )0 ](0, 2) > −n2n+1−ρI and the inequalities (4.7)-(4.9) hold. Then, I + contains a real root ξ of f and the w(I) n -neighborhood of I is isolating for ξ . (d) For any tuple (J, sJ,` , sJ,r , BJ ) ∈ O, the endpoints of J are located outside the union of the disks ∆i := ∆ σ (zi , f ) (zi ), where i = 1, . . . , n. 64n3 Proof. Since f˜I ∈ [ fI ]2−ρI , we have f˜I + ∈ [ fI + ]2−ρI +2 due to Lemma 1 (b). Reversing the coefficients and replacing x by x + 1 increases the error by a factor of at most 2n (see Lemma 1 (c)), thus h̃ ∈ [ fI + ,rev ]2−ρI +2+n . For the second part of (a), consider the following simple computation: |t[( fI )0 ](0, 2) − t[( f˜I )0 ](0, 2)| ≤ n−1 3 3 · n · 2−ρI ∑ 2i = · n · 2−ρI (2n − 1) < n2n+1−ρI , 2 2 i=0 where the first inequality uses ( f˜I )0 ∈ [( fI )0 ]n·2−ρI . For (b), we have 18 | f (a + t · w(I)) − fÎ (t)| = | fI (t) − fÎ (t)| ≤ | fI (t) − f˜I (t)| + |t| · 2n+1−ρI n n 1 ≤ 2−ρI ∑ |t|i + 1 + · 2n+1−ρI n n i=0 1 n+1 −ρI ≤ n2 1+ + n2n+2−ρI < n2n+3−ρI . n Now, if the inequalities (4.7) and (4.8) hold, then sign( f (a+ )) = sign(λ − ), sign( f (b+ )) = sign(λ + ) and f (a+ ) · f (b+ ) < 0, hence, f has a real root in I + . We next show that (4.9) implies the uniqueness of this root. From t[( f˜I )0 ](0, 2) > −n2n+1−ρI , it follows that T [( fÎ )0 ](0, 2) succeeds 1 1 ) and λ + = fÎ (1 + 4n ) and, thus, ∆2 (0) contains at most one root of fÎ . Since λ − = fÎ (− 4n 1 1 ˆ have different signs, the interval (− 4n , 1 + 4n ) contains a root γ of fI . We consider the n1 neighborhood U ⊂ C of (0, 1) and an arbitrary point z on its boundary; see Figure 4.1. It holds 5 1 that | − 1n − γ|/|z − γ| < (1 + 4n )/( 4n ) = 4n + 5 < 8n and, for any root γ̃ 6= γ of fÎ , we have 1 | − n1 − γ̃| | − n1 − z| + |z − γ̃| 1 + 2n 1 + 2n ≤ ≤ 1+ = 2 . |z − γ̃| |z − γ̃| 1 − 1n 1 − 1n Hence, it follows that λ fÎ (− n1 ) | − n1 − γ| | − n1 − γ̃| = = ∏ fˆ (z) fˆ (z) |z − γ| |z − γ̃| I I γ̃6=γ: fÎ (γ̃)=0 deg fÎ −1 ˆ 1 1 − deg fI +1 deg fÎ −1 < (4n + 5) · 2 1+ 1− 2n n √ deg fÎ +4 deg fÎ −1 < (4n + 5) · 2 · 2.72 · 2.72 < n2 ˆ and, thus, | fÎ (z)| > |λ | · 2− deg fI −4 n−1 . Since |z| ≤ 1 + 1n , we have | fI (z) − fÎ (z)| < n2n+3−ρI according to (b). Then, from Rouché’s Theorem, it follows that fI has exactly one root within U if (4.9) holds. This shows (c). It remains to prove (d): Let I˜ = (ã, b̃) and I = (a, b) the corresponding smaller interval. From our construction and (c), I + contains a root ξ = zi0 of f and the w(I) n - neighborhood of I is isolating for this root. Thus, |ã − zi | > w(I) 4n for all i. If there exists an i 6= i0 with ã ∈ ∆i , then w(I) < 4n|ã − zi | < σ (zi , f )/(16n2 ). Hence, we obtain 1 σ (zi , f ) · w(I) + |ξ − zi | ≤ |ξ − ã| + |ã − zi | < 1 + n 64n3 1 σ (zi , f ) σ (zi , f ) < 1+ · + < σ (zi , f ), n 16n2 64n3 (ξ , f ) a contradiction. It remains to show that ã ∈ / ∆i0 . If ã ∈ ∆i0 , then w(I) < σ16n 2 . According to Lemma 8 (b), T [( fJ )0 ](0, 2) already holds for a parent node J of I and, thus, t[( f˜J )0 ](0, 2) > −n2n+1−ρJ because of (a). This contradicts the fact that J is not terminal. In completely analogous manner, one shows that b̃ is also not contained in any ∆i . This proves (d). 2 We close this section with a result on the size of the recursion tree induced by D CMρ and the bit complexity of D CMρ : 19 Theorem 13. Let f be a polynomial as in (2.1) and ρ ∈ N an arbitrary positive integer. Then, the recursion tree TD CMρ induced by D CMρ is a subtree of the tree TD CM induced by D CM, thus, |TD CMρ | ≤ |TD CM | = O(Σ f + n log n). Furthermore, D CMρ demands for a number of bit operations bounded by Õ(n(Σ f + n log n)(nΓ + ρ − log σ f )). Proof. For the first claim, we remark that D CMρ never splits an interval I which is not split by D CM when applied to the exact polynomial f . Namely, if I is terminal for D CM, then either t[( fI )0 ](0, 2) > 0 or var( f , I + ) = var( fI + ,rev ) = 0. In the first case, we must have t[( f˜I )0 ](0, 2) > 1 ) are either −n2n+1−ρI whereas, in the second case, all coefficients h̃i of h̃(x) = (1 + x)n · f˜I + ( 1+x n+2−ρ n+2−ρ I I larger than −2 or smaller than 2 ; see Lemma 12 (a). Thus, I is terminal for D CMρ as well. The result on the size of TD CMρ then follows directly from Theorem 11. For the bit complexity, we first consider the cost in each iteration: For an active node (I, f˜I , ρI ) ∈ A , I = (a, b), the polynomial f˜I approximates fI to ρI ≤ ρ bits after the binary point. The absolute value of each coefficient of fI is bounded by 2nΓ because the shift operation x 7→ a+(b−a)·x does not increase the coefficients of f by a factor of more than 2n and the absolute value of the coefficients of f is bounded by 2nΓ ; see Section 2.2. It follows that the bitsize of the coefficients of f˜I is bounded by O(nΓ) + ρ. Hence, the cost for computing h̃(x), f˜I` and f˜Ir (x) is bounded by Õ(n(nΓ + ρ)). Namely, the latter constitutes a bound on the cost for a fast asymptotic Taylor shift by an O(log n)-bit number. The cost for evaluating t[( fI )0 ](0, 2), λ − , λ + and λ matches the same bound because all these computations are evaluations of a polynomial of bitsize O(nΓ + ρ) at an O(log n)-bit number. We further remark that, in each iteration, O contains disjoint isolating intervals J for some of the real roots of f and, thus, |O| ≤ n. Hence, the endpoints of the interval J have to be compared with those of at most n intervals stored in O. Since D CMρ does not produce σ any interval of size less than 8nf2 , these comparisons demand for at most O(n(log n − log σ f )) bit operations. It follows that the total cost at each node is bounded by Õ(n(nΓ + ρ − log σ f )) bit operations. The bound on the total cost then follows from our result on the size of the recursion tree. 2 4.2. Known ρ f and σ f From Lemma 3 (c), we already know that, for ρ ≥ ρ f , each root zi of f moves by at most when passing from f to an arbitrary approximation f˜ ∈ [ f ]2−ρ f ; see Definition 2 for the definition of ρ f . Hence, we expect it to be possible to isolate the roots of f by only considering approximations of f (and the intermediate results fI ) to ρ f bits after the binary point. The following theorem proves a corresponding result. σ (zi , f ) 64n3 Theorem 14. Let f be a polynomial as in (2.1) and ρ ∈ N an integer with ρ ≥ ρ max := ρ f − 3 log σ f + 16n = O(Σ f + n). f Then, D CMρ isolates all real roots of f and BJ > 2−ρ f for all (J, sJ,` , sJ,r , BJ ) ∈ O. Proof. Due to Theorem 11 and 13, the height h(D CMρ ) of TD CMρ is bounded by h(D CMρ ) ≤ log 16n2 = 2 log n + 4 − log σ f ≤ 4n − log σ f . σf 20 (4.10) Then, for any interval I = (a, b) produced by D CMρ , we have ρI ≥ ρ + 2 log w(I) ≥ ρ − 2h(D CMρ ) ≥ ρ min := ρ f + 8n − log σ f > 0. f (4.11) ρ The latter inequality guarantees that D CM does not return “insufficient precision”. Now let I be an interval whose closure I contains a root ξ = zi0 of f . We aim to show the following facts: (1) I is not discarded in Step 3 of D CMρ . (2) If t[( f˜I )0 ](0, 2) > −n2n+1−ρI , then all inequalities (4.7)-(4.9) are fulfilled. w(I) ˜ (3) In the latter case, either I˜ = (a − w(I) 2n , b + 2n ) is added to O or I only intersects intervals J, with a corresponding (J, sJ,` , sJ,r , BJ ) ∈ O, which already isolate ξ . If (1)-(3) hold, then D CMρ outputs isolating intervals for all real roots of f . Namely, D CMρ starts subdividing I0 = (− 21 , 21 ) which contains all real roots of f . Thus, for each root ξ of f , we eventually obtain an interval I such that I contains ξ and t[( f˜I )0 ](0, 2) > −n2n+1−ρI . Then, either I˜ is added to the list of isolating intervals or O already contains an isolating interval for ξ . (ξ , f ) For the proof of (1), we have already shown that w(I) > σ16n 2 . Lemma 3 (c) then ensures 0 + that an arbitrary g ∈ [ f ]2−ρ f has a root ξ ∈ I . Namely, the root ξ ∈ I stays real and moves by w(I) (ξ , f ) at most σ64n 3 < 4n when passing from f to g. Now, suppose that all coefficients h̃i of h̃(x) = 1 (1 + x)n · f˜I + ( 1+x ) are larger than −2n+2−ρI ; see (4.1) for definitions. Since |hi − h̃i | < 2n+2−ρI for all coefficients hi of fI + ,rev = ∑ni=0 hi xi (see Lemma 12 (a)), it follows that hi > −2n+3−ρI for all i. Hence, for the polynomial g(x) := f (x) + 2n+3−ρI ∈ [ f ]2−ρ f , we have gI + ,rev (x) = fI + ,rev (x) + 2n+3−LI (x + 1)n and, thus, gI + ,rev has only positive coefficients. In the case where h̃i < 2n+2−ρI for all i, we consider g(x) := f (x) − 2n+3−ρI ∈ [ f ]2−ρ f and, thus, gI + ,ρ has only negative coefficients. Hence, in both cases, there exists a g ∈ [ f ]2−ρ f which has no root in I + , a contradiction. It follows that I cannot be discarded in Step 3. For (2), suppose that t[( f˜I )0 ](0, 2) > −n2n+1−ρI . Due to Lemma 12 (a), we have t[( fI )0 ](0, 2) > n+2−ρ −n2n+2−ρI , and since log n2 w(I) I ≤ 6 + 3 log n + n − ρI − log σ f < −ρ f , it follows that g(x) := f (x) + x · sign(( fI )0 (0)) · n2n+2−ρI ∈ [ f ]2−ρ f . w(I) Hence, g has a root ξ 0 in I + . Since t[(gI )0 ](0, 2) = t[( fI )0 ](0, 2) + n2n+2−ρI > 0, the disk ∆2w(I) (a) is isolating for ξ 0 . The following argument shows that ∆ 3w(I) (a) isolates ξ : Suppose that ∆ 3w(I) (a) 2 2 contains an additional root z j 6= ξ of f . Then, σ (ξ , f ) < 3w(I) and, thus, ξ and z j would move by at most 3w(I) < w(I) 2 when passing from f to g. It follows that g would have at least two roots 64n3 within ∆2w(I) (a), a contradiction. Now, since ∆ 3w(I) (a) is isolating for ξ ∈ I, we have 2 σ (ξ , f ) < w(I) < 2σ (ξ , f ). 16n2 w(I) + The left inequality implies that the distance of ξ to any of the points a+ = a − w(I) 4n , b = b + 4n and c := a − w(I) n is larger than or equal to root zi 6= ξ and the disk ∆ 3w(I) (a). Then, w(I) 4n > σ (ξ , f ) . 64n3 Let di denote the distance between a 2 σ (zi , f ) |zi − ξ | di + 3w(I) w(I) ≤ ≤ < di + . 64n3 64n3 64n3 4 21 It follows that the points a+ , b+ , c ∈ ∆ 5w(I) (a) are located outside the disk ∆i := ∆ σ (zi , f ) (zi ). In 4 64n3 summary, none of the disks ∆i , i = 1, . . . , n, contains any of the points a+ , b+ and c. Hence, due to Lemma 3 (c), it follows that each of the values | f (c)|, | f (a+ )| and | f (b+ )| is larger than (n + 1)2−ρ f . A simple computation now shows that (n + 1)2−ρ f > 22n+8−ρI n2 . Thus, according to Lemma 12 (b), each of the absolute values |λ |, |λ − | and |λ + | is larger than (n + 1)2−ρ f − 2n+3−ρI n > 22n+8−ρI n2 − 2n+3−ρI n > 22n+7−ρI n2 . I+ (4.12) It follows that the inequalities (4.8) and (4.9) hold. Since is isolating for ξ , and f (b+ ) must have different signs and, thus, the same holds for λ − and λ + . Hence, the inequality (4.7) holds as well. In addition, we have BI˜ = min(|λ − |, |λ + |) − 2n+3−ρI n > 2−ρ f because of (4.12). It remains to show (3): If t[( f˜I )0 ](0, 2) > −n2n+1−ρI , then due to (2) and Lemma 12 (b), the interval ˜ I˜ and the w(I) n -neighborhood of I is isolating for ξ . If I does not intersect any other interval in O, then I˜ is added to O and, thus, D CMρ outputs an isolating interval for ξ . We still have to consider the case where I˜ intersects an interval J from O. From the construction of O, J is the ˜ of an interval J 0 = (c, d). Now, suppose that J is isolating for a root γ 6= ξ . The extension (c̃, d) w(J 0 ) roots ξ and γ move by at most w(I) 4n and 4n , respectively, when passing from f to an arbitrary g ∈ [ f ]2−ρ f (see the proof of (1)). Hence, it follows that the union of (a − w(I), b + w(I)) and (c − w(J 0 ), d + w(J 0 )) contains at least two roots of any g ∈ [ f ]2−ρ f . Due to Lemma 10, one of the disks ∆2w(I) (a) or ∆2w(J 0 ) (c) then also contains at least two roots of g contradicting the fact that n+2−ρI t[(pI )0 ](0, 2) > 0 for p(x) := f (x) + x · sign(( fI )0 (0)) · n2 w(I) f (a+ ) ∈ [ f ]2−ρ f and t[(qJ 0 )0 ](0, 2) > 0 n+2−ρ 0 for q(x) := f (x) + x · sign(( fI )0 (0)) · n2 w(J 0 ) J ∈ [ f ]2−ρ f . It follows that J already isolates ξ . 2 4.3. Unknown ρ f and σ f For unknown ρ f and σ f , we proceed as follows: We start with an initial precision ρ (e.g. ρ = 16) and run D CMρ . If D CMρ returns ”insufficient precision”, we double ρ and start over. Otherwise, D CMρ returns a list O = {(Jk , sk,` , sk,r , Bk )}k=1,...,m , where each interval Jk = (ck , dk ) isolates a real root of f , sk,` = sign f (ck ), sk,r = sign f (dk ) and 0 < Bk < min(| f (ck )|, | f (dk )|). As already mentioned, there is no guarantee that all roots of f are captured. Hence, in a second step, we use the subsequently described method C ERTIFYρ to check whether the region of uncertainty m [ 1 1 Jk R := − , \ 2 2 k=1 contains a root of f . If we can guarantee that f (x) 6= 0 for all x ∈ R, we return the list L = {Jk }k=1,...,m of isolating intervals. Otherwise, we double ρ and start over the entire algorithm. We have already proven in Theorem 14 that D CMρ isolates all real roots of f if ρ ≥ ρ max (i.e., ρ fulf fills the inequality (4.10)). The following considerations will show that, for ρ ≥ ρ max , C ERTIFYρ f succeeds as well. How can we guarantee that f does not vanish on R? The crucial idea is to consider a decomposition of [− 12 , 21 ] into subintervals I and corresponding µI -approximations g of fI such that g is monotone on [0, 1] or T [g](0, 1) holds. Namely, for such an interval I, we can easily estimate the image g([0, 1]) and, thus, conclude that f either contains no root in I ∩ R or that ρ < ρ max f because g(t) and fI (t) differ by at most (n + 1)µI for all t ∈ [0, 1]. More precisely, we have: 22 >+B1 c1 J d1 1 -1/2 >+B1 <-B1 a c1 d1 l( a ) l(c1) l(d1) I -1/2 <-B2 >+B 2 <-B1 a L1 c1 d1 c2 >+B m <-Bm d J2 2 cm d Jm m 1/2 <-B2 c2 b l(c2) L2 c2 Fig. 4.2. D CMρ returns a list O = {Jk , sk,` , sk,r , Bk }k , where Jk is isolating for a real root of f , sk,` = sign f (ck ), sk,r = sign f (dk ) and min(| f (ck )|, | f (dk )|) > Bk > 0. The intervals in between define the region of uncertainty R. In C ERTIFYρ , we subdivide (−1/2, 1/2) into intervals I such that, for a µ-approximation g of fI , either T [g](0, 1) holds or g is monotone on (0, 1). If T [g](0, 1) holds and |g(0)| > 8mµ, then I contains no root of f ; see Lemma 15 (a). If g is monotone on (0, 1), we consider all intervals Li in the intersection of I with R and check whether the conditions in Lemma 15 (b) are fulfilled. If they are fulfilled, then f has no root in Li ; otherwise, we must have ρ < ρ max f . Lemma 15. Let I = (a, b) be an interval and g(x) a µ-binary approximation of fI with − log µ ≥ ρ − 2(4n − log σ f ). (4.13) (a) Suppose that T [g](0, 1) holds and I is not entirely contained in one of the Jk . If |g(0)| > 8nµ, (4.14) then I¯ contains no root of f . Otherwise, ρ < ρ max f . S ¯ (b) Suppose that g is monotone on [0, 1] and let I ∩ R = si=1 Li be the intersection of I¯ and R. For each endpoint q of an arbitrary Li , we define  sk,` · Bk , if q ∈ / {a, b} and q is the left endpoint of an interval Jk    s · B , if q ∈ / {a, b} and q is the right endpoint of an interval Jk k,r k (4.15) λ (q) :=  g(0), if q = a    g(1), if q = b. If, for all Li = [q` , qr ], min(|λ (q` )|, |λ (qr )|) > 4nµ and λ (ql ) · λ (qr ) > 0, then I¯ ∩ R contains no root of f . Otherwise, we have ρ < ρ max f . Proof. If T [g](0, 1) holds, then 31 |g(0)| < |g(t)| < 53 |g(0)| for all t ∈ [0, 1] according to Lemma 6. Suppose that g(0) > 8nµ, then it follows that 1 8 | fI (t)| ≥ |g(t)| − |g(t) − fI (t)| > |g(0)| − |g(t) − fI (t)| > nµ − (n + 1)µ > 0, 3 3 ¯ hence, f has no root in I. Now suppose that |g(0)| ≤ 8nµ. Since I is not contained in any Jk , there exists a t ∈ [0, 1] with x = a + t(b − a) ∈ R and | f (x)| = | fI (t)| ≤ |g(t)| + (n + 1)µ ≤ 5 max 16nµ. If ρ ≥ ρ max f , then from (4.13) and the definition of ρ f , it follows 3 |g(0)| + (n + 1)µ < that − log µ ≥ ρ min = ρ f + 8n − log σ f ; see the computation in (4.11). Hence, we have | f (x)| < f −ρ f 2 . In addition, Lemma 12 (d) and Theorem 14 guarantee that D CMρ returns isolating intervals for all real roots of f , and each point in R has distance ≥ σ (zi , f )/(64n3 ) from each root zi . Thus, | f (x)| > (n + 1)2−ρ f due to Lemma 3 (c), a contradiction. This proves (a). 23 For (b), we consider an arbitrary interval Li = [q` , qr ]. Let t` and tr be corresponding values in [0, 1] with ql = a + t` · w(I) and qr = a + tr · w(I). If min(|λ (q` )|, |λ (qr )|) > 4nµ, then min(|g(t` )|, |g(tr )|) ≥ min(|λ (q` )|, |λ (qr )|) − (n + 1)µ > 2nµ. Namely, for q` = a, we obviously have |g(t` )| = |λ (q` )|; otherwise, |g(t` )| ≥ | fI (t` )|−(n+1)µ ≥ |λ (q` )| − (n + 1)µ. For qr , an analogous argument applies. If, in addition, λ (q` ) · λ (qr ) > 0, then g(t` ) · g(tr ) > 0 as well because λ (q` ) and λ (qr ) have the same sign as g(t` ) and g(tr ), respectively. Since we assumed that g is monotone on [0, 1], it follows that |g(t)| > 2nµ for all t ∈ [t` ,tr ]. This shows that | fI (t)| ≥ |g(t)| − (n + 1)µ > 0 for all t ∈ [t` ,tr ], thus the first part of −ρ f (b) follows. For the second part, suppose that ρ ≥ ρ max > 4nµ for all k and f . Then, Bk > 2 −ρ f | f (x)| > 2 (n + 1) for all x ∈ R according to Lemma 3 (c) and Theorem 14. Thus, if a ∈ R, we have |g(0)| ≥ | fI (0)| − (n + 1)µ = | f (a)| − (n + 1)µ > 2−ρ f · (n + 1) − (n + 1)µ > 4nµ. An analogous argument applies to b. It follows that |λ (q)| > 4nµ for all endpoints q of an arbitrary interval Li = [q` , qr ]. It remains to show that λ (q` ) · λ (qr ) > 0. We have already shown that |λ (q)| > 4nµ for each endpoint q, thus, f (q) must have the same sign as λ (q). Namely, if q ∈ {a, b}, then f (q) differs from λ (q) > 4nµ by at most (n + 1)µ < 4nµ, and, for q ∈ / {a, b}, we have sign(λ (q)) = sk,` or sign(λ (q)) = sk,r depending on whether q is the left or the right endpoint of an interval Jk . Since ρ ≥ ρ max f , R contains no root of f , thus, we must have λ (q` ) · λ (qr ) = f (q` ) · f (qr ) > 0. 2 We can now formulate the subroutine C ERTIFYρ (see Algorithm 3 in the Appendix for pseudocode). C ERTIFYρ is similar to D CMρ in the sense that we recursively subdivide I0 = (− 21 , 12 ) into intervals I and consider corresponding ρI -binary approximations f˜I of fI . Then, in each iteration, we aim to apply Lemma 15 in order to certify that I¯ ∩ R contains no root of f or ρ < ρ max f . Throughout the following consideration, we assume that σf C ERTIFYρ never produces an interval I of width w(I) ≤ 2 . (4.16) 8n We will prove this fact in Theorem 16 (b). Again, we mark comments which should help to follow the approach by an ”//” at the beginning. C ERTIFYρ . In a first step, we choose a (ρ + n + 1)-binary approximation f˜ of f and evaluate f˜(− 12 + x). Then, the resulting polynomial is approximated by a (ρ + 1)-binary approximation f˜I0 ∈ [ f˜(− 12 + x)]2−ρ−1 , thus, f˜I0 ∈ [ fI0 ]2−ρ according to Lemma 1. C ERTIFYρ maintains a list A of active nodes (I, f˜I , ρI ), where I = (a, b) ⊂ I0 is an interval, ˜fI approximates fI to ρI bits after the binary point and ρ + 2 log w(I) ≤ ρI ≤ ρ. We initially start with A := {(I0 , f˜I0 , ρ)}. For each active node, we proceed as follows: (1) Remove (I, f˜I , ρI ) from A . (2) If I ∩ R = 0, / do nothing (i.e., discard I). Otherwise, compute t[ f˜I ](0, 1). // If I ∩ R = 0, / I is contained in one of the intervals Jk , and thus we can discard I. 24 (3) If t[ f˜I ](0, 1) > −2−ρI +2 n, check whether | f˜I (0) + sign( f˜I (0)) · 2−ρI +2 n| > 2−ρI +6 n2 . (4.17) If (4.17) holds, do nothing (i.e., discard I); otherwise, return ”insufficient precision”. // For g(x) := f˜I (x) + sign( f˜I (0)) · 2−ρI +2 n ∈ [ fI ]2−ρI +3 n , the predicate T [g](0, 1) holds. From our assumption on w(I), we further have ρI − 3 − log n ≥ ρ + 2 log w(I) − 3 − log n ≥ ρ − 2(3 + 2 log n − log σ f ) − 3 − log n ≥ ρ − 2 log σ f − 8n, and thus 2−ρI +3 n ≤ 2−ρ−2(4n−log σ f ) . It follows that g fulfills the condition (4.13) from Lemma 15 and, therefore, I¯ contains no root of f if (4.17) holds; otherwise, ρ < ρ max f . 1 ) and consider (4) If t[ f˜I ](0, 1) ≤ −2−ρI +2 n, compute h̃(x) = ∑ni=0 h̃i xi := (1 + x)n · ( f˜I )0 ( 1+x the following distinct cases: (a) If h̃i > −n2n−ρI for all i (or h̃i < n2n−ρI for all i), consider g(x) := f˜I (x) + n2n−ρI · x ∈ [ fI ]n2n+1−ρI (g(x) := f˜I (x) − n2n−ρI · x, respectively). Then, for each interval Li = [q` , qr ], determine λ (q` ) and λ (qr ) as defined in (4.15). If min(|λ (q` )|, |λ (qr )|) > n2n+3−ρI and λ (q` ) · λ (qr ) > 0 for all Li , discard I; otherwise, return ”insufficient precision”. // Suppose h̃i > −n2n−ρI for all i and define g(x) := f˜I (x)+n2n−ρI x. Then, the polynomial 1 1 (1 + x)n · (g)0 ( 1+x ) = (1 + x)n · ( f˜I )0 ( 1+x ) + n2n−ρI (1 + x)n has only positive coefficients. 0 It follows that var(g , (0, 1)) = 0 and, therefore, g is monotone on [0, 1]. In addition, from our assumption on w(I), we have n2n+1−ρI ≤ 2−ρ−2(4n−log σ f )+1 (see the computation above). Hence, we can apply Lemma 15 (b) to g: That is, I¯ ∩ R contains no root of f if min(|λ (q` )|, |λ (qr )|) > n2n+3−ρI and λ (q` ) · λ (qr ) > 0 for all Li = [ql , qr ]. If one of n−ρI for all i is the latter two inequalities does not hold, then ρ < ρ max f . The case h̃i < n2 treated in exactly the same manner. (b) If there exist h̃i and h̃ j with h̃i ≤ −n2n−ρI and h̃ j ≥ n2n−ρI , then I is subdivided into I` := (a, mI ) and Ir := (mI , b). We add (I` , f˜I` , ρI − 1) and (Ir , f˜Ir , ρI − 2) to A , where f˜I` is an ρI -binary approximation of f˜I ( 2x ) and f˜Ir an (ρI − 1)-binary approximation of ρ f˜I ( x+1 2 ); see Step 4 (b) of D CM for details. If ρI < 2, return ”insufficient precision”. // Due to Lemma 1, f˜Il ∈ [ fIl ]2−ρI −1 and f˜Ir ∈ [ fIr ]2−ρI −2 . Hence, by induction, it follows that ρ + 2 log w(I) ≤ ρI ≤ ρ for all active nodes. C ERTIFYρ stops when A becomes empty. If C ERTIFYρ returns ”insufficient precision”, we know for sure that σ < σ max . Otherwise, the region of uncertainty R contains no root of f . f The following theorem proves that our assumption (4.16) for the intervals produced by C ERTIFYρ is correct. Furthermore, we show that C ERTIFYρ is also efficient with respect to bit complexity matching the worst case bound obtained for D CMρ ; see Theorem 13. 25 Theorem 16. For a polynomial f as defined in (2.1) and an arbitrary ρ ∈ N, (a) C ERTIFYρ does not produce an interval I of width w(I) ≤ σ f /(8n2 ) and it induces a recursion tree of size O(Σ f + n log n). (b) C ERTIFYρ needs no more than Õ(n(Σ f + n log n)(nΓ + ρ − log σ f )) bit operations. ρ (c) For ρ ≥ ρ max f , C ERTIFY succeeds. Proof. An interval I is only subdivided if t[ f˜I ](0, 1) ≤ −2−ρI +2 n (Step (3)) and if there exist 1 coefficients h̃i and h̃ j of h̃(x) = ∑ni=0 h̃i xi = (1 + x)n · ( f˜I )0 ( 1+x ) with h̃i < −n2n−ρI and h̃ j > n−ρ I n2 (Step 4 (b)). In the first case, we must have t[ fI ](0, 1) < 0 since |t[ fI ](0, 1) − t[ f˜I ](0, 1)| < −ρ +2 2 I n. Hence, T [ fI ](0, 1) does not hold. For the second case, we have var(( fI )0 , (0, 1)) 6= 0 1 1 ) and (1 + x)n · ( fI )0 ( 1+x ) differ by since corresponding coefficients of h̃(x) = (1 + x)n · ( f˜I )0 ( 1+x n−ρ 0 0 0 I at most n2 , and thus var( f , I) = var(( f )I , (0, 1)) = var(( fI ) , (0, 1)) 6= 0. Hence, the first part of (a) follows from Lemma 8 (d). This proves that our assumption (4.16) is always fulfilled, and thus Lemma 15 applies in Step 3 and Step 4 (a) of C ERTIFYρ . It follows that the algorithm f only fails (i.e. it returns ”insufficient precision”) if ρ < ρmax . For the second part of (a), we remark that, due to the above argument, an interval I is terminal if the disk ∆2nw(I) (m(I)) does not contain a root ξ of f with σ (ξ , f ) < 4n2 w(I). In [35, Section 4.2], it has been shown that the recursion tree T ( f 0 ) induced by the latter property 14 has size O(Σ f + n log n). Hence, the same holds for the recursion tree induced by C ERTIFYρ which is a subtree of T ( f 0 ). Finally, (c) follows in completely analogous manner as the result on the bit complexity for D CMρ as shown in the proof of Theorem 13. 2 Eventually, we present our overall root isolation method RI SOLATE. It applies to a polynomial F as given in (1.4) and returns isolating intervals for all real roots of F. RI SOLATE: Choose a starting precision ρ ∈ N (e.g., ρ = 16) and run D CMρ on the polynomial f as defined in (2.1). If D CMρ returns “insufficient precision”, we double ρ and start over again. Otherwise, D CMρ returns a list O = {(Jk , sk,` , sk,r , Bk )}k=1,...,m with isolating intervals Jk for some of the real roots of f . If C ERTIFYρ returns “insufficient precision”, we double ρ and start over the algorithm. If C ERTIFYρ succeeds, the intervals Jk = (ck , dk ) isolate all real roots of f . Hence, we return the intervals (2Γ ck , 2Γ dk ), k = 1, . . . , m, which isolate the real roots of F. The following theorem summarizes our results: Theorem 17. Let F be a polynomial as given in (1.4). Then, RI SOLATE determines isolating intervals J1 , . . . , Jm for all real roots of F and, for each interval J ∈ {J1 , . . . , Jm } containing a root ξ of F, it holds that σ (ξ , F) < w(J) < 2nσ (ξ , F). 16n2 In [35, Section 4.2], T ( f 0 ) is defined as subdivision tree obtained by recursive bisection of the interval (− 14 , 14 ) in accordance with the following rule: At depth h ∈ N0 , an interval I = (− 41 + i2−h−1 , − 14 + (i + 1)2−h−1 ) is subdivided if and only if var( f 0 , I) 6= 0 and ∆28 n5 w(I) (m(I)) contains a root ξ of f with separation σ (ξ , f ) < 27 n5 w(I). For the given 14 situation, we can alternatively define T ( f 0 ) as the (even smaller) tree obtained by recursive bisection of (− 12 , 21 ), where an interval I is subdivided if var(I, f 0 ) 6= 0 and ∆2nw(I) (m(I)) contains a root ξ of f with σ (ξ , f ) < 4n2 w(I). 26 RI SOLATE demands for coefficient approximations of F to Õ(ΣF + nΓF ) bits after the binary point and the total cost is bounded by Õ(n(ΣF + nΓF )2 ) = Õ(n(ΣF + nτ)2 ) bit operations. For F ∈ Z[x] a polynomial with integer coefficients of bitsize τ, RI SOLATE computes isolating intervals with Õ(n3 τ 2 ) bit operations. Proof. It remains to prove the complexity bounds and the claim on the width of the isolating intervals. According to Appendix 6.1, the computation of an approximate logarithmic root bound Γ ∈ N as defined in Section 2.2 needs Õ(n2 ΓF ) bit operations. For a fixed precision ρ, the total cost for running D CMρ and C ERTIFYρ is bounded by Õ(n(Σ f + n log n)(nΓ + ρ − log σ f )) = Õ(n(ΣF + nΓ)(nΓ + ρ − log σF )) bit operations; see Theorem 13 and Theorem 16. Since we double ρ in each step and succeed max = O(Σ + n) = O(Σ + nΓ) and we need at most for ρ ≥ ρ max F f f , ρ is always bounded by 2ρ f a logarithmic number of rounds. Hence, it follows that (up to logarithmic factors) the total cost is dominated by the cost for the last run which is Õ(n(ΣF + nΓ)2 ). Furthermore, we have to approximate the coefficients of f to O(Σ f + n) = O(ΣF + nΓ) bits after the binary point. Hence, the coefficients of F have to be approximated to O(ΣF + nΓ) bits after the binary point; see Section 2.3 for more details. From our construction of f and Γ, it holds that Γ < 8 log n + 1 + ΓF . Hence, we can replace Γ by ΓF in the above complexity bounds. For the special case where F = An · xn + · · · + A0 is a polynomial with integer coefficients of bit size τ, we first divide F by its leading coefficient to meet the requirements in (1.4). Then, the bound on the bit complexity follows from the above general bound (applied to F(x)/An ) and the fact that ΣF = Õ(nτ); see Appendix 6.2. The estimate on the size of the isolating intervals is due to the following consideration: An inξ terval I which contains the root z = 2Γ+1 of f is not subdivided by D CMρ if w(I) ≤ σ (z, f )/(8n2 ). Hence, any interval Jk which is returned by D CMρ as an isolating interval for z is the exw(I) 2 tension I˜ = (a − w(I) 2n , b + 2n ) of an interval I = (a, b) with w(I) > σ (z, f )/(16n ), and thus w(J) = 2Γ+1 w(I) > σ (ξ , F)/(16n2 ). From our construction, the w(I) n -neighborhood of I isolates z as well. This shows that w(J) = 2Γ+1 w(I) < 4Γnσ (zi , f ) = 2nσ (ξ , F). 2 4.4. Some Remarks 4.4.1. On the Complexity Analysis for Integer Polynomials We remark that, in order to achieve the complexity bound Õ(n3 τ 2 ) for integer polynomials, the subroutine C ERTIFYρ and its analysis is actually not needed. Namely, due to our considerations in Appendix 6.2, we can compute explicit upper bounds for Σ f (in terms of n and τ) and, thus, also an explicit upper bound ρ ∗ (n, τ) for ρ max which matches ρ max at least with respect to worst f f ρ ∗ (n,τ) case complexity. Then, Theorem 14 guarantees that D CM computes isolating intervals for all real roots of f . Unfortunately, this approach cannot be considered practical at all because such upper bounds usually tend to be much larger than the actual ρ max f . We would like to emphasize on the fact that our algorithm is output sensitive in the way that it demands for a precision which is not much larger than ρ f . At this point, we even conjecture that, for any bisection method, the bound on the bit complexity as achieved by our algorithm is optimal (up to log-factors). We are not aware of any lower 27 bounds for the bit complexity of root isolation algorithms, and we can also not provide a rigorous proof for the optimality of our bound. However, the intuition behind our claim is that, for a Mignotte polynomial F, the bounds on the precision demand and the size of the recursion tree seem to be optimal. Namely, any bisection algorithm needs Θ(nτ) steps to separate two roots of F with pairwise distance 2−Θ(nτ) . In addition, if we perturb the coefficients of F by more than 2−Θ(nτ) , the two ordinary roots can move by more than their initial separations, hence it seems that a precision of size Θ(nτ) is needed to isolate these roots from each other. We finally remark that, in practice, it may be advantageous to start with the exact representation of the rational polynomial F/An and not with an (artificial) approximation to a certain number ρ of bits. In this case, we propose to use the classical Descartes method with exact arithmetic for the first iterations, and only if the exact representation of the polynomials constructed during the subdivision process exceeds the given working precision ρ, we switch to our modified Descartes method, where approximations are considered. However, as already argued above, we do not expect that this approach yields any improvement with respect to worst case bit complexity. 4.4.2. On Efficient Implementation We formulated our algorithm in a way to make it accessible to the complexity analysis but still feasible and efficient for an implementation. Nevertheless, we recommend to consider a slight modification of our algorithm when actually implementing it. For our certification step C ERTIFYρ , the most obvious modification is to only subdivide the region R instead of the entire interval (− 21 , 12 ). More precisely, R decomposes into intervals L j ”in between” the isolating intervals Jk . Then, we approximate the polynomials fL j to ρ bits after the binary point and recursively proceed each L j in a similar way as proposed in C ERTIFYρ . An experimental implementation of our algorithm in M APLE has shown that, following this approach, the running time for the certification step is almost negligible, whereas, for the original formulation, it is approximately of the same magnitude as the running time for D CMρ . Furthermore, we propose to also use the inclusion predicate based on Descartes’ Rule of Signs. With respect to complexity, our inclusion predicate based on the T 0 [3/2](·)-test (see Corollary 7) is comparable to Descartes’ Rule of Signs, where we check whether f has exactly one sign variation for a certain interval. However, in practice, this subtle difference is crucial because already log n bisection steps more for each root may render an algorithm inefficient. As an alternative, for an interval I, we propose to check whether there exists a ”good” approximation g of fI with var(g, (0, 1)) = 1. 15 Namely, if there exists such a g, we can proceed with f˜I := g which has exactly one root in I. Thus, it is easy to refine I (via simple bisection or quadratic interval refinement) such that T [g0 ](0, 2) holds as well. We finally report on an interesting behavior of the proposed method. It is easy to see that, for small intervals I = (a, b), the leading coefficients of fI (x) = f (a + w(I) · x) are considerably smaller than the first-order coefficients. Since we only consider a certain number ρI ≤ ρ of bits after the binary point, the approximations f˜I can be chosen of a considerably lower degree than fI . As a consequence, the cost at such an interval is considerably reduced because we have to 1 1 compute the polynomial f˜I + = f˜I ( 4n + (1 + 2n ) · x) which is expensive for large degrees. In + Since f˜I approximates fI to ρI bits after the binary point, we can derive an interval approximation [ fI,rev ] := [a− 0 , a0 ] + − + n · · · [an , an ]·x of fI,rev to ρI −n bits after the binary point. We can then easily check whether there exists a polynomial h ∈ [ fI,rev ] with var(h) = 1, and transform h back to g(x) = (x−1)n ·h(1/(x−1)). The so-obtained polynomial g approximates fI to at least ρI − 2(n + 1) bits after the binary point and var(g, [0, 1]) = 1. Notice that, following this approach, the precision ρI has to be updated accordingly. 15 28 particular, for a polynomial with two very nearby roots (such as Mignotte polynomials), this behavior can be clearly observed. More precisely, when refining an interval I that contains two nearby roots, the degree of f˜I decreases in each bisection step and eventually equals 2 for I small enough. We consider this behavior as quite natural because fI implicitly captures the information on the location of the roots in a neighborhood of I, whereas the influence of all other roots becomes almost negligible. 5. Conclusion We presented a novel deterministic algorithm to isolate the real roots of a square-free polynomial F with arbitrary real coefficients. Our analysis shows that the hardness of isolating the real roots exclusively depends on the location of the roots and not on the coefficient type of F. Furthermore, the overall running time is significantly reduced by considering approximations at each node of the recursion tree. In particular, for integer polynomials, we achieve an improvement with respect to worst case bit complexity by a factor n = deg F compared to other practical methods such as the Descartes method, the continued fraction method, or the Sturm method. The improvement stems from the fact that exact arithmetic produces too much information for the task of root isolation and, thus, a significant overhead of computation. Hence, for the main part, we consider our result to be the missing theoretical proof of a fact that has already been observed in practice, namely, that using approximate but certified arithmetic instead of exact arithmetic yields a significant improvement. We are confident that this result does not only hold for the Descartes method but also for a majority of the known real roots solvers and encourage other researchers to develop corresponding approximate variants. Very recent work [37] shows how to combine the Descartes method with Newton iteration to improve upon the linear convergence of the original Descartes method. The algorithm N EW D SC from [37] can only be used to isolate the real roots of an integer polynomial F, and it achieves a worst case bit complexity of Õ(n3 τ). Hence, the major remaining research question is whether combining the two approaches, that is, using approximate arithmetic as proposed in this paper and using Newton iteration as proposed in [37], yields a practical algorithm that achieves record complexity bounds comparable to the bounds achieved by Pan’s method. 29 References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] A. G. Akritas. There is no ”Descartes’ method”. In Computer Algebra in Education, pages 19–35, 2008. A. G. Akritas and A. Strzeboński. A comparative study of two real root isolation methods. Nonlinear Analysis:Modelling and Control, 10(4):297–304, 2005. A. Alesina and M. Galuzzi. A new proof of Vicent’s theorem. L’Enseignement Mathematique, 44:219–256, 1998. A. Alesina and M. Galuzzi. Addendum to the paper ”A new proof of Vicent’s theorem”. L’Enseignement Mathematique, 45:379–380, 1999. A. Alesina and M. Galuzzi. Vincent’s theorem from a modern point of view. Categorical Studies in Italy 2000, Rendiconti del Circolo Matematico di Palermo, Serie II, 64:179–191, 2000. S. Basu, R. Pollack, and M.-F. Roy. Algorithms in Real and Algebraic Geometry. Springer, 2nd edition, 2006. E. Berberich, P. Emeliyanenko, and M. Sagraloff. An elimination method for solving bivariate polynomial systems: Eliminating the usual drawbacks. In ALENEX: Algorithm Engineering and Experiments, pages 35–47, 2011. D. Bini and G. Fiorentino. Design, analysis and implementation of a multiprecision polynomial rootfinder. Numerical Algorithms, 23:127–173, 2000. M. Burr and F. Krahmer. SqFreeEVAL: An (almost) optimal real-root isolation algorithm. Journal of Symbolic Computation, 47(2):153–166, 2012. G. E. Collins and A. G. Akritas. Polynomial real root isolation using descarte’s rule of signs. In SYMSAC: Symposium on Symbolic and Algebraic Computation, pages 272–275, 1976. Z. Du, V. Sharma, and C. Yap. Amortized bounds for root isolation via Sturm sequences. In SNC: Symbolic and Numeric Computation, pages 113–130. 2007. A. Eigenwillig. On multiple roots in Descartes’ rule and their distance to roots of higher derivatives. Journal of Computational and Applied Mathematics, 200(1):226–230, 2007. A. Eigenwillig. Real Root Isolation for Exact and Approximate Polynomials using Descartes’ Rule of Signs. PhD thesis, Universität des Saarlandes, May 2008. A. Eigenwillig, L. Kettner, W. Krandick, K. Mehlhorn, S. Schmitt, and N. Wolpert. An exact descartes algorithm with approximate coefficients. In CASC: Computer Algebra in Scientific Computing, pages 138–149, 2005. A. Eigenwillig, V. Sharma, and C. K. Yap. Almost tight recursion tree bounds for the descartes method. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 71–78, 2006. J. Gerhard. Modular algorithms in symbolic summation and symbolic integration. LNCS, Springer, 3218, 2004. X. Gourdon. Combinatoire, Algorithmique et Géométrie des Polynômes. Thèse, École polytechnique, 1996. M. Hemmer, E. P. Tsigaridas, Z. Zafeirakopoulos, I. Z. Emiris, M. I. Karavelas, and B. Mourrain. Experimental evaluation and cross-benchmarking of univariate real solvers. In SNC: Symbolic Numeric Computation, pages 45–54, 2009. J. R. Johnson and W. Krandick. Polynomial real root isolation using approximate arithmetic. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 225–232, 1997. 30 [20] M. Kerber and M. Sagraloff. Efficient real root approximation. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 209–216, 2011. [21] M. Kerber and M. Sagraloff. A worst-case bound for topology computation of algebraic curves. Journal of Symbolic Computation, 47(3):239–258, 2012. [22] W. Krandick and K. Mehlhorn. New bounds for the Descartes method. Journal of Symbolic Computation, 41(1):49–66, 2006. [23] J. McNamee and V. Pan. Numerical Methods for Roots of Polynomials -. Number 2 in Studies in Computational Mathematics. Elsevier Science, 2013. [24] K. Mehlhorn and S. Ray. Faster algorithms for computing Hong’s bound on absolute positiveness. Journal of Symbolic Computation, 45(6):677–683, 2010. [25] K. Mehlhorn and M. Sagraloff. A deterministic algorithm for isolating real roots of a real polynomial. Journal of Symbolic Computation, 46(1):70–90, 2011. [26] K. Mehlhorn, M. Sagraloff, and P. Wang. From approximate factorization to root isolation. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 283– 290, 2013. [27] B. Mourrain, F. Rouillier, and M.-F. Roy. Bernsteins basis and real root isolation. Combinatorial and Computational Geometry, 52:459–478, 2005. [28] N. Obreschkoff. Über die Wurzeln von algebraischen Gleichungen. Jahresbericht der Deutschen Mathematiker-Vereinigung, 33:52–64, 1925. [29] N. Obreschkoff. Verteilung und Berechnung der Nullstellen reeller Polynome. VEB Deutscher Verlag der Wissenschaften, 1963. [30] N. Obreschkoff. Zeros of Polynomials. Marina Drinov, Sofia, 2003. Translation of the Bulgarian original. [31] A. M. Ostrowski. Note on Vincent’s theorem. Annals of Mathematics, Second Series, 52(3):702–707, 1950. Reprinted in: Alexander Ostrowski, Collected Mathematical Papers, vol. 1, Birkhäuser Verlag, 1983, pp. 728–733. [32] V. Y. Pan. Solving a polynomial equation: some history and recent progress. SIAM Review, 39(2):187–220, 1997. [33] V. Y. Pan. Univariate polynomials: Nearly optimal algorithms for numerical factorization and root-finding. Journal of Symbolic Computation, 33(5):701–733, 2002. [34] F. Roullier and P. Zimmermann. Efficient isolation of a polynomial’s real roots. Journal of Computational and Applied Mathematics, pages 33–50, 2004. [35] M. Sagraloff. A general approach to isolating roots of a bitstream polynomial. Mathematics in Computer Science, 4(4):481–506, 2010. [36] M. Sagraloff. On the complexity of real root isolation. CoRR, abs/1011.0344, 2010. [37] M. Sagraloff. When Newton meets Descartes: a simple and fast algorithm to isolate the real roots of a polynomial. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 297–304, 2012. [38] M. Sagraloff and C.-K. Yap. A simple but exact and efficient algorithm for complex root isolation. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 353–360, 2011. [39] A. Schönhage. The fundamental theorem of algebra in terms of computational complexity, 1982. Manuscript, Department of Mathematics, University of Tübingen. Updated 2004. [40] A. Schönhage. Quasi-GCD computations. Journal of Complexity, 1(1):118–137, 1985. [41] V. Sharma. Complexity of real root isolation using continued fractions. Theoretical Computer Science, 409(2):292–310, 2008. 31 [42] A. W. Strzebonski and E. P. Tsigaridas. Univariate real root isolation in an extension field. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 321– 328, 2011. [43] A. W. Strzebonski and E. P. Tsigaridas. Univariate real root isolation in multiple extension fields. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 343–350, 2012. [44] E. Tsigaridas and I. Emiris. On the complexity of real root isolation using continued fractions. Theoretical Computer Science, pages 158–173, 2008. [45] J. von zur Gathen and J. Gerhard. Fast algorithms for Taylor shifts and certain difference equations. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 40–47, New York, NY, USA, 1997. ACM. [46] J.-C. Yakoubsohn. Numerical analysis of a bisection-exclusion method to find zeros of univariate analytic functions. Journal of Complexity, 21(5):652 – 690, 2005. [47] C. Yap. Fundamental Problems in Algorithmic Algebra. Oxford University Press, 2000. 32 6. 6.1. Appendix Approximating ΓF Theorem 1. An integer Γ ∈ N with Γ p ≤ Γ < 8 log n + Γ p (6.1) can be computed with Õ(n2 Γ p ) bit operations. The computation uses an approximation of F to L = O(nΓF ) bits after the binary point. Proof. Consider the Cauchy polynomial n−1 F̄(x) := |An |xn − ∑ |Ai |xi i=0 of F. Then, according to [13, Proposition 2.51], F̄ has a unique positive real root ξ ∈ R+ , and the following inequality holds: n maxi=1,...,n |ξi | ≤ ξ < · maxi=1,...,n |zi | < 2n · maxi=1,...,n |zi |. ln 2 It follows that F̄(x) > 0 for all x ≥ ξ and F̄(x) < 0 for all x < ξ . Furthermore, since F̄ coincides with its own Cauchy polynomial, each complex root of F̄ has absolute value less than or equal to ξ . Let k0 be the smallest non-negative integer k with F̄(2k ) > 0 (which is equal to the smallest k with 2k > ξ ). Our goal is to compute an integer Γ with k0 ≤ Γ ≤ k0 + 1. Namely, if Γ fulfills the latter inequality, then max(1, max |zi |)) ≤ max(1, ξ ) ≤ 2Γ < 4 max(1, ξ ) < 8n · max(max |zi |, 1), i i and thus Γ fulfills inequality (6.1). In order to compute a Γ with k0 ≤ Γ ≤ k0 + 1, we use exponential and binary search (try k = 1, 2, 4, 8, . . . until F̄(2k ) > 0 and, then, perform binary search on the interval k/2 to k) and approximate evaluation of F̄ at the points 2k : More precisely, we evaluate F̄(2k ) using interval arithmetic with a precision ρ (using fixed point arithmetic) which guarantees that the width w of B(F̄(2k ), ρ) is smaller than 1/4, where B(E, ρ) is the interval obtained by evaluating a polynomial expression E via interval arithmetic with precision ρ for the basic arithmetic operations; see [20, Section 4] for details. We use [20, Lemma 3] to estimate the cost for each such evaluation: Since F̄ has coefficients of size less than 2n Mea(F), we have to choose ρ such that 2−ρ+2 (n + 1)2 Mea(F)2n+nk < 1/4 in order to ensure that w < 1/4. Hence, ρ is bounded by O(log Mea(F) + nk) and, thus, each interval evaluation needs Õ(n(log Mea(F) + nk)) bit operations. We now use exponential plus binary search to find the smallest k such that B(F̄(2k ), ρ) contains only positive values. The following argument then shows that k0 ≤ k ≤ k0 + 1: Obviously, we must have k ≥ k0 since F̄(2k ) < 0 and F̄(2k ) ∈ B(F̄(2k ), ρ) for all k < k0 . Furthermore, the point x = 2k0 +1 has distance more than 1 to each of the roots of F̄, and thus |F̄(2k0 +1 )| ≥ |An | ≥ 1/4. Hence, it follows that B(F̄(2k0 + 1), ρ) contains only positive values. For the search, we need O(log k0 ) = O(log log ξ ) = O(log(log n + ΓF )) iterations, and the cost for each of these iterations is bounded by Õ(n(log Mea(F) + nk0 )) = Õ(n2 ΓF ) bit operations. 2 33 6.2. Integer Polynomials For a polynomial F with integer coefficients of bit size τ, we aim to show that ΣF = Õ(nτ). We proceed in two steps: First, we cluster the roots ξi of F into subsets consisting of nearby roots. Second, we apply the generalized Davenport-Mahler bound [11, 13] to the roots of F. W.l.o.g., we can assume that σ (ξ1 , F) ≤ · · · ≤ σ (ξn , F). For h ∈ N, we denote i(h) the maximal index i with σ (ξi , F) ≤ 2−h and R(h) := {ξ1 , . . . , ξi(h) } the corresponding set of roots. If h ≤ log(1/σF ), then R contains at least two roots. For a fixed h, we are interested in a partition of R := R(h) into disjoint subsets R1 , . . . , Rl that consist of nearby points, only. Lemma 18. Suppose that h ≤ log(1/σF ). Then, there exists a partition of R := R(h) into disjoint sets R1 , . . . , Rl such that |Ri | ≥ 2 for all i ∈ {1, . . . , l} and |ξ − ξ 0 | ≤ n2−h for all ξ , ξ 0 ∈ Ri . Proof. We initially set R1 := {ξ1 }. Then, we add all roots ξi to R1 that satisfy |ξi − ξ1 | ≤ 2−h . For each root in R1 , we proceed in the same way. More precisely, for each ξ ∈ R1 , we add those roots ξ 0 ∈ R to R1 with |ξ − ξ 0 | ≤ 2−h . If no further root can be added to R1 , we consider the set R\R1 of the remaining roots and treat it in exactly the same manner. Finally, we end up with a partition R1 , . . . , Rl of R such that, for any two points in any Ri , their distance is less than or equal to (|Ri | − 1)2−h < n · 2−h . Furthermore, each of the sets Ri must contain at least two roots since σ (ξi , F) ≤ 2−h for all i = 1, . . . , i(h). 2 We now fix a h > 4 log n and consider a partitioning of R := R(h) as in the above lemma. Then, we define Gi to be the directed graph on each Ri which connects consecutive roots of Ri in ascending order of their absolute values. We further define G := (R, E) as the union of all Gi . Then, G is a directed graph on R with the following properties: (1) each edge (α, β ) ∈ E satisfies |α| ≤ |β |, (2) G is acyclic, and (3) the in-degree of each node is at most 1. Hence, we can apply the generalized Davenport-Mahler bound [11, 13] to G : √ !#E n/2 3 1 1 · ∏ |α − β | ≥ (√n + 1 · 2τ )n−1 · n n (α,β )∈E Since each set Ri contains at least 2 roots, we must have i(h) > #E ≥ i(h)/2. Furthermore, for each edge (α, β ) ∈ E, we have |α − β | ≤ n · 2−h . Thus, it follows that (notice that n · 2−h < 1) √ !i(h) n/2 i(h)/2 i(h) 3 3 1 1 1 −h 2 (n · 2 ) > √ · · 2 > , · n nτ τ n−1 n n (n + 1) · 2 n ( n+1·2 ) and thus i(h) < 2n(τ + log(n + 1)) 2n(τ + log(n + 1)) 8n(τ + log(n + 1)) < = . log 3 − 3 log n + h h/4 h From the latter inequality, we conclude that log(1/σF ) < 4n(τ + log(n + 1)) + 1 since, otherwise, there exists an h with 4n(τ + log(n + 1)) ≤ h ≤ log(1/σF ) and i(h) < 2 which is not possible. For the bound on ΣF , it suffices to consider only the roots ξ1 , . . . , ξk with separation less than 1/n4 since each root with a larger separation contributes with at most max(τ + 1, 4 log n) to ΣF . Thus, ΣF = Õ(nτ) follows from k − ∑ log σ (ξi , F) < i=1 d4n(τ+log(n+1))e ∑ d4n(τ+log(n+1))e i(h) < 8n(τ + log(n + 1)) h=1 ∑ h=1 34 1 = O(nτ log(nτ)). h 6.3. Algorithms Algorithm 1 D CM Require: polynomial f = ∑0≤i≤n ai xi ∈ R[x] as defined in (2.1) Ensure: returns a list O of disjoint isolating intervals for all real roots of f {only in the REAL-RAM model} I0 := (− 12 , 21 ) fI0 (x) := f (− 12 + x) A := { (I0 , fI0 ) }; O := 0/ {list of active and isolating intervals} repeat (I, fI ) some element in A with I = (a, b); delete (I, fI ) from A 1 1 1 + 1 + 2n x and fI + ,rev (x) = ∑ni=0 hi xi := (1 + x)n · fI + 1+x fI + := fI − 4n if var( fI + ,rev ) = 0 then do nothing else if t[( fI )0 , 3/2](0, 2) > 0 then s := sign( fI + (0) · fI + (1)) if s ≥ 0 then do nothing else if I + does not intersect any interval in O then add I + to O else do nothing end if end if else subdivide Iinto Il := (a, mI ) and Ir := (mI , b) = fI` (x + 1) fI` := fI 2x and fIr := fI x+1 2 add (Il , fI` ) and (Ir , fIr ) to A end if end if until A is empty return O 35 Algorithm 2 D CMρ Require: polynomial f = ∑0≤i≤n ai xi ∈ R[x] as in (2.1) and a ρ ∈ N Ensure: returns ”insufficient precision” or a list O = {Jk , sk,` , sk,r , Bk } of disjoint isolating intervals Jk = (ck , dk ) for some of the real roots of f (and sk,` = sign( f (ck )), sk,r = sign( f (dk )) and 0 < Bk ≤ min(| f (ck )|, | f (dk )|). I0 := (− 12 , 21 ) f˜ a (ρ + n + 1)-binary approximation of f f˜I0 a (ρ + 1)-binary approximation of f˜(− 12 + x) A := { (I0 , f˜I0 , ρ) }; O := 0/ {⇒ f˜I0 ∈ [ fI0 ]2−ρ } {list of active and isolating intervals} repeat ˜ (I, f˜I , ρI ), where I := (a, b), some element in A ; delete (I, fI , ρI ) from A ˜fI + (x) := f˜I − 1 + 1 + 1 x and h̃(x) = ∑ni=0 h̃i xi := (1 + x)n · f˜I + 1 4n 2n 1+x if h̃i > −2n+2−ρI for all i or h̃i < 2n+2−ρI for all i then do nothing else if t[( f˜I )0 , 3/2](0, 2) > −n · 2n+1−ρI then fÎ (x) := f˜I (x) + sign(( f˜I )0 (0)) · n · 2n+1−ρI · x λ − := f˜I + (0) − 2n−1−ρI , λ+ := f˜I + (1) + (4n + 1)2n−1−ρI and λ := f˜I (−1/n) − 2n+1−ρI . w(I) intersects no J for all (J, sJ,` , sJ,r , BJ ) ∈ O and λ − · λ + < 0 if I˜ := a − w(I) 2n , b + 2n ˆ and min(|λ − |, |λ + |) > n2n+3−ρI and |λ | > n2 2deg( fI )+7+n−ρI then ˜ sign(λ − ), sign(λ + ), min(|λ − |, |λ + |) − 2n+3−ρI n) to O add (I, {⇒ I˜ contains a root ξ of f and the w(I) n -neighborhood of I is isolating for ξ } else do nothing {J˜ is already isolating for ξ } end if else do nothing end if else if ρI < 0 then return ”insufficient precision” else if ρI < 2 then return ”insufficient precision” else Subdivide I into I` := (a, mI ) and Ir := (m I , b) f˜I` an ρI -binary approximation of f˜I 2x {⇒ f˜Il ∈ [ fI` ]2−(ρI −1) } 1+x ˜ ˜ fIr an (ρI − 1)-binary approximation of fI 2 {⇒ f˜Ir ∈ [ fIr ]2−(ρI −2) } Add (I` , f˜I` , ρI − 1) and (Ir , f˜Ir , ρI − 2) to A end if end if end if until A is empty return O 36 Algorithm 3 C ERTIFYρ Require: polynomial f = ∑0≤i≤n ai xi ∈ R[x] as defined in (2.1), an integer ρ ∈ N, and the list O = {(Jk , sk,` , sk,r , Bk )}k=1,...,s returned by D CMρ . Ensure: returns ”insufficient precision” or the list L = {Jk }k=1,...,s of isolating intervals with the guarantee that, for each real root of f , there exists a corresponding interval in L . I0 := (− 21 , 21 ) f˜ a (ρ + n + 1)-binary approximation of f f˜I0 a (ρ + 1)-binary approximation of f˜(− 12 + x) A := { (I0 , f˜I0 , ρ) } {⇒ f˜I0 ∈ [ fI0 ]2−ρ } {list of active intervals} repeat (I, f˜I , ρI ), where I := (a, b), some element in A ; delete (I, f˜I , ρI ) from A . S if I¯ ∩ R = si=1 Li = 0/ then do nothing else if t[ f˜I , 3/2](0, 1) > −n · 2−ρI +2 then if | f˜I (0) + sign( f˜I (0)) · 2−ρI +2 n| > n2 · 2−ρI +6 then do nothing {I contains no root of f } else return ”insufficient precision” {ρ < ρ max f } end if else 1 h̃(x) := ∑ni=0 h̃i xi = (1 + x)n · ( f˜I )0 ( 1+x ) n−ρ I if h̃i < n · 2 for all i (or h̃i > −n · 2n−ρI for all i) then ˜ g(x) := fI (x) − n · 2n−ρI (or g(x) := f˜I (x) + n · 2n−ρI , respectively); if for each Li = [q` , qr ], min(|λ (q` )|, |λ (qr )|) > n · 2n+3−ρI and λ (ql ) · λ (qr ) < 0 then do nothing {I ∩ R contains no root of f ; λ (ql ), λ (qr ) defined as in (4.15)} else return ”insufficient precision” {ρ < ρ max f } end if else if ρI < 2 then return ”insufficient precision” else Subdivide I into Il := (a, m` ) and Ir := (m ` , b) f˜I` an ρI -binary approximation of f˜I 2x {⇒ f˜I` ∈ [ fI` ]2−(ρI −1) } 1+x ˜ ˜ fIr an (ρI − 1)-binary approximation of f` 2 {⇒ f˜Ir ∈ [ fIr ]2−(ρI −2) } Add (Il , f˜I` , ρI − 1) and (Ir , f˜Ir , ρI − 2) to A end if end if end if end if until A is empty return ”certification successful” {The region of uncertainty R contains no root of f } 37

On the Complexity of the Descartes Method when using

Related documents

Products

Support

On the Complexity of the Descartes Method when using

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib