Example of Reducing Cancellation Errors: Quadratic Formula The quadratic equation ax 2 ` bx ` c “ 0, has two roots x1 “ ´b ` ? b2 ´ 4ac 2a , a, b, c P R, x2 “ ´b ´ a ­“ 0 ? b2 ´ 4ac 2a . Example: Consider the quadratic equation x 2 ` 76.3x ´ 1.71 “ 0, whose roots are approximately (to 10 digits) x1 “ 0.02240495436, x2 “ ´76.32240495. Using 4-digit rounding arithmetic: ? b2 ùñ Relative error in x1 : — only RW ‰1{2˘ a 4ac “ fl flp76.3 q ‘ flp4 b 1.71q “ `“ 2 ´76.30 ‘ 76.35 “ 2¨1 |x1 ´ flpx1q 0.002595 “ « | x1 | 0.02240 ¨ ¨ ¨ flpx1 q “ significant digit of accuracy. 25 Similarly: flpx2 q “ fl with relative error ´ ´76.30 a 76.35 ¯ “ fl 2.000 ´76.322 ¨ ¨ ¨ ` 76.35 ´76.322 ... ´ ´152.7 ¯ 2.000 “ ´76.35 « 3.6 ˆ 10´4. What happened to cause the large relative error in x1 ? 2 2 We have: b ą 0, b " 4ac ùñ pb ´4ac q 1{2 « b, so in the calculation of x1 “ we have subtraction of nearly equal numbers ùñ loss of significance. ´b ` ? b2 ´ 4ac 2a However, computing x2 (when b ą 0q involves addition of the nearly equal numbers ´b and ´pb2 ´ 4ac q1{2 : no problem. We can reduce the round-off error in x1 by rationalizing the numerator in the quadratic formula: x1 “ ´b ` ? b2 ´ 4ac 2a x1 “ ùñ — alternative quadratic formula (use if b ą 0 to avoid subtractive cancellation). Example (continued): Applying this formula to our previous example: ˆ flpx1 q “ fl ´2p´1.710q 76.30 ` 76.35 ˙ ˆ “ fl 3.420 ˙ 152.7 “ 0.02240 — correct to 4 digits Similarly, in the case b ă 0, the other root x2 is susceptible to subtractive cancellation, which can be mitigated by rationalizing the numerator (as above) to give the alternative form SFU MACM 316, FALL 2021 — 1: P RELIMINARIES & C OMPUTER A RITHMETIC 26 (b) Amplification of Round-off Error The floating-point representation of numbers introduces relative errors ď u (e.g. for double precision computer arithmetic with rounding, we have seen (p.19) |x ´ flpx q| ď 2´53 « 1.1 ˆ 10´16 ) |x | ùñ absolute error in representing x is ď ux. Subsequent division by a small number (or equivalently, multiplication by a large number) can magnify the absolute error. In general, if flpx q “ x ` δ for some δ , and z “ 10´n , then x cz “ In this case, the absolute error in floating-point division is |δ| ˆ 10n , which (for large n) can be much larger than |δ|. RW 27 Introductory Example (revisited): (pp.7–9 of Introduction) Recall the numerical derivative calculation of f px q “ sin x at x “ 1: • f 1 p1q « D` f p1q “ f p1 ` hq ´ f p1q h ˆ ` • the computer returns fl fl flpsinp1 ` hqq ´ flpsinp1qq ˘˙ flphq • error in flpsinp1 ` hqq, flpsinp1qq is roughly u “ 12 mach « 10´16 • ùñ for h “ 10´p , error in final answer is approximately mach {h “ 10p mach « 10p´16 — amplification of round-off error for small h (large p)! We say that the differentiation formula D` f is not numerically stable: it is substantially affected by round-off error, so we cannot get an arbitrarily accurate result merely by decreasing h. In fact, one can show for this problem (HW 2): • use that truncation error is Ophq and round-off error is Opmach {hq; 1{2 • the optimal step size (minimizing the total error) is h “ Opmach q; 1{2 • hence the minimum expected total error is Opmach q. e.g. for double precision ( 21 mach « 10´16 ), we can expect at best only 8 digit accuracy in the numerical derivative calculated in this way. To improve available accuracy, need to use a better algorithm (reduce truncation error) or higher precision. . . SFU MACM 316, FALL 2021 — 1: P RELIMINARIES & C OMPUTER A RITHMETIC 28