Numerical Error AE = error = xt - xa,

advertisement
Numerical Error: Atkinson defines the absolute error to be
AE = error = xt - xa,
where xa approximates some exact value xt, and the relative error as
RE = relative error = (xt - xa)/xt.
We will also have occasion to use these logarithmic measures of error:
d-acc = “decimal places of accuracy” = - log10|error|
-acc = “digits of accuracy” = - log10|RE|
Let’s try and justify the d-acc & -acc labeling...
Suppose xa is accurate to at least the dth decimal place (i.e. |AE|  10-d),
where d is an integer. Then d-acc  d, and conversely. In particular, since
d-acc  [d-acc], with [ ] denoting the ‘greatest integer function’, it holds
that xa is accurate to at least [d-acc] decimal places.
If |AE|  10-d, then since |xt|  10e, |RE|  10-(e+d)  -acc  e+d.
Here, we define xa to have at least k digits of accuracy* with respect to xt if
the magnitude of the error, |AE|, is less than or equal to 1 unit in the (k+1)st
digit counting rightward from xt’s first nonzero digit. More precisely,
|AE|  10e-k, for e satisfying |xt| = 10e (d1.d2d3...dk...) with d1  1.
Note: This also implies |RE|  10-k so that -acc  k.
*This is a bit more stringent than Atkinson’s definition (1 unit vs. 5 units).
Claim: If -acc   (a positive integer), then xa is accurate wrt xt to at least
the (- 1)st place counting rightward from xt’s first nonzero digit.
Proof: Suppose that -acc  . Then -log10|RE|    |RE|  10- 
|AE|  |xt| 10- < 10e-(-1) and xa has the claimed accuracy.
Note: We can’t do better than the (- 1)st place. Try xt = 5.000 & xa = 4.995.
Then |AE| = 5 10-3 & -acc = 3, but |AE| isn’t less than or equal to10-3.
Claim: Suppose xa is accurate wrt xt to at least the th place counting
rightward from xt’s first nonzero digit, i.e. assume |AE|  10e-,
where |xt| = (d1.d2d3...) 10e with d1  1. Then -acc  .
Proof: Since |RE| = |AE|/|xt|  10-, this follows immediately.
Atkinson’s #7 on pg. 43, but using the round-to-even rule & an arbitrary :
Claim: For x in the range of the arithmetic, that is if ßL  |x|  ßU( – ß1-n),
|(x - fl(x))/x|  .5ß1-n/(1 + .5ß1-n) < .5ß1-n,
where “fl” is the round-to-even floating point operation.
Proof: Let x = sgn(x) ße (d1.d2d3...dndn+1...)ß, z = (d1.d2d3...dn)ß and
y = (.dn+1dn+2dn+3…)ß.
Then we know that fl(x) (and note that fl(-x) = -fl(x)) is either...
(1) sgn(x) ße z + sgn(x) ße ß1-n if y > .5 or if y = .5 & dn is odd,
or
(2) sgn(x) ße z if y < .5 or if y = .5 and dn is even.
Therefore,
(1)  |x - fl(x)| = ße ß1-n (1 - y) and (2) |x - fl(x)| = ße ß1-n y.
Accordingly, in case (1),
|(x - fl(x))/x|  ße ß1-n (1 - y)/(ße(1 + ß1-n y))
 .5 ß1-n /(1 + .5 ß1-n)
< .5 ß1-n,
while in case (2),
|(x - fl(x))/x|  ß1-n y/(1 + ß1-n y)  .5ß1-n/(1 + .5ß1-n) < .5 ß1-n.
*
Since g(y) = ß1-n y/(1 + ß1-n y) is strictly increasing for y > 0.
Rounding-up-fives (aka simple rounding) yields the same inequality while
chopping yields |x - fl(x)| = ße ß1-n y < ße+1-n & therefore |(x - fl(x))/x| < ß1-n.
Regarding Addition: Suppose we want to compute S = 1in ai, where we
can ignore the input errors occurring when the ai's are floated, but not the
errors associated with the intermediate sums, i.e. the values
a1 + a2, a1 + a2 + a3, a1 + a2 + a3 + ... + an.
Set S2 = fl(a1 + a2) = (a1 + a2)(1 + 2), S3 = fl(S2 + a3) = (S2 + a3)(1 + 3), ...,
Sn = fl(Sn-1 + an) = (Sn-1 + an)(1 + n). Then since
Sn = (a1 + a2) 2in (1 + i) + a3 3in (1 + i) + a4 4in (1 + i) +
… + an-1 (1 + n-1) (1 + n) + an (1 + n)
= (a1 + a2) (1 + 2in i + r2) + a3 (1 + 3in i + r3) +
… + an-1 (1 + n-1+ n + rn) + an (1 + n),
where r2, r3, r4,…, are higher power expressions in 2, 3, …, it follows
that if these expressions are ignored, Sn – S is approximately equal to
(a1 + a2) 2in i + a3 3in i + … + an-1(n-1+ n) + an n.
This suggests that we might minimize the error if we first sort the ai's from
least to greatest in magnitude, i.e. |a1|  |a2|  |a3|  . . .  |an|, before adding.
Example: Add 1.21, 10.4, 30.4, 50.6, & 70.1. The exact value is 162.71,
but using 3-digit arithmetic with a round-to-even, yields the following least to greatest: 163.
rel. error = .00178...
vs.
greatest to least: 162.
rel. error = .00436...
Download