Towards a Modern Floating-Point Environment

advertisement
Towards a Modern Floating-Point Environment
PhD thesis defense
Olga Kupriianova
PEQUAN team, CalSci Dept, LIP6, UPMC
advisors: Jean-Claude Bajard, Christoph Lauter
i
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
1/ 44
What is Floating-Point Arithmetic
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
2/ 44
What is Floating-Point Arithmetic
Continuous
R numbers
3/2 = 1.5
π
=
3.141592653589
...
√
2 = 1.414213562373 . . .
Olga Kupriianova
Discrete
Floating point (FP) numbers
3/2 = 1.5e0
π
√ = 3.14159e0
2 = 1.41421e0
Towards a Modern Floating-Point Environment
Dec 11, 2015
2/ 44
What is Floating-Point Arithmetic
Continuous
R numbers
3/2 = 1.5
π
=
3.141592653589
...
√
2 = 1.414213562373 . . .
Discrete
Floating point (FP) numbers
3/2 = 1.5e0
π
√ = 3.14159e0
2 = 1.41421e0
± m0 .m1 m2 . . . mk−1 ·β E
|
{z
}
significand
E is exponent, k precision, β radix
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
2/ 44
Rounding
How to represent a real number in machine?
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
3/ 44
Rounding
How to represent a real number in machine?
π ≈ 3.14
e ≈ 2.71
π ≈ 3.141
e ≈ 2.718
π ≈ 3.1415
e ≈ 2.7182
π ≈ 3.14159
e ≈ 2.71828
π ≈ 3.141592
e ≈ 2.718281
π ≈ 3.1415926
e ≈ 2.7182818
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
3/ 44
Rounding
How to represent a real number in machine?
π ≈ 3.14
e ≈ 2.71
π ≈ 3.141
e ≈ 2.718
π ≈ 3.1415
e ≈ 2.7182
π ≈ 3.14159
e ≈ 2.71828
π ≈ 3.141592
e ≈ 2.718281
π ≈ 3.1415926
e ≈ 2.7182818
± m0 .m1 m2 . . . mk−1 ·β E ,
|
{z
}
significand
E is exponent, k precision
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
3/ 44
How to Round
Rounding modes:
Rounding Down (RD)
Rounding Up (RU)
Rounding toward Zero (RZ)
Rounding to the Nearest : ties to even (RN), ties to away (RA)
3.14
Olga Kupriianova
3.15
Towards a Modern Floating-Point Environment
Dec 11, 2015
4/ 44
How to Round
Rounding modes:
Rounding Down (RD)
Rounding Up (RU)
Rounding toward Zero (RZ)
Rounding to the Nearest : ties to even (RN), ties to away (RA)
x = 3.14159
3.14
Olga Kupriianova
3.15
Towards a Modern Floating-Point Environment
Dec 11, 2015
4/ 44
How to Round
Rounding modes:
Rounding Down (RD)
Rounding Up (RU)
Rounding toward Zero (RZ)
Rounding to the Nearest : ties to even (RN), ties to away (RA)
x = 3.14159
3.14
RD(x)
RZ(x)
RN (x), RA(x)
Olga Kupriianova
3.15
RU (x)
Towards a Modern Floating-Point Environment
Dec 11, 2015
4/ 44
How to Round
Rounding modes:
Rounding Down (RD)
Rounding Up (RU)
Rounding toward Zero (RZ)
Rounding to the Nearest : ties to even (RN), ties to away (RA)
x = 3.145
3.14
Olga Kupriianova
3.15
Towards a Modern Floating-Point Environment
Dec 11, 2015
4/ 44
How to Round
Rounding modes:
Rounding Down (RD)
Rounding Up (RU)
Rounding toward Zero (RZ)
Rounding to the Nearest : ties to even (RN), ties to away (RA)
x = 3.145
3.14
RD(x)
RZ(x)
RN (x)
Olga Kupriianova
3.15
RU (x)
RA(x)
Towards a Modern Floating-Point Environment
Dec 11, 2015
4/ 44
Accuracy and Rounding
Accuracy
Mathematical x → x̂ in FP arithmetic
Due to roundings, |x − x̂| > 0
2E m
Olga Kupriianova
2E (m + 1)
Towards a Modern Floating-Point Environment
Dec 11, 2015
5/ 44
Accuracy and Rounding
Accuracy
Mathematical x → x̂ in FP arithmetic
Due to roundings, |x − x̂| > 0
x
2E m
Olga Kupriianova
2E (m + 1)
Towards a Modern Floating-Point Environment
Dec 11, 2015
5/ 44
Accuracy and Rounding
Accuracy
Mathematical x → x̂ in FP arithmetic
Due to roundings, |x − x̂| > 0
x̂
2E m
Olga Kupriianova
2E (m + 1)
Towards a Modern Floating-Point Environment
Dec 11, 2015
5/ 44
Accuracy and Rounding
Accuracy
Mathematical x → x̂ in FP arithmetic
Due to roundings, |x − x̂| > 0
RN
RN (x)
2E m
Olga Kupriianova
x̂
2E (m + 1)
Towards a Modern Floating-Point Environment
Dec 11, 2015
5/ 44
Accuracy and Rounding
Accuracy
Mathematical x → x̂ in FP arithmetic
Due to roundings, |x − x̂| > 0
x̂
2E m
Olga Kupriianova
2E (m + 1)
Towards a Modern Floating-Point Environment
Dec 11, 2015
5/ 44
Accuracy and Rounding
Accuracy
Mathematical x → x̂ in FP arithmetic
Due to roundings, |x − x̂| > 0
RN
x̂
2E m
Olga Kupriianova
2E (m + 1)
Towards a Modern Floating-Point Environment
Dec 11, 2015
5/ 44
Accuracy and Rounding
Accuracy
Mathematical x → x̂ in FP arithmetic
Due to roundings, |x − x̂| > 0
RN
x̂
2E m
Olga Kupriianova
2E (m + 1)
Towards a Modern Floating-Point Environment
Dec 11, 2015
5/ 44
Accuracy and Rounding
Accuracy
Mathematical x → x̂ in FP arithmetic
Due to roundings, |x − x̂| > 0
RN
x̂
2E m
2E (m + 1)
Table Maker’s Dilemma (TMD)
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
5/ 44
Accuracy and Rounding
Accuracy
Mathematical x → x̂ in FP arithmetic
Due to roundings, |x − x̂| > 0
RN
x̂
2E m
2E (m + 1)
Table Maker’s Dilemma (TMD)
How to determine the accuracy
of the approximation x̂
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
5/ 44
Floating-Point Environment
IEEE754-1985: binary FP numbers, rounding modes,
arithmetic operations
In total ∼ 70 operations
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
6/ 44
Floating-Point Environment
IEEE754-1985: binary FP numbers, rounding modes,
arithmetic operations
In total ∼ 70 operations
IEEE754-2008 : decimal FP numbers and operations,
recommendations for mathematical function implementations,
heterogeneous operations: mix precisions
In total ∼ 350 operations for binary arithmetic
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
6/ 44
Floating-Point Environment
IEEE754-1985: binary FP numbers, rounding modes,
arithmetic operations
In total ∼ 70 operations
IEEE754-2008 : decimal FP numbers and operations,
recommendations for mathematical function implementations,
heterogeneous operations: mix precisions
In total ∼ 350 operations for binary arithmetic
Current situation: decimal and binary worlds,
intersect only in conversion
...next revisions (2018?): even more operations?
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
6/ 44
Floating-Point Environment
IEEE754-1985: binary FP numbers, rounding modes,
arithmetic operations
In total ∼ 70 operations
IEEE754-2008 : decimal FP numbers and operations,
recommendations for mathematical function implementations,
heterogeneous operations: mix precisions
In total ∼ 350 operations for binary arithmetic
Current situation: decimal and binary worlds,
intersect only in conversion
...next revisions (2018?): even more operations?
mix binary and decimals in one operation
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
6/ 44
Floating-Point Environment
IEEE754-1985: binary FP numbers, rounding modes,
arithmetic operations
In total ∼ 70 operations
IEEE754-2008 : decimal FP numbers and operations,
recommendations for mathematical function implementations,
heterogeneous operations: mix precisions
In total ∼ 350 operations for binary arithmetic
Current situation: decimal and binary worlds,
intersect only in conversion
...next revisions (2018?): even more operations?
mix binary and decimals in one operation
provide more implementations of mathematical functions
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
6/ 44
Implementation of Mathematical Functions
Why do we need more implementations of mathematical functions?
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
7/ 44
Implementation of Mathematical Functions
Why do we need more implementations of mathematical functions?
Mathematics
Computer Science
exp(x) = ex = lim
n→∞
1+
logb (x) = y; by = x
Z x
2
erf(x) =
e−t dt
x n
n
Several implementations for each:
exp - correctly rounded
exp - faithfully rounded
exp - with accuracy ≤ 2−45
...
0
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
7/ 44
Implementation of Mathematical Functions
Why do we need more implementations of mathematical functions?
Mathematics
Computer Science
exp(x) = ex = lim
n→∞
1+
logb (x) = y; by = x
Z x
2
erf(x) =
e−t dt
x n
n
Several implementations for each:
exp - correctly rounded
exp - faithfully rounded
exp - with accuracy ≤ 2−45
...
0
The existing libraries give only few implementations per function
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
7/ 44
Contributions/Structure
1. Mixed-Radix Arithmetic and Arbitrary Precision Base Conversion
2. Automatic Generation of Mathematical Functions (Metalibm)
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
8/ 44
Mixed-Radix Arithmetic
Radix conversion algorithm for Mixed-Radix Arithmetic
Correctly-rounded conversion from decimal character sequence to a
binary FP number (scanf analogue)
Research for fused multiply-add operation xy ± z
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
9/ 44
Mixed-Radix Arithmetic
Radix Conversion
Two-steps algorithm:
1
2
Exponent computation is errorless
Compute mantissa with specified accuracy
Integer-based computations
Memory consumption is known in advance
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
9/ 44
Mixed-Radix Arithmetic
Radix conversion algorithm for Mixed-Radix Arithmetic
Correctly-rounded conversion from decimal character sequence to a
binary FP number (scanf analogue)
Research for fused multiply-add operation xy ± z
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
9/ 44
Mixed-Radix Arithmetic
Our scanf version
User input is arbitrarily long
No way to precompute the worst cases for the TMD
Algorithm with integer computations, no memory allocations
Two-ways scheme is used: fast easy rounding or slow hard
Two conversions: decimal-binary and binary-decimal
Gives correctly-rounded (CR) result for all the rounding modes
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
9/ 44
Mixed-Radix Arithmetic
Radix conversion algorithm for Mixed-Radix Arithmetic
Correctly-rounded conversion from decimal character sequence to a
binary FP number (scanf analogue)
Research for fused multiply-add operation (FMA) xy ± z
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
9/ 44
Mixed-Radix Arithmetic
Mixed-radix FMA
xy ± z
2−4 · 139 × 10−16 · 346 − 23 · 42 = 10E m
Addition and multiplication within only one rounding
Worst cases search: reduce the iterations number with the use of
continued fraction theory and variable relations
Iterations reduced at ∼ 99%
Still about ∼ 16 000 000 000 iterations
First results obtained in ∼ 10 weeks of computations
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
9/ 44
Code Generation for Mathematical Functions
1. Introduction & Motivation for Code Generation
2. Manual Function Implementation. Background
3. Metalibm: General Overview
4. Metalibm: Domain Splitting
5. Metalibm: Reconstruction for Vectorizable Code
6. Conclusion and Perspectives
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
10/ 44
Existing Mathematical Libraries (libm)
Existing implementations
Intel’s Math Kernel Libraries
glibc libm
AMD’s libm
CRLibm by ENS Lyon
ARM’s mathlib
newlib
libmcr by Sun
OpenLibm for Julia
...
Yeppp!
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
11/ 44
Existing Mathematical Libraries (libm)
Existing implementations
Intel’s Math Kernel Libraries
glibc libm
AMD’s libm
CRLibm by ENS Lyon
ARM’s mathlib
newlib
libmcr by Sun
OpenLibm for Julia
...
Yeppp!
All these libms are static
Only few implementations per function
Limited dictionary of functions
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
11/ 44
Existing Mathematical Libraries (libm)
Existing implementations
Intel’s Math Kernel Libraries
glibc libm
AMD’s libm
CRLibm by ENS Lyon
ARM’s mathlib
newlib
libmcr by Sun
OpenLibm for Julia
...
Yeppp!
All these libms are static
Only few implementations per function
Limited dictionary of functions
Olga Kupriianova
How to get
exp(x) with 40 accurate bits
Towards a Modern Floating-Point Environment
Dec 11, 2015
11/ 44
Existing Mathematical Libraries (libm)
Existing implementations
Intel’s Math Kernel Libraries
glibc libm
AMD’s libm
CRLibm by ENS Lyon
ARM’s mathlib
newlib
libmcr by Sun
OpenLibm for Julia
...
Yeppp!
All these libms are static
Only few implementations per function
Limited dictionary of functions
Olga Kupriianova
How to get
exp(x)
R x sin t with 40 accurate bits
t dt with 25 bits
0
Towards a Modern Floating-Point Environment
Dec 11, 2015
11/ 44
One Size Does not Fit All
Current request
Several performance/accuracy options
(“quick and dirty”, faithful, correctly-rounded)
Several portability options (SIMD instructions)
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
12/ 44
One Size Does not Fit All
Current request
Several performance/accuracy options
(“quick and dirty”, faithful, correctly-rounded)
Several portability options (SIMD instructions)
Some people are still not happy
More performance, less compliance
degraded accuracy
reduced domain
Functions not from standard libm (e.g.
Olga Kupriianova
Rx
0
sin t
t dt,
Towards a Modern Floating-Point Environment
dilogarithm)
Dec 11, 2015
12/ 44
One Size Does not Fit All
Current request
Several performance/accuracy options
(“quick and dirty”, faithful, correctly-rounded)
Several portability options (SIMD instructions)
Some people are still not happy
More performance, less compliance
degraded accuracy
reduced domain
Functions not from standard libm (e.g.
Rx
0
sin t
t dt,
dilogarithm)
Who is going to write all these variants?
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
12/ 44
Task
Rough Computations
3 precisions from IEEE754
∼ 50 functions in libm
∼ 5 accuracies for each precision
∼ 5 various approximations
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
13/ 44
Task
Rough Computations
3 precisions from IEEE754
∼ 50 functions in libm
∼ 5 accuracies for each precision
∼ 5 various approximations
I have to implement 1300 functions...
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
13/ 44
Task
Rough Computations
3 precisions from IEEE754
∼ 50 functions in libm
∼ 5 accuracies for each precision
∼ 5 various approximations
I have to implement 1300 functions...
Each function implementation takes ∼ 1 man-month
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
13/ 44
Task
Rough Computations
3 precisions from IEEE754
∼ 50 functions in libm
∼ 5 accuracies for each precision
∼ 5 various approximations
I have to implement 1300 functions...
Each function implementation takes ∼ 1 man-month
It will take me more then 100 years to rewrite the libm
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
13/ 44
Task
Rough Computations
3 precisions from IEEE754
∼ 50 functions in libm
∼ 5 accuracies for each precision
∼ 5 various approximations
I have to implement 1300 functions...
Each function implementation takes ∼ 1 man-month
It will take me more then 100 years to rewrite the libm
Generate parametrized implementations of mathematical functions
Metalibm
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
13/ 44
Code Generation for Mathematical Functions
1. Introduction & Motivation for Code Generation
2. Manual Function Implementation. Background
3. Metalibm: General Overview
4. Metalibm: Domain Splitting
5. Metalibm: Reconstruction for Vectorizable Code
6. Conclusion and Perspectives
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
14/ 44
How to Implement a Mathematical Function
Elementary functions have transcendental values
exp(1) = 2.71828182845904523536028747135266 . . .
log2 (10) = 3.32192809488736234787031942948939 . . .
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
15/ 44
How to Implement a Mathematical Function
Elementary functions have transcendental values
exp(1) = 2.71828182845904523536028747135266 . . .
log2 (10) = 3.32192809488736234787031942948939 . . .
Compute polynomial approximations
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
15/ 44
How to Implement a Mathematical Function
Elementary functions have transcendental values
exp(1) = 2.71828182845904523536028747135266 . . .
log2 (10) = 3.32192809488736234787031942948939 . . .
Compute polynomial approximations
Lagrange, Newton, Chebyshev
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
15/ 44
How to Implement a Mathematical Function
Elementary functions have transcendental values
exp(1) = 2.71828182845904523536028747135266 . . .
log2 (10) = 3.32192809488736234787031942948939 . . .
Compute polynomial approximations
Lagrange, Newton, Chebyshev
error
bounds
Remez algorithm
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
15/ 44
Approximation Polynomials
f = exp
Compute different approximations on [0, 5]
deg(p1 ) = 7, deg(p2 ) = 8, deg(p3 ) = 9
p1
p2
p3
0.005
0.004
0.003
0.002
0.001
0
-0.001
-0.002
-0.003
-0.004
-0.005
Olga Kupriianova
0
1
2
3
Towards a Modern Floating-Point Environment
4
5
Dec 11, 2015
16/ 44
Approximation Polynomials
f = exp
Compute different approximations on [0, 5]
deg(p1 ) = 7, deg(p2 ) = 8, deg(p3 ) = 9
Larger domain, larger degree
Conclusion: approximate only on small domains
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
16/ 44
Argument Reduction. Example
Example
Implement f (x) = ex
Exploit the algebraic property ea+b = ea · eb
x
j
ex = 2 log 2 = 2
Olga Kupriianova
x
log 2
m
x
j
−
x
m
· 2 log 2 log 2 = 2E · ex−E log 2 = 2E · er
log(2) log(2)
r∈ −
,
2
2
Towards a Modern Floating-Point Environment
Dec 11, 2015
17/ 44
Argument Reduction. Example
Example
Implement f (x) = ex
Exploit the algebraic property ea+b = ea · eb
P. Tang’s table-based method
log(2) log(2)
x
m
i/2t
r∗
∗
,
e =2 ·2
·e , r ∈ −
2 · 2t 2 · 2t
m ∈ Z,
Olga Kupriianova
t ∈ Z,
0 ≤ i ≤ 2t − 1,
t
2i/2 in table
Towards a Modern Floating-Point Environment
Dec 11, 2015
17/ 44
Argument Reduction. Does It Always Work?
Useful algebraic properties:
exp(a + b) = exp(a) · exp(b)
log(ab) = log(a) + log(b)
sin(a + 2πk) = sin(a)
...
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
18/ 44
Argument Reduction. Does It Always Work?
Useful algebraic properties:
exp(a + b) = exp(a) · exp(b)
log(ab) = log(a) + log(b)
sin(a + 2πk) = sin(a)
...
How to implement Dickman’s function?
uρ0 (u) + ρ(u − 1) = 0,
Olga Kupriianova
ρ(u) = 1 for 0 ≤ u ≤ 1
Towards a Modern Floating-Point Environment
Dec 11, 2015
18/ 44
Argument Reduction. Does It Always Work?
Useful algebraic properties:
exp(a + b) = exp(a) · exp(b)
log(ab) = log(a) + log(b)
sin(a + 2πk) = sin(a)
...
How to implement Dickman’s function?
uρ0 (u) + ρ(u − 1) = 0,
ρ(u) = 1 for 0 ≤ u ≤ 1
Piecewise-polynomial approximation
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
18/ 44
General Scheme
x
x
Argument Reduction
Reduced argument r
Tabulation index i
Polynomial
Table
Reconstruction
f (x)
Olga Kupriianova
Domain Splitting
x
Subdomain index i
Polynomial
Table
Reconstruction
f (x)
Towards a Modern Floating-Point Environment
Dec 11, 2015
19/ 44
General Scheme
x
x
Argument Reduction
Reduced argument r
Tabulation index i
Polynomial
Table
Reconstruction
f (x)
Domain Splitting
x
Subdomain index i
Polynomial
Table
Reconstruction
f (x)
And filtering of special cases
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
19/ 44
Code Generation for Mathematical Functions
1. Introduction & Motivation for Code Generation
2. Manual Function Implementation. Background
3. Metalibm: General Overview
4. Metalibm: Domain Splitting
5. Metalibm: Reconstruction for Vectorizable Code
6. Conclusion and Perspectives
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
20/ 44
Structure
parameters
Metalibm
implementation
function
domain
final accuracy
max poly degree
table size
...
Olga Kupriianova
Towards a Modern Floating-Point Environment
C code for function
Dec 11, 2015
21/ 44
What is Metalibm
Academic Prototype
Open-Source code generator for parametrized libm functions
Part of ANR Metalibm Project
Available at http://metalibm.org/
Objectives
Push-button tool to implement functions f : R → R
automatic argument reduction
automatic polynomial approximation
automatic domain splitting
Support black-box functions
no function dictionary
specify function by an expression or external code
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
22/ 44
What is Metalibm
Academic Prototype
Open-Source code generator for parametrized libm functions
Part of ANR Metalibm Project
Available at http://metalibm.org/
Objectives
Push-button tool to implement functions f : R → R
automatic argument reduction
automatic polynomial approximation
automatic domain splitting
Support black-box functions
no function dictionary
specify function by an expression or external code
⇒ use only C 3 functions that can be evaluated over intervals
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
22/ 44
Generation Levels
parameters
C 3 function
domain
final accuracy
max poly degree
table size
Metalibm
1
implementation
2
C code for function
3
1 - Properties detection
2 - Domain splitting
3 - Approximation
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
23/ 44
Property Detection: Example on Exponential
Task
Generate f (x) on [a, b] with accuracy ε̄
Generation hypothesis
f (x) is of type β x , unknown β
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
24/ 44
Property Detection: Example on Exponential
Task
Generate f (x) on [a, b] with accuracy ε̄
Generation hypothesis
f (x) is of type β x , unknown β
Finding the base
β = exp
log(f (ξ))
ξ
, for some ξ ∈ [a, b]
Decision of acceptance
x
[a,b]
ε = fβ(x) − 1
∞
Olga Kupriianova
|ε| < |ε̄|
Towards a Modern Floating-Point Environment
Dec 11, 2015
24/ 44
Property Detection: Example on Exponential
error
2*eps
eps
-eps
-2eps
1.5e-14
1e-14
5e-15
0
-5e-15
-1e-14
-1.5e-14
-100
Olga Kupriianova
-50
0
50
Towards a Modern Floating-Point Environment
100
Dec 11, 2015
25/ 44
Property Detection: Example on Exponential
error
2*eps
eps
-eps
-2eps
1.5e-14
1e-14
5e-15
0
-5e-15
-1e-14
-1.5e-14
-100
Olga Kupriianova
-50
0
50
Towards a Modern Floating-Point Environment
100
Dec 11, 2015
25/ 44
Property Detection: Example on Exponential
error
2*eps
eps
-eps
-2eps
1.5e-14
1e-14
5e-15
0
-5e-15
-1e-14
-1.5e-14
-100
Olga Kupriianova
-50
0
50
Towards a Modern Floating-Point Environment
100
Dec 11, 2015
25/ 44
Property Detection
Currently detectable properties
Exponential f (a + b) = f (a)f (b)
Logarithmic f (ab) = f (a) + f (b)
Periodic f (x + C) = f (x)
Symmetric f (x) = f (−x) and f (x) = −f (x)
Sinh-like family f (x) = β x − β −x
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
26/ 44
When property detection does not work
Argument reduction
Step is based on mathematical properties:
na+b = na · nb , sin(x + 2π) = sin(x), log(a · b) = log(a) + log(b), . . .
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
27/ 44
When property detection does not work
Argument reduction
Step is based on mathematical properties:
na+b = na · nb , sin(x + 2π) = sin(x), log(a · b) = log(a) + log(b), . . .
What properties do we know for erf(x), or a black-box function?
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
27/ 44
When property detection does not work
Argument reduction
Step is based on mathematical properties:
na+b = na · nb , sin(x + 2π) = sin(x), log(a · b) = log(a) + log(b), . . .
What properties do we know for erf(x), or a black-box function?
When argument reduction does not work
Piecewise-polynomial approximation
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
27/ 44
When property detection does not work
Argument reduction
Step is based on mathematical properties:
na+b = na · nb , sin(x + 2π) = sin(x), log(a · b) = log(a) + log(b), . . .
What properties do we know for erf(x), or a black-box function?
When argument reduction does not work
Piecewise-polynomial approximation
Algorithm to split the domain
Reconstruction becomes an execution of if-statements
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
27/ 44
Code Generation for Mathematical Functions
1. Introduction & Motivation for Code Generation
2. Manual Function Implementation. Background
3. Metalibm: General Overview
4. Metalibm: Domain Splitting
5. Metalibm: Reconstruction for Vectorizable Code
6. Conclusion and Perspectives
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
28/ 44
Problem Statement
Task:
Split the domain [a, b] into I0 , I1 , . . . , IN
Approximation degrees on Ik : dk ≤ dmax
Olga Kupriianova
Towards a Modern Floating-Point Environment
function
initial domain
accuracy
degree bound
Dec 11, 2015
f
[a, b]
ε̄
dmax
29/ 44
Problem Statement
Task:
Split the domain [a, b] into I0 , I1 , . . . , IN
Approximation degrees on Ik : dk ≤ dmax
function
initial domain
accuracy
degree bound
f
[a, b]
ε̄
dmax
Naive Solution:
Split domain into N equal parts, N is large.
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
29/ 44
Problem Statement
f = asin, [a, b] = [0, 0.85], dmax ≤ 8, ε̄ = 2−52
1.2
splitting
asin(x)
9
degrees
8
1
7
0.8
6
5
0.6
4
0.4
3
2
0.2
1
0
0
45 intervals
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
29/ 44
Problem Statement
f = asin, [a, b] = [0, 0.85], dmax ≤ 8, ε̄ = 2−52
1.2
splitting
asin(x)
9
degrees
8
1
7
0.8
6
5
0.6
4
0.4
3
2
0.2
1
0
0
45 intervals
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
29/ 44
Problem Statement
The quantity of intervals N → min
Classic problem:
having f , [a, b] and d
find a polynomial p and accuracy ε
We need:
having f , [a, b], bounds ε̄ and dmax
find I and polynomial p
Solution:
compute Remez polynomials and check the constraints?
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
29/ 44
Problem Statement
Remez
Test
OK
Olga Kupriianova
Bisect
Towards a Modern Floating-Point Environment
Dec 11, 2015
29/ 44
Domain Splitting
Theorem of de la Vallée-Poussin:
a continuous function f on [a, b],
its approximating polynomial p
find the bounds for optimal error
error
bounds
6e-05
0
-6e-05
Olga Kupriianova
0
0.2
0.4
0.6
0.8
Towards a Modern Floating-Point Environment
1
Dec 11, 2015
30/ 44
Domain Splitting
Interpolation on Chebyshev nodes
Test
OK
Remez
OK
Olga Kupriianova
Bisect
Bisect
Towards a Modern Floating-Point Environment
Dec 11, 2015
30/ 44
Domain Splitting
error
bounds
target acc
6e-05
0
-6e-05
0
0.2
0.4
0.6
0.8
1
Need a split
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
30/ 44
Domain Splitting
error
bounds
target acc
6e-05
0
-6e-05
0
0.2
0.4
0.6
0.8
1
No split
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
30/ 44
Domain Splitting
error
bounds
target acc
6e-05
0
-6e-05
0
0.2
0.4
0.6
0.8
1
Need Remez
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
30/ 44
Domain Splitting. Our method
Bisection
f = asin, [a, b] = [0, 0.85], dmax ≤ 8, ε̄ = 2−52
2
9
splitting
asin(x)
degrees
8
7
1.5
6
5
1
4
3
0.5
2
1
5
0.8
5
0.8
0.7
5
0.7
0.6
5
0.6
0.5
5
0.5
0.4
5
0.4
0.3
5
0.3
0.2
5
0.2
0.1
5
0.0
0
0.1
0
0
24 intervals
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
31/ 44
Domain Splitting. Our method
Our optimized method
f = asin, [a, b] = [0, 0.85], dmax ≤ 8, ε̄ = 2−52
2
9
splitting
asin(x)
degrees
8
7
1.5
6
5
1
4
3
0.5
2
1
5
0.8
5
0.8
0.7
5
0.7
0.6
5
0.6
0.5
5
0.5
0.4
5
0.4
0.3
5
0.3
0.2
5
0.2
0.1
5
0.0
0
0.1
0
0
18 intervals
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
31/ 44
Some results
measure
subdomains in bisection
subdomains in our method
subdomains saved
coefficients saved
memory saved (bytes)
memory saved (%)
name
f1
f2
f3
f4
Olga Kupriianova
f
asin
asin
erf
erf
ε̄
2−52
2−45
2−51
2−45
f1
24
18
25%
42
336
23%
f2
15
10
30%
34
272
35%
domain I
[0, 0.85]
[−0.75, 0.75]
[−0.75, 0.75]
[−0.75, 0.75]
Towards a Modern Floating-Point Environment
f3
9
5
44 %
30
240
38.5%
f4
12
8
30%
28
224
45%
dmax
8
8
9
7
Dec 11, 2015
32/ 44
Code Generation for Mathematical Functions
1. Introduction & Motivation for Code Generation
2. Manual Function Implementation. Background
3. Metalibm: General Overview
4. Metalibm: Domain Splitting
5. Metalibm: Reconstruction for Vectorizable Code
6. Conclusion and Perspectives
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
33/ 44
Problem Statement
Reconstruction gets tricky with arbitrary domain splitting:
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
34/ 44
Problem Statement
Reconstruction gets tricky with arbitrary domain splitting:
f (x) ≈ pk (x), x ∈ Ik , k ∈ [0, N ] ∩ Z
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
34/ 44
Problem Statement
Reconstruction gets tricky with arbitrary domain splitting:
f (x) ≈ pk (x), x ∈ Ik , k ∈ [0, N ] ∩ Z
∀x ∈ [a, b] find k such that x ∈ Ik
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
34/ 44
Problem Statement
Reconstruction gets tricky with arbitrary domain splitting:
f (x) ≈ pk (x), x ∈ Ik , k ∈ [0, N ] ∩ Z
f ≈ p2 (x)
∀x ∈ [a, b] find k such that x ∈ Ik
x
a
Olga Kupriianova
Towards a Modern Floating-Point Environment
01
2 3 4 5
6
7 8 9
Dec 11, 2015
b
34/ 44
Problem Statement
/∗ compute i so that a[i] < x < a[i+1] ∗/
i=31;
if (x < arctan_table[i][A].d) i−= 16;
else i+=16;
if (x < arctan_table[i][A].d) i−= 8;
else i+= 8;
if (x < arctan_table[i][A].d) i−= 4;
else i+= 4;
if (x < arctan_table[i][A].d) i−= 2;
else i+= 2;
if (x < arctan_table[i][A].d) i−= 1;
else i+= 1;
if (x < arctan_table[i][A].d) i−= 1;
xmBihi = x−arctan_table[i][B].d;
xmBilo = 0.0;
Listing 1: Code sample for arctan function from crlibm library
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
34/ 44
Vectorizable Reconstruction
f ≈ p2 (x)
x
Splitting of
the domain [a, b]:
a
Olga Kupriianova
01
Towards a Modern Floating-Point Environment
2 3 4 5
6
Dec 11, 2015
7 8 9
b
35/ 44
Vectorizable Reconstruction
To determine the subinterval we
build a mapping function
P (x) = k, x ∈ Ik
10
9
8
7
6
5
4
3
2
1
0
a
Olga Kupriianova
I0
I1
I2
I3
Towards a Modern Floating-Point Environment
I4
I5
I6
I7
I8
Dec 11, 2015
I9 b
35/ 44
Vectorizable Reconstruction
To determine the subinterval we
build a mapping function
P (x) = k, x ∈ Ik
p(x) : bp(x)c = k, x ∈ Ik
10
9
8
7
6
5
4
3
2
1
0
a
Olga Kupriianova
I0
I1
I2
I3
Towards a Modern Floating-Point Environment
I4
I5
I6
I7
I8
Dec 11, 2015
I9 b
35/ 44
Vectorizable Reconstruction
To determine the subinterval we
build a mapping function
P (x) = k, x ∈ Ik
p(x) : bp(x)c = k, x ∈ Ik
10
9
8
7
6
5
4
3
2
1
0
a
Olga Kupriianova
I0
I1
I2
I3
Towards a Modern Floating-Point Environment
I4
I5
I6
I7
I8
Dec 11, 2015
I9 b
35/ 44
Vectorizable Reconstruction
Interpolation polynomial
with a posteriori condition check
10
9
8
7
6
5
4
3
2
1
0
a
Olga Kupriianova
I0
I1
I2
I3
Towards a Modern Floating-Point Environment
I4
I5
I6
I7
I8
Dec 11, 2015
I9 b
35/ 44
Vectorizable Reconstruction
mapping function
P (x) = k, x ∈ Ik
p(x) : bp(x)c = k, x ∈ Ik
10
p(x) 9
bp(x)c 8
7
6
5
4
3
2
1
0
I0
a I1
Olga Kupriianova
I2
I3
I4
Towards a Modern Floating-Point Environment
I5
I6
I7
I8
I9 b
Dec 11, 2015
35/ 44
Example
p
7
6
5
4
3
2
1
0
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
f = asin(x); dom = [-0.75; 0];
ε̄ = 2−48 ; dmax = 10;
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
36/ 44
A Posteriori Condition Check Fails
p
7
6
5
4
3
2
1
0
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
f = atan(x); dom = [-π/2; 0];
ε̄ = 2−40 ; dmax = 8;
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
37/ 44
How to Improve our Method
Interval Arithmetic approach:
10
9
8
7
6
5
4
3
2
1
0
a
Olga Kupriianova
b
Towards a Modern Floating-Point Environment
Dec 11, 2015
38/ 44
How to Improve our Method
Interval Arithmetic approach:
10
9
8
7
6
5
4
3
2
1
0
a
Olga Kupriianova
b
Towards a Modern Floating-Point Environment
Dec 11, 2015
38/ 44
How to Improve our Method
Interval Arithmetic approach:
10
9
8
7
6
5
4
3
2
1
0
a
Olga Kupriianova
b
Towards a Modern Floating-Point Environment
Dec 11, 2015
38/ 44
How to Improve our Method
Interval Arithmetic approach:
Interval system of linear algebraic
equations
10
9
8
7
6
5
4
3
2
1
0
a
Olga Kupriianova
    
c0
b0
1 a0 · · · an0
1 a1 · · · an   c1   b1 
1  

 
 .. .. . .
..  ·  ..  =  .. 
. .
. . .  . 
cn
bn
1 an · · · ann

b
Towards a Modern Floating-Point Environment
Dec 11, 2015
38/ 44
How to Improve our Method
Interval Arithmetic approach:
Interval system of linear algebraic
equations
10
9
8
7
6
5
4
3
2
1
0
    
c0
b0
1 a0 · · · an0
1 a1 · · · an   c1   b1 
1  

 
 .. .. . .
..  ·  ..  =  .. 
. .
. . .  . 
cn
bn
1 an · · · ann

a
b
tolerance solution:
united solution:
Ξtol = c ∈ RN | ∀a ∈ a, ∀b ∈ b, Ac = b Ξuni = c ∈ RN | ∃a ∈ a, ∃b ∈ b, Ac = b
S. Shary HdR thesis “Интервальные алгебраические задачи и их численное решение”
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
38/ 44
Achievements
Generate the specified function versions in few minutes
Detect algebraic properties automatically
Use the improved (optimized) domain splitting algorithm
Generation of the vectorizable implementations has started
Extra bonuses: composite functions and a set of function variants
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
39/ 44
Code Generation for Mathematical Functions
1. Introduction & Motivation for Code Generation
2. Manual Function Implementation. Background
3. Metalibm: General Overview
4. Metalibm: Domain Splitting
5. Metalibm: Reconstruction for Vectorizable Code
6. Conclusion and Perspectives
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
40/ 44
Conclusion
Paving the road for mixed-radix arithmetic:
Novel algorithm for radix conversion
Algorithm for conversion from decimal character sequence to a
binary FP number
Worst cases search for FMA
Code generation for mathematical functions:
Novel algorithm for domain splitting
Novel approach to generate vectorizable implementations
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
41/ 44
Future Work
Mixed-Radix arithmetic
Finish the worst cases search for FMA
Start its implementation
Research on other arithmetic operations
Obtain test/comparison results for our scanf
Code generation of mathematical functions
Filtering of special cases
Improve vectorizable reconstruction:
Interval arithmetic for a priori approach
Overlapping intervals
Connection between reconstruction and domain splitting
Add more parameters
Integrate with N.Brunie’s version
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
42/ 44
Perspectives
Long-term Future
Integrate Metalibm to the glibc/gcc
Apply the similar algorithms for filter generation
Formal proof for scanf algorithm
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
43/ 44
Questions
N. Brunie, F. de Dinechin, O. Kupriianova, and C. Lauter. Code generators for
mathematical functions. In ARITH22 - June 2015, Lyon, France. Proceedings, pp
66-73. Best paper award.
O. Kupriianova and C. Q. Lauter. A domain splitting algorithm for the mathematical
functions code generator. In Asilomar Conference on Signals, Systems and
Computers, Pacific Grove, CA, USA, November, 2014, pp 1271-1275.
O. Kupriianova and C. Q. Lauter. Replacing branches by polynomials in vectorizable
elementary functions. 16th GAMM-IMACS International Symposium on Scientific
Computing, Computer Arithmetic and Validated Numerics, September 2014,
Würzburg, Germany.
O. Kupriianova and C. Q. Lauter. Metalibm: A mathematical functions code
generator. In Mathematical Software - ICMS 2014 - 4th International Congress,
Seoul, South Korea, August 2014. Proceedings, pp 713–717.
O. Kupriianova, C. Q. Lauter, and J.-M. Muller. Radix conversion for IEEE754-2008
mixed radix floating-point arithmetic. In Asilomar Conference on Signals, Systems
and Computers, Pacific Grove, CA, USA, November 2013, pp 1134–1138.
Olga Kupriianova
Towards a Modern Floating-Point Environment
Dec 11, 2015
44/ 44
Download