Towards a Tool for Forward and Reverse Mode Source-to

advertisement
Towards a Tool for Forward and Reverse Mode
Source-to-Source Transformation in C/C++
Michael Voßbeck, Ralf Giering, and Thomas Kaminski
FastOpt, Schanzenstr. 36, 20357 Hamburg, Germany, http://www.FastOpt.com
Summary. The paper presents a feasibility study for TAC++, the C/C++equivalent of Transformation of Algorithms in Fortran (TAF). The goal of this
study is to design an AD-tool capable of generating the adjoint of a simple but
non-trivial C code. The purpose of this new tool is threefold: First, we demonstrate
the feasibility of reverse mode differentiation in C. Second, we will use the
experience from this study in the design of TAC++. Third, the tool is valuable
in itself, as it can differentiate simple C codes and, thus, can support hand-coders
of large adjoint C/C++ codes. We have transfered a subset of TAF algorithms to
the new tool, including the Efficient Recomputation Algorithm (ERA). Our test
code is a C version of the Roe solver in the CFD code EULSOLDO. For a gradient
evaluation, the automatically generated adjoint takes the CPU time of about 3.8
function evaluations.
Keywords: automatic differentiation, reverse mode, adjoint model, recomputation,
source-to-source transformation, C++, CFD, flow solver
1 Introduction
Automatic Differentiation (AD, [1]) is a technique that yields accurate derivative information for functions defined by numerical programmes. Such a programme is decomposed into elementary functions defined by operations such
as addition or division and intrinsics such as cosine or logarithm. On the level
of these elementary functions, the corresponding derivatives are derived automatically, and application of the chain rule results in an evaluation of a
multiple matrix product, which is automated, too.
The two principal implementations of AD are operator overloading and
source-to-source transformation. The former exploits the overloading capability of modern object-oriented programming languages such as Fortran-90
[2, 3, 4] or C++ [5]. All relevant operations are extended by corresponding
derivative operations. Source-to-source transformation takes the function code
2
Voßbeck et al.
as input and generates a second code that evaluates the function’s derivative.
This derivative code is then compiled and executed. Hence, differentiation
and derivative evaluation are separated. The major disadvantage is that any
code analysis for the differentiation process has to rely on information that is
available at compile time. On the other hand, once generated, the derivative
code can be conserved, and the derivative evaluation can be carried out any
time on any platform, independently from the AD-tool. Also, extended derivative code optimisations by a compiler (and even by hand) can be applied. This
renders source-to-source transformation the ideal approach for large-scale and
run-time-critical applications.
The forward mode of AD propagates derivatives in the execution order
defined by the function evaluation, while the reverse mode operates in the
opposite order. FastOpt’s AD-tool Transformation of Algorithms in Fortran
(TAF, [6, 7]) has generated highly efficient forward and reverse mode derivative codes of a number of large (5,000 - 100,000 lines excluding comments)
Fortran 77-95 codes [8, 9, 10, 11, 12, 13]. Many further applications are in
progress.
Regarding source-to-source transformation for C, to our knowledge, ADIC
[14, 15] is the only tool that is currently available. However, ADIC is restricted
to the forward mode of AD. Hence, ADIC is not well-suited for differentiation
of functions with a large number of independent and a small number of dependent variables. However, such functions are typical for optimisation problems,
where the gradient of a scalar-valued function of many control variables is
required. In this context, the restriction to the forward mode usually constitutes a serious drawback. Typically such optimisation problems are rendered
computationally feasible by an artificial reduction of the number of control
variables.
This paper addresses reverse mode AD for C codes in the sense that it
presents a feasibility study for TAC++, a C++-equivalent of TAF. The goal
of the study is to implement an AD-tool capable of generating the adjoint of
a simple but non-trivial C test-code. The purpose of the new tool is threefold:
• Demonstrate the feasibility of reverse mode differentiation in C and investigate the performance of the generated code.
• Gain experience for design of TAC++.
• Build a tool that can differentiate simple codes and, thus, provides valuable
support to hand-coders of large adjoint C codes (and C++ codes with
sections in plain C syntax). This is the same approach we have taken for
Fortran: The adjoint model compiler (AMC [16]), the pre-predecessor of
TAF, was supporting hand-coders from the beginning, although, initially,
the tool was only capable of working on the level of basic code blocks.
TAC++
3
2 C Version of EULSOLDO
As starting point for our test code we selected the Roe Solver [17] of the
CFD code EULSOLDO [18]. As a test object Roe’s solver has become popular with AD-tool developers [19, 20]. The original solver routine has the four
components of the residual as the dependent variables. As we are mainly interested in the scalar reverse mode, we use the sum over these four components
as scalar dependent variable (fc), as shown in file 1. We also concatenate
all independent variables into a single array (x) of eight components, such
that the subroutine conveniently interfaces with our standard drivers. EULSOLDO’s original Fortran code has been transformed to C code (141 lines
without comments and one statement per line) by means of the tool f2c [21]
with command-line options -A (generate ANSI-C), -a (storage class of local
variables is automatic), and -r8 (promote real to double precision). f2c
also inserts an include directive for its header file f2c.h and uses pointer
types for all formal parameters, in order to preserve Fortran subroutine properties (call by reference) as shown in file 1. The transformed code is basic in
the sense that it consists of the following language elements:
• Selected datatype: int and double in scalar, array, typedef, and pointer
form
• Basic arithmetics: addition, subtraction, multiplication, division
• Basic pointer arithmetics
• One intrinsic: sqrt
• A few control flow elements: for, if, conditional expression
• A function call
3 The AD-tool
In the design process of the new tool, our approach has been to implement
well-proved and reliable TAF algorithms. Our AD tool essentially consists of
the four components shown in Fig. 1.
The normalisation replaces certain language constructs by equivalent
canonical code that is more appropriate to the transformation phase. For
example the comma-expression in the C version of EULSOLDO that is shown
in file 2 is normalised to the code segment shown in file 3.
Neither TAF’s global data flow analysis nor its local data dependence
analysis have been transferred yet. One of the consequences is that all variables
of type double are active.
The main challenge in reverse mode AD is to provide required values, i.e.
values from the function evaluation that are needed in the derivative code (for
details see [6, 22, 23]). The new tool uses recomputation for providing required
values. As in TAF, the Efficient Recomputation Algorithm (ERA [23]) avoids
unnecessary recomputations, which is essential for generating efficient derivative code. For instance the adjoint statement (see file 5) of the if-statement
4
Voßbeck et al.
/* roemodel call.f -- translated by f2c (version 20000121)...*/
#include "f2c.h"
int flux ( doublereal *ql, doublereal *qr,
doublereal *sinal, doublereal *cosal, doublereal *ds,
doublereal *r , doublereal *lambda)
{
... omitting body ...
}
int model (integer *n, doublereal *x, doublereal *fc)
{
... omitting declarations ...
/* Parameter adjustments */
--x;
/* Function Body */
sinal = .2307;
cosal = .3458;
ds = .25;
ql[0] = x[1];
ql[1] = x[2];
ql[2] = x[3];
ql[3] = x[4];
qr[0] = x[5];
qr[1] = x[6];
qr[2] = x[7];
qr[3] = x[8];
flux (ql, qr, &sinal, &cosal, &ds, r , &lambda);
/* Calculate Cost value */
*fc = 0.;
for (nvar = 1; nvar <= 4; ++nvar)
*fc += r [nvar - 1];
return 0;
} /* model */
File 1: Function code interface generated by f2c (slightly edited to save space).
l[0] = (d 1 = uhat - ahat, ((d 1) >= 0 ? (d 1) : -(d 1)));
File 2: Comma-expression in the C version of EULSOLDO
d 1=uhat-ahat;
l[0]=(d 1 >= 0 ? d 1 : -d 1);
File 3: file 2 in normalised form
TAC++
5
source.c
Front end
Normalisation
Transformation
Back end
source_ad.c
Fig. 1. Component structure of the new tool. The front end translates C code into
the new tool’s internal representation and comprises scanner, parser, and semantic analysis. The normalisation replaces certain language constructs by equivalent
canonical code that is more appropriate to the transformation phase. The transformation generates the derivative code (in the internal representation), and finally the
back end translates the internal representation back to C code.
from file 4 has the required variables d 1 and l[0]. While dl1star is still
available from an earlier recomputation (not shown), l[0] may be overwritten
by the if-statement itself. Hence, only recomputations for l[0] have to be
generated.
In its current form, the new tool implements an equivalent of TAF’s
pure mode, i.e. it generates code that evaluates the derivative without the
function. The new tool accepts preprocessed ANSI-C code, i.e. a preprocessor
such as cpp has to be invoked first. For the test code this preprocessing
step resolves the include directive for f2c.h. The f2c generated code also
contains simple pointer arithmetics, as a consequence of different conventions
of addressing elements of arrays in C and Fortran: While Fortran addresses
the first element of the array x containing the independent variables by x(1),
the C version addresses this element by x[0]. In order to use the index values
from the Fortran version, f2c includes a shift operation on the pointer to
the array x, i.e. the statement - -x is inserted. File 6 shows the generated
adjoint of the code section from File 1. The new tool generates an adjoint
6
Voßbeck et al.
/* Absolute eigenvalues, acoustic waves with entropy fix. */
l[0] = (d 1 = uhat - ahat, abs(d 1));
dl1 = qrn[0] / qr[0] - ar - qln[0] / ql[0] + al;
/* Computing MAX */
d 1 = dl1 * 4.;
dl1star = max(d 1,0.);
if (l[0] < dl1star * .5) {
l[0] = l[0] * l[0] / dl1star + dl1star * .25;
}
File 4: if-statement including code the if-clause depends on (from C version
of EULSOLDO)
/* RECOMP============== begin */
d 1=uhat-ahat;
l[0]=(d 1 >= 0 ? d 1 : -d 1);
/* RECOMP============== end */
if( l[0] < dl1star*0.500000 )
{
dl1star ad+=l ad[0]*(-(l[0]*l[0]/(dl1star*dl1star))+0.250000);
l ad[0]=l ad[0]*(2*l[0]/dl1star);
}
File 5: Recomputations for adjoint statement of if-statement from File 4
comprising 560 lines of C code with one statement and one declaration per
line. This excludes comments and the code generated from the include file.
The complete processing chain is depicted in the right branch of Fig. 2.
In its current form, the new tool handles the subset of ANSI-C used by
EULSOLDO (see sect. 2). In addition it can handle all intrinsics as usually
defined in math.h.
4 Performance of Generated Code
We have tested the performance of the generated code in a number of test
environments, i.e. for different combinations of processor, compiler, and level
of compiler optimisation.
Our first test environment consists of a 3GHz Pentium 4 and the Intel
compiler (icc, Version 8.0) with flags “-O3 -ip -xN”. This environment achieves
the fastest CPU-time for the function code. We have called it standard as it
reflects the starting point of typical users, i.e. they are running a function code
in production mode (as fast as possible) and need fast derivative code. In an
attempt to isolate the impact of the individual factors processor, compiler,
and optimisation level, further test environments have been constructed:
TAC++
7
void flux ad( doublereal *ql, doublereal *ql ad, doublereal *qr,
doublereal *qr ad, doublereal *sinal, doublereal *sinal ad,
doublereal *cosal, doublereal *cosal ad, doublereal *ds,
doublereal *ds ad, doublereal *r , doublereal *r ad,
doublereal *lambda, doublereal *lambda ad )
{
... omitting body ...
}
void model ad( integer *n, doublereal *x, doublereal *x ad,
doublereal *fc, doublereal *fc ad )
{
... omitting declarations ...
--x;
--x ad;
sinal=0.230700;
cosal=0.345800;
ds=0.250000;
ql[0]=x[1];
ql[1]=x[2];
ql[2]=x[3];
ql[3]=x[4];
qr[0]=x[5];
qr[1]=x[6];
qr[2]=x[7];
qr[3]=x[8];
for( nvar=1; nvar <= 4; ++nvar )
r ad[nvar-1]+=*fc ad;
*fc ad=0;
flux ad(ql, ql ad, qr, qr ad, &sinal, &sinal ad, &cosal, &cosal ad,
&ds, &ds ad, r , r ad, &lambda, &lambda ad);
...omitting further adjoint statements...
}
File 6: Adjoint of File 1 (slightly edited to save space).
• The environment g++ differs from standard in that it uses the GNU
C/C++ compiler (g++, Version 3.3.3) with option “-O3”
• The environment AMD differs from standard in that it uses another processor, namely the 1600 MHz Athlon XP1900+ and the corresponding fast
compiler flags “-O3 -ip -tpp6”.
• The environment lazy differs from standard in that it does not use compiler
flags at all. In terms of compiler optimisation this is equivalent to the iccflag “-O2”.
For each environment, Table 1 lists the CPU time for a function evaluation, a
gradient evaluation, and their ratio. We used the timing module provided by
ADOL-C. Each code has been run five times, and the fastest result has been
recorded.
8
Voßbeck et al.
f2c
roemodel.f
roemodel_f2c.c
cpp
new tool
taf
f2c
roemodel_ad.f
roemodel_ad_f2c.c
icc
ifort
roemodel_ad
roemodel_f2c_ad.c
roemodel_ad_f2c
icc
roemodel_f2c_ad
Fig. 2. Processing chain for test code. Oval boxes denote the stages of the processes.
Rectangular boxes contain the files that are input/output to the individual stages of
the process. Names of executables are printed in bold face letters. The right branch
shows processing with f2c and new AD tool, the middle branch shows processing
with TAF and f2c, and the left branch shows processing with TAF.
Table 1. Performance of code generated by new tool (CPUs: seconds of CPU time)
Version
standard
g++
AMD
lazy
CPUs Function
5.4e-07
7.7e-07
6.2e-07
5.8e-07
CPUs Adjoint(Gradient) Ratio
2.1e-06 3.8
2.8e-06 3.6
2.4e-06 3.9
2.4e-06 4.1
In the absence of another source-to-source tool for reverse mode AD, we
have used the operator overloading tool ADOL-C, version 1.8.7 [24], in reverse
mode as a first benchmark for performance. We have changed the type of all
active variables in EULSOLDO’s C-version by hand, which was easy for a
code of that size. Next EULSOLDO’s derivative was evaluated by ADOL-C,
which went well and straightforward.
We repeated the performance tests for the same environments as above (see
Table 1). In addition we tested two ADOL-C specifics with high impact on
performance: First, ADOL-C offers an active vector class, to speed up deriva-
TAC++
9
tive propagation for active arrays. While our environment standard employs
the active vector class (type adoublev(n)), the additional test scalar uses
regular arrays (adouble[n]). The second ADOL-C specific relates to the tape
that it uses to store all operations and operands. As our test code represents a
small AD problem, ADOL-C can keep the tapes [24] in memory, and we used
this default in our environment standard. For larger AD-applications, the tape
needs to be written to disk, though. Hence, Table 2 shows the additional test
disk with the tape written to disk instead of memory.
As ADOL-C does not offer a pure mode (see sect. 3), each gradient evaluation includes a function evaluation. The ratios in Table 2 demonstrate, for
our test code, a considerably higher performance for the adjoint generated by
source-to-source transformation. It is worth noting that ADOL-C has a number of advantages such as its capabilities of handling the full C/C++ language
standard and providing higher order derivatives.
Table 2. Performance of ADOL-C reverse mode (CPUs: seconds of CPU time)
Version
standard
g++
AMD
lazy
scalar
disk
CPUs Function
5.4e-07
8.3e-07
7.2e-07
5.8e-07
5.4e-07
5.4e-07
CPUs Adjoint(Function+Gradient)
12.0e-06
11.0e-06
16.0e-06
12.0e-06
12.0e-06
23.0e-06
Ratio
22.5
13.0
22.0
21.0
23.0
43.0
As the current tool is only basic, we were curious to get an idea of the
potential performance of TAC++. Hence, we have applied our Fortran tool
TAF (with command-line options “-split -replaceintr”) to EULSOLDO’s initial Fortran-version. A previous study on AD of EULSOLDO [20] found this
combination of TAF command-line options for generating the most efficient
adjoint code. The TAF-generated adjoint has then been compiled with the Intel Fortran compiler (ifort, Version 8.0) and flags “-O3 -ip -xN”. This process
is shown as left branch in Fig. 2. Table 3 compares the performance of the new
tool’s adjoint (in the environment standard, first row) with that of the TAFgenerated adjoint (second row). Our performance ratio is in the range reported
by [20] for a set of different environments. Besides the better performance of
ifort-generated code, the second row also suggests that TAF-generated code
is more efficient by about 26%. In this comparison, both the AD tool and the
compiler differ. To isolate their respective effects on the performance, we have
carried out an additional test: We have taken the TAF-generated adjoint code,
have applied f2c, and have compiled in the environment standard as depicted
by the middle branch in Fig. 2. The resulting performance is shown in row 3 of
Table 3. The value of 2.9 suggests that the superiority of the Fortran branch
10
Voßbeck et al.
(row 2) over the C branch (row 1) cannot be attributed to the difference in
compilers. It rather indicates some scope for improvement of the current tool
in terms of performance of the generated code.
Two immediate candidates for improving this performance are the two
TAF command-line options identified by [20]. The option “-replaceintr” makes
TAF’s normalisation phase replace intrinsics such as abs, min, max by ifthen-else structures. In the C branch (row 1 of Table 3) this is already
done by f2c, i.e. EULSOLDO’s C version does not use any of these intrinsics.
The TAF command-line option “-split” is not available in the new tool yet.
Here might be some potential for improving the performance of the generated
code. Another difference to TAF is the lack of an activity analysis in the new
tool, which currently simply assumes all reals and doubles are active. Consequently it generates superfluous code segments propagating derivatives of
passive variables (which are all zero). In EULSOLDO this affects four passive
variables. Hence, transferring TAF’s activity analysis to the new tool, will
certainly enhance the performance of the generated code.
Table 3. Performance of function and adjoint codes generated by new tool and TAF
Version
f2c → new tool → icc
TAF → ifort
TAF → f2c → icc
CPUs Function
5.4e-07
5.1e-07
5.4e-07
CPUs Adjoint(Gradient) Ratio
2.1e-06 3.8
1.5e-06 2.8
1.6e-06 2.9
5 Conclusions and Outlook
Our study demonstrates the feasibility of reverse mode source-to-source transformation in C for a short test code (proof of concept). The usefulness of such
a tool is stressed by the favourable performance comparison to ADOL-C’s
adjoint of our test code. We will use the experience from this study for the
design of TAC++. This development will be application-driven, i.e. we will
tackle challenges as they arise in applications. Hence, TAC++’s functionality
will be enhanced application by application. Fortunately, many of the C++
challenges occur also in Fortran-90. Examples are handling of dynamic memory, structured types, operator overloading, overloaded functions, or accessing
of private variables. This allows us to port well-proved TAF algorithms to
TAC++. Other challenges are specific to C/C++ and require a solution that
is independent from TAF.
TAC++
11
Acknowledgements
The authors thank Paul Cusdin and Jens-Dominik Müller for providing EULSOLDO in its original Fortran 77 version as well as Andrea Walther, Olaf
Vogel, and Andreas Griewank for providing ADOL-C, and four anonymous
reviewers for helpful comments.
References
1. Griewank, A.: Evaluating Derivatives: Principles and Techniques of Algorithmic
Differentiation. Number 19 in Frontiers in Appl. Math. SIAM, Philadelphia, PA
(2000)
2. Shiriaev, D.: ADOL–F automatic differentiation of Fortran codes. In Berz, M.,
Bischof, C.H., Corliss, G.F., Griewank, A., eds.: Computational Differentiation:
Techniques, Applications, and Tools. SIAM, Philadelphia, PA (1996) 375–384
3. Rhodin, A.: IMAS - Integrated Modeling and Analysis System for the solution
of optimal control problems. Computer Physics Communications 107 (1997)
21–38
4. Pryce, J.D., Reid, J.K.: AD01, a Fortran 90 code for automatic differentiation. Technical Report RAL-TR-1998-057, Rutherford Appleton Laboratory, Chilton, Didcot, Oxfordshire, OX11 OQX, England (1998) See
ftp://matisa.cc.rl.ac.uk/pub/reports/prRAL98057.ps.gz.
5. Griewank, A., Juedes, D., Utke, J.: ADOL–C, a package for the automatic
differentiation of algorithms written in C/C++. ACM Trans. Math. Software
22 (1996) 131–167
6. Giering, R., Kaminski, T.: Recipes for Adjoint Code Construction. ACM Trans.
Math. Software 24 (1998) 437–474
7. Giering, R., Kaminski, T., Slawig, T.: Applying TAF to a Navier-Stokes solver
that simulates an Euler flow around an airfoil. To appear in Future Generation
Computer Systems (2005)
8. P. Heimbach and C. Hill and R. Giering: An efficient exact adjoint of the
parallel MIT general circulation model, generated via automatic differentiation.
To appear in Future Generation Computer Systems (2005)
9. Galanti, E., Tziperman, E., Harrison, M., Rosati, A., Giering, R., Sirkes, Z.: The
equatorial thermocline outcropping - a seasonal control on the tropical pacific
ocean-atmosphere instability. Journal of Climate 15 (2002) 2721–2739
10. Kaminski, T., Giering, R., Scholze, M., Rayner, P., Knorr, W.: An example of
an automatic differentiation-based modelling system. In Kumar, V., Gavrilova,
L., Tan, C.J.K., L’Ecuyer, P., eds.: Computational Science – ICCSA 2003, International Conference Montreal, Canada, May 2003, Proceedings, Part II. Volume
2668 of Lecture Notes in Computer Science., Berlin, Springer (2003) 95–104
11. Thomas, J.P., Hall, K.C., Dowell, E.H.: A discrete adjoint approach for modeling unsteady aerodynamic design sensitivities. 41st AIAA Aerospace Sciences
Meeting, Reno, Nevada (2003)
12. Giering, R., Kaminski, T., Todling, R., Errico, R., Gelaro, R., Winslow, N.: Generating tangent linear and adjoint versions of NASA/GMAO’s Fortran-90 global
weather forecast model. In Bücker, H.M., Corliss, G., Hovland, P., Naumann,
12
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
Voßbeck et al.
U., Norris, B., eds.: Automatic Differentiation: Applications, Theory, and Tools.
Lecture Notes in Computational Science and Engineering. Springer (2005)
Xiao, Y., Xue, M., Martin, W., Gao, J.: Development of an adjoint for a complex
atmospheric model, the ARPS, using TAF. In Bücker, H.M., Corliss, G., Hovland, P., Naumann, U., Norris, B., eds.: Automatic Differentiation: Applications,
Theory, and Tools. Lecture Notes in Computational Science and Engineering.
Springer (2005)
Bischof, C.H., Roh, L., Mauer, A.: ADIC — An extensible automatic differentiation tool for ANSI-C. Software–Practice and Experience 27 (1997) 1427–1456
Hovland, P., Norris, B., Smith, B.: Making automatic differentiation truly automatic: Coupling PETSc with ADIC. In Sloot, P.M.A., Tan, C.J.K., Dongarra,
J.J., Hoekstra, A.G., eds.: Computational Science – ICCS 2002, Proceedings
of the International Conference on Computational Science, Amsterdam, The
Netherlands, April 21–24, 2002. Part II. Volume 2330 of Lecture Notes in Computer Science., Berlin, Springer (2002) 1087–1096
Giering, R.: Adjoint code generation. In Bischof, C., Griewank, A., Khademi,
P., eds.: Workshop Report on First Theory Institut on Computational Differentiation, Technical Report ANL/MCS-TM-183 (1993) 11–12
Roe, P.L.: Approximate Riemann solvers, parameter vectors, and difference
schemes. J. Comput. Phys. 135 (1997) 250–258
Cusdin, P., Müller, J.D.: EULSOLDO. Technical Report QUB-SAE-03-02, QUB
School of Aeronautical Engineering (2003)
Tadjouddine, M., Forth, S.A., Pryce, J.D.: AD tools and prospects for optimal
AD in CFD flux Jacobian calculations. In Corliss, G., Faure, C., Griewank, A.,
Hascoët, L., Naumann, U., eds.: Automatic Differentiation of Algorithms: From
Simulation to Optimization. Computer and Information Science. Springer, New
York, NY (2002) 255–261
Cusdin, P., Müller, J.D.: Improving the performance of code generated by automatic differentiation. Technical Report QUB-SAE-03-04, QUB School of Aeronautical Engineering (2003) Submitted to Optimization Methods and Software.
Feldman, S.I., Weinberger, P.J.: A portable Fortran 77 compiler. In: UNIX Vol.
II: research system (10th ed.). W. B. Saunders Company (1990) 311–323
Faure, C.: Adjoining strategies for multi-layered programs. Optimisation Methods and Software 17 (2002) 129–164 To appear. Also appeared as INRIA Rapport de recherche no. 3781, BP 105-78153 Le Chesnay Cedex, FRANCE, 1999.
Giering, R., Kaminski, T.: Recomputations in reverse mode AD. In Corliss,
G., Griewank, A., Fauré, C., Hascoet, L., Naumann, U., eds.: Automatic Differentiation of Algorithms: From Simulation to Optimization. Springer Verlag,
Heidelberg (2002) 283–291
Griewank, A., Juedes, D., Mitev, H., Utke, J., Vogel, O., Walther, A.: ADOL-C:
A package for the automatic differentiation of algorithms written in C/C++.
Technical report, Technical University of Dresden, Institute of Scientific Computing and Institute of Geometry (1999) Updated version of the paper published
in ACM Trans. Math. Software 22, 1996, 131–167.
Download