Compilers, Libraries and Other Strange Beasts Mario Antonioletti, Applications Consultant Compilers, Libraries and Others 4Compilers, preprocessors, linkers – What do they do and how do they do it? – Why should I care how they do it? – Macros and conditional compilation 4Libraries – – – – What are they? Shared, static, .so, .a, .lib, .dll, .ocx Building them and looking at them Standard libraries: black boxes and NIH 4Hairy things – Language crosscalling - a can of worms 2 Advanced Programming: Tools and Techniques - Session 4 1 Code compilers 4Fundamental tools of the trade: – Code preprocessors (e.g. cpp, fpp, gpp) – Code compilers (e.g. f77, f90, cc, CC, c++, gcc, javac) – Code linkers (e.g. ld, ild) 4Knowing how these things work is valuable – Understand optimisation – Understand how to use correctly in, e.g., makefiles – Understand how your program will work on strange architectures 4Joe BankDBProgrammer may not need to know 4Sally HPCWizard does 3 Advanced Programming: Tools and Techniques - Session 4 What do they do? 4Translate high-level language into machine runnable code .F .F .c .c .cpp .cpp .f.f .c .c .cpp .cpp .s .s .o .o .so .so .a .a preprocessor preprocessor compiler compiler assembler assembler linker linker a.out a.out 4 .a .a .so .so Advanced Programming: Tools and Techniques - Session 4 2 Preprocessors 4Preprocess source code before compilation – Use to define constants and macros (e.g. #define TRUE 1) – Also to include other code files (e.g. #include <stdio.h>) – Also for conditional compilation (e.g. #ifdef PAR_MPI) 4Most common is C preprocessor cpp – Run implicitly by C compilers – Also run implicitly by most Fortran compilers for .F / .F90 • Though some use slightly different ‘Fortranny’ versions (e.g. fpp) – Can be used explicitly for any language • Makefile is good place to define preprocessing rules like this: .texs.tex: cpp $< > ${<:.texs=.tex} 5 Advanced Programming: Tools and Techniques - Session 4 Preprocessors void s_fact(void) { #ifdef MFACT_TIMES double t0, t1; t0 = mfact_clock(); #endif .... body of function in here.... #ifdef MFACT_TIMES t1 = mfact_clock(); fprintf(logfile_p, “s_fact time ”,t1-t0); #endif } 6 Advanced Programming: Tools and Techniques - Session 4 3 Preprocessors 4Preprocessors are a powerful and useful tool 4But: avoid overuse... 4Don’t use macros for the sake of it – Macros should enhance readability as well as (possibly) performance 4Don’t use macros that extend over many lines – Can confuse debuggers as to correct line numbering 4Don’t use too many #ifdefs in one function – If you’re conditionally switching a lot of code, you probably should write it as a completely separate function 7 Advanced Programming: Tools and Techniques - Session 4 Preprocessors 4Be careful with magic numbers: #define IN_BUFFER_SIZE 20 #define OUT_BUFFER_SIZE 30 : char in_buffer[IN_BUFFER_SIZE]; 4This is fine, but no typechecking is done for #define’d constants enum { IN_BUFFER_SIZE = 20, OUT_BUFFER_SIZE = 30 }; 4This is better (plus debuggers can see them!) 8 Advanced Programming: Tools and Techniques - Session 4 4 Compilers 4Take (preprocessed) source code and generate assembly language 4Usually work in two distinct stages – Syntactic and semantic analysis (front end) • Checks your code for “correctness” • Generates intermediate language representation – Assembly code generation (back end) • Takes intermediate language and generates assembler 4Can incorporate optimisation in either or both stages 9 Advanced Programming: Tools and Techniques - Session 4 Assemblers 4Turn assembly language into object code 4Are (naturally) very architecture specific 4Again, can incorporate optimisation 4Assembly is usually a hidden stage in compilation – Unless you have to write explicit assembler! 10 Advanced Programming: Tools and Techniques - Session 4 5 Linkers 4Put all the pieces together to make a program 4Unite all the object files and join them to libraries – Library locations defined by flags (e.g. -L) or by “library path” environment variable (e.g. LD_LIBRARY_PATH) 4Add operating system “hooks” for calling dynamic libraries 4Bind all symbolic references to memory addresses – Identify the “main” routine as the initial entry point 4Some linkers work incrementally (e.g. ild) 11 Advanced Programming: Tools and Techniques - Session 4 Compiler systems 4All the preceding phases are generally presented as one “compilation process” – e.g. f90 mycode.F90 actually does all this for you (Solaris): • fpp mycode.F90 → mycode.for.f90 f90 -s mycode.for.f90 → mycode.s as mycode.s → mycode.o ld mycode.o <libraries> → a.out 4Specify compile options for any of these phases 4Understanding the options available for your compiler is essential, especially when optimising – Read The Flippin’ Manual! 12 Advanced Programming: Tools and Techniques - Session 4 6 Object code formats 4COFF – Unix System V R3 Common Object File Format • SGI/MIPS, DEC (ECOFF), IBM, Macintosh and Be (XCOFF) 4ELF – Unix System V R4 Executable and Linking Format • Solaris, Linux 4OMF – Intel Object Module File 4PE – MS Windows 95/NT Portable Executable – Also a flavour of COFF 13 Advanced Programming: Tools and Techniques - Session 4 Libraries 4A library is just a collection of object files – A useful (or essential!) collection of functions, subroutines or classes 4Libraries are A Good Thing – They promote clean, well-behaved interfaces between logically separate chunks of code – The promote reuse and discourage the reinvention of the wheel – Special-purpose libraries are often optimised for the current architecture • Never write your own FFT... • Break away from that Black-Box-Not-Invented-Here feeling... 14 Advanced Programming: Tools and Techniques - Session 4 7 Libraries 4Libraries have two broad types: 4Static libraries (.a or .lib) – Library code copied into your program – Large executable file but can give marginally faster code – Wasteful: each user’s program has its own copy of the library 4Dynamic libraries (.so or .dll) – – – – Library linked dynamically, i.e. at runtime Smaller executable file Single library copy can be used by multiple running programs Pick up latest (improved!) versions when you run • Though can lead to non-bit reproducibility – Changes of library locations or versions → cryptic runtime failures 15 Advanced Programming: Tools and Techniques - Session 4 Building (Unix) libraries 4Static libraries (*.a) — the ar command 4ar collects object files together into a static lib – A Solaris example: $ ar ruv libmfact.a s_factor.o n_factor.o... – The r & u options mean replace existing files if they are older than the ones specified on the command line 4Note the naming convention: libmfact.a can be referred to when compiling/linking with the flag -lmfact – Generally, -lxxx looks for libxxx.a on the library path 16 Advanced Programming: Tools and Techniques - Session 4 8 Building (Unix) libraries 4Shared libraries (*.so) — the ld command 4Use the standard linker command with flags – E.g. under Solaris use the -G flag $ ld -o libmfact.so -G s_factor.o n_factor.o... 4Use linker flag -lmfact to reference, as before Advanced Programming: Tools and Techniques - Session 4 17 Windows/NT libs & components 4LIBs – Static object libraries – Behave just like Unix archive libraries (*.a) – Can be built through IDE project “wizards” 4DLLs – Dynamic Link Libraries – Used in same way as Unix shared object (*.so) libraries – Again, usually built through IDE project “wizards” 18 Advanced Programming: Tools and Techniques - Session 4 9 Windows/NT libs & components 4OCXs – OLE Custom Control – Independent code component, used like a shared library 4ActiveX controls – Supersedes OCX (though use same file extension) – A marriage of OLE and Component Object Model – Can run as stand-alone programs in a “container” environment (a little like Java applets) 19 Advanced Programming: Tools and Techniques - Session 4 Examining object code 4Some standard (Unix) tools for looking inside object code, libraries, executables: 4nm: prints the name list of an object file 4dis: the opposite of as :-) 4ldd: lists dynamic library dependencies 4file: prints basic file information 4strings: extracts text strings 4od: examines arbitrary binary files 20 Advanced Programming: Tools and Techniques - Session 4 10 nm 4Displays the “symbol table” (if not stripped) $ nm s_factor.o [Index] [6] [3] [4] [5] [1] [42] [25] [9] [13] [30] [40] Value | | | | | | | | | | | Size 0| 0| 0| 0| 0| 16| 1216| 4| 0| 0| 0| Type Bind 0|NOTY 0|NOTY 0|NOTY 0|NOTY 0|FILE 288|FUNC 68|FUNC 4|OBJT 0|NOTY 0|NOTY 0|NOTY |GLOB |LOCL |LOCL |LOCL |LOCL |GLOB |GLOB |GLOB |GLOB |GLOB |GLOB Other Shname |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 |UNDEF |.bss |.data |.rodata |ABS |.text |.text |COMMON |UNDEF |UNDEF |UNDEF Name |.mul |Bbss.bss |Ddata.data |Drodata.rodata |s_factor.c |s_fact |s_fact_free |stack_ptr |N |chain_index |double_size Advanced Programming: Tools and Techniques - Session 4 21 dis 4Disassembles object code $ dis s_factor.o disassembly for s_factor.o section .text 0: 4: 8: c: s_fact() 10: 14: 18: 1c: 20: 24: 28: 22 00 00 00 00 01 01 01 01 00 00 00 00 00 00 00 00 unimp unimp unimp unimp 0x10000 0x10000 0x10000 0x10000 9d a2 21 e2 a2 21 e2 e3 10 00 24 10 00 24 bf 20 00 20 20 00 20 90 04 00 00 08 00 00 save mov sethi st mov sethi st %sp, -112, %sp 4, %l1 %hi(Bbss.bss), %l0 %l1, [%l0 + Bbss.bss] 8, %l1 %hi(Bbss.bss), %l0 %l1, [%l0 + Bbss.bss] Advanced Programming: Tools and Techniques - Session 4 11 ldd 4Lists dynamic dependencies of executables $ ldd testcode.exe libm.so.1 => libfui.so.1 => libfai.so.1 => libfsumai.so.1 => libfprodai.so.1 => libfsu.so.1 => libsunmath.so.1 => libc.so.1 => libdl.so.1 => 23 /opt/SUNWspro/lib/libm.so.1 /opt/SUNWspro/lib/libfui.so.1 /opt/SUNWspro/lib/libfai.so.1 /opt/SUNWspro/lib/libfsumai.so.1 /opt/SUNWspro/lib/libfprodai.so.1 /opt/SUNWspro/lib/libfsu.so.1 /opt/SUNWspro/lib/libsunmath.so.1 /usr/lib/libc.so.1 /usr/lib/libdl.so.1 Advanced Programming: Tools and Techniques - Session 4 file 4Determines file type by reading file headers – Works for many non-object file types – Doesn’t rely on the ‘.xyz’ filename extension :-) $ file s_factor.o s_factor.o: ELF 32-bit MSB relocatable SPARC Version 1 $ file testcode.exe testcode.exe: ELF 32-bit MSB executable SPARC Version 1 dynamically linked, not stripped $ file s_factor.c s_factor.c: English text 24 Advanced Programming: Tools and Techniques - Session 4 12 strings 4Finds printable strings in object or binary files $ strings s_factor.o s_fact_allocate_vecs s_fact_allocate_vecs – These correspond to the program lines exit_err(“s_fact_allocate_vecs”, malloc_err) 4Can be used to embed CVS/RCS ids in objects – eg. in C: static char *cvsid = “$Id$”; – CVS/RCS expands $Id$ to current revision info – This revision info is embedded in compiled code as a string! Advanced Programming: Tools and Techniques - Session 4 25 od 4Stands for ‘octal dump’ (it’s an old function :-) 4Displays contents of binary files – Flags specify display format: chars, ints, floats, doubles,... $ od -c unknown-file 0000000 0000020 0000040 0000060 0000100 0000120 0000140 0000160 0000200 26 G \0 377 h \0 265 001 237 $ I F 377 @ 377 377 G \0 \0 I 027 234 l 237 036 200 360 8 9 a \0 \0 302 \0 \0 \0 \0 \0 377 \0 \0 377 377 \0 377 377 377 200 \0 \0 377 377 377 ! 376 016 M a d e w i t I M P \0 ! 371 004 001 < \0 007 \0 , \0 \0 \0 \0 003 241 x 272 254 360 360 034 8 c X 353 323 ( u V h 212 344 006 t k + 277 S 274 331 354 347 6 370 252 003 347 " 026 M 4 025 0 270 031 \ 220 016 s > 250 P e ( 262 } 242 214 G i 016 Advanced Programming: Tools and Techniques - Session 4 13 Standard (numeric) libraries 4Reiterating: libraries are A Good Thing 4Know what libraries are available and what they do 4Use them! – They will be better than code you can write yourself... 4Some common libraries of note – – – – – 27 MPI BLAS LAPACK PETSc NAG Advanced Programming: Tools and Techniques - Session 4 MPI 4Message Passing Interface 4Very good example of reusable library – You could reimplement all that interprocess communication at the sockets/hardware/native level if you wanted too... 4Frequently optimised for HPC hardware – eg. T3D/T3E MPI (written by EPCC!) uses Cray’s shmem model 4Fortran77 and C interfaces – Also C++ and Fortran90 in MPI-2 4Generic Unix versions available from http://www-unix.mcs.anl.gov/mpi/index.html 28 Advanced Programming: Tools and Techniques - Session 4 14 BLAS 4Basic Linear Algebra Subprograms – Fast, efficient kernels for building numeric codes 4Level 1: vector-vector operations – xCOPY, xAXPY, xDOT, ... (x indicates type) 4Level 2: matrix-vector operations – xGEMV, xHEMV, xSYMV, ... 4Level 3: matrix-matrix operations – xGEMM, xHEMM, xSYMM, ... 4Optimised BLAS exist for many HPC systems – This is why you should use them! 29 Advanced Programming: Tools and Techniques - Session 4 LAPACK 4Linear Algebra PACKage – Routines for solving simultaneous linear systems, eigenvalue problems etc. – Includes matrix factorisation routines (Cholesky, LU, SVD, ...) 4Uses BLAS kernels where possible 4Designed to run efficiently on cache-based HPC – SCALAPACK for distributed memory systems 4Primarily Fortran77 – Fortran90 & C++ interfaces do exist 4 http://www.netlib.org/lapack/index.html 30 Advanced Programming: Tools and Techniques - Session 4 15 PETSc 4Portable, Extensible Toolkit for Scientific Computation – Linear and non-linear equation solvers – GMRES, CG, other popular “Krylov subspace methods” 4Built on MPI for parallel system portability – Also uses LAPACK and BLAS 4C, C++, Fortran interfaces 4 http://www-fp.mcs.anl.gov/petsc/ Advanced Programming: Tools and Techniques - Session 4 31 NAG 4Numerical Algorithms Group Ltd – – – – Optimisation, PDEs, ODEs, FFTs, etc... NAG libraries are not free! Widely recognised as good Widely used in industry 4Fortran77, Fortran90, C versions – Windows, Unix and Linux – Also HPC versions for SMP and MPI – Even a DLL version for Windows VB programmers :-) 4 http://www.nag.co.uk/numerical_libraries.asp 32 Advanced Programming: Tools and Techniques - Session 4 16 Language crosscalling 4Sometimes programs are written in more than one language – e.g. Numerical Fortran code with an i/o library in C 4This is either Very Hard or Merely Tricky: – If the two compilers produce different object file formats, this is Very Hard – If the two compilers share a common object file format (e.g. ELF 32-bit MSB under Solaris), this is Merely Tricky 4We’ll look at some of the Mere Tricks for Fortran - C crosscalling… 33 Advanced Programming: Tools and Techniques - Session 4 Fortran - C crosscalling 4Three main gotchas: 1. Datatypes may not match up the way you hope 2. C passes function arguments by value Fortran passes arguments by reference 3. Functions and subroutines have different naming conventions 34 Advanced Programming: Tools and Techniques - Session 4 17 Fortran - C datatypes 4Generally, sizes (in bytes) of program datatypes are compiler and system dependent – Can use explicit Fortran types (e.g. INTEGER*4) and C’s sizeof operator to make things easier • Best thing to do is write test programs – Types which are usually safe: • Fortran INTEGER REAL DOUBLE PRECISION CHARACTER C int float double char under Solaris 32-bit integer 32-bit floating point 64-bit floating point 8-bit ASCII value – Harder things: • Fortran LOGICAL, COMPLEX, strings • C structs, pointers Advanced Programming: Tools and Techniques - Session 4 35 Arguments: Fortran → C 4call croutine(x, y, n) – Fortran passes a scalar (say integer n) – C actually receives a pointer to that scalar (say int *n) – Passing arrays is OK • • • • Fortran array appears as a C pointer, as you expect BUT: the axes of multidimensional arrays are reversed! Thus: integer x(i,j) → int x[j][i] Also, don’t forget Fortran indexes from 1, C from 0! – Passing strings is very hairy • Fortran strings have a defined, fixed length • C strings are null terminated (character ‘\0’) • Fortran needs to pass the string length explicitly • C must then add a ‘\0’ to the end 36 Advanced Programming: Tools and Techniques - Session 4 18 Arguments: C → Fortran 4froutine(x, y, &n); – Because Fortran expects to receive a pointer, C must pass scalars using the “address of” operator ‘&’ – Passing arrays is fine, since C arrays are pointers • NB: previous notes on axes and indexes! – Strings are hairy • Fortran expects a fixed-length string, right-padded with spaces if necessary • C must fix this up and remove the ‘\0’ null characters before passing Advanced Programming: Tools and Techniques - Session 4 37 Naming conventions 4Compilers with the same object file format do not necessarily use the same naming conventions for functions and subroutines 4Usual variation is to prepend or append an ‘_’ to the function name – Sometimes both 4This varies system to system... 38 Advanced Programming: Tools and Techniques - Session 4 19 Naming conventions: Solaris 4Under Solaris – f77/f90 append ‘_’ to the name of each function or subroutine – cc does not 4Therefore – C functions to be called from Fortran must add ‘_’ to their name definition – C programs must add ‘_’ to the name of any Fortran routines they call 4Here’s a place that nm is useful for checking actual symbolic names in object code Advanced Programming: Tools and Techniques - Session 4 39 Solaris: Fortran → C example PROGRAM main INTEGER n, n2 EXTERNAL csquare n = 9 CALL csquare(n, n2) PRINT*, ’Result: ’, n, ’ squared = ’, n2 END 40 Advanced Programming: Tools and Techniques - Session 4 20 Solaris: Fortran → C example void csquare_(int *n, int *n2) { *n2 = (*n) * (*n); } Advanced Programming: Tools and Techniques - Session 4 41 Solaris: C → Fortran example #include <stdio.h> void main() { extern int fsquare_(); int n, n2; n = 9; n2 = fsquare_(&n); printf(”Result: %d squared = %d\n”, n, n2); } 42 Advanced Programming: Tools and Techniques - Session 4 21 Solaris: C → Fortran example INTEGER FUNCTION fsquare(n) INTEGER n fsquare = n*n RETURN END 43 Advanced Programming: Tools and Techniques - Session 4 Summary 4Understanding what your compilers are doing can be very important for HPC 4Always read up on the compilers you will use 4Know what libraries are available and when you should use them 4If writing generic, “useful” code, build it into a library so others can benefit 4If using mixed-language programs, watch for the gotchas! 44 Advanced Programming: Tools and Techniques - Session 4 22 Makefile practical 4An exercise in writing good Makefiles, using libraries and other good things 4Point your browser at http://www.epcc.ed.ac.uk/~softdev/day1/makeTest.html 45 Advanced Programming: Tools and Techniques - Session 4 23