COP4020 Programming Languages Compilation and Interpretation Prof. Xin Yuan Overview Compilation and interpretation Virtual machines Static linking and dynamic linking Compiler in action (g++) Integrated development environments 3/19/2016 COP4020 Spring 2014 2 Compilation and interpretation A program written in a high level language can run in two ways Compiled into a program in the native machine language and then run on the target machine Directly interpreted and the execution is simulated within an interpreter Example: how can the following statement be executed? 3/19/2016 A[i][j] = 1; COP4020 Spring 2014 3 Compilation and interpretation Example: how can the following statement be executed? A[i][j] = 1; Approach 1: 3/19/2016 You can create a software environment that understands 2dimensional array (and the language) To execute the statement, just put 1 in the array entry A[i][j]; This is interpretation since the software environment understands the language and performs the operations specified by interpreting the statements. COP4020 Spring 2014 4 Compilation and interpretation Example: how can the following statement be executed? A[i][j] = 1; Approach 2: Translate the statements into native machine (or assembly language) and then run the program. This is compilation. g++ produces the following assembly for this statement. salq $2, %rax addq %rcx, %rax leaq 0(,%rax,4), %rdx addq %rdx, %rax salq $2, %rax addq %rsi, %rax movl $1, A(,%rax,4) 3/19/2016 COP4020 Spring 2014 5 Compilation and interpretation How is a C++ program executed on linprog? How is a python program executed? g++ try.cpp compiling the program into machine code ./a.out running the machine code python try.py The program just runs, no compilation phase The program python is the software environment that understands python language. The program try.py is executed (interpreted) within the environment. In general, which approach is more efficient? 3/19/2016 COP4020 Spring 2014 6 Compilation and interpretation In general, which approach is more efficient? A[i][j] = 1; Compilation: salq addq leaq addq salq addq movl 3/19/2016 $2, %rax %rcx, %rax 0(,%rax,4), %rdx %rdx, %rax $2, %rax %rsi, %rax $1, A(,%rax,4) Interpretation: • • create a software environment that understand the language put 1 in the array entry A[i][j]; COP4020 Spring 2014 7 Compilation and interpretation In general, which approach is more efficient? A[i][j] = 1; Compilation: salq addq leaq addq salq addq movl $2, %rax %rcx, %rax 0(,%rax,4), %rdx %rdx, %rax $2, %rax %rsi, %rax $1, A(,%rax,4) Interpretation: • • create a software environment that understand the language put 1 in the array entry A[i][j]; • For the machine to put 1 in the array entry A[i][j], that code sequence still needs to be executed. • Most interpreter does a little more than the barebone “real work.” • Compilation is always more efficient!! • Interpretation provides more functionality. E.g. for debugging One can modify the value of a variable during execution. 3/19/2016 COP4020 Spring 2014 8 Compilation Compilation is the conceptual process of translating source code into a CPU-executable binary target code Compiler runs on the same platform X as the target code Source Program Compiler Target Program Debug on X Compile on X Input Target Program Output Run on X 3/19/2016 COP4020 Spring 2014 9 Cross Compilation Compiler runs on platform X, target code runs on platform Y Source Program Cross Compiler Target Program Compile on X Input Debug on X (= emulate Y) Copy to Y Target Program Output Run on Y 3/19/2016 COP4020 Spring 2014 10 Interpretation Interpretation is the conceptual process of running highlevel code by an interpreter Source Program Interpreter Output Input 3/19/2016 COP4020 Spring 2014 11 Compilers versus Interpreters Compilers “try to be as smart as possible” to fix decisions that can be taken at compile time to avoid to generate code that makes this decision at run time Type checking at compile time vs. runtime Static allocation Static linking Code optimization Compilation leads to better performance in general Allocation of variables without variable lookup at run time Aggressive code optimization to exploit hardware features 3/19/2016 COP4020 Spring 2014 12 Compilers versus Interpreters Benefit of interpretation? Interpretation facilitates interactive debugging and testing Interpretation leads to better diagnostics of a programming problem Procedures can be invoked from command line by a user Variable values can be inspected and modified by a user Some programming languages cannot be purely compiled into machine code alone Some languages allow programs to rewrite/add code to the code base dynamically Some languages allow programs to translate data to code for execution (interpretation) 3/19/2016 COP4020 Spring 2014 13 Compilers versus Interpreters The compiler versus interpreter implementation is often fuzzy One can view an interpreter as a virtual machine that executes highlevel code Java is compiled to bytecode Java bytecode is interpreted by the Java virtual machine (JVM) or translated to machine code by a just-in-time compiler (JIT) A processor (CPU) can be viewed as an implementation in hardware of a virtual machine (e.g. bytecode can be executed in hardware) 3/19/2016 COP4020 Spring 2014 14 Virtual Machines A virtual machine executes an instruction stream in software Adopted by Pascal, Java, Smalltalk-80, C#, functional and logic languages, and some scripting languages 3/19/2016 Pascal compilers generate P-code that can be interpreted or compiled into object code Java compilers generate bytecode that is interpreted by the Java virtual machine (JVM) The JVM may translate bytecode into machine code by just-intime (JIT) compilation COP4020 Spring 2014 15 Compilation and Execution on Virtual Machines Compiler generates intermediate program Virtual machine interprets the intermediate program Source Program Compiler Intermediate Program Compile on X Input Run on VM Virtual Machine Output Run on X, Y, Z, … 3/19/2016 COP4020 Spring 2014 16 Pure Compilation and Static Linking Adopted by the typical Fortran systems Library routines are separately linked (merged) with the object code of the program Source Program Compiler Incomplete Object Code Static Library Object Code Linker extern printf(); _printf _fget _fscan … 3/19/2016 COP4020 Spring 2014 Binary Executable 17 Compilation, Assembly, and Static Linking Facilitates debugging of the compiler Source Program Compiler Assembly Program extern printf(); Assembler _printf _fget _fscan … 3/19/2016 Static Library Object Code COP4020 Spring 2014 Linker Binary Executable 18 Compilation, Assembly, and Dynamic Linking Dynamic libraries (DLL, .so, .dylib) are linked at run-time by the OS (via stubs in the executable) Source Program Compiler Assembly Program extern printf(); Assembler Shared Dynamic Libraries _printf, _fget, _fscan, … Input 3/19/2016 Incomplete Executable COP4020 Spring 2014 Output 19 Static linking and dynamic linking in action Try ‘g++ –static a1.cpp’ and ‘g++ a1.cpp’ Try to run the program on linprog and cetus What are the relative executable file sizes for static and dynamic linking? Why dynamic linking is now much more common? Which linking mechanism is more portable? 3/19/2016 COP4020 Spring 2014 20 Preprocessing Most C and C++ compilers use a preprocessor to import header files and expand macros (‘cpp a.cpp’ and ‘cpp a1.cpp’ Source Program Preprocessor Modified Source Program #include <stdio.h> #define N 99 … for (i=0; i<N; i++) for (i=0; i<99; i++) Compiler 3/19/2016 COP4020 Spring 2014 Assembly or Object Code 21 The CPP Preprocessor Early C++ compilers used the CPP preprocessor to generated C code for compilation C++ Source Code C++ Preprocessor C Source Code C Compiler 3/19/2016 COP4020 Spring 2014 Assembly or Object Code 22 g++ phases: When running g++, it invokes preprocessor, compiler, assembler, and linker Try ‘g++ -v a.cpp’ to see the commands executed. Stop after preprocess ‘g++ -v -E a1.cpp’ Stop after compiler (assembly code) ‘ g++ -v –S a1.cpp’ 3/19/2016 COP4020 Spring 2014 23 Integrated Development Environments Programming tools function together in concert Editors Compilers/preprocessors/interpreters Debuggers Emulators Assemblers Linkers Advantages Tools and compilation stages are hidden Automatic source-code dependency checking Debugging made simpler Editor with search facilities Examples 3/19/2016 Smalltalk-80, Eclipse, MS VisualStudio, Borland COP4020 Spring 2014 24