LECTURE 16 Intermediate Code INTERMEDIATE CODE • The generation of a machine-independent intermediate form decouples the backend from front-end, and facilitates retargeting. • Machine independent code optimizations can be applied here. Parser Static Semantic Analysis Machine-Specific Code Optimization Target Code Generation Intermediate Code Generation Machine-Independent Code Optimization INTERMEDIATE LANGUAGES Many kinds for different purposes. • High-level representation for source to source translation to keep the program structure. • Abstract Syntax Tree • Low-level representation for compiling for target machine. • Intermediate form is close to low level machine language. • Three-address code (more on this later). • gcc uses RTL, a variation of the three-address code. • Other commonly used intermediate languages • Control flow graph, Program dependence graph (PDG), DAG (directed acyclic graph). THREE-ADDRESS CODE A sequence of statements of the form x = y op z Three-address statements closely resemble assembly statements (OP src1 src2 dst). Example: a = b * -c + b * -c t1 = -c t2 = b * t1 t3 = -c t4 = b * t3 t5 = t2 + t4 a = t5 or t1 = -c t2 = b * t1 t3 = t2 + t2 a = t3 THREE-ADDRESS CODE Some three-address statements that will be used later: Assignment statements With a binary operation: With a unary operation: With no operation (copy) x = y op z x = op y x=y Branch statements Unconditional jump: Conditional jumps: goto L if x relop y goto L Statement for procedure calls param x call p, n return y set a parameter for a procedure call call procedure p with n parameters return from a procedure with return value y THREE-ADDRESS CODE Instructions for procedure call p(x1, x2, x3, …, xn) param x1 param x2 … param xn call p, n Indexed assignments: x = y[i] and x[i] = y Address and pointer assignments x = &y, x = *y