COP4020 Programming Languages Subroutines and Parameter Passing Prof. Xin Yuan Today’s topics Implementing routine calls Calling sequences Hardware/language support for efficient execution of subroutines 5/29/2016 COP4020 Spring 2014 2 Subroutine frame (Activation record) Temporary storage (e.g. for expression evaluation) Local variables Bookkeeping (e.g. saved CPU registers) Return address Subroutine arguments and returns 5/29/2016 Activation record (subroutine frame) is used to store all related information for the execution of a subroutine Before a subroutine is executed, the frame must be set up and some fields in the frame must be initialized Formal arguments must be replaced with actual arguments This is done in the calling sequence, a sequence of instructions before and after a subroutine, to set-up the frame. COP4020 Spring 2014 3 Stack layout With stack allocation, the subroutine frame for the current frame is on top of the stack sp: top of the stack fp: an address within the top frame 5/29/2016 COP4020 Spring 2014 To access non-local variables, static link (for static scoping) or dynamic link (dynamic scoping) are maintained in the frame 4 Calling sequences Maintaining the subroutine call stack Calling sequences in general have three components The code executed by the caller immediately before and after a subroutine call. Prologue: code executed at the beginning of the subroutine Epilogue: code executed at the end of the subroutine …. foo(100+I, 20+j) … Calling sequence code Call foo Calling sequence code 5/29/2016 COP4020 Spring 2014 Prologue code foo body Epilogue code 5 Calling sequences Temporary storage (e.g. for expression evaluation) What needs to be done in the calling sequence? Local variables Bookkeeping (e.g. saved CPU registers) Return address Subroutine arguments and returns Before the subroutine code can be executed (caller code before the routing + prologue) – set up the subroutine frame 5/29/2016 Compute the parameters and pass the parameters Saving the return address Save registers Changing sp, fp (to add a frame for the subroutine) Changing pc (start running the subroutine code) Execute initialization code when needed COP4020 Spring 2014 6 Calling sequences Temporary storage (e.g. for expression evaluation) Local variables Bookkeeping (e.g. saved CPU registers) 5/29/2016 After the subroutine code is executed (caller code after the routine + epilogue) – remove the subroutine frame Return address Subroutine arguments and returns What needs to be done in the calling sequence? Passing return result or function value Finalization code for local objects Deallocating the stack frame (restoring fp and sp to their previous value) Restoring saved registers and PC Some of the operations must be performed by the caller, others can either be done by the caller or callee. COP4020 Spring 2014 7 Saving and restoring registers: the problem Some registers such as sp and fp are clearly different in and out of a subroutine. For general purpose registers used in caller and/or callee. main() { … R1 = 10 … call foo(); … R2 = R1 } 5/29/2016 foo() { R1= 20 } To execute correctly, R1 need to be saved before foo() and restored after. COP4020 Spring 2014 8 Saving and restoring registers: the solution Solution 1: Save/restore in the calling sequence at caller main() { … R1=10 … T1 = R1 call foo(); R1 = T1 … R2 = R1 } 5/29/2016 Solution 2: Save/restore in prologue and epilogue at callee Calling sequence COP4020 Spring 2014 foo() { T1 = R1 … R1=20 … R1 = T1 return } prologue epilogue 9 Saving and restoring registers The compiler should generate code only to save and restore registers that matters Ideally, we should only save registers that is used in the caller and the callee. Difficult due to separate compilation: no information of callee when compiling caller, and vice versa. Simple solution (with unnecessary save/restore): If a subroutine does not use R3, R3 does not need to be saved in the calling sequence. Option 1: caller saves/restores all registers it uses Option 2: callee saves/restores all registers it uses Compromised solution: 5/29/2016 partition registers into two sets, one for caller save one for callee save. COP4020 Spring 2014 10 A typical calling sequence Caller before the call: 1. 2. 3. 4. Prologue in the callee: 1. 2. 3. saves any caller-saves registers whose values will be needed after the call. Computes the values of arguments and moves them into the stack or registers Computes the static link, and passes it as an extra hidden argument Use a special subroutine call instruction to jump to the subroutine, simultaneously passing the return address on the stack or in a register Allocates a frame (sp = sp – offset) Save old fp into the stack, update the fp Save callee-saves registers that may be overwritten in the routine. Epilogue in the callee: 1. 2. 3. 4. 5/29/2016 Moves the return value into a register or a location in the stack Restores callee-saves registers Restore fp and sp Jump back to the return address COP4020 Spring 2011 11 A typical calling sequence Caller after the call: 1. 2. 5/29/2016 Moves the return value to wherever it is needed Restores caller-saves registers. COP4020 Spring 2014 12 A question Why local variables typically do not have a default value (while globals do)? Int I main() { int j; cout << I << j; } 5/29/2016 COP4020 Spring 2014 13 Hardware support for efficient subroutine execution Calling sequences are overheads to running subroutines. Register windows Introduced in Berkeley RISC machines Also used in Sun SPARC and Intel Itanium processors Basic idea: Maintain multiple sets (a window) of registers Using a new mapping (set) of registers when making subroutine calls 5/29/2016 Set and reset a mapping is cheaper than saving and restoring registers New and old mapping registers overlaps to allow parameter passing COP4020 Spring 2014 14 Language support for efficient subroutine execution In-line functions Inline int max(int a, int b) {return a> b ? a: b;} How is it different from ordinary functions? Such functions are not real functions, the routine body is expanded in-line at the point of call. A copy of the routine body becomes a part of the caller No actual routine call occurs. Will inline function always improve performance? Maybe, maybe not 5/29/2016 Many other factors: e.g. code size affecting cache/memory behavior COP4020 Spring 2014 15