Chapter 10 The Stack Stack data structure Interrupt I/O Arithmetic using a stack Stack Data Structure Abstract Data Structures – are defined simply by the rules for inserting and extracting data The rule for a Stack is LIFO (Last In - First Out) – Operations: Push (enter item at top of stack) Pop (remove item from top of stack) – Error conditions: Underflow (trying to pop from empty stack) Overflow (trying to push onto full stack) – We just have to keep track of the address of top of stack (TOS) College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 2 A “physical” stack A coin holder as a stack College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 3 A hardware stack Implemented in hardware (e.g. registers) – Previous data entries move up to accommodate each new data entry Note that the Top Of Stack is always in the same place. College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 4 A software stack Implemented in memory – The Top Of Stack moves as new data is entered Here R6 is the TOS register, a pointer to the Top Of Stack College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 5 Push & Pop Push – Decrement TOS pointer (our stack is moving down) – then write data in R0 to new TOS PUSH ADD STR R6, R6, # -1 R0, R6, # 0 Pop – Read data at current TOS into R0 – then increment TOS pointer POP LDR ADD College of Charleston, Computer Science Dr. Anderson R0, R6, # 0 R6, R6, # 1 CS 250 Comp. Org. & Assembly 6 Push & Pop (cont.) Push – Decrement TOS pointer (our stack is moving down) – then write data in R0 to new TOS Pop – Read data at current TOS into R0 – then increment TOS pointer What if stack is already full or empty? – Before pushing, we have to test for overflow – Before popping, we have to test for underflow – In both cases, we use R5 to report success or failure College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 7 PUSH & POP in LC-3 PUSH x3FFB MAX … x3FFF BASE ST R2, Sv2 ;needed by PUSH ST R1, Sv1 ;needed by PUSH LD R1, MAX ;MAX has -x3FFB ADD R2, R6, R1 ;Compare SP to x3FFB BRz fail_exit ;Branch is stack is full ADD R6, R6, # -1 ;Adjust Stack Pointer STR R0, R6, # 0 ;The actual ‘push’ BRnzp success_exit … BASE MAX Sv1 Sv2 College of Charleston, Computer Science Dr. Anderson .FILL xC001 .FILL xC005 .FILL x0000 .FILL x0000 ;Base has -x3FFF ;Max has -x3FFB CS 250 Comp. Org. & Assembly 8 PUSH & POP in LC-3 POP ST R2, Sv2 ST R1, Sv1 LD R1, BASE ADD R1, R1, # -1 ADD R2, R6, R1 BRz fail_exit LDR R0, R6, # 0 ADD R6, R6, # 1 BRnzp success_exit … BASE MAX Sv1 Sv2 ;save, needed by POP ;save, needed by POP ;BASE contains x-3FFF ;R1 now has x-4000 ;Compare SP to x4000 ;Branch if stack is empty ;The actual ‘pop’ ;Adjust stack pointer .FILL xC001 .FILL xC005 .FILL x0000 .FILL x0000 College of Charleston, Computer Science Dr. Anderson ;Base has -x3FFF ;Max has -x3FFB CS 250 Comp. Org. & Assembly 9 PUSH & POP in LC-2 (cont.) success_exit fail_exit BASE MAX Sv1 Sv2 LD LD AND RET ; LD LD AND ADD RET R1, Sv1 R2, Sv2 R5, R5, # 0 ;Restore register values ; ;R5 <-- success R1, Sv1 R2, Sv2 R5, R5, # 0 R5, R5, # 1 ;Restore register values .FILL xC001 .FILL xC005 .FILL x0000 .FILL x0000 College of Charleston, Computer Science Dr. Anderson ;R5 <-- fail ;Base has -x3FFF ;Max has -x3FFB CS 250 Comp. Org. & Assembly 10 Memory-mapped I/O revisited College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 11 Interrupt-driven I/O Just one device: CPU I/O Memory IRQ IACK When IRQ goes active, jump to a special memory location: the ISR, or interrupt service routine. For now, let’s say it exists at address x1000. Activate IACK to tell the device that the interrupt is being serviced, and it can stop activating the IRQ line. College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 12 Generating the Interrupt Using the Status Register – The peripheral sets a Ready bit in SR[15] (as with polling) – The CPU sets an Interrupt Enable bit in SR[14] – These two bits are anded to set the Interrupt. In this way, the CPU has the final say in who gets to interrupt it! College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 13 Processing an interrupt: one device Device generates an IRQ CPU signals IACK – “OK, I’m on it.” Switch to Supervisor Mode CPU saves its current state – What and how? Address of the ISR is loaded into the PC – x1000 Continue – process the interrupt When finished, return to running program – How? College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 14 Supervisor Mode Bit 15 of the PSR = Privileged (supervisor) mode Priv 15 Priority 10 – 8 N Z P 2 0 1 Only the Operating System can access device addresses Why? College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 15 Interrupts and program state We need to save the PC, the PSR, and all Registers – We could require that ISRs save all relevant registers (callee save) – The callee would ALWAYS have to save the contents of the PC and PSR In most computers these values (and possibly all register contents) are stored on a stack – Remember, there might be nested interrupts, so simply saving them to a register or reserved memory location might not work. College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 16 The Supervisor Stack The LC-3 has two stacks – The User stack Used by the programmer for subroutine calls and other stack functions – The Supervisor stack Used by programs in supervisor mode (interrupt processing) Each stack is in separate region of memory The stack pointer for the current stack is always R6. – If the current program is in privileged mode, R6 points to the Supervisor stack, otherwise it points to the user stack. Two special purpose registers, Saved.SSP and Saved.USP, are used to store the pointer currently not in use, so the two stacks can share push/pop subroutines without interfering with each other. College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 17 Saving State When the CPU receives an INT signal … – If the system was previously in User mode, the User stack pointer is saved & the Supervisor stack pointer is loaded Saved.USP <= (R6) R6 <= (Saved.SSP) – PC and PSR are pushed onto the Supervisor Stack – Set the system to Supervisor mode PSR[15] <= 0 Jump to the interrupt service routine College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 18 Processing an interrupt: details Device generates in IRQ CPU signals IACK – “OK, I’m on it.” CPU saves its current state – PC and PSR are saved on the Supervisor Stack Switch to Supervisor Mode – Change the S bit in the PSR to 0. Address of the ISR is loaded into the PC – For now we assume just one ISR – x1000 Continue – process the interrupt When finished, return to running program – Pop the PC and PSR from the Supervisor Stack College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 19 More than one device Who sent the interrupt? One way is to have a unified ISR that checks the status bits of every device in the system – This is a hybrid method between interrupt-driven I/O and polling – Requires every new device to modify the ISR – The ISR will be large and complex CPU Memory I/O 1 I/O 2 I/O 3 I/O 4 IRQ College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 20 Vectored Interrupts If we have multiple devices, we need a very large ISR that knows how to deal with all of them! Using vectored interrupts, we can have a different ISR for each device. Each I/O device has a special register where it keeps a special number called the interrupt vector. – The vector tells the CPU where to look for the ISR. College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 21 A vectored-interrupt device Device Controller x8000 Input register x8002 Output register x8004 Status register x8006 67 Interrupt Vector Register • When I trigger an interrupt, look up address number 67 in the vector table, and jump to that address. College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 22 Getting the interrupt vector INTA CPU I/O 1 I/O 2 I/O 3 I/O 4 Memory IRQ INTA tells a device to put the interrupt vector on the bus INTA is daisy chained so only one device will respond College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 23 Initial state of the ISR Vectored interrupts – Along with the INT signal, the I/O device transmits an 8-bit vector (INTV). – If the interrupt is accepted, INTV is expanded to a 16-bit address: The Interrupt Vector Table resides in locations x0100 to x01FF and holds the starting addresses of the various Interrupt Service Routines. (similar to the Trap Vector Table and the Trap Service Routines) INTV is an index into the Interrupt Vector Table, i.e. the address of the relevant ISR is ( x0100 + Zext(INTV) ) – The address of the ISR is loaded into the PC – The PSR is set as follows: PSR[15] <= 1 (Supervisor mode) PSR[2:0] <= 000 (no condition codes set) Now we wait while the interrupt is processed College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 24 Interrupt sequence: >1 device Device generates an IRQ CPU switches to SSP if necessary (hardware) Current PC and PSR are saved to the supervisor stack (hardware) Switch to supervisor mode (S = 0; hardware) CPU sends IACK , which is daisy chained to device (hardware) Device sends its vector number (hardware) Vector is looked up in the interrupt vector table, and address of the ISR is loaded into the PC (hardware) ISR saves any registers that it will use (software) ISR runs, then restores register values (software) ISR executes RTI instruction, which restores PSR and PC (software) – Note that this restores previous supervisor/user mode College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 25 Multiple devices: priority What happens if another interrupt occurs while the system is processing an interrupt? Can devices be “starved” in this system? College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 26 Priority Each task has an assigned priority level – LC-3 has levels PL0 (lowest) to PL7 (highest). – If a higher priority task requests access, then a lower priority task will be suspended. Likewise, each device has an assigned priority – The highest priority interrupt is passed on to the CPU only if it has higher priority than the currently executing task. If an INT is present at the start of the instruction cycle, then an extra step is inserted: – The CPU saves its state information so that it can later return to the current task. – The PC is loaded with the starting address of the Interrupt Service Routine – The FETCH phase of the cycle continues as normal. College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 27 Priority of the current program Remember those extra bits in the PSR? Priv 15 College of Charleston, Computer Science Dr. Anderson Priority 10 – 8 N Z P 2 0 1 CS 250 Comp. Org. & Assembly 28 Device Priority College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 29 Returning from the Interrupt The last instruction of an ISR is RTI (ReTurn from Interrupt) – – – – Return from Interrupt (opcode 1000) Pops PSR and PC from the Supervisor stack Restores the condition codes from PSR If necessary (i.e. if the current privilege mode is User) restores the user stack pointer to R6 from Saved.USP Essentially this restores the state of our program to exactly the state it had prior to the interrupt – Continues running the program as if nothing had happened! How does this enable multiprogramming environments? College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 30 The entire interrupt sequence Device generates an IRQ at a specific PL IF requesting PL > current process priority: – – – – – – – CPU switches to SSP if necessary (hardware) Current PC and PSR are saved to the supervisor stack (hardware) Switch to supervisor mode (S = 0; hardware) Set process priority to requested interrupt PL CPU sends IACK , which is daisy chained to device (hardware) Device sends its vector number (hardware) Vector is looked up in the interrupt vector table, and address of the ISR is loaded into the PC (hardware) – ISR saves any registers that it will use (software) – ISR runs, then restores register values (software) – ISR executes RTI instruction, which restores PSR and PC (software) Note that this restores previous supervisor/user mode and process priority College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 31 Execution flow for a nested interrupt College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 32 Supervisor Stack & PC during INT College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 33 Interrupts: Not just for I/O Interrupts are also used for: – – – – – Errors (divide by zero, etc.) TRAPs Operating system events (quanta for multitasking, etc.) User generated events (Ctrl-C, Ctrl-Z, Ctrl-Alt-Del, etc.) …and more. College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 34 DMA – Direct Memory Access DMA – A device specialized in transferring data between memory and an I/O device (disk). – CPU writes the starting address and size of the region of memory to be copied, both source and destination addresses. CPU memory DMA – DMA does the transfer in the background. – It accesses the memory only when the CPU is not accessing it (cycle stealing). College of Charleston, Computer Science Dr. Anderson I/O Dev CS 250 Comp. Org. & Assembly 35 Stack-based instruction sets Three-address vs zero-address – The LC-3 explicitly specifies the location of each operand: it is a threeaddress machine e.g. ADD R0, R1, R2 – Some machines use a stack data structure for all temporary data storage: these are zero-address machines the instruction ADD would simply pop the top two values from the stack, add them, and push the result back on the stack Most calculators use a stack to do arithmetic, most general purpose microprocessors use a register bank Two-address machines – – – – This has nothing to do with stacks… but the x86 is a two-address machine The DR is always SR1 So ADD R0, R1 in x86 is equivalent to ADD R0, R0, R1 in LC-3 Implications? College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 36 Practice problems 10.8, 10.10 (this is a good one!), 10.12 (long, but good), 10.13 (also long, but good) College of Charleston, Computer Science Dr. Anderson CS 250 Comp. Org. & Assembly 37