CSE 243: Introduction to Computer Architecture and Hardware/Software Interface Fall 2002 Swapna S. Gokhale Solved example problems. 1. Register R1 and R2 contain values 1800 and 3800 respectively. The word length of the processor is 4 bytes. What is the effective address of the memory operand in each one of the following cases? a) ADD 100(R2), R6. b) LOAD R6, 20(R1,R2) c) STORE –(R2), R6 d) SUBTRACT (R2)+, R6 Solution: a) Effective address = 100 + Contents of R2 = 100 + 3800 = 3900. b) Effective address = 20 + Contents of R1 + Contents of R2 = 20 + 1800 + 3800 = 5620. c) Effective address = Contents of R2 – 4 = 3800 – 4 = 3796. d) Effective address = Contents of R2 = 3800. 2. Consider two vectors A and B. The number of elements in each vector is N. The dot product of two vectors is defined as: N 1 D A(i ) B (i ) i 0 Where A(i) denotes the ith element of vector A, and B(i) denotes the ith element of vector B. Write an assembly language program to compute the dot product of two vectors A and B. The first element of vector A is stored at the memory location vector_A, the first element of vector B is stored at the memory location vector_B, and the number of elements in each vector is stored at memory location N. Store the dot product in location DOTPROD. The word length of the processor is 4. The processor allows only two operand instructions, and in the two operand instructions only one of the operands may be in the memory. The other operand must be in a processor register. Solution: MOVE #vector_A, R1 MOVE #vector_B, R2 MOVE N, R3 CLEAR R0 LOOP: MOVE (R1), R4 MULTIPLY (R2), R4 ADD R4, R0 # Move the address of vector A to R1. # Move the address of vector B to R2. # Move the number of elements in each vector to R3. # Clear the contents of register R0. # Move one of the operands to a processor register. # Multiply the ith element of A with the ith element # of B # Accumulate the sum. ADD #4, R1 # Get the next element in A. ADD #4, R2 # Get the next element in B. DECREMENT R3 # Decrement the counter. BRANCH > 0 LOOP MOVE R0, DOTPROD #Store the result in DOTPROD 3. Compute the number of memory accesses for the program in problem 2. Assume that each instruction occupies one word. Express the number of memory accesses in the form k1 + k2*N, where k1 is the fixed number of memory accesses independent of N, and k2 is the number of memory accesses in the loop. Solution: Since each instruction occupies a single word, one memory access is needed to fetch each instruction. If the instruction involves a memory operand as a source, then another memory access is needed to fetch the operand. If the instruction involves a memory operand both as a source and a destination then two additional memory accesses are needed. Memory accesses MOVE #vector_A, R1 1 (k1) MOVE #vector_B, R2 1 (k1) MOVE N, R3 2 (k1) CLEAR R0 1 (k1) LOOP: MOVE (R1), R4 2 (k2) MULTIPLY (R2), R4 2 (k2) ADD R4, R0 1 (k2) ADD #4, R1 1 (k2) ADD #4, R2 1 (k2) DECREMENT R3 1 (k2) BRANCH > 0 LOOP 1 (k2) MOVE R0, DOTPROD 2 (k1) The memory accesses which should be a part of k1 and k2 have been indicated. Thus, k1 = 1+1+2+1+2 = 7. k2 = 2+2+1+1+1+1+1=9. 4. For the following 68000 instructions, determine how many bytes does each instruction occupy? How many memory accesses does the fetching and execution of each instruction require? a) SUB.L D1, (A1) b) SUB.L (A2,D1),D1 c) SUB.L #$1234,(A1) Solution: Referring to Page 766, in the 68000 the OP code of each instruction occupies 16 bits. If the instruction has an immediate operand, and if the operand is 8 or 16 bits, then the instruction occupies one extension word in addition to the OP code word. If the immediate operand is 32 bits, then the instruction occupies 2nd and 3rd extension words. For the indexed and relative addressing modes, the required index value (either as an immediate operand, or in a register) is given in the word that follows the OP code. That is, an instruction that uses indexed and relative addressing modes occupies 2 16-bit words. The number of memory accesses required to fetch an instruction is equal to the number of words in an instruction. The number of memory accesses to execute an instruction depends on the size of the operand and if the memory operand is used as a source or destination or both. If the operand size is a long word, and the memory operand is a source operand, two memory accesses are required. If the operand size is a long word, and the memory operand is a destination operand, two memory accesses are required. If the operand size is a long word, and the memory operand is a source and destination operand, four memory accesses are required. a) This instruction occupies 2 bytes (one word), as a result, one memory access is needed to fetch the instruction. It has one memory operand as the source and one memory operand as the destination and the operand size is a long word, so four memory accesses are needed to execute the instruction. b) This instruction occupies 4 bytes, or 2 16 bit words since it uses the indexed addressing mode. As a result, 2 memory accesses are required to fetch the instruction. It uses one memory operand as a source operand and the operand size is a long word, so two memory access is required. c) This instruction occupies 6 bytes, or 3 16 bit words since it uses an immediate operand of length 32 bits. 3 memory accesses are needed to fetch the instruction. It uses a memory operand as a source and a destination and the operand size is a long word, so 4 memory accesses are required to execute it. Note: Please do not confuse between 16-bit and 24-bit addressing. If you identify the # of memory accesses per fetch and execute of an instruction that would be sufficient. 5. What does this program in 68000 assembly language do? MOVEA.L #LIST, A0 LOOP: MOVE.L (A0), D1 ADDI #4, A0 CMPI #20, D1 BLT LOOP SUBI #4, A0 MOVE.L (A0), RSLT Solution: In order to determine what task an assembly language program performs, it usually helps to determine the task performed by each instruction. It then helps to consider a small example and determine what the program produces as output. In the above program, the first instruction initializes the address register A0 with the address of the memory location labeled LIST. The second instruction moves the first element at LIST to D1, and increments A0 to point to the next number in the list. The next instruction compares the contents of register D1 with the number 20. The CMPI instruction subtracts the number 20 from register D1, that is performs, D1-20. If the result is negative, that is D1<20 then the N flag is set. The BLT instruction tests the N flag. If N is set to 1, then the branch is taken and the instructions beginning at the memory address LOOP are executed. On the other hand, if D1>20, then the branch is not taken, and the instruction following the branch instruction is executed. The next instruction following the BLT instruction uses decrements the contents of register A0 and makes it point to the original number which was loaded in register D1. It then transfers the contents of that memory location to RSLT. Let us consider 10 numbers stored starting at memory address LIST as shown : LIST: 1 10 25 5 36 48 7 20 25 11 The first time the instructions are executed or during the first pass through the loop, D1 has the value 1. Register A0 points to the 2nd number. D1-20 is negative, as a result the N flag is set to 1 and BLT results in the branch being taken leading to a second pass through the LOOP. During the second pass through the loop, D1 has the value 10. Register A0 points to the 3rd number. D1-20 is negative, as a result the N flag is set to 1 and BLT results in the branch being taken leading to a third pass through the LOOP. During the third pass through the loop, D1 has the value 25. Register A0 points to the 4th number. D1-20 is positive, as a result, the N flag is set to 0, and the branch is not taken. The register A0 is decremented to point to the third number, and the third number is stored in location RSLT. Thus, this program identifies the first number in the list that is greater than 20 and stores it in the memory location RSLT. The condition code flags that are checked by each branch instruction are summarized in Appendix C, Table C.6