Emulating ‘MUL’ How multiplication of unsigned integers can be performed in software Hardware support absent? • The earliest microprocessors (from Intel, Zilog, and others) did not implement the multiplication instructions – that operation would have had to be done in software • Even with our Core-2 Quad processor, there is an ultimate limit on the sizes of integers that can be multiplied using the processor’s built-in ‘multiply’ instructions Multiplying with Base Ten • Here’s how we learn to multiply multi-digit integers using ordinary ‘decimal’ notation 8765 x 4321 -----------------8765 17530 26295 + 35060 -----------------37873565 multiplicand multiplier partial product partial product partial product partial product product Analogy with Base Ten • When multiplying multi-digit integers using ‘binary’ notation, we apply the same idea 1001 x 1011 -----------------1001 1001 0000 + 1001 -----------------1100011 multiplicand (=9) multiplier (=11) partial product partial product partial product partial product product (=99) Some observations… • With ‘binary’ multiplication of two N-digit values, the product can require 2N-digits • Each ‘partial product’ is either zero or is equal to the value of the ‘multiplicand’ • Succeeding partial products are ‘shifted’ • So if we want to ‘emulate’ multiplication of unsigned binary integers using software, we must implement these observations 8-bit case • The smallest case to consider is using the ‘mul’ instruction to compute the product of 8-bit values, say in registers AL and BL .section number1: number2: product: .section mov mov mul mov .data .byte .byte .word 100 200 0 .text number1, %al number2, %bl %bl %ax, product Doing it by hand 100 = 0x64 = 01100100 (binary) 01100100 AL 200 = 0xC8 = 11001000 (binary) 11001000 BL 00000000 00000000 11001000 00000000 00000000 11001000 11001000 + 00000000 ------------------------------------0100111000100000 (BL x 0) (BL x 0) (BL x 1) (BL x 0) (BL x 0) (BL x 1) (BL x 1) (BL x 0) 20000 = 0x4E20 = 0100111000100000 (binary) 01001110 00100000 AX Using x86 instructions .section .text softmulb: push %rcx # save caller’s count-register sub mov nxbit8: rcr jnc add noadd8: loop sub %ah, %ah $9, %rcx $1, %ax noadd %bl, %ah nxbit8 %ah, %cl # zero-extend AL to (CF:AH:AL) # number of bits in (CF:AH) # next multiplier-bit to Carry-Flag # skip addition if CF-bit is zero # else add the multiplicand # go back to shift in next CF-bit # set CF-bit if 8-bits exceeded %rcx # recover caller’s count-register pop ret Visual depiction ROR $1, %AX CF 0 AH 00000000 AL multiplier 17-bit value gets ‘rotated’ 1-place to the right BL multiplicand then multiplicand is added to AH (unless CF =0) Exhaustive testing • We can insert ‘inline assembly language’ in a C++ program to construct a loop that checks our software multiply operation against every case of the CPU’s ‘mul’ • Our test-program is names ‘multest.cpp’ • You can compile it like this: $ g++ multest.cpp softmulb.s -o multest In-class exercises • Can you write a ‘software’ emulation for the CPU’s 16-bit multiply operation? mul %bx • Can you write a ‘software’ emulation for the CPU’s 32-bit multiply operation? mul %ebx