Introduction to x86 Computer Architecture What is an x86 Processor • An x86 Processor is now a general term used to refer to a class of processors that share a common subset of instructions • Some x86 microprocessors are: Athlon (AMD), Opteron (AMD), Pentium IV (Intel), Xeon (Intel), Pentium Core Duo (Intel), Athlon X2 (AMD) – Most popular desktop and sever processors – Knowing instructions for one, you can program for any other processor • Some processors have additional instructions that cannot be used on others! x86 Philosophy on Instructions • x86 Processors follow the Complex Instruction Set Computer (CISC) Philosophy • It is converse to the Reduced Instruction Set Computer (RISC) Philosophy – The data path we discussed in class falls under the category of RISC – x86 processors provide a lot of (relatively) complex instructions • When you look at the manuals the number of instructions will strike you. • Each instruction has its own peculiarity to watch out for. CISC vs. RISC CISC RISC Larger number of instructions. Fewer number of instructions. Instructions have different formats. All instructions have the same format. Instructions vary in size. (some Instructions have the same instructions are just 1 byte while size. others are 9 bytes long!) Data path is very complex and Data path is simpler and involves multiple clocks. typically uses a single clock. Code for software is smaller Code for software is larger (example you can multiply or divide two registers) (Have to write code for multiplication or division!) Puts burden on hardware Puts burden on software. CISC vs. RISC (Contd.) CISC RISC Can extract more Instruction Level Parallelism (ILP) – complex instructions are broken internally into smaller pieces and run in parallel. Complex instructions enable better use of other parts of CPU such as caches. Compiler development is harder. Harder to extract ILP Athlon, Pentium, Opteron, Xeon. ARM, MIPS, Ultra Sparc, Ultra Sparc T1 Software developer may need to handle some gritty details. Streamlines compiler development. Microprogramming • Microprogramming is a computer architecture design strategy in which complex instructions are actually implemented as a sequence of micro (or smaller) instructions. – Sequence of microinstructions is called a micro program. • The are carefully designed and optimized for fastest execution. • CPU is actually executing micro programs rather than the main code and consequently has a RISC type computing core! – CISC architectures often use microprogramming. Micro program Example • Example of a micro program to perform operation Add R1, R2 (R2 = R2 + R1) – Select register R1 as 1st Input to ALU – Select register R2 as 2nd Input to ALU – Set carry input of ALU to zero – Select ALU to perform 2’s complement addition – Direct result from ALU to register R2 – Update flags as necessary Micro program storage • Micro programs need to be retrieved and processed at high speed! – They are stored directly on the CPU and are hardwired to the data path. • The storage location is called a control store. – Micro programs are typically just bits that are applied to inputs of various devices in the data path. • Like selection bits for multiplexers and de-multiplexers Micro instruction implementations • Microprogramming maybe implemented using two different strategies – Vertical micro codes • One micro instruction at a time just like a conventional program. – Horizontal micro codes • Several or all micro instructions are processed simultaneously. Vertical Micro Codes • This is older/simple approach. – In this strategy micro codes are executed one after another in a serial fashion. This is type is similar to conventional program execution (one instruction at a time) 00110001 00000001 00001111 11110000 Sequence of short micro codes to implement a macro instruction (vertical organization) Horizontal Micro Codes • This is a newer and faster approach • Adopted as transistors started becoming cheaper – Micro codes are typically executed in parallel making micro instructions much faster! – The drawback is that microinstructions become much wider as they have to accommodate all bits (even if they are not used) in each instructions. • All instructions have microinstructions of the same size 00110001 00000001 00001111 11110000 Horizontal organization of micro codes to make a single microinstruction! x86 Architecture • Most x86 processors use little microprogramming. • Most instructions are directly executed by a complex data path • Some advanced instructions and looping instructions use microprogramming – The x86 manual typically provides the microinstructions to explain the working of each instruction – You really don’t have to worry about all the details when writing assembly language code for a given processor. Assembly programming with x86 • In order to do assembly programming with x86 you need to know the following: – Registers provided by x86 processors – Instruction set • Refer to x86 instruction set manuals available from Intel. – http://www.intel.com/design/pentium4/manuals/253666.htm – http://www.intel.com/design/pentium4/manuals/253667.htm – A text editor to type out your assembly program – An assembler & linker to convert assembly to binary x86 Registers • x86 registers are grouped into two categories – General purpose registers • Used for doing arithmetic and logic operations • Used like temporary variables in programs – Reserved (or specialized) registers • These are used by the microprocessor internally for performing certain operations • The values in these registers are typically not modified by programs – They are set and handled by the operating system that loads and runs your programs. General purpose x86 Registers • 4-basic 32-bit registers that can be reused as 16-bit or 8-bit registers • 32-bit registers are called: eax, ebx, ecx, and edx • Low 16-bit parts of 32-bit registers are correspondingly called: ax, bx, cx, and dx • 8-bits parts are called: ah, al, bh, bl, ch, cl, dh, & dl Bit positions 32 16 15 8 7 0 EAX AH AL AX EBX BH BL BX ECX CH CL CX EDX Cannot use high 16bits independently! DH DL DX Names for low 16-bits Using parts of registers • Note that registers eax, ax, ah, and al all share the same register space! – They are names for parts of the same register! – Altering any one correspondingly alters other registers as well! • Example Now AH = 1, AL = 1 but EAX and AX = 257! Change Set EAX AHtoto1 1 EAX 00 00 00 01 01 Other general purpose registers • The x86 processor also provides other general purpose registers. – They cannot be accessed as bytes! Register Names for low 16-bits • Smallest unit is lower 16-bits Bit positions 32 16 15 8 7 EBP (Base Pointer) BP ESI (Source Index) SI EDI (Destination Index) DI ESP (Stack Pointer) Cannot use high 16bits independently! SP 0 Segment Registers • Segment registers (16-bits) are special registers – 4 Special segment registers • CS – Code Segment register – Indicates an area of memory in which instructions are stored. • DS – Data Segment register – Indicates an area of memory in which R/W data is stored. • SS – Stack Segment register – Area of memory in which stack for a program is stored. – Stack is used for calling methods/functions/subprograms • ES – Extra Segment register – Used for copying data from one segment to another. – They are typically set by the operating system – You will never change them (in this course) – We will discuss segment registers further in the course EFLAGS Register • EFLAGS register is a special (32-bit) register that contains flags (bits) generated from results of ALU operations. – Typically, the only way to change the values is to perform an ALU operation. • Usually a set of individual flags (or bits) are selectively changed by each instruction. • You have to know which instruction changes which bit! – Initially it is hard but with some practice you will get the hang of it. • Each bit can be indirectly inspected using suitable instruction – Typically a conditional jump instruction! Flags in EFLAGS • Certain flags in EFLAGS are frequently used – ZF: Zero Flag • Set if output from ALU is zero. Cleared otherwise. – CF: Carry Flag • Set if arithmetic operation generates a carry or borrow. • This flag indicates overflow for unsigned arithmetic. – PF: Parity Flag • Set if least significant byte (8-bits) of result from ALU has an even number of 1s. – SF: Sign Flag • Set to most significant bit of the result (1 indicates negative result while 0 indicates positive result) Instruction Pointer (IP) • It is a special 32-bit register that indicates the address of the next instruction to be executed by the microprocessor. – The register is called EIP in x86 processors – Value is typically initialized by the OS when programs are loaded into memory. – Changed by conditional or unconditional jumps – By function/method calls Reading the Manual • Now that you know about registers, it is now time to start exploring instructions supported by x86 processors. – For this task you have to refer to the Intel manuals for details of instructions. – First you need to understand the notations used • I will explain some of the notation using an example. • You have to learn other notations by reading the manual. The MOV instruction • One of the most overloaded instruction in the x86 manual for assignment operation – Copy value from one location to another • Location can be memory, register, or constant! – Handles all possible combinations using different op-codes • Mnemonic is the same but op-codes are different! – Refer to page 635 of Instruction Set reference • http://download.intel.com/design/Pentium4/manuals/25 366620.pdf Snippet from Manual Opcode Instruction Description 88 /r Mov r/m8, r8 Mov r8 to r/m8. 89 /r Mov r/m16, r16 Mov r16 to r/m16. 89 /r Mov r/m32, r32 Mov r32 to r/m32. C6 /0 Mov r/m8, imm8 Mov imm8 to r/m8. C7 /0 Mov r/m16, imm16 Mov imm16 to r/m16. All you need to know is what do the notations (r/m32 or imm8) actually imply. Understanding notation • Review section 3.1.1.2 (Page 51) of the instruction manual • http://download.intel.com/design/Pentium4/manuals/25366620.pdf – rN:A N-bit register (EAX etc. for 32-bit, AX etc. for 16-bit, AL/AH etc. for 8-bit) – immN: An N-bit immediate or constant value (represented as unsigned or 2’s complement) – mN: A N-bit memory address from where K-bytes are to be stored. Value of K depends on size of value being R/W – r/mN (r/m8, r/m16, r/m32) : A N-bit register or N-bit memory address to read/write data to. Manual to Assembly Conversion • The assembler we are going to use uses a different notation – Instructions are written with destination registers last – Register names are prefixed in % sign – Constant values are prefixed with $ sign • Example: $31, $3.142 • Hexadecimal constants are written as $0x7F – Hexadecimal constants cannot use fractions like $0x7F.A • Octal constants have leading 0 as $071 • Binary constants have a ‘b’ at the end like $01110b Example Translation • Examples of various MOV instructions: Comments Suffix Interpretation Manual Assembly Mov r/m8, r8 Movb %al, %ah AH = AL b – byte * Movb k , %bl BL=k, Loads 1 byte from k w – word (2-bytes) Movl %ebx,%eax EAX = EBX l – int/long (4-bytes) Movl k, %eax EAX=k, Loads 4 bytes! Mov r/m32, r32 Mov r/m32, imm32 Movl $10, %eax EAX = 10 (4 bytes!) Movl $-20, %ebx EBX = -20 (4 bytes!) * where k is symbol (or variable) for a memory address. Variables • In assembly symbols (or variables) are typically used to refer to addresses – Variables have a type associated with them which defines the following attributes: • Size (in bytes): Most important in assembly – Once defined you cannot change the size! • Data type: Weak concept in assembly – You can always reinterpret values in different ways – Valid types are: • • • • • byte (1-byte) word (2-bytes) int or long (4-bytes) float (4-bytes) string (array of bytes) Defining Variables • Variables are defined by defining a symbol – With a given type – And a default initial value. var1: .int -32 /* Java: int var1 = -32; */ var2: .byte 0 /* Java: byte var2 = 0; */ var3: .float -3.142 /* Java: float var3 = -3.142; */ Comments are delimited by /* and */ character sequences! Putting it all together! • The first almost complete assembly: /* Program to swap values of variables var1 & var2 */ .text /* Every thing below goes in Code Segment */ .global _start /* Global entry point into program */ _start: movl movl movl movl .data val1: val2: /* Indicate where program starts */ val1, %eax /* eax = val1 */ val2, %ebx /* ebx = val2 */ %eax, val2 /* val2 = eax */ %ebx, val1 /* val1 = ebx */ /* Everything below goes in data segment */ .int 10 .int 20 Problem with earlier code • The previous assembly is actually valid – It will compile, link, run do swapping of integers – However, it does not terminate! • How to make the program terminate? • One way is to stop the microprocessor from processing further instructions. – Have to tell the OS to stop the program (or process) from running further • Need to interact with OS for this task. OS Interactions • Example code to stop a process (or program) from running – Invoke an OS routine or “system call” .text /* Every thing below goes in Code Segment */ Read as call Interrupt 80 hex! (Transfer .global _start /* Global control entry to point */ Will OS). into This isprogram OS specific! Not work on Windows! _start: /* Indicate where program starts */ /* Rest of the program actually goes here */ movl $1, %eax /* Set eax=1, SysCall Code for exit */ movl $0, %ebx /* Exit code value set to zero */ int $0x80 /* Transfer control to OS */ .data The Final Assembly /* Program to swap values of variables var1 & var2 */ .text /* Every thing below goes in Code Segment */ .global _start /* Global entry point into program */ _start: movl movl movl movl /* Indicate where program starts */ val1, %eax /* eax = val1 */ val2, %ebx /* ebx = val2 */ %eax, val2 /* val2 = eax */ %ebx, val1 /* val1 = ebx */ movl $1, %eax movl $0, %ebx int $0x80 .data val1: val2: /* Set eax=1, SysCall Code for exit */ /* Exit code value set to zero */ /* Transfer control to OS */ /* Everything below goes in data segment */ .int 10 .int 20 More Instructions • I will cover some basic, core instructions in class. – You are expected to review the instruction manuals to obtain a comprehensive list of instructions! Instruction Result addb $8, %AL AL += 8; addl %ebx, %eax EAX += EBX addl k, %eax EAX += k addl %eax, k k += EAX Add Instruction • Add instruction performs integer addition – It is applicable for both signed & unsigned numbers – Sets OF, CF, and SF based on result of addition Instruction Result addb $8, %AL AL += 8; addl %ebx, %eax EAX += EBX addl k, %eax EAX += k addl %eax, k k += EAX * Where k is an int variable Sub Instruction • Sub instruction performs integer subtraction – It is applicable for both signed & unsigned numbers – Sets OF, CF, and SF based on result of addition Instruction Result subb $8, %AL AL -= 8; subl %ebx, %eax EAX -= EBX subl k, %eax EAX -= k subl %eax, k k -= EAX * Where k is an int variable MUL Instruction • MUL Instruction comes in 2 flavors • Note: Multiplying 2 n-bit numbers generates a 2n-bit result – Example: Multiplying 2 32-bit numbers generates a 64-bit result. – MUL: Performs unsigned multiplication – IMUL: Performs signed integer multiplication – Source and destination registers are implied! • Source: Uses AL, AX, or EAX depending on size of operand • Destination: Uses DX:AX or EDX:EAX to store 32-bit or 64-bit result! Instruction Result mulb %BL AX = AL * BL mull %ebx imull k EDX:EAX = EAX * EBX EDX:EAX = EAX * K * Where k is an int variable MUL (Continued) Operand Size Operand 1 (Implicit) Operand 2 Destination Byte AL r/m8 AX Word AX r/m16 DX:AX Int EAX r/m32 EDX:EAX • OF and CF are set to 0 if upper half of the result is 0 – Values in SF, ZF, PF are undefined at the end of the instruction! DIV Instruction • DIV Instruction comes in 2 flavors – DIV: Performs unsigned division – IDIV: Performs signed integer division – Source and destination registers are implied! • Source: Uses AL, AX, or DX:AX or EDX:EAX depending on size of divisor • Destination: Uses AX or DX:AX or EDX:EAX to store remainder and quotient respectively Instruction Result divb %BL AL=(AX/BL); AH=AX%BL divl %ebx EAX=(EDX:EAX / EBX); EDX=(EDX:EAX % EBX) divl k EAX=(EDX:EAX / k); EDX=(EDX:EAX % k) * Where k is an int variable DIV (Continued) Operand Size Dividend Divisor Quotient Remainder Word/Byte AX r/m8 AL AH Word/word DX:AX r/m16 AX DX Int/int EDX:EAX r/m32 EAX EDX • Values of CF, OF, SF, ZF, and PF are undefined! • Before performing idiv use cdq instruction to sign extend eax value to edx. Otherwise your division will give incorrect results for negative numbers!