Other Processors • Having learnt MIPS, we can learn other major processors. • Not going to be able to cover everything; will pick on the interesting aspects. ARM • Advanced RISC Machine • The major processor for mobile and embedded electronics, like iPad • Simple, low power, low cost ARM • One of the most interesting features is the conditional execution. • That is, an instruction will execute if some condition is true, otherwise it will not do anything (turned into a nop). ARM • A set of flags, showing the relation of two numbers : gt, equal, lt. – cmp Ri, Rj # set the flags depending on the values in Ri and Rj – subgt Ri, Ri, Rj # i = i – j if flag is gt – sublt Ri, Ri, Rj # i = i – j if flag is lt – bne Label # goto Label if flag is not equal ARM • How to implement while (i != j) { if (i > j) i -= j; else j -= i; } ARM • In MIPS, assume i is in $s0, j in $s1: Loop: beq $s0, $s1, Done slt $t0, $s0, $s1 beq $t0, $0, L1 sub $s0, $s0, $s1 j Loop L1: sub $s1, $s1, $s0 L2: j Loop ARM • In ARM, Loop: cmp Ri, Rj subgt Ri, Ri, Rj sublt Rj, Rj, Ri bne Loop ARM • Discussion: Given the MIPS hardware setup, can we support conditional execution? Introduction to Intel IA-32 and IA-64 Instruction Set Architectures History 9/27/2007 11:23:26 PM week06-3.ppt 10 Recent Intel Processors • • • • The Intel® Pentium® 4 Processor Family (2000-2006) The Intel® Xeon® Processor (2001-2006) The Intel® Pentium® M Processor (2003-Current) The Intel® Pentium® Processor Extreme Edition (20052007) • The Intel® Core™ Duo and Intel® Core™ Solo Processors (2006-Current) • The Intel® Xeon® Processor 5100 Series and Intel® Core™2 Processor Family (2006-Current) 9/27/2007 11:23:26 PM week06-3.ppt 11 History 9/27/2007 11:23:27 PM week06-3.ppt 12 Recent Intel Processors 9/27/2007 11:23:27 PM week06-3.ppt 13 Intel Core 2 Duo Processors 9/27/2007 11:23:28 PM week06-3.ppt 14 Intel Core 2 Quad Processors 9/27/2007 11:23:28 PM week06-3.ppt 15 Bit and Byte Ordering 9/27/2007 11:23:29 PM week06-3.ppt 16 Intel Assembly • Each instruction is represented by – Where label presents the line – A mnemonic is a reserved name for a class of instruction opcodes which have the same function. – The operands argument1, argument2, and argument3 are optional. There may be from zero to three operands, depending on the instruction 9/27/2007 11:23:29 PM week06-3.ppt 17 Memory Modes 9/27/2007 11:23:30 PM week06-3.ppt 18 Addressing • The processors use byte addressing • Intel processors support segmented addressing – Each address is specified by a segment register and byte address within the segment 9/27/2007 11:23:30 PM week06-3.ppt 19 9/27/2007 11:23:30 PM week06-3.ppt 20 Intel Registers 9/27/2007 11:23:31 PM week06-3.ppt 21 Basic Program Execution Registers • General purpose registers – There are eight registers (note that they are not quite general purpose as some instructions assume certain registers) • Segment registers – They define up to six segment selectors • EIP register – Effective instruction pointer • EFLAGS – Program status and control register 9/27/2007 11:23:31 PM week06-3.ppt 22 General Purpose and Segment Registers 9/27/2007 11:23:32 PM week06-3.ppt 23 General Purpose Registers • • • • • EAX — Accumulator for operands and results data EBX — Pointer to data in the DS segment ECX — Counter for string and loop operations EDX — I/O pointer ESI — Pointer to data in the segment pointed to by the DS register; source pointer for string operations • EDI — Pointer to data (or destination) in the segment pointed to by the ES register; destination pointer for string operations • ESP — Stack pointer (in the SS segment) • EBP — Pointer to data on the stack (in the SS segment) 24 Alternative General Purpose Register Names 25 Registers in IA-64 26 Segment Registers 27 Operand Addressing • Immediate addressing – Maximum value allowed varies among instructions and it can be 8-bit, 16-bit, or 32-bit • Register addressing – Register addressing depends on the mode (IA-32 or IA-64) 28 Register Addressing 29 Memory Operand • Memory operand is specified by a segment and offset 30 Offset • Displacement - An 8-, 16-, or 32-bit value. • Base - The value in a general-purpose register. • Index — The value in a general-purpose register. • Scale factor — A value of 2, 4, or 8 that is multiplied by the index value. 31 Effective Address 32 Effective Address • Common combinations – Displacement – Base – Base + displacement – (Index * scale) + displacement – Base + index + displacement – Base + (Index * scale) + displacement 33 Addressing Mode Encoding 34 Fundamental Data Types 35 Example 36 Pointer Data Types • Near pointer • Far pointer 37 128-Bit SIMD Data Types 38 BCD Integers • Intel also supports BCD integers, where each digit (0-9) is represented by 4 bits 39 Floating Point Numbers 40 General Purpose Instructions • Data transfer instructions 41 Data Transfer Instructions 42 Data Transfer Instructions 43 Binary Arithmetic Instructions 44 Decimal Arithmetic Instructions 45 Logical Instructions 46 Shift and Rotate Instructions 47 Bit and Byte Instructions 48 Bit and Byte Instructions 49 Control Transfer Instructions 50 51 String Instructions 52 I/O Instructions • These instructions move data between the processor’s I/O ports and a register or memory 53 Enter and Leave Instructions • These instructions provide machine-language support for procedure calls in block structured languages 54 Segment Register Instructions • The segment register instructions allow far pointers (segment addresses) to be loaded into the segment registers 55 Procedure Call Types • The processor supports procedure calls in the following two different ways: – CALL and RET instructions. – ENTER and LEAVE instructions, in conjunction with the CALL and RET instructions 56 Stack 57 Calling Procedures Using CALL and RET • Near call (within the current code segment) • Near return 58 Far Call and Far Return • Far call • Far return 59 Stack During Call and Return 60 Parameter Passing • Passing parameters through the general-purpose registers – Can pass up to six parameters by copying the parameters to the general-purpose registers • Passing parameters on the stack – Stack can be used to pass a large number of parameters and also return a large number of values • Passing parameters in an argument list – Place the parameters in an argument list – A pointer to the argument list can then be passed to the called procedure 61 Saving Procedure State Information • The processor does not save general purpose registers – A calling procedure should explicitly save the values in any of the general-purpose registers that it will need when it resumes execution after a return – One can use PUSHA and POPA to save and restore all the general purpose registers (except ESP) 62 Calls to Other Privilege Levels 63 Stack For Calling and Called Procedure 64 Procedure Calls For Block-structured Languages • ENTER and LEAVE instructions automatically create and release, respectively, stack frames for called procedures – The ENTER instruction creates a stack frame compatible with the scope rules typically used in block-structured languages – The LEAVE instruction, which does not have any operands, reverses the action of the previous ENTER instruction 65 ENTER Instruction 66 IA-32 and IA-64 Instruction Format 67 Examples of Instruction Formats 68 ADD Instructions 69 ADD Instructions 70 Add Instruction Description 71 SCAS/SCASB/SCASW/SCASD—Scan String 72 SIMD in IA-32 and IA-64 • To improve performance, Intel adopted SIMD (single instruction multiple data) instructions – MMX technology (Pentium II processor family) – SSE 10/7/2007 9:37:48 PM week07-1.ppt 73 MMX • MMX introduced – Eight new 64-bit data registers, called MMX registers – Three new packed data types: • 64-bit packed byte integers (signed and unsigned) • 64-bit packed word integers (signed and unsigned) • 64-bit packed double word integers (signed and unsigned) – Instructions that support the new data types 10/7/2007 9:37:59 PM week07-1.ppt 74 MMX • Packed integer types allow operations to be applied on multiple integers 10/7/2007 9:38:00 PM week07-1.ppt 75 SSE • SSE introduced eight 128-bit data registers (called XMM registers) – In 64-bit modes, they are available as 16 64-bit registers – The 128-bit packed single-precision floating-point data type, which allows four single-precision operations to be performed simultaneously • They can be used in parallel with MMX registers 10/7/2007 9:38:04 PM week07-1.ppt 76 SSE Execution Environment 10/7/2007 9:38:04 PM week07-1.ppt 77 XMM Registers • In certain modes, additional eight 64 bit registers are also available (XMM8 - XMM15) 10/7/2007 9:38:05 PM week07-1.ppt 78 SSE Data Type • SSE extensions introduced one new data type – 128-Bit Packed Single-Precision Floating-Point Data Type – SSE 2 introduced five data types 10/7/2007 9:38:05 PM week07-1.ppt 79 Packed and Scalar Double-Precision Floating-Point Operations 10/7/2007 9:38:07 PM week07-1.ppt 80 SSE Instructions • SSE Data Transfer Instructions 10/7/2007 9:38:10 PM week07-1.ppt 81 SSE Packed Arithmetic Instructions 10/7/2007 9:38:11 PM week07-1.ppt 82 SSE Packed Arithmetic Instructions 10/7/2007 9:38:12 PM week07-1.ppt 83 SSE Comparison, Logical, and Shuffle Instructions 10/7/2007 9:38:12 PM week07-1.ppt 84 SSE2 Instructions 10/7/2007 9:38:14 PM week07-1.ppt 85 SSE2 128-Bit SIMD Integer Instructions 10/7/2007 9:38:14 PM week07-1.ppt 86 Horizontal Addition/Subtraction 10/7/2007 9:38:16 PM week07-1.ppt 87 Horizontal Data Movements 10/7/2007 9:38:17 PM week07-1.ppt 88 Conversion Between Different Types 10/7/2007 9:38:17 PM week07-1.ppt 89 Using MMX/SSE Instructions in C/C++ Programs • Data types for MMX and SSE instructions – These types are defined in C/C++ • /usr/lib/gcc/i386-redhat-linux/3.4.3/include/mmintrin.h • /usr/lib/gcc/i386-redhat-linux/3.4.3/include/pmmintrin.h • /usr/lib/gcc/i386-redhat-linux/3.4.3/include/emmintrin.h 10/7/2007 9:38:18 PM week07-1.ppt 90 Built-in Functions • Built-in functions are C-style functional interfaces to MMX/SSE instructions See http://developer.apple.com/documentation/DeveloperTools/gcc-4.0.1/gcc/X86-Built_002din-Functions.html 10/7/2007 9:38:29 PM week07-1.ppt 91 Intel MMX/SSE Intrinsics • Intrinsics are C/C++ functions and procedures for MMX/SSE instructions – With instrinsics, one can program using these instructions indirectly using the provided intrinsics – In general, there is a one-to-one correspondence between MMX/SSE instructions and intrinsics 10/7/2007 10:01:41 PM week07-1.ppt 92 GCC Inline Assembly • GCC inline assembly allows us to insert inline functions written in assembly – GCC provides the utility to specify input and output operands as C variables – Basic inline – Extended inline assembly 10/7/2007 10:01:43 PM week07-1.ppt 93 GCC Inline Assembly • Some examples 10/7/2007 10:03:01 PM week07-1.ppt 94 GCC Inline Assembly 10/7/2007 10:03:04 PM week07-1.ppt 95 GCC Inline Assembly 10/7/2007 10:03:06 PM week07-1.ppt 96