Dual Core Architecture: The Itanium 2 (9000 series) Intel Processor COE 305: Microcomputer System Design [071] Mohd Adnan Khan(246812) Noor Bilal Mohiuddin(237873) Faisal Arafsha(232083) DATE: 27th November 2007 Table of Contents I. INTRODUCTION II. WHY WE USE DUAL CORE III. INTEL ITANIUM 2 9000 SERIES 1. INTRODUCTION 2. HISTORY 3. ARCHITECTURE 4. RELIABILITY 5. SOFTWARE SUPPORT 6. COMPETITION IV. REFERENCES I) Introduction Dual-core is an architecture that refers to a Central Processing Unit (CPU) with two complete execution cores in a single processor. The two cores, their caches and cache controllers are all built together in a single IC. They can be considered as two processors that are working side by side to help each other in processing and executing. There are differences between a Dual-Core and a Dual-Processor CPU. A dual-core is the term for using two separate cores that reside side by side on the same chip. While a dual-processor is the term fore using two processors, not necessarily on the same chip, not even on the same motherboard. II) Why we use dual core There are several reasons for which dual core processing is being used today and has, so far, been widely successful. Some say it’s the flattening of the clock speed curve that forced AMD and INTEL to succumb to this technology. Clock speed barriers have been hit by the two microprocessor giants and they have chosen an alternate route for progressive performance and to "stay top of mind with new product releases." The problem with rapidly escalating clock speed is heat emission. High clock speed leads to greater heat emission, leading to more errors. This heat is given off as a result of power dissipation. Operating a processor at high clock speeds requires excessive amounts of electricity to be running around the die, making it more susceptible to noise. Since the pathways in the processors are microscopically close together, leakage of electricity from one pathway to another can corrupt the data in that pathway. This corrupted data generates errors. A dual core processor is a cross between a single core processor and a dual processor system. "A dual core processor won't be twice as fast as a single core processor nor will it be as fast as a dual processor system." The performance of the dual core will fall somewhere in between the two, but it will be significantly better than the single core processor. Since there are two pipelines, two instructions can be executed simultaneously. Also, two processor caches allow more data on the processor die for quicker access. The only issue is regarding the single bus and memory that the processors have to share. This is one of the few drawbacks of dual core. In conclusion, dual core technology provides a great performance increase in comparison to single core processors. It is advantageous, especially, for users that wish to multitask. Dual core processors provide an inexpensive way for manufacturers to produce products that progress the performance curve. III) Itanium 1) Introduction The Itanium is a 64-bit Intel microprocessor that implements the Intel Itanium architecture. There are basically two processor families in the Intel Itanium architecture: Itanium and Itanium-2 families. These processors are mostly used in high performance computing systems. This architecture was initially developed at HP and was later HP and Intel collaborated to build the Itanium series of processor's. The first Itanium microprocessor was released in 2001, and more powerful Itanium processors have been released frequently over the past few years. HP produces most Itanium-based systems, but several other manufacturers have also developed systems based on Itanium. As of 2007, Itanium is the fourth-most deployed microprocessor architecture for enterprise-class systems. 2) History In the late 90's HP determined that reduced instruction set (RIC) computers were reaching a processing limit at one instruction/cycle and developed a new architecture called Explicitly Parallel Instruction Computing (EPIC) that allows the processor to execute more than one instruction in one clock cycle. HP and Intel became partners in 1994 and developed the IA-64 architecture. Intel was aiming to undertake a large development effort on IA-64 in the expectation that the resulting microprocessor would be used by the majority of the enterprise systems manufacturers. HP and Intel initiated a large joint development effort with a goal of delivering the first product codenamed Merced, in 1998. Intel announced the official name of the processor, Itanium October 4, 1998. Original Itanium processor: 2001–2002 By the time Itanium was released in June, 2001, it was no longer superior to the RISC and CISC processors. Only a few thousand of the original Itaniums were sold, due to limited availability caused by poor yields, relatively poor performance, and high cost. However, these machines were useful for software development for the Itanium 2 processors that followed. IBM delivered a supercomputer based on this processor. Itanium 2 processors: 2002–present The Itanium 2 was released in 2002, and was mainly marketed for enterprise servers. The initial Itanium 2 was codenamed McKinley. McKinley used a 180 nm process, but it relieved many of the performance problems of the original Itanium. In 2003, AMD released the Opteron, which implemented its x86-64 64-bit architecture. Opteron gained rapid acceptance in the enterprise server space because it provided an easy upgrade from x86. Intel responded by implementing x86-64 in its Xeon microprocessors in 2004. Intel released a new Itanium 2 family member, named Madison, in 2003. Madison used a 130 nm process and was the basis of all new Itaniums until Montecito was released in June 2006. Itanium is not a high-volume product for Intel. Intel does not release production numbers, but one industry analyst estimated that the production rate was 200,000 processors per year in 2007. 3) Architecture A. The Intel Itanium architecture. • Dual-Core: The Intel Itanium processor provides two complete 64-bit processing cores on a single processor with up to 24 MB low-latency L3 cache which provides high bandwidth for both cores. This high cache together with the Hyper-Threading (HT) feature provides twice the performance of earlier dual-core processors. • EPIC: Explicitly Parallel Instruction Computing provides different advanced implementations of parallelism, prediction, and speculation for a great instructionlevel parallelism (ILP). This feature helps to address the requirements of high-end enterprise and technical workloads. • Hyper-Threading: With the HT technology provided in the Itanium, the number of threads in the operating system is doubled in each core, which leads to providing four times the threads used by the operating system. This gives much higher performance than the old single-thread implementations. HT was not introduced in the previous Itanium family; it was first introduced in the Intel Itanium2 processor. • Wide, Parallel Hardware for High Performance: The Itanium contains 128 general and 128 floating-point registers that support rotation. Also, a register stack engine is used to improve the management of processor resources. Another feature introduced in the Itanium 2 is the support of prediction and speculation that helps improve the processing performance. • Energy Efficiency: The Intel Itanium 2 uses 20% less power than the previous dual-core Itaniums with 2.5 times higher performance per watt, which lowers the energy requirements with major performance improvements. • High-Bandwidth System Bus for Scalability: The processor uses up to 8.53 GB/s bandwidth. It has a 128-bit data bus (64 bits dedicated to each core). It also provides 50-bits of physical memory addressing and 64-bits of virtual addressing. The busses, with 400-533 MHz frequency, are expendable to systems with multiple system busses • Features to support flexible platform environments: An IA-32 execution layer is available in the Itanium 2 to support IA-32 application binaries. The processor contains an abstraction layer that eliminates processor dependencies. • Pinout Specifications: The Intel Itanium 2 has 611 pins that are used for input, output, or both input and output. The DC specifications are shown in the table below: B. Instruction Execution The instruction word is 128-bit and contains three instructions. The instruction fetch is capable of feeding two of these instruction words into the pipeline from the L1 cache per clock cycle. This allows the process to execute 6 instructions per clock cycle. The pipeline consists of thirty execution units that can execute part of the instruction. The execution units are divided into eleven groups that can execute one instruction per clock cycle. These groups are: • Six general-purpose ALUs, two integer units, one shift unit • Four data cache units • Six multimedia units, two parallel shift units, one parallel multiply, one population count • Two floating-point multiply-accumulate units, two "miscellaneous" floatingpoint units three branch units C. Memory architecture The Itanium 2 processors have 3 levels of cache. The Level 1 cache is 16KB for both instruction and data. The Level 2 cache is 256KB fro both instruction and data. The Level 3 cache varies from 1.5MB to 24MB. The Itanium 2 bus, called the Scalability Port, is capable of transferring 2x128 bits per clock cycle at speeds of up to 533 MHz (transferring at 17.056 GB/s). D. Chipsets The Itanium bus uses a chipset to interface with the rest of the system. Enterprise server manufacturers differentiate their systems by designing and developing chipsets that interface the processor to memory, interconnections, and peripheral controllers. The chipset is the heart of the system-level architecture for each system design. 4) Reliability: A. Features: Some of the important features of the Itanium include Intel Cache Safe technology that provides automatic cache recovery. Its enhanced machine check architecture allows wide-ranging error detection and correction capabilities. B. Functions: In the case of cache errors, the Itanium automatically disables cache lines, preventing it from freezing. The address path error correction system offers automatic error detection, logging and correction. Also, power can be serviced while running and the servers can be controlled remotely. C. Benefits: The Itanium has several benefits. These include improved capability to survive cache errors and higher levels of computing uptime. Furthermore, it can detect biterrors and manage data corruption. Finally, the several features allow the Itanium servers to be highly reliable, manageable and easily serviced. 5) Software support In an attempt to increase the number of software that can run on the Itanium, Intel developed effective compilers for its platform. The Itanium is supported by Windows Server 2003, a number of Linux distributions (like Debian, Red Hat and Novell SuSE), and HP-UX. It also supports mainframe environment GCOS and a number of IA-32 operating systems using the Instruction Set Simulator. According to Intel, over 10,000 applications are available for Itanium based systems. 6) Competition The Itanium competes with other enterprise servers such as Sun Microsystems UltraSPARC IV, Fujitsu's SPARC64, IBM's POWER6, AMD's Opteron, and Intel's own Xeon servers. The Itanium comprises of the most effective floating point performance relative to fixed-point performance of any general-purpose microprocessor. This feature is not required for most enterprise servers workloads. IV) References 1) http://en.wikipedia.org/wiki/Itanium 2) Dual-Core Intel® Itanium® Processor 9000 and 9100 Series Datasheet 3) Dual-Core Intel® Itanium® 2 Processor 9000 Series 4) http://icrontic.com/articles/dual_core