Pentium 4 Processor Adam Mittner, Timothy Paetz, Henry Olschofka 1 Pentium 4 • • • • 7th Generation Intel processor series Hyper-threading Hyper-pipelining Advanced System Buses – 800, 533, 400 MHz • Advanced Cache • Netburst Microarchitecture 2 Hyper-pipeline Technology • New hyper-pipelined technology increases the depth of the pipeline delivering for the processor increased performance, frequency, and scalability. One major improvement was to one of the key pipelines, the branch prediction/recovery pipeline. It is now implemented in 31 stages when on the 0.13 micron Pentium 4 processor, it took 20 stages 3 System Bus • Supports a faster system bus including the 800, 533, and 400 megahertz bus. The 800 megahertz bus delivers 6.4 gigabytes of data-per-second into and out of the processor. This is accomplished by quadpumping the data over a 200 megahertz clocked system bus. The same quad pumping is used for the 533 and 400 megahertz system buses 4 Level 1 Execution Trace Cache • The Pentium 4 features 16 kilobyte data cache compared to 8 kilobyte data cache 0.13 micron processor. The processor also includes an Execution Trace Cache, which will store up to 12-K decoded micro-ops in order of the execution of the program. Since the decoder is removed from the main execution, this increases performance and since instructions that are branched around are not stored, it makes more efficient use of the cache storage space. This allows a means to deliver a high volume of instructions to the processor and a reduction of time used to recover from mis-predicted branches 5 Level 2 Cache • The Advanced Transfer Cache delivers a much higher data bandwidth between the Level 2 Cache and the processor core. The Advanced Transfer Cache consists of a 256 bit interface that transfers data on each core clock. The Pentium 4 transfer rate is approximately 108 gigabytes per second, which is much faster than the Pentium III which could only transfer at 16 gigabytes per second 6 Level 3 Cache • Cache is integrated and is available in 2 megabytes, and is used with the 800 megahertz system bus to provide a faster route of data transfer to the main memory. This efficient design for the level 3 cache provides a faster means to large data sets stored on the processor in its cache. This reduces the average memory latency and increases the speed for larger workloads 7 Raid • RAID = Redundant Array of Inexpensive Disks • Developed at the University of California 15 years ago • Chipsets that support RAID for the Pentium 4: • 875P • 865E • 865P • 865G 8 Raid • RAID 0 uses Stripping • Read and Write speeds are increased 9 Raid • RAID 1 uses mirroring • Information is identical on each pair of disks to use as backup copies. 10 Hyper-Threading • One physical processor turns into two logical processors • Two threads can be executed at once • The two logical processors share the same resources • This causes problems with cache memory if the processors have to dump and reload the memory each time one is trying to process a thread • Not as powerful as two physical processors, but more powerful than a processor without Hyper-Threading 11 The 90nm Process • higher-performance, lower-power transistors • strained silicon • high-speed copper interconnects • new “low-k” dielectric material 12 The 90nm Process • Higher-performance, lower-power transistors • size of the one transistor is 50 nanometers, or 50 billionths of a meter • hundreds of these transistors could fit inside one red blood cell • best size comparison is that of a virus • true nanotechnology 13 The 90nm Process • Strained Silicon • enables faster transistors • silicon lattice, grid pattern of silicon atoms • lattice is stretched • increases the transistors 10%-25%, with only a 2 percent increase in manufacturing costs. 14 15 The 90nm Process • High-speed copper interconnects • seven layers, one more layer than the older process generations • “low-k” dielectric insulator material • between each interconnect • reduces wire-to-wire capacitance • speeds up the intra-chip communication and reduces chip power 16 The 90nm Process • SRAM chip of 52 megabits • 52 million individual bits of information • 330 million transistors, in 109 squared millimeters (fingernail) 17 Supported Chipsets • 800 MHz system bus • 875P, 865PE, 865G, 865GV, and 848P • 2.4 GHz to 3.4 GHz • 533/400 MHz system bus • 865P, 850 family, 850E, 845PE, 845GE, 845GV, 845E, 845G • 2.26 GHz to 3.06 GHz • 400 MHz system bus •845GL and 845 • 2 GHz to 2.6 GHz 18 Supported Chipsets •Differences • Main memory support • Hyper-threading support • Graphics Technology • Raid Control • other smaller differences 19 QUESTIONS? 20