SSG EMEA Software enabling Intel Multi-Core the Road Intel is committed to Bjoern Bruecher Sr. SW Application Engineer SSG/DRD/EMEA Enterprise Team Intel GmbH, Munich Bjoern.Bruecher@Intel.com Intel: Total Platform Approach PLATFORM VALIDATION CHIPSETS/ COMMS BOARDS SYSTEMS INTEL CAPITAL INDUSTRY STANDARDS SOLUTIONS BLUEPRINTS Unparalleled industry reach and ecosystem breadth SOFTWARE TOOLS INTEL® SOLUTIONS SERVICES DEVELOPER SERVICES SOFTWARE VENDOR ALLIANCES SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com 2 The Intel Behind Intel Software Intel Software and Solutions Group 1 group • 6 divisions • 14 time zones • 24 major sites • 2,500 employees 170,000 registered developers, 20+ Operating Systems, Thousands of Applications Software Development Tools Software IP Strategy Managed Run Time Enabling Architecting Solutions & Services Strategic Enabling Parallel & Distributed Solutions Software Enabling & Developer Services Platform Planning Intel Software Tools, Engineers, Expertise SSG EMEA Software Enabling 3 for CST GmbH; Bjoern.Bruecher@Intel.com More’s Law is Alive and Well Transistor Count 10,000,000,000 Itanium® Family DC Processor (Montecito) = 1.7 Bil 1,000,000,000 Pentium® M Processor Pentium® 4 Processor Pentium® III Processor Pentium® II Processor 10,000,000 Pentium® Processor 486™ DX Processor 1,000,000 386™ Processor 286 100,000 8086 8008 10,000 8080 4004 1970 100,000,000 1980 1990 2000 1,000 2010 “Eventually one billion transistors may crowd a single chip – 1,000 times more than possible today.” National Geographic, 1982 SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com * Other brands and names may be claimed as the property of others. 4 …thru Intel’s Nanotechnology Leadership 1000 nm 100 nm 10 nm 1980 1990 2000 SSG EMEA Software Enabling 2010 2020 5 for CST GmbH; Bjoern.Bruecher@Intel.com * Other brands and names may be claimed as the property of others. Intel Keeps Introducing New Si Generation Every 2 Years www.intel.com/research/silicon All features and dates specified are targets provided for planning purposes only and are subject to change without notice 45nm Chips currently in Production SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com * Other brands and names may be claimed as the property of others. 6 Challenge 1: Power Power Density Race Sun’ Sun’s Surface 10,000 Rocket Nozzle Power Density (W/cm2) 1,000 Nuclear Reactor 100 Hot Plate 8086 10 4004 8085 386 8008 286 486 8080 1 ’70 ’80 ’90 Pentium® processors ’00 ’10 Exponential of Power will not allow us to push Frequency SSG EMEA Software Enabling 7 for CST GmbH; Bjoern.Bruecher@Intel.com * Other brands and names may be claimed as the property of others. Challenge 2: Memory Latency Memory Latency 3000 Logic 2500 GAP MHz 2000 1500 1000 500 Memory 0 1992 1994 1996 1998 2000 2002 CPU needs to wait longer and longer accessing Memory SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com * Other brands and names may be claimed as the property of others. 8 Challenge 3: RC Delays D e la y (p s ) 10000 Interconnect RC Delay Clock Period 1000 100 RC delay of 1mm interconnect 10 Copper Interconnect 1 0.35 0.25 180nm 130nm 90nm 65nm Process Technology i486™: 16 clocks RAM access roundtrip 65 nm IC: 15 clocks to cross the chip Signal Propagation Delay increases SSG EMEA Software Enabling 9 for CST GmbH; Bjoern.Bruecher@Intel.com * Other brands and names may be claimed as the property of others. Doubling Performance on Tight Power Budget 2f P ~ 4f² 4P0 4x Power Increase f f P = 4P0 P0 ~ f² Core Die/Socket f f ‘only’ 2x P ~ 2f² 2P0 Double the Cores to Double Performance SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com * Other brands and names may be claimed as the property of others. 10 Multi-Core Paradigm Old Paradigm: New Paradigm: Do One Task Really, Really Fast Do Many Tasks Simultaneously Single Core Performance System Performance Relative Performance 40 1 0.8 30 Assuming 100% parallel software 1 0.6 20 0.4 10 0.5 0.2 0.3 0 0 Large Medium Small Large Medium Small Frequency is NO Longer the Best indicator of Performance SSG EMEA Software Enabling 11 for CST GmbH; Bjoern.Bruecher@Intel.com Innovative Microarchitecture Quad-Core Intel® Xeon® Processor 5300 Series feature … Wide Dynamic Execution Intel® Intel® NetBurst® NetBurst® Microarchitecture + New Innovations Intel® Core™ Microarchitecture Advanced Digital Media Boost Smart Memory Access Advanced Smart Cache Intelligent Power Capability Mobile Microarchitecture Also used in Dual-Core Intel® Xeon® Processor 5100 Series Common Architecture for Intel Servers, Desktop, Mobile SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com 12 New Intel® Core™ Microarchitecture FASTER EFFICIENT Intel® Intel® Intelligent Power Capability Intel® Wide Dynamic Execution Engine 4 instructions per cycle SMARTER How do these features work? Learn more at intel.com http://www.intel.com/technology/architecture /coremicro/demo/demo.htm 1 ultra fine grained power control Intel® Advanced Smart Cache Intel® Smart Memory Access 2x cache size, shared Advanced pre-fetching compared to previous generation Dual-Core Intel® Xeon® Processor based servers Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Delivering a Performance and Power Advantage for IT SSG EMEA Software Enabling 13 for CST GmbH; Bjoern.Bruecher@Intel.com New Intel® Core™ Microarchitecture Intel® Intel® Wide Wide Dynamic Dynamic Execution Execution (WIDER) (WIDER) NetBurst/Yonah Pipeline (Three) Conroe/Woodcrest/Merom Pipeline (Four Lane Super Scalar Pipeline) Conroe/Woodcrest/Merom Decodes, Executes & Retires instructions at a sustained rate of 4 instructions per clock cycle SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com 14 New Intel® Core™ Microarchitecture Smart Cache Intel® Advanced Smart Cache Intel® Intel® Advanced Advanced Smart Smart Cache Cache (Efficient (Efficient Data Data Sharing) Sharing) Dynamic Cache Allocation: Shared Cache adapts to mismatched loads on each core Core 1 Core 2 L2 Cache Dynamic Bandwidth Allocation: High bandwidth application can borrow L1 to L2 bandwidth from the other core’s application. Core 1 Core 2 L2 Cache Core 1 2X L2 to L1 Bandwidth Core 2 L2 Cache 2X Bandwidth to L1 caches & up to 4MB L2 cache SSG EMEA Software Enabling 15 for CST GmbH; Bjoern.Bruecher@Intel.com New Intel® Core™ Microarchitecture ® Advanced Intel Intel® Advanced Digital Digital Media Media Boost Boost (Improved (Improved perf. perf. for for Multimedia Multimedia apps) apps) Only Per clock cycle Improved Performance for Streaming SIMD Extensions (SSE, SSE2, SSE3) (128 Bits every single clock cycle vs. 64 in previous Intel processors) SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com 16 New Intel® Core™ Microarchitecture Intel® Intel® Intelligent Intelligent Power Power Capability; Capability; Power Power Gating Gating Power shut off Intel Intelligent Power Capability optimizes energy usage, delivering more performance per watt SSG EMEA Software Enabling 17 for CST GmbH; Bjoern.Bruecher@Intel.com Infrastructure Refresh with Multi-Core Quad-Core Boosts Application Performance Across the Board up to 5x 4.5x Fastest Single Core Intel Xeon servers Data as of Nov 7, 2006 Intel measured or published Integer Intel® Core™ Microarchitecture Large 8MB SAP 2 tier 4x Java apps Dual & Floating pt 3x On-Die Cache database Quad-Core 2x Dedicated High 1.0 Advanced Speed Buses FBD Memory 64-bit Intel Xeon single core Best Published Dual-Core Intel® Xeon® 5160 Intel measured Quad-Core Intel® Xeon® E5345 Intel® QuickData Technology Tremendous Capability – Broad IT Usages SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com 18 Quad-Core for Workstations Simultaneous design and analysis Interact seamlessly with large models & assemblies Improved Performance for Rendering Faster Analytics for Financial Services What Can Quad-Core Do For You? SSG EMEA Software Enabling 19 for CST GmbH; Bjoern.Bruecher@Intel.com The Magic of 45 nm PERFORMANCE Spectrum of Intel Architecture ns Optio n o i t ra ic s Integ Graph : e l p Exam SYSTEM ON A CHIP Notebook, Desktop and Server Ultra-low Cost and Power Optimized Architecture Intel® Core™ 2 and Nehalem Architecture Visual Computing and HPC Optimized Architecture POWER SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com 20 Conclusion • Frequency is NO longer the Best indicator of Performance • Intel’s innovative Core Microarchitecture leads to outstanding Performance on tight Power Budget • Intel’s Dual-Core and Quad-Core boosts Application Performance Across the Board • Intel is committed to the Roadmap with a clear Multi- / Many- Core Direction SSG EMEA Software Enabling 21 for CST GmbH; Bjoern.Bruecher@Intel.com Intel® Workstation Platform Roadmap 2H’ 2H’2006 Xeon® DP Workstation 2007 2008 Glidewell Platform Harpertown Quad-Core Xeon® processor 5300 series Gainestown Wolfdale-DP Seaburg Chipset Intel® 5000X Chipset Tylersburg Chipset Garlow Platform Wyloway Platform Foxhollow Platform Intel® Core™ 2 Extreme Intel® Core™ 2 Quad Yorkfield Bloomfield Intel® Core™ 2 Extreme Intel® Core™ 2 Duo Wolfdale Bloomfield-DC Intel® 975X Chipset Intel® Mobile Workstation Thurley Platform Stoakley Platform Intel® Xeon® processor 5100 series Intel® UP Workstation Future Napa Platform Santa Rosa Platform Intel® Core™ 2 Duo Intel® Core™ 2 Duo Mob. Intel® 945PM Chipset Mob. Intel® PM965 Chipset 22 SSG EMEA Software Enabling for CST GmbH; Bjoern.Bruecher@Intel.com Tylersburg Chipset Bearlake-X Chipset Dual Dual Core Core 22 4+ 4+ Cores Cores