Tom Presotto - October 2010 A long walk … together IBM Vimercate S/38 manufacturing plant. February 79 S/38 August 79 Back to the basis 1979 – Mainframe and “dumb” terminals 1990 – 2010 Windows client-server model SOA & Cloud 2010 – Cloud Computing and browser POWER4™ 180 nm POWER5™ 130 nm POWER6™ 65 nm POWER7 45 nm POWER8 2010 2007 2004 2001 4 POWER5 POWER5+™ POWER6 POWER7 Technology 130 nm 90 nm 60 nm 45 nm Size 389 mm2 245 mm2 341 mm2 567 mm2 Transistors 276 M 276 M 790 M 1.2 B Cores 2 2 2 4/6/8 Frequencies 1.65 GHz 1.9 GHz 3-5 GHz 3-4 GHz L2 Cache 1.9 MB Shared 1.9 MB Shared 4 MB / Core 256 KB / Core L3 Cache 36 MB 36 MB 32 MB 4 MB / Core Memory Cntrl 1 1 2/1 2/1 LPAR 10 / Core 10 / Core 10 / Core 10 / Core Performance per Watt POWER4™ POWER4+™ p670 p670 1.5 GHz 1.1 GHz rPerf: 24.46 rPerf: 46.79 KWatts: 6.71 KWatts: 6.71 6.97 3.64 POWER5™ p5-570 1.65 GHz rPerf: 68.4 KWatts: 5.2 13.15 POWER5+™ POWER6™ p570 Power 570 1.9 GHz 4.7 GHz rPerf: 85.20 rPerf: 134.35 KWatts: 5.2 KWatts: 5.6 16.38 23.99 POWER6™ Power 570 4.2 GHz rPerf: 193.25 KWatts: 5.6 34.56 POWER7™ Power 780 3.8 GHz rPerf: 685.09 KWatts: 6.4 107.04 POWER7 Core Eight processor cores • 4 Way SMT per core – up to 4 threads per core • 32 Threads per chip • L1: 32 KB I Cache / 32 KB D Cache • L2: 256 KB per core • L3: Shared 32MB on chip eDRAM Binary Compatibility with POWER6 Transistors: 1.2 B 6 POWER7 Core POWER7 CORE POWER7 CORE POWER7 CORE POWER7 CORE L2 Cache L2 Cache L2 Cache L2 Cache MC0 12 Execution Units MC1 L3 Cache and Chip Interconnect L2 Cache L2 Cache L2 Cache L2 Cache POWER7 CORE POWER7 CORE POWER7 CORE POWER7 CORE • 2 Fixed Point Units • 2 Load Store Units • 4 Double Precision Floating Point Units • 1 Branch • 1 Condition Register • 1 Vector Unit • 1 Decimal Floating Point Unit 64-bit PowerPC architecture v2.07 Modes: POWER6, POWER6+ and POWER7 7 L3 Cache Local SMP Links POWER7 CORE L2 Cache POWER7 F A CORE S T L2 Cache POWER7 CORE POWER7 CORE L2 Cache L2 Cache 6-to-1 latency improvement L3 REGION MC0 L3 Cache and Chip Interconnect L2 Cache POWER7 CORE L2 Cache POWER7 CORE L2 Cache POWER7 CORE MC1 L2 Cache POWER7 CORE 20% energy of SRAM On chip cache benefits Remote SMP & I/O Links eDRAM Fast Local Regions Shared L3 Cache Intelligent cache management No off-chip drv & rcv 2x bandwidth improvement POWER6 L3 4 MB L2 Ctrl Memory Cntrl L3 Alti Core Vec 4 MB L2 Fabric Bus Controller GX Bus Cntrl L3 Ctrl P Core O W L2 E R L3 G X B U S Core L2 Core Core L2 L2 L3 Cache L2 L2 Core Core Memory Cntrl Alti Vec Core POWER7 L2 L2 Core Core Memory Interface Memory++ GX+ Bridge Memory+ Memory+ POWER7 delivers up to 3 - 4X the performance with less energy than POWER6 S M P F A B R I C POWER5 POWER7 POWER6 2 memory controller Up to 256GB of memory D D R 3 DDR2 @ 553 MHz Effective Bandwidth: 1.1 GB/s 10 DDR2 @ 553 / 667 MHz Effective Bandwidth: 2.6 GB/s D D R 3 D D R 3 D D R 3 D D R 3 D D R 3 D D R 3 D D R 3 D D R 3 D D R 3 DDR3 @ 1066 MHz Effective Bandwidth: 6.4 GB/s Less power requirements EDRAM Cell Fewer soft errors Enables POWER7 to provide 32MB of internal L3 Cache 1/5 the standby power 1/3 the space of conventional 6T SRAM implementation Better performance Greater density 1.5 Billion reduction in transistors 11 POWER7 TurboCore™ Mode TurboCore Chips: 4 available cores Aggregation of L3 Caches of unused cores. Power 780 TurboCore TurboCore chips have a 2X the L3 Cache per Chip available P O W E R 4 TurboCore Chips L3 = 32 MB G X Provides up to 1.5X per core to core B U S Core Core Core Core L2 L2 L2 L2 32 MB L3 Cache L2 L2 L2 L2 Core Core Core Core Chips run at higher frequency Memory Interface Power reduction of unused cores. With “Reboot”, System can be reconfigured to 8 core mode. 12 TurboCores Unused Core S M P F A B R I C POWER7 Core / Cache options 6-Core Chip P O W E R G X B U S 4-Core Chip Core Core Core Core L2 L2 L2 L2 24 MB L3 Cache L2 L2 L2 L2 Core Core Core Core Memory Interface Power 750 / Power 770 13 S M P P O W E R F A B R I C G X B U S Core Core Core Core L2 L2 L2 L2 16 MB L3 Cache L2 L2 L2 L2 Core Core Core Core Memory Interface PS700 S M P F A B R I C Multi-threading Evolution Single thread Out of Order FX0 FX1 FP0 FP1 LS0 LS1 BRX CRL FX0 FX1 FP0 FP1 LS0 LS1 BRX CRL POWER5 2 Way SMT FX0 FX1 FP0 FP1 LS0 LS1 BRX CRL POWER7 4 Way SMT FX0 FX1 FP0 FP1 LS0 LS1 BRX CRL No Thread Executing Thread 2 Executing 14 S80 Hardware Multi-thread Thread 0 Executing Thread 3 Executing Thread 1 Executing POWER7 4 way SMT 2 1,5 1 Requires POWER7 Mode • POWER6 Mode supports SMT1 and SMT2 0,5 Operating System Support Dynamic Runtime SMT scheduling 15 • AIX 6.1 and AIX 7.1 • IBM i 6.1 and 7.1 • Linux 0 SMT1 • Spread work among cores to execute in appropriate threaded mode • Can dynamical shift between modes as required: SMT1 / SMT2 / SMT4 SMT2 SMT4 Standard Cache Option All cores active Active Memory Expansion True Memory True Memory True Memory Expanded Memory Expanded Memory Expanded Memory True Memory True Memory True Memory Expanded Memory Expanded Memory Expanded Memory Expand memory beyond physical limits More effective server consolidation • Run more application workload / users per partition • Run more partitions and more workload per server Effectively up to 100% more memory Active Memory Sharing Around the World Asia Americas Europe 5 0 Day and Night 15 Memory Usage (GB) AXI, IBM i, and Linux partitions 10 Time 10 Night Day 5 0 Time 15 Memory Usage (GB) Moves memory from one partition to another Memory Usage (GB) 15 Infrequent Use 10 5 0 Time #10 #9 #8 #7 #6 #5 #4 #3 #2 #1 EnergyScale™ EnergyScale is used to dynamically optimizes the processor performance versus processor power and system workload IBM Systems Director is also required to manage AEM functions and supports the following functions: • • • • • • • • • • 18 Power Trending Thermal Reporting Static Energy Saver Mode Dynamic Energy Saver Mode Energy Capping Soft Energy Capping Processor Nap Energy Optimized Fan Control Altitude Input Processor Folding 18 TPMD: Thermal Power Management Device TPMD monitor power usage and temperatures in real time Can adjust the processor power and performance in real time TPMD card is part of the base hardware configuration. TPMD function is comprised of a risk processor and data acquisition If the temperature exceeds an upper (functional) threshold, TPMD actively reduces power consumption by reducing processor voltage and frequency or throttling memory as needed. 19 POWER7 “Over Clock” Uplift Nominal Over Clock 4,4 If the temperature is lower than upper 4,2 (functional) threshold, TPMD will allows POWER7 cores to4 “Over clock” if workloads demands are present. 3,8 3,6 3,4 3,2 3 20 Offerings - April 2010 PS700 Express PS701 Express PS702 Express Power 770 Power 780 Power 750 Express Power 755 New models August 2010 Power 795 Power 780 Power 770 Power 750 Power 720/740 HPC Power 755 Power 710/730 PS Blades Power 710 4-6-8 core 1 socket Processor module – pick ONE 4-core: #8350 3 GHz 6-core: #8349 3.7 GHz 8-core: #8359 3.55 GHz 2U Can NOT change module For 4-core (1 socket) – Zero 12X I/O loops – Max 1 4X IB Adapter – Max 64 GB memory – Disk-only drawers – Fibre Channel cards ok – IBM i P05 tier (users) – AIX small tier For 6-/8-core (1 socket) – Zero 12X I/O loops – Max 1 4X IB Adapter – Max 64 GB memory – Disk-only drawers – Fibre Channel cards ok – IBM i P10 tier (users) – AIX small tier Power 730 8-12-16 core 2 sockets Processor module – pick TWO of the same feature 4-core: 4-core: 6-core: 8-core: #8350 #8348 #8349 #8359 2U 3 GHz 3.7 GHz 3.7 GHz 3.55 GHz Can NOT change module For 8-/12-/16-core – Zero 12X I/O loops – Max 2 4X IB Adapters – Max 128 GB memory – Disk-only drawers ok – Fibre Channel cards ok – IBM i P20 tier (5250) – AIX small tier Power 720 4-6-8 core 1 socket Processor module – pick ONE 4-core: #8350 3 GHz 6-core: #8351 3 GHz 8-core: #8352 3 GHz Can NOT change module For 4-core – Zero 12X I/O loops – Max 64 GB memory – Zero disk-only drawers – Fibre Channel cards ok – IBM i P05 tier (users) – AIX small tier For 6-/8-core – Max 1 12X I/O loop – Max 128 GB memory – Disk-only drawers – Fibre Channel cards ok – IBM i P10 tier (users) – AIX small tier 4U Power 740 4-6-8-12-16 core 1 or 2 sockets Pick processor modules 1 or 2: 4-core: #8353 3.3 GHz 1 or 2: 4-core: #8347 3.7 GHz 1 or 2: 6-core: #8354 3.7 GHz 2: 8-core: #8355 3.55 GHz For 4-,6-core (1 socket) – Max 1 12X I/O loop – Max 128 GB memory – IBM i P20 tier 5250 Entitlements – AIX small tier For 8-,12-,16-core (2 socket) – Max 2 12X I/O loops – Max 256 GB memory – IBM i P20 tier 5250 Entitlements – AIX small tier 4U POWER7 delivers outstanding performance CPW 80,000 60,000 40,000 20,000 525 POWER5 Single core CPW 550 POWER5 3800 520 POWER6 550 POWER6 4700 720 POWER7 740 POWER7 5950 NB CPW measured in maximum system and I/O configuration Power 750 4 Socket 4U 6 or 8 cores per socket 3.0 to 3.55 GHz Energy-Star Qualified 28 Power 750 System Overview TPMD 3 PCIe & 2 PCI-X Slots Dual Power Supplies Half-High Bay Up to 4 Processor / Memory Cards (tape or removable disk DVD Fans 8 SFF Bays (Disk or SSD) 29 Power 750 System 30 POWER7 Architecture 6 Cores @ 3.3 GHz 8 Cores @ 3. 0, 3.3, 3.55 GHz Max: 4 Sockets DDR3 Memory Up to 512 GB System Unit SAS SFF Bays Up to 8 Drives (HDD or SSD) 73 / 146 / 300GB @ 15k (2.4 TB) (Opt: cache & RAID-5/6) System Unit IO Expansion Slots PCIe x8: 3 Slots (2 shared) PCI-X DDR: 2 Slots 1 GX+ & Opt 1 GX++ 12X cards Integrated SAS / SATA Yes System Unit Integrated Ports 3 USB, 2 Serial, 2 HMC Integrated Virtual Ethernet Quad 10/100/1000 Optional: Dual 10 Gb System Unit Media Bays 1 Slim-line DVD & 1 Half Height IO Drawers w/ PCI slots PCIe = 4 Max: PCI-X = 8 MAX Cluster 12X SDR / DDR (IB technology) Redundant Power and Cooling Yes (AC or DC Power) Single phase 240 VAC or -48 VDC Certification (SoD) NEBS / ETSI for harsh environments EnergyScale Active Thermal Power Management Dynamic Energy Save & Capping Power 755 Power 755 and Power 750 hardware is very, very similar, but Power 755 offering is customized to High Performance Computing environment 31 Power 755 32 POWER7 Architecture 4 Processor Sockets = 32 Cores 8 Core @ 3.3 GHz DDR3 Memory 128 GB / 256 GB, 32 DIMM Slots System Unit SAS SFF Bays Up to 8 disk or SSD 73 / 146 / 300GB @ 15K (up to 2.4TB) System Unit Expansion PCIe x8: 3 Slots (1 shared) PCI-X DDR: 2 Slots GX++ Bus Integrated Ports 3 USB, 2 Serial, 2 HMC Integrated Ethernet Quad 1Gb Copper (Opt: Dual 10Gb Copper or Fiber) System Unit Media Bay 1 DVD-RAM ( No supported tape bay ) Cluster Up to 64 nodes Ethernet or IB-DDR Redundant Power Yes (AC or DC Power) Single phase 240vac or -48 VDC Certifications (SoD) NEBS / ETSI for harsh environments EnergyScale Active Thermal Power Management Dynamic Energy Save & Capping The highest performing 4-socket system on the planet POWER7 continues to break the rules with more performance SPECint_rate Itanium HP rx6600 SPARC Sun T5440 x86 HP DL585 POWER7 Power 755 with PowerVM 33 The most energy efficient 4-socket system on the planet Most energy efficient systems Performance Per Watt Itanium HP rx6600 SPARC Sun T5440 x86 HP DL585 POWER7 Power 755 with PowerVM 34 Power 770 12 or 16 core 4U Nodes Up to 4 Nodes per system 3.1 and 3.5 GHz Capacity on Demand Enterprise RAS 35 Power 770 Power 770 6 Cores @ 3.55 GHz 8 Cores @ 3.1 GHz Processor Technology L3 Cache 4U x 32 inches Depth Redundant Power & Cooling Yes Redundant Server Processor Yes / Two Enclosure minimum Redundant Clock Yes / Two Enclosure minimum Hot Add Support Yes Hot Service Yes System Unit Single Enclosure 4 Enclosures Processors Up to 2 Sockets 8 Sockets Up to 512 GB Up to 2 TB 6 24 DDR3 Memory (Buffered) SAS/SSD SFF Bays DVD-RAM Media Bays SAS / SATA Controller PCIe bays GX++ Slots (12X DDR) Integrated Ethernet USB 12X I/O Drawers w/ PCI slots 36 On Chip 1 Slim-line 4 Slim-line 2/1 8/4 6 PCIe 24 PCIe 2 8 Std: Quad 1Gb Opt: Dual 10Gb + Dual 1 Gb Std: Four Quad 1Gb Opt: Four x Dual 10Gb + Dual 1 Gb 3 12 Max: 4 PCIe, 8 PCI-X Max: 16 PCIe, 32 PCI-X Power 780 New Modular High-End Up to 64 Cores TurboCore 3.86 or 4.14 GHz Capacity on Demand Enterprise RAS 24x7 Warranty PowerCare 37 Power 780 Power 780 Processor Technology 4 Cores @ 4.1 GHz 8 Cores @ 3.8 GHz L3 Cache On Chip Redundant Power & Cooling Yes Redundant Server Processor Yes / Two Enclosure minimum Redundant Clock Yes / Two Enclosure minimum Hot Add Support Yes Hot Service Yes System Unit Single Enclosure 4 Enclosures Processors 2 Sockets 8 Sockets Up to 512 GB Up to 2 TB 6 24 DDR3 Memory (Buffered) SAS/SSD SFF Bays (CEC) DVD-RAM Media Bays SAS / SATA Controller PCIe (CEC) GX++ Slots (12X DDR) Integrated Ethernet USB 12X I/O Drawers w/ PCI slots 38 TurboCore 1 Slim-line 2/1 4 Slim-line 8/4 6 PCIe 24 PCIe 2 8 Std: Quad 1Gb Opt: Dual 10Gb + Dual 1 Gb Std: Four Quad 1Gb Opt: Four x Dual 10Gb + Dual 1 Gb 3 12 Max: 4 PCIe, 8 PCI-X Max: 16 PCIe, 32 PCI-X Power 795 ✓New High-end ✓24 to 256 Cores ✓8 TB memory ✓TurboCore ✓3.7, 4.0 or 4.25 GHz ✓1,000 VMs* with PowerVM ✓Capacity on Demand ✓Enterprise RAS ✓24x7 Warranty ✓PowerCare On October 7, IBM published a new SAP 2-tier Sales and Distribution benchmark result on the Power 795. The result is 70,032 users on a 128-core Power 795 running AIX and DB2. This is the highest result ever attained on this benchmark. Power Systems Blades ✓PS700 1 socket 4 core ✓PS701 1 socket 8 core ✓PS702 2 socket 16 core ✓3.0 GHz POWER7 PS700 Blade 4 Cores Architecture 4 Core Single Socket Fiber Support Yes (via BladeCenter chassis) L2 & L3 Cache On Chip 1 BladeCenter chassis DDR3 Memory Up to 64 GB Media Bays Redundant Power DASD / Bays 0 - 2 SAS (300/600GB) Redundant Cooling Daughter Card Options CIOv & CFFh ( PCIe Adapters ) Integrated Options Dual Port Gbt Ethernet Ethernet, USB Service Processor Power & Thermal 41 Yes BladeCenter chassis Yes BladeCenter chassis Yes POWER Save / Power Cap POWER7 PS701 Blade 8 Cores Fiber Support Media Bays Redundant Power Yes (via BladeCenter chassis) 1 BladeCenter chassis Architecture 8 Core Single Socket L2 & L3 Cache On Chip DDR3 Memory Up to 128 GB DASD / Bays 0 - 1 SAS (300/600GB) Redundant Cooling Yes BladeCenter chassis Daughter Card Options CIOv & CFFh ( PCIe Adapters ) Service Processor Yes Integrated Options Dual Port Gbt Ethernet Ethernet, USB Power & Thermal POWER Save / Power Cap 42 Yes BladeCenter chassis POWER7 PS702 Blade 16 Cores Fiber Support Media Bays Redundant Power Architecture 8 Cores/Socket Two Socket L2 & L3 Cache On Chip DDR3 Memory Up to 256 GB DASD / Bays 0 - 2 SAS (300/600GB) Redundant Cooling Daughter Card Options CIOv & CFFh ( PCIe Adapters ) Integrated Options Quad Port Gbt Ethernet Ethernet, USB Service Processor Power & Thermal 43 Yes (via BladeCenter chassis) 1 BladeCenter chassis Yes BladeCenter chassis Yes BladeCenter chassis Yes POWER Save / Power Cap i Edition Express for BladeCenter S i Edition Express for BladeCenter S BladeCenter PS700 or JS12 IBM i PowerVM Express BladeCenter S IBM i preloaded The i Edition Express for BladeCenter S is the perfect alternative to a traditional rack or tower server with comparable starting prices and enables clients to run their i applications and consolidate x86 servers into a single BladeCenter S chassis that supports up to six blades and over 7 terabytes of disk storage 44 Power Systems Virtualization Hypervisor •Support for multiple operating environments Dynamic Logical Partitioning •Micro-partitioning, resource movement Multiple Shared Processor Pools •Cap processor resources for a group of partitions Virtual I/O Server •Virtualizes resources for client partitions Integrated Virtualization Manager Lx86 •Simplifies partition management for entry systems •Supports x86 Linux applications Live Partition Mobility •Move running AIX and Linux partitions Active Memory Sharing •Share a memory pool among partitions VIOS Power Hypervisor IBM i 7.1 Highlights DB2 PowerHA Support for XML and column level encryption Async Geographic Mirroring & LUN-level switching VIOS IBM i 6.1 Virtualization IBM i 6.1 virtualization for i 7.1 partitions Solid State Drives Workload Capping Open Access for RPG Zend Server Community Edition Systems Director Automatic movement of hot data to SSDs Limit # of cores used by middleware within a partition Extend application reach to pervasive devices PHP environment preloaded with IBM i Richer management of IBM i via Systems Director IBM i 7.1 Power Systems Traditional IBM i Workload management IBM i Workload Management Subsystems provide workload isolation Priorities are used to schedule work No way to cap a given application to a subset of the processor resources in a partition All workloads can access the full number of Cores in the Partition Application 1 = 8 Cores Application 2 = 8 Cores Application 3 = 8 Cores IBM i System / Partition IBM i Workload Capping IBM i workload capping can control workloads by limiting the number of cores that can be used by an application Application 1 = 3 Cores Application 2 = 6 Cores Application 3 = 8 cores IBM i System / Partition Back to the basis Back to the future IBM i and Cloud Computing • • • • • • • Best platform for “private cloud” Centralized model Server consolidation Bring back the complexity into the “computer room” No more “personal” workstations and company’s data stored on users’ disks Low-TCO terminals SOA approach to integrate third party “SaaS” solutions IBM refreshes CloudBurst line with Power7 chips - 14 October 2010 A rack with a single Power 750 server and 32 processor cores can run up to 160 virtual machines, while the top end system, with 11 Power 750 servers in five racks, can run up to 2,960 virtual machines, 50 tom.presotto@evog.it