Power 7 - RegOnline

advertisement
Tom Presotto - October 2010
A long walk … together
IBM Vimercate
S/38 manufacturing plant.
February 79
S/38 August 79
Back to the basis
1979 – Mainframe and
“dumb” terminals
1990 – 2010
Windows
client-server
model
SOA & Cloud
2010 – Cloud Computing
and browser
POWER4™
180 nm
POWER5™
130 nm
POWER6™
65 nm
POWER7
45 nm
POWER8
2010
2007
2004
2001
4
POWER5
POWER5+™
POWER6
POWER7
Technology
130 nm
90 nm
60 nm
45 nm
Size
389 mm2
245 mm2
341 mm2
567 mm2
Transistors
276 M
276 M
790 M
1.2 B
Cores
2
2
2
4/6/8
Frequencies
1.65 GHz
1.9 GHz
3-5 GHz
3-4 GHz
L2 Cache
1.9 MB Shared
1.9 MB Shared
4 MB / Core
256 KB / Core
L3 Cache
36 MB
36 MB
32 MB
4 MB / Core
Memory Cntrl
1
1
2/1
2/1
LPAR
10 / Core
10 / Core
10 / Core
10 / Core
Performance per Watt
POWER4™ POWER4+™
p670
p670
1.5 GHz
1.1 GHz
rPerf: 24.46 rPerf: 46.79
KWatts: 6.71 KWatts: 6.71
6.97
3.64
POWER5™
p5-570
1.65 GHz
rPerf: 68.4
KWatts: 5.2
13.15
POWER5+™ POWER6™
p570
Power 570
1.9 GHz
4.7 GHz
rPerf: 85.20 rPerf: 134.35
KWatts: 5.2
KWatts: 5.6
16.38
23.99
POWER6™
Power 570
4.2 GHz
rPerf: 193.25
KWatts: 5.6
34.56
POWER7™
Power 780
3.8 GHz
rPerf: 685.09
KWatts: 6.4
107.04
POWER7 Core
Eight
processor
cores
• 4 Way SMT per core – up to 4
threads per core
• 32 Threads per chip
• L1: 32 KB I Cache / 32 KB D Cache
• L2: 256 KB per core
• L3: Shared 32MB on chip eDRAM
Binary Compatibility with POWER6
Transistors: 1.2 B
6
POWER7 Core
POWER7
CORE
POWER7
CORE
POWER7
CORE
POWER7
CORE
L2 Cache
L2 Cache
L2 Cache
L2 Cache
MC0
12 Execution Units
MC1
L3 Cache and
Chip Interconnect
L2 Cache
L2 Cache
L2 Cache
L2 Cache
POWER7
CORE
POWER7
CORE
POWER7
CORE
POWER7
CORE
• 2 Fixed Point Units
• 2 Load Store Units
• 4 Double Precision
Floating Point Units
• 1 Branch
• 1 Condition
Register
• 1 Vector Unit
• 1 Decimal Floating
Point Unit
64-bit PowerPC architecture v2.07
Modes: POWER6, POWER6+ and POWER7
7
L3 Cache
Local SMP Links
POWER7
CORE
L2 Cache
POWER7
F
A CORE
S
T L2 Cache
POWER7
CORE
POWER7
CORE
L2 Cache
L2 Cache
6-to-1
latency
improvement
L3 REGION
MC0
L3 Cache and
Chip Interconnect
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
MC1
L2 Cache
POWER7
CORE
20% energy
of SRAM
On chip
cache
benefits
Remote SMP & I/O Links
eDRAM
Fast Local Regions
Shared L3 Cache
Intelligent cache management
No off-chip
drv & rcv
2x
bandwidth
improvement
POWER6
L3 4 MB
L2
Ctrl
Memory
Cntrl
L3
Alti
Core Vec
4 MB
L2
Fabric Bus
Controller
GX Bus Cntrl
L3
Ctrl
P Core
O
W L2
E
R
L3
G
X
B
U
S
Core
L2
Core Core
L2
L2
L3 Cache
L2
L2
Core Core
Memory
Cntrl
Alti
Vec Core
POWER7
L2
L2
Core Core
Memory Interface
Memory++
GX+
Bridge
Memory+
Memory+
POWER7 delivers up to 3 - 4X the
performance with less energy than
POWER6
S
M
P
F
A
B
R
I
C
POWER5
POWER7
POWER6
2 memory controller
Up to 256GB of memory
D
D
R
3
DDR2 @ 553 MHz
Effective Bandwidth:
1.1 GB/s
10
DDR2 @ 553 / 667 MHz
Effective Bandwidth:
2.6 GB/s
D
D
R
3
D
D
R
3
D
D
R
3
D
D
R
3
D
D
R
3
D
D
R
3
D
D
R
3
D
D
R
3
D
D
R
3
DDR3 @ 1066 MHz
Effective Bandwidth:
6.4 GB/s
Less power requirements
EDRAM Cell
Fewer soft errors
Enables POWER7 to provide
32MB of internal L3 Cache
1/5 the standby power
1/3 the space of conventional 6T SRAM implementation
Better performance
Greater density
1.5 Billion reduction in transistors
11
POWER7 TurboCore™ Mode
TurboCore Chips: 4 available cores
Aggregation of L3 Caches of unused
cores.
Power 780 TurboCore
TurboCore chips have a 2X the L3 Cache
per Chip available
P
O
W
E
R
4 TurboCore Chips L3 = 32 MB
G
X
Provides up to 1.5X per core to core
B
U
S
Core
Core
Core
Core
L2
L2
L2
L2
32 MB
L3 Cache
L2
L2
L2
L2
Core
Core
Core
Core
Chips run at higher frequency
Memory Interface
Power reduction of unused cores.
With “Reboot”, System can be
reconfigured to 8 core mode.
12
TurboCores
Unused
Core
S
M
P
F
A
B
R
I
C
POWER7 Core / Cache options
6-Core Chip
P
O
W
E
R
G
X
B
U
S
4-Core Chip
Core
Core
Core
Core
L2
L2
L2
L2
24 MB L3 Cache
L2
L2
L2
L2
Core
Core
Core
Core
Memory Interface
Power 750 / Power 770
13
S
M
P
P
O
W
E
R
F
A
B
R
I
C
G
X
B
U
S
Core
Core
Core
Core
L2
L2
L2
L2
16 MB L3 Cache
L2
L2
L2
L2
Core
Core
Core
Core
Memory Interface
PS700
S
M
P
F
A
B
R
I
C
Multi-threading Evolution
Single thread Out of Order
FX0
FX1
FP0
FP1
LS0
LS1
BRX
CRL
FX0
FX1
FP0
FP1
LS0
LS1
BRX
CRL
POWER5 2 Way SMT
FX0
FX1
FP0
FP1
LS0
LS1
BRX
CRL
POWER7 4 Way SMT
FX0
FX1
FP0
FP1
LS0
LS1
BRX
CRL
No Thread Executing
Thread 2 Executing
14
S80 Hardware Multi-thread
Thread 0 Executing
Thread 3 Executing
Thread 1 Executing
POWER7 4 way SMT
2
1,5
1
Requires
POWER7
Mode
• POWER6 Mode supports SMT1 and
SMT2
0,5
Operating
System
Support
Dynamic
Runtime SMT
scheduling
15
• AIX 6.1 and AIX 7.1
• IBM i 6.1 and 7.1
• Linux
0
SMT1
• Spread work among cores to execute in
appropriate threaded mode
• Can dynamical shift between modes as
required: SMT1 / SMT2 / SMT4
SMT2
SMT4
Standard Cache Option
All cores active
Active Memory Expansion
True
Memory
True
Memory
True
Memory
Expanded
Memory
Expanded
Memory
Expanded
Memory
True
Memory
True
Memory
True
Memory
Expanded
Memory
Expanded
Memory
Expanded
Memory
Expand memory beyond physical limits
More effective server consolidation
• Run more application workload / users per partition
• Run more partitions and more workload per server
Effectively up to
100% more
memory
Active Memory Sharing
Around the World
Asia
Americas
Europe
5
0
Day and Night
15
Memory Usage (GB)
AXI, IBM
i, and
Linux
partitions
10
Time
10
Night
Day
5
0
Time
15
Memory Usage (GB)
Moves
memory
from one
partition
to another
Memory Usage (GB)
15
Infrequent Use
10
5
0
Time
#10
#9
#8
#7
#6
#5
#4
#3
#2
#1
EnergyScale™
EnergyScale is used to dynamically optimizes the processor performance
versus processor power and system workload
IBM Systems Director is also required to manage
AEM functions and supports the following functions:
•
•
•
•
•
•
•
•
•
•
18
Power Trending
Thermal Reporting
Static Energy Saver Mode
Dynamic Energy Saver Mode
Energy Capping
Soft Energy Capping
Processor Nap
Energy Optimized Fan Control
Altitude Input
Processor Folding
18
TPMD:
Thermal Power Management Device
TPMD monitor power usage and
temperatures in real time
Can adjust the
processor power
and performance
in real time
TPMD card is
part of the base
hardware
configuration.
TPMD function is
comprised of a risk
processor and data
acquisition
If the temperature exceeds an upper
(functional) threshold, TPMD actively
reduces power consumption by reducing
processor voltage and frequency or
throttling memory as needed.
19
POWER7 “Over Clock” Uplift
Nominal Over Clock
4,4
If the temperature is lower than upper
4,2
(functional) threshold,
TPMD will allows
POWER7 cores to4 “Over clock” if
workloads demands are present.
3,8
3,6
3,4
3,2
3
20
Offerings - April 2010
PS700 Express
PS701 Express
PS702 Express
Power 770
Power 780
Power 750 Express
Power 755
New models
August 2010
Power 795
Power 780
Power 770
Power 750
Power 720/740
HPC
Power 755
Power 710/730
PS Blades
Power 710 4-6-8 core


1 socket
Processor module – pick ONE
 4-core: #8350 3 GHz
 6-core: #8349 3.7 GHz
 8-core: #8359 3.55 GHz

2U
Can NOT change module
For 4-core (1 socket)
– Zero 12X I/O loops
– Max 1 4X IB Adapter
– Max 64 GB memory
– Disk-only drawers
– Fibre Channel cards ok
– IBM i P05 tier (users)
– AIX small tier
For 6-/8-core (1 socket)
– Zero 12X I/O loops
– Max 1 4X IB Adapter
– Max 64 GB memory
– Disk-only drawers
– Fibre Channel cards ok
– IBM i P10 tier (users)
– AIX small tier
Power 730 8-12-16 core


2 sockets
Processor module – pick TWO of
the same feature





4-core:
4-core:
6-core:
8-core:
#8350
#8348
#8349
#8359
2U
3 GHz
3.7 GHz
3.7 GHz
3.55 GHz
Can NOT change module
For 8-/12-/16-core
– Zero 12X I/O loops
– Max 2 4X IB Adapters
– Max 128 GB memory
– Disk-only drawers ok
– Fibre Channel cards ok
– IBM i P20 tier (5250)
– AIX small tier
Power 720 4-6-8 core


1 socket
Processor module – pick ONE
 4-core: #8350 3 GHz
 6-core: #8351 3 GHz
 8-core: #8352 3 GHz

Can NOT change module
For 4-core
– Zero 12X I/O loops
– Max 64 GB memory
– Zero disk-only drawers
– Fibre Channel cards ok
– IBM i P05 tier (users)
– AIX small tier
For 6-/8-core
– Max 1 12X I/O loop
– Max 128 GB memory
– Disk-only drawers
– Fibre Channel cards ok
– IBM i P10 tier (users)
– AIX small tier
4U
Power 740 4-6-8-12-16 core
1 or 2 sockets
 Pick processor modules

 1 or 2: 4-core: #8353 3.3 GHz
 1 or 2: 4-core: #8347 3.7 GHz
 1 or 2: 6-core: #8354 3.7 GHz
 2: 8-core: #8355 3.55 GHz
For 4-,6-core (1 socket)
– Max 1 12X I/O loop
– Max 128 GB memory
– IBM i P20 tier
5250 Entitlements
– AIX small tier
For 8-,12-,16-core (2 socket)
– Max 2 12X I/O loops
– Max 256 GB memory
– IBM i P20 tier
5250 Entitlements
– AIX small tier
4U
POWER7 delivers outstanding performance
CPW
80,000
60,000
40,000
20,000
525
POWER5
Single core CPW
550
POWER5
3800
520
POWER6
550
POWER6
4700
720
POWER7
740
POWER7
5950
NB CPW measured in maximum system and I/O configuration
Power 750
4 Socket 4U
6 or 8 cores per socket
3.0 to 3.55 GHz
Energy-Star Qualified
28
Power 750 System Overview
TPMD
3 PCIe & 2 PCI-X
Slots
Dual Power
Supplies
Half-High Bay
Up to 4
Processor / Memory
Cards
(tape or removable disk
DVD
Fans
8 SFF Bays
(Disk or SSD)
29
Power 750 System
30
POWER7 Architecture
6 Cores @ 3.3 GHz
8 Cores @ 3. 0, 3.3, 3.55 GHz
Max: 4 Sockets
DDR3 Memory
Up to 512 GB
System Unit SAS SFF Bays
Up to 8 Drives (HDD or SSD)
73 / 146 / 300GB @ 15k (2.4 TB)
(Opt: cache & RAID-5/6)
System Unit
IO Expansion Slots
PCIe x8: 3 Slots (2 shared)
PCI-X DDR: 2 Slots
1 GX+ & Opt 1 GX++ 12X cards
Integrated SAS / SATA
Yes
System Unit
Integrated Ports
3 USB, 2 Serial, 2 HMC
Integrated Virtual Ethernet
Quad 10/100/1000
Optional: Dual 10 Gb
System Unit Media Bays
1 Slim-line DVD & 1 Half Height
IO Drawers w/ PCI slots
PCIe = 4 Max: PCI-X = 8 MAX
Cluster
12X SDR / DDR (IB technology)
Redundant Power and
Cooling
Yes (AC or DC Power)
Single phase 240 VAC or -48 VDC
Certification (SoD)
NEBS / ETSI for harsh environments
EnergyScale
Active Thermal Power Management
Dynamic Energy Save & Capping
Power 755
Power 755 and Power 750 hardware is very,
very similar, but Power 755 offering is
customized to High Performance Computing
environment
31
Power 755
32
POWER7 Architecture
4 Processor Sockets = 32 Cores
8 Core @ 3.3 GHz
DDR3 Memory
128 GB / 256 GB, 32 DIMM Slots
System Unit
SAS SFF Bays
Up to 8 disk or SSD
73 / 146 / 300GB @ 15K (up to 2.4TB)
System Unit
Expansion
PCIe x8: 3 Slots (1 shared)
PCI-X DDR: 2 Slots
GX++ Bus
Integrated Ports
3 USB, 2 Serial, 2 HMC
Integrated Ethernet
Quad 1Gb Copper
(Opt: Dual 10Gb Copper or Fiber)
System Unit Media Bay
1 DVD-RAM ( No supported tape bay )
Cluster
Up to 64 nodes
Ethernet or IB-DDR
Redundant Power
Yes (AC or DC Power)
Single phase 240vac or -48 VDC
Certifications (SoD)
NEBS / ETSI for harsh environments
EnergyScale
Active Thermal Power Management
Dynamic Energy Save & Capping
The highest performing 4-socket
system on the planet
POWER7 continues to
break the rules with
more performance
SPECint_rate
Itanium
HP rx6600
SPARC
Sun T5440
x86
HP DL585
POWER7
Power 755
with PowerVM
33
The most energy efficient 4-socket
system on the planet
Most energy
efficient
systems
Performance Per Watt
Itanium
HP rx6600
SPARC Sun
T5440
x86
HP DL585
POWER7
Power 755
with PowerVM
34
Power 770
12 or 16 core 4U Nodes
Up to 4 Nodes per system
3.1 and 3.5 GHz
Capacity on Demand
Enterprise RAS
35
Power 770
Power 770
6 Cores @ 3.55 GHz
8 Cores @ 3.1 GHz
Processor Technology
L3 Cache
4U x 32 inches Depth
Redundant Power & Cooling
Yes
Redundant Server Processor
Yes / Two Enclosure minimum
Redundant Clock
Yes / Two Enclosure minimum
Hot Add Support
Yes
Hot Service
Yes
System Unit
Single Enclosure
4 Enclosures
Processors
Up to 2 Sockets
8 Sockets
Up to 512 GB
Up to 2 TB
6
24
DDR3 Memory (Buffered)
SAS/SSD SFF Bays
DVD-RAM Media Bays
SAS / SATA Controller
PCIe bays
GX++ Slots (12X DDR)
Integrated Ethernet
USB
12X I/O Drawers w/ PCI slots
36
On Chip
1 Slim-line
4 Slim-line
2/1
8/4
6 PCIe
24 PCIe
2
8
Std: Quad 1Gb
Opt: Dual 10Gb +
Dual 1 Gb
Std: Four Quad 1Gb
Opt: Four x Dual 10Gb +
Dual 1 Gb
3
12
Max: 4 PCIe, 8 PCI-X
Max: 16 PCIe, 32 PCI-X
Power 780
New Modular High-End
Up to 64 Cores
TurboCore
3.86 or 4.14 GHz
Capacity on Demand
Enterprise RAS
24x7 Warranty
PowerCare
37
Power 780
Power 780
Processor Technology
4 Cores @ 4.1 GHz
8 Cores @ 3.8 GHz
L3 Cache
On Chip
Redundant Power & Cooling
Yes
Redundant Server Processor
Yes / Two Enclosure minimum
Redundant Clock
Yes / Two Enclosure minimum
Hot Add Support
Yes
Hot Service
Yes
System Unit
Single Enclosure
4 Enclosures
Processors
2 Sockets
8 Sockets
Up to 512 GB
Up to 2 TB
6
24
DDR3 Memory (Buffered)
SAS/SSD SFF Bays (CEC)
DVD-RAM Media Bays
SAS / SATA Controller
PCIe (CEC)
GX++ Slots (12X DDR)
Integrated Ethernet
USB
12X I/O Drawers w/ PCI slots
38
TurboCore
1 Slim-line
2/1
4 Slim-line
8/4
6 PCIe
24 PCIe
2
8
Std: Quad 1Gb
Opt: Dual 10Gb + Dual
1 Gb
Std: Four Quad 1Gb
Opt: Four x Dual 10Gb +
Dual 1 Gb
3
12
Max: 4 PCIe, 8 PCI-X
Max: 16 PCIe, 32 PCI-X
Power 795
✓New High-end
✓24 to 256 Cores
✓8 TB memory
✓TurboCore
✓3.7, 4.0 or 4.25 GHz
✓1,000 VMs* with PowerVM
✓Capacity on Demand
✓Enterprise RAS
✓24x7 Warranty
✓PowerCare
On October 7, IBM published a new SAP 2-tier Sales
and Distribution benchmark result on the
Power 795. The result is 70,032 users on a 128-core
Power 795 running AIX and DB2. This is
the highest result ever attained on this benchmark.
Power Systems Blades
✓PS700 1 socket 4 core
✓PS701 1 socket 8 core
✓PS702 2 socket 16 core
✓3.0 GHz
POWER7 PS700 Blade 4 Cores
Architecture
4 Core Single Socket
Fiber Support
Yes (via BladeCenter chassis)
L2 & L3 Cache
On Chip
1 BladeCenter chassis
DDR3 Memory
Up to 64 GB
Media Bays
Redundant
Power
DASD / Bays
0 - 2 SAS (300/600GB)
Redundant
Cooling
Daughter Card
Options
CIOv & CFFh
( PCIe Adapters )
Integrated
Options
Dual Port Gbt Ethernet
Ethernet, USB
Service
Processor
Power &
Thermal
41
Yes BladeCenter chassis
Yes BladeCenter chassis
Yes
POWER Save / Power Cap
POWER7 PS701 Blade 8 Cores
Fiber Support
Media Bays
Redundant
Power
Yes (via BladeCenter chassis)
1 BladeCenter chassis
Architecture
8 Core Single Socket
L2 & L3 Cache
On Chip
DDR3 Memory
Up to 128 GB
DASD / Bays
0 - 1 SAS (300/600GB)
Redundant
Cooling
Yes BladeCenter chassis
Daughter Card
Options
CIOv & CFFh
( PCIe Adapters )
Service
Processor
Yes
Integrated
Options
Dual Port Gbt Ethernet
Ethernet, USB
Power & Thermal POWER Save / Power Cap
42
Yes BladeCenter chassis
POWER7 PS702 Blade 16 Cores
Fiber Support
Media Bays
Redundant
Power
Architecture
8 Cores/Socket Two Socket
L2 & L3 Cache
On Chip
DDR3 Memory
Up to 256 GB
DASD / Bays
0 - 2 SAS (300/600GB)
Redundant
Cooling
Daughter Card
Options
CIOv & CFFh
( PCIe Adapters )
Integrated
Options
Quad Port Gbt Ethernet
Ethernet, USB
Service
Processor
Power &
Thermal
43
Yes (via BladeCenter chassis)
1 BladeCenter chassis
Yes BladeCenter chassis
Yes BladeCenter chassis
Yes
POWER Save / Power Cap
i Edition Express for BladeCenter S
i Edition Express for BladeCenter S
BladeCenter PS700 or JS12
IBM i
PowerVM Express
BladeCenter S
IBM i preloaded
The i Edition Express for BladeCenter S is the perfect
alternative to a traditional rack or tower server with comparable
starting prices and enables clients to run their i applications
and consolidate x86 servers into a single BladeCenter S
chassis that supports up to six blades and over 7 terabytes of
disk storage
44
Power Systems Virtualization
Hypervisor
•Support for multiple operating environments
Dynamic Logical
Partitioning
•Micro-partitioning, resource movement
Multiple Shared
Processor Pools
•Cap processor resources for a group of partitions
Virtual I/O Server
•Virtualizes resources for client partitions
Integrated Virtualization
Manager
Lx86
•Simplifies partition management for entry systems
•Supports x86 Linux applications
Live Partition Mobility
•Move running AIX and Linux partitions
Active Memory Sharing
•Share a memory pool among partitions
VIOS
Power Hypervisor
IBM i 7.1 Highlights

DB2


PowerHA


Support for XML and column level encryption
Async Geographic Mirroring & LUN-level switching
VIOS
IBM i 6.1
Virtualization
 IBM i 6.1 virtualization for i 7.1 partitions

Solid State Drives

Workload Capping

Open Access for RPG

Zend Server Community Edition

Systems Director
 Automatic movement of hot data to SSDs
 Limit # of cores used by middleware within a partition
 Extend application reach to pervasive devices
 PHP environment preloaded with IBM i
 Richer management of IBM i via Systems Director
IBM i 7.1
Power Systems
Traditional IBM i Workload management

IBM i Workload Management
 Subsystems provide workload isolation
 Priorities are used to schedule work
 No way to cap a given application to a subset of the processor resources in
a partition

All workloads can access the full number of Cores in the Partition
Application 1 = 8 Cores
Application 2 = 8 Cores
Application 3 = 8 Cores
IBM i System / Partition
IBM i Workload Capping

IBM i workload capping can control workloads by
limiting the number of cores that can be used by
an application
Application 1 = 3 Cores
Application 2 = 6 Cores
Application 3 = 8 cores
IBM i System / Partition
Back to the basis
Back to the future
IBM i and Cloud Computing
•
•
•
•
•
•
•
Best platform for “private cloud”
Centralized model
Server consolidation
Bring back the complexity into the “computer
room”
No more “personal” workstations and
company’s data stored on users’ disks
Low-TCO terminals
SOA approach to integrate third party “SaaS”
solutions
IBM refreshes CloudBurst line
with Power7 chips - 14 October
2010
A rack with a single Power 750 server
and 32 processor cores can run up to 160
virtual machines, while the top end
system, with 11 Power 750 servers in five
racks, can run up to 2,960 virtual
machines,
50
tom.presotto@evog.it
Download