Datacenter Basics - UCSB Computer Science

advertisement
Datacenter Basics
Fred Chong
290N Green Computing
Storage Hierarchy
Figure : Storage hierarchy of a Warehouse-scale computer
Performance Variations
Figure : Latency, bandwidth and capacity of a Warehouse-scale computer
Server Comparison
Processor
HP Integrity Superdome –
Itanium2
64 sockets, 128 cores (dualthreaded), 1.6GHz Itanium2,
12MB last-level cache
HP ProLiant ML350 G5
1 socket, quad-core, 2.66GHz
X5355 CPU, 8MB last-level cache
Memory
2048GB
320,974GB, 7056 drives
Disk
storage
$2.93/tpmC
TPC-C
price/perfor
mance
24GB
3961GB, 105 drives
price/perfor $1.28/transactions per minute
mance
(server HW
only)
price/perfor $2.39/transactions per minute
mance
(server HW
only) (no
discounts)
$0.10/transactions per minute
$0.73/tpmC
$0.12/transactions per minute
Cost proportional to Power
• Cost proportional to power delivered
• Typically $10-20/W
• Power delivery
– 60-400kV transmission lines
– 10-20kV medium voltage
– 110-480V low voltage
UPS
•
•
•
•
Uninterruptible Power Supply
Batteries or flywheel
AC-DC-AC conversion
Conditions the power feed
– Removes spikes or sags
– Removes harmonic distortions
• Housed in a separate UPS room
• Sizes range from hundreds of kW to 2MW
PDUs
• Power Distribution Units
• Breaker panels
– Input 200-480V
– Output many 110V or 220V
– 75-225kW in 20-30A circuits (max 6 kW)
• Redundancy from two independent power
sources
Paralleling
• Multiple generators or UPSs
– Feed a shared bus
• N+1 (one failure)
• N+2 (one maintenance, one failure)
• 2N (redundant pairs)
Cooling
Cooling Steps
•
•
•
•
12-14 C coolant
16-20 C air at CRAC (Computer Room AC)
18-22 C at server intake
Then back to chiller
“Free Cooling”
• Pre-cool coolant before chiller
• Water-based cooling towers use evaporation
– Works in moderate climate – freezes if too cold
• Glycol-based radiator outside the building
– Works in cold climates
Cooling is Critical
• Datacenter would fail in minutes without
cooling
– Cooling backed up by generators and UPSs
• Adds > 40% critical electrical load
Airflow
• 100 cfm (cubic feet per minute) per server
• 10 servers would require 1000 cfm from
perforated tiles
• Typically no more than 150-200W / sq ft
power density
• Recirculation from one server’s hot air into the
intake of a neighbor
– Some avoid with overhead ducts
Variations
• In-rack cooling
– Water cooled coils next on the server
– Cost of plumbing
– Damage from leaks (earthquake zones!)
• Container-based datacenters
– Shipping container 8’ x 8.5’ x 40’
– Similar to in-rack cooling but for the whole
container
– Higher power densities
Power Efficiency
• PUE – power usage efficiency
– Datacenter power infrastructure
Poor PUEs
•
•
•
•
•
•
•
85% of datacenters PUE > 3
Only 5% PUE = 2.0
Chillers take 30-50% overhead
CRAC 10-30% overhead
UPS 7-12% overhead (AC-DC-AC)
Humidifiers, PDUs, lighting
EPA “achievable” PUE of 1.4 by 2011
Improvements
•
•
•
•
•
Evaporative cooling
Efficient air movement
Eliminate power conversion losses
Google PUE = 1.21
Several companies PUE = 1.3
A more comprehensive metric
Efficiency 

Computation  1   1  
Computation


 
 
Total Energy  PUE   SPUE   Total Energy to Electronic Components 
(a)
(b)
(c)
• (b) SPUE – server power usage efficiency
• (c) computation energy efficiency
SPUE
• Power delivered to components directly
involved in computation:
– Motherboad, disks, CPUs, DRAM, I/O cards
• Losses due to power supplies, fans, voltage
regulators
• SPUE of 1.6-1.8 common
– Power supplies less than 80% efficient
– Voltage regulators less than 70% efficient
• EPA feasible SPUE < 1.2 in 2011
TPUE
• Total PUE = TPUE = PUE * SPUE
• Average of 3.2 today (2.2 Watts wasted for
every Watt in computation)
• PUE 1.2 and SPUE 1.2 would give 2X benefit
• TPUE of 1.25 probably the limit of what is
economically feasible
Computing Efficiency
•
•
•
•
•
Area of greatest potential
Hardest to measure
SPECpower
Joulesort
Storage Network Industry Association
SPECPower Example
Server Load
Load vs Efficiency
Pwr50
Pwr10
Pwr10sub
Eff50
Eff10
Eff10sub
180
160
140
120
(%)
100
80
60
40
20
0
0
5
10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
Load level (% of peak)
1000
500
0
Watts
Ax chopping
Running
Jumping rope
Cycling (race)
Lambada
Tennis
Soccer
Walking
Laundry
Ballroom
Cooking
Bowling
Typing
Playing cards
Knitting
Sleep
Human Dynamic Range
2000
1500
Component Efficiency
CPU
DRAM
Disk
Other
100.00
90.00
80.00
Power (% of peak)
70.00
60.00
50.00
40.00
30.00
20.00
10.00
0.00
Idle
7
14
21
29
36
43
50
57
Compute load (% )
64
71
79
86
93
100
CPU Voltage Scaling
100
2.4 GHz
90
1.8 GHz
80
Power (% of peak)
70
1 GHz
60
50
40
30
20
10
DVS savings (%)
0
Idle
7
14
21
29
36
43
50
57
Compute load (%)
64
71
79
86
93
100
Disks
• As much as 70% power to keep drives spinning
• 1000X penalty to spin up and access
• Multiple head, low RPM drives [Gurumurthi]
Server Power Supplies
Power Provisioning
• $10-22 per deployed IT Watt
• Given 10 year depreciation cycle
– $1-2.20 per Watt per year
• Assume $0.07 per kilowatt-hr and PUE 2.0
– 8766 hours in a year
– (8766 / 1000) * $0.07 * 2.0 = $1.22724
• Up to 2X cost in provisioning
• eg. 50% full datacenter = 2X provisioning cost
Time at Power Level
80 servers
800 servers
8000 servers
Oversubscription Opportunity
• 7% for racks (80)
• 22% for PDUs (800)
• 28% for clusters (8000)
– Could have hosted almost 40% more machines
Underdeployment
• New facilities plan for growth
• Also discretization of capacity
– Eg 2.5kW circuit may have four 520W servers
• 17% underutilized, but can’t have one more
Download