Data Centre Metrics and Design Standards Data Centre Metrics and Design Standards Presented by Lee Smith (CDCE® ATD®) Director: Data Centre Services & Training © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards About us Dee Smith Consulting Lee Smith (CDCE® ATD®) • Founded in 2010 • Provides IT consulting, training and other professional services to the African market • First company in SA to launch data centre design and management training courses • Now offering • Other services: • • • • • • • • – – – – – Turnkey projects Data centre assessments and Certification management Data centre migration Digital security In IT since 1989 First Certified Data Centre Expert® in Africa Only CDCE® instructor in Africa Uptime Institute Accredited Tier Designer® Certified TIA-942 Design Consultant® Independent Data Centre Consultant Panel judge for Brill Awards for Efficient IT Passionate about all matters data centre • www.deesmith.co.za © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Agenda • • • • • • Why are data centre metrics necessary? Some metrics in use today Exploring PUETM Data Centre Design Principles and Standards Approaching and managing your metrics & standards Conclusion and Questions © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Why are data centre metrics necessary? • • • • Metrics are an essential tool for operational improvement Better understanding of sustainability & efficiency in your data centre Provides the ability to measure change (good or bad) Make better informed decisions for new data centre deployments/refurbishments Image: © Greenbly.com, LLC., Spokane, WA • Increased efficiency can improve the triple bottom line • Used for comparison (not in the first year) • No measurement/metrics No baseline No analysis No understanding No improvement Stagnation © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Some data centre metrics in use today • • • • • • PUETM CUETM WUETM ERFTM DCePTM DCcETM – Power Usage Effectiveness Sustainability – Carbon Usage Effectiveness (xUE) Metrics – Water Usage Effectiveness – Energy Reuse Factor – Data Centre energy Productivity Data Centre – Data Centre compute Efficiency Productivity © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards PUETM – Power Usage Effectiveness • • • • • • Introduced by The Green Grid (TGG) in 2007 Measures infrastructure energy efficiency Helps to improve & report energy efficiency Globally adopted and industry-preferred Different calculation levels: PUE1, PUE2, PUE3 TGG and AHSRAE joint publication March 2014 Total Facility Energy PUE = IT Equipment Energy (Dedicated Building) PUE (Mixed Use Facility) pPUETM: Partial PUE Energy dedicated solely to the data centre © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards CUETM – Carbon Usage Effectiveness • Introduced by The Green Grid (TGG) in 2011 • The 2nd metric from TGG in the xUE family • Measures CO2 (Greenhouse Gas) emissions associated with the data centre • Why? Operational carbon usage has an impact • How? Factors affect design, location & operation Source: ebay CUE = Total CO2 emissions caused by Total Data Centre Energy IT Equipment Energy CO2 emitted (kgCO2eq) CEF (kgCO2eq/kWh) Unit of Energy [Carbon Emission Factor] (kWh) x Total Data Centre Energy PUE Energy IT Equipment © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards WUETM – Water Usage Effectiveness • • • • • Introduced by The Green Grid (TGG) in 2011 The 3rd metric from TGG in the xUE family Measures water usage in the data centre Why? Water usage also has an impact How? Factors affect design, location & operation WUEsite = WUEsource = Annual Site Water Usage IT Equipment Energy Annual Site Water Usage + Annual Source Energy Water Usage IT Equipment Energy © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards ERFTM – Energy Reuse Factor • Introduced by The Green Grid (TGG) • The portion of energy exported for reuse outside of the data centre • Examples for correct energy reuse: – Warm air (waste heat) reuse elsewhere on campus or in the neighbourhood – Heat to run absorption chillier not used for the data centre ERF = Energy Reused Total Energy Consumed Measured as it leaves the data centre control volume Energy dedicated solely for the data centre © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards DCePTM – Data Centre energy Productivity • Introduced by The Green Grid (TGG) • Allows each user to define its meaningful work produced by/in a data centre based on the energy consumed – Retail institution uses number/value of sales – Financial institution uses number of transactions completed – Search company uses number of searches completed • Only applicable to improvements in a single data centre DCeP = Useful Work Produced Total Energy Consumed Defined as applicable to user’s business Energy dedicated solely for the data centre © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards PUETM: Measuring the data points Total Data Centre Facility Energy [measured at facility’s utility meter(s)] Power Switchgear, UPS, Battery backup, etc. Cooling Chillers, CRACs, Pumps, etc. IT Equipment Energy [best measured at output of computer room PDUs] IT Load Servers, Storage, Telecoms equipment, etc. BUILDING LOAD Demand from the Grid Source: The Green Grid (2014) Total Facility Energy PUE = IT Equipment Energy Energy dedicated solely for the data centre Energy dedicated only for IT Equipment to manage, process, store, route data in the compute space PUE cannot be less than 1 © 2014 - Dee Smith and Associates No duplication without written permission For illustration purposes only Required Measurement L1 L2 L3 Level 1 (PUE1) L1 Basic Level 2 (PUE2) L2 Intermediate Level 3 (PUE3) L3 Advanced Total Facility Energy Utility Input Utility Input Utility Input IT Equipment Energy UPS Outputs (kWh) PDU Outputs (kWh) IT Equipment Input [Rack PDU] (kWh) Measurement Interval Monthly / Weekly Daily / Hourly Substation 15KV 480V / 277V L3 BMS Facilit y Power Glycol Pumps L3 CRAC Units L3 L3 BMS Facilit y Power Maintenance Bypass 2500A, 480V PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA PDU 100 KVA L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L3 L3 L3 L3 L3 L3 L3 L3 L3 L3 L3 L3 L3 L3 KVM / Console Continuous (15 mins or less) L3 Servers Daily / Hourly Panel 800A, 480V CRAC Units L3 Switches IT Equipment Input [Rack PDU] (kWh) Glycol Pumps L1 L3 Routers PDU Outputs (kWh) L3 Printers Measurement Monthly / Interval Weekly Source: The Green Grid (2014) PDU Outputs UPS Inputs / Outputs Mechanical Inputs Mechanical Equipment PCs & Workstations UPS Outputs (kWh) UPS inputs/ Outputs Mechanical Inputs L1 L1 L1 UPS A-side 400kW 500KVA Mechanical Equipment Encryption IT Equipment Energy Level 3 (PUE3) L3 Advanced UPS A-side 400kW 500KVA UPS A-side 400kW 500KVA Mechanical Switchgear Storage Total Facility Energy Level 2 (PUE2) L2 Intermediate L2 Mechanical Switchgear Backup Devices Level 1 (PUE1) L1 Basic L2 L2 L2 L2 Standby Switchboard 2500A, 480V UPS B-side 320kW 400KVA ATS ATS L2 Continuous (15 mins or less) Additional recommended Generator Switchgear ATS L2 Measurement Backup Gens (3 x 2MW) Utility Telco Data Centre Metrics and Design Standards PUE: Critical Power-Path measurement points Redundant feeds for IT Equipment Additional measurement points provide further insight into energy efficiency of the infrastructure © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Some basic principles 1. Classify the subcomponents correctly • • Total Facility Energy IT Equipment energy 2. If the NOC supports the data centre, include it 3. Use a consistent method to obtain data inputs 4. Most useful data are obtained when equipment is grouped by system (especially at L3) 5. Measure the energy – kWh (not KVA or kW) 6. Use actual energy measurements (i.e. no estimations) 7. Don’t measure during maintenance or operational anomalies 8. Best practice: Real-time and automated (<= 15 mins.) • Measure IT equipment energy as close to the IT kit as possible © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards How to get this going 1. 2. 3. 4. 5. 6. 7. 8. Include metrics during the planning and design phases Obtain executive support/commitment Select one metric to begin with (it’s usually PUE) Determine whether you can collect the data Define the method for consistent data capture Put the plan into action and evaluate data collection Refine if required To publish the results you’ll require data for a full year • Use the average PUE over the full year when reporting a PUE value © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Efficiency improvements for data centres • Pick the low-hanging fruit – Hot-aisle; Cold-aisle (at a minimum) – Blanking plates – Close those holes – Neat cabling – Remove all unwanted cabling • Good airflow management principles – Isolate the cold and hot air (containment, plastic curtains, etc.) – Apply ASHRAE’s thermal recommendations – Economised cooling; VFD/EC fans – New and more efficient technology requires a solid business case • Merge Facilities with IT so they report to the same CIO/CTO © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Efficiency improvements for IT • Don’t just focus on the data centre facility – Efficiency requires a holistic approach which includes IT • Target servers running less than 5% utilisation – Thereafter go for servers with less than 20% utilisation • Remove comatose/zombie servers – Barclays Removed almost 15,000 servers (globally) Equivalent to filling up almost 600 server racks Freed up 20,000 network ports and 3,000 SAN ports Eliminated an estimated 2.5MW of power usage US$10M total savings over the 2-year program • “The most efficient data centre is the one you never build” © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Food for thought • The cost of utilities is increasing • Efficiency improvement leads to improved TCO and ROI – No solid business case no investment in efficiency • Be realistic in your energy efficiency expectations • xUE-family does not measure digital output efficiency – For this there is DCeP (i.e. How well is the data centre actually performing?) • Metrics are only useful if you do something with the data • Holistic approach – From “chip to chiller” (Facilities and IT) – Continuous operational improvement, leadership and executive support • Fundamentally it’s management challenge – not technology © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Data Centre Design Principles and Standards The Bad News… There is NO world-wide accredited/official Data Centre Design Standard © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards The Good News… • Best practices / “semi-standards” o Uptime Institute guidelines o ANSI/TIA-942 o ANSI/BICSI-002 o Singapore Standard SS507 (BC/DR) o ISO-24762 (International guideline for BC/DR) o European Code of Conduct (Best practices focused more on Green) • Continued refinement/improvement of existing publications • Possibility of new publications/standards in future All logos are trademarks of their respective organisations © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards What Standards/Guidelines are there? BICSI-002 Origin Official Standard Ownership Detailed Specs Publicly Available Type of Compliance Main coverage European code of Conduct ISO 24762 TIA-942 Uptime Institute USA Europe International USA USA Yes (ANSI) No Yes (ANSI) No (EN 50600) Yes (ISO) (Reputational) Public Public Public Public Private Yes Yes Yes Yes No Class (F0 - F4) Implement / Endorse Pass / Fail Rated/Rating (1, 2, 3, 4) Tier (I, II, III, IV) Telecoms Electrical Architectural Mechanical Fire Safety Architectural Mech-Elec Data floor Racks IT kit A few more Telecoms Electrical Architectural Mechanical Process Telecoms Electrical Architectural Mechanical Fire Safety Electrical Mechanical Other considerations in “Operational Sustainability” (TUI) © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Prevalent in RSA market Uptime Institute Tier Topology TIA-942 (2014) • Outcomes-based design requirements • Design levels: Tier-I, II, III, IV • TCDD – Tier Certification of Design Documentation (valid for 24 months) • TCCF – Tier Certification of Constructed Facility (compulsory since January 2014) • Conducted only by The Uptime Institute • Adaptations as and when published by TUI • Infrequent reviews • More prescriptive in design requirements • Design levels: Rated-1, 2, 3, 4 • Design Validation with Corrective Action Report (CAR) • Final onsite certification audit (compulsory) • Recertification Audit every third year • Conducted by External Auditor – EPI are the global leaders in TIA-942 audits • Formal review every 5 years © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Redundancy Design – high level only TIA-942: Rated-1 (Basic) TUI: Tier-I (Basic Capacity) • • • • • • Single path for power, cooling and network distribution. No redundant components Generator must only support UPS. Mech. load support not required Single distribution path Non-redundant capacity components – Critical environment power – Cooling Systems TUI not able to provide example schematic for this presentation Source: EPI *Example only! Does not indicate minimum or only option © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Redundancy Design – high level only TIA-942: Rated-2 (Redundant) TUI: Tier-II (Redundant Components) • Single path for power, cooling and network distribution. • Redundant components • Generator config. install is typically N+1 • Single distribution path • Redundant capacity components (N + R) – Only N is required by the Standard – Engine Generator – UPS Modules – IT & UPS cooling TUI not able to provide example schematic for this presentation Source: EPI *Example only! Does not indicate minimum or only option © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Redundancy Design – high level only TIA-942: Rated-3 (Concurrently Maintainable) TUI: Tier-III (Concurrently Maintainable) • • • • • Multiple power, cooling and network distribution paths – only one path is active. Redundant components Concurrently Maintainable (CM) Compartmentalised • • • Redundant capacity components and independent distribution paths Elements of a distribution path may be inactive No runtime limits on engine-generator capacity at design load Assumes dual-corded IT equipment TUI not able to provide example schematic for this presentation Source: EPI *Example only! Does not indicate minimum or only option © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Redundancy – high level only TIA-942: Rated-4 (Fault Tolerant) TUI: Tier-IV (Fault Tolerant) • • • • Multiple power, cooling and network distribution paths (at least two paths active) 2N ; no SPoFs along essential load Cooling as per ASHRAE limits • • Redundant active distribution paths and capacity components Compartmentalisation (components & paths) “N” after any failure (at any point in the system) – during single failure • – Single event with consequential impact (i.e. loss of DB TUIboard) not able to provide example schematic for this presentation Gens sized for total building load – 2N redundancy • Automated/Autonomous failure response Source: EPI • • • No runtime limits on gen capacity at design load Continuous cooling critical IT and UPS systems Autonomous response to failure *Example only! Does not indicate minimum or only option © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Commonalities of TUI Design Topology and TIA-942 • Both are vendor neutral/agnostic • You don’t need two separate power feed-paths onto your site • Data centres can be designed and built against measurable redundancy criteria – TUI is outcomes based ; TIA is more prescriptive • Certification of the facility itself is the recognised objective – A design certification/validation does stand on its own • Certifications: Both do it commercially – TUI for itself ; EPI - in terms of TIA-942 – Although TIA-942 is a non-profit organization, EPI is a commercial venture • Both have a beneficial impact in the data centre industry • Emphasis on operational sustainability is also prevalent – TUI more so than TIA © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Approaching and managing your certification standard • Understand why your want to apply a data centre standard/guideline – Market demand (i.e. RFx from potential client) – Independently ensure that data centre complies with requirements – Marketing and advertising (competitive advantage or at least “equality” with competitors) • TUI certified data centres are much more prevalent in South Africa • Processes to obtain certification requires sustained commitment • No matter which certification you select (if any): – Your team must have experience and understanding in this regard – You need people who understand the processes involved – Once you begin, you must complete it or you’ll have nothing to show • The local AHJ always overrules any standard/guideline © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Food for thought • • • • • Understand the decision drivers ITO the standard/guideline selected Your data centre must live up to your design and efficiency claims Design philosophy is to prevent failure and SPoFs anyway Standards/guidelines have the same objective - availability Design/Build is one aspect – Data Centre management is equally important • PUE is the widest used/reported metric (but usually unaudited) • Metrics determine opportunities to improve operational efficiency • Obtaining metrics is the initial step – Improving on it is the objective • Executive sponsorship and buy-in is critical for success © 2014 - Dee Smith and Associates No duplication without written permission Data Centre Metrics and Design Standards Thank you! Questions? © 2014 - Dee Smith and Associates No duplication without written permission Lee Smith (CDCE® ATD®) Director: Data Centre Services & Training lee@deesmith.co.za www.deesmith.co.za @the_dc_guy