Australian Government Data Centre Strategy 2010-2025 Better Practice Guide: Data Centre Power July 2013 Contents Contents 2 1. Introduction 3 Scope 3 Policy Framework 4 Related documents 4 2. Definition and Discussion 6 A Conceptual Model of a Data Centre 6 Observations about Power in Data Centres 8 3. Better practices 15 Organisation of the Better Practices 15 Implementing the Better Practices 16 Better Practices – Fundamentals 16 Better Practices – Power Systems 17 Better Practices – Power Consumption 18 4. Conclusion Summary of Better Practices 22 22 1. Introduction The purpose of this guide is to advise Australian Government agencies on ways to improve data centre operations. Many government functions are critically dependent upon information and communication technology (ICT) systems based in data centres. Achieving and maintaining optimal data centre performance is challenging. The ICT equipment, power, cooling and other systems dynamically interact in many ways. By linking the data centre to the agency’s business outcomes in an integrated approach, managers can better support business operations, control costs, and manage risks. Each better practice guide focuses on a different data centre attribute. Electricity1 is one of the key resources for a data centre. Safe, reliable and efficient electricity use is essential to data centre performance. The aim is to assist Australian Government agencies to: Assess data centre performance relating to power. Improve practices relating to power leading to greater efficiency, reliability and safety. Meet legislative and policy requirements. The consequences of unsafe, inefficient or unreliable power use in data centres are significant. An unsafe data centre poses risks to people and surrounding buildings. An inefficient data centre can waste over 70 per cent of the electricity it uses. Unreliable data centres can disrupt the agency’s business operations. This guide on power forms part of a set of better practice guides for data centres. Scope This guide applies to the data centre infrastructure supplying power to the ICT equipment and the supporting systems. This includes switchboards, switchgear, controlgear, transformers, uninterruptible power supplies (UPS), backup power generation, distribution boards, equipment racks and ICT equipment. This guide considers the power used by the ICT equipment and cooling systems in a data centre, but does not address in detail the better practices for ICT equipment efficiency or cooling a data centre2. 1 Throughout this guide, the terms power, electricity and electrical power are equivalent, although power is preferred. 2 The Data Centre Optimisation Targets Guidance document addresses ICT equipment and cooling efficiency. Better Practice Guide: Data Centre Power | 3 Policy Framework The guide has been developed within the context of the Australian public sector’s data centre policy framework. This framework applies to agencies subject to the Financial Management and Accountability Act 1997 (FMA Act). The data centre policy framework seeks financial, technical and environmental outcomes. The Australian Government Data Centre Strategy 2010 – 2025 (data centre strategy) describes actions that will avoid $1 billion in future data centre costs. Using power in data centres more efficiently will contribute significantly to the $1 billion goal. The Australian Government ICT Sustainability Plan 2010 – 2015 describes actions that agencies are to take to improve environmental outcomes. The data centre strategy and the ICT sustainability plan have the same targets and objectives for data centres. The Data Centre Optimisation Targets (DCOT) policy sets minimum efficiency standards that data centres used by FMA agencies are to reach by June 2015. The DCOT policy set the power efficiency target for data centres of a Power Usage Effectiveness3 (PUE) below 1.94. The DCOT Guidance document describes methods to improve efficiency, including virtualisation, consolidation, cooling, procurement and server settings. In February 2013, the National Australian Built Environment Rating System (NABERS) for data centres was released. NABERS is an initiative led by the NSW Government and supported by the Australian Government. In the Australian public sector’s data centre policy framework, PUE was chosen as an interim metric until NABERS for data centres was available. Agencies should anticipate that NABERS will replace PUE as the preferred metric over time. Related documents Information about the data centre strategy, and DCOT targets and guidance can be obtained from the Data Centre section (datacentres@finance.gov.au). The data centre better practice guides also cover: 3 Cooling: the mechanical and electrical systems that provide conditioned air at the optimum temperature, humidity and pressure. Data Centre Infrastructure Management: the system that monitors and reports the state of the data centre. Also known as the building management system. Fire protection: the detection and suppression systems that minimise the effect of fire on people and the equipment in the data centre. Security: the physical security arrangements for the data centre. This includes access controls, surveillance and logging throughout the building, as well as perimeter protection. The PUE metric is a ratio of the power used by the ICT equipment and overheads divided by the power used by the ICT equipment in a single data centre. The Green Grid developed the PUE metric and more information can be found at the website www.thegreengrid.org. 4 A PUE of 1.9 means that for every watt powering the ICT equipment 0.9 watts is being used by overheads. Better Practice Guide: Data Centre Power | 4 Equipment racks: this guide brings together aspects of power, cooling, cabling, monitoring, fire protection, security and structural design to achieve optimum performance for the ICT equipment. Structure: the physical building design provides for movement of people and equipment through the site, floor loading capacity, reticulation of cable, air and water. The design also complements the fire protection and security better practices. Environment: this guide examines data centre sustainability, including packaging, electric waste, water use and green house gas generation. Better Practice Guide: Data Centre Power | 5 2. Definition and Discussion A Conceptual Model of a Data Centre A typical data centre consists of a number of different components. This complexity can be reduced to a collection of subsystems. The key data centre subsystems and power distribution networks are shown a conceptual model in Figure 1. This conceptual model describes the key functions used in a wide range of data centre designs, from computer rooms in converted office space to purpose-built, standalone data centres. While a data centre applying better practices will have these elements in their power subsystems, the physical connections and arrangements are likely to be different. The data centre industry has regional variations in the terms used to describe common set of data centre concepts. This section describes the conceptual data centre model, using terms in common use in Australia. Main switchboard Local distribution lines to the building UPS DB Equipment racks ICT equipment Backup generator Office area Fire protection system HVAC system Security system Data Centre Infrastructure Management Figure 1: Key Subsystems in a Data Centre Power enters the building site via the local distribution lines. Very high availability data centres will have multiple lines from several suppliers and substations. The main switchboard has several functions. The main switchboard connects to the various power networks in the building, and the backup generators. Each of the key subsystems usually is on its own power network. If the external power fails, static transfer switches automatically switches over to use power from the backup generator. The metering device that measures the total electricity consumption for the building is located at the point of supply between the local distribution lines and the main switchboard. The main switchboard has a residual current device, which reduces the chance of electrocution, by shutting off power if a fault is detected. Better Practice Guide: Data Centre Power | 6 The backup generators are the source of power when the supply through the local distribution lines fails. Diesel generators are most commonly used to provide backup power, although other alternatives are becoming available. The heating, ventilation and air conditioning (HVAC) system cools and humidifies the data centre. The HVAC system is the largest overhead in a data centre. The uninterruptible power supply (UPS) ensures that there is continuity of power to the ICT equipment, regardless of disruptions to the building power supply. The length of time the UPS supplies power varies from a few seconds, long enough for the backup generator to take over, up to a few minutes, to allow the ICT equipment to shut down gracefully. The distribution board (DB) is an electrical wiring junction that connects a single power line from the UPS to multiple power lines to the equipment racks. The distribution board converts three phase power into single phase power. The equipment racks provide an enclosure for housing the ICT equipment. Inside the equipment rack the power rails provide power outlets for the ICT equipment. Modern equipment racks have a range of sensors for power and temperature. The ICT equipment consists of the servers, disk drives, storage area networks, tape drives, switches, routers and so on. The model does not show the power factor correction equipment, which is used on a case by case basis to improve the efficiency and quality of the power supply. The office area is necessary for people who must work at the data centre location. Data centres usually have several hazards for long term staff, including too much noise, uncomfortable temperatures and very low light levels. The fire protection system detects and suppresses fires or other abnormal hot spots. The security system provides the physical security arrangements. It may have video cameras, secure doors, or biometric access controls. The Data Centre Infrastructure Management (DCIM) system provides command and control functions to assist in managing a data centre. Older data centres may have a building management system that provides more limited functions, and is a predecessor to DCIM. High Voltage, Low Voltage Electrical power is distributed across Australia at very high voltages, typically 11,000 to 115,000 volts (V). The power must be converted to a lower voltage, around 240V, to be usable by most ICT equipment. However, 415V is commonly used by HVAC systems and larger ICT equipment such as mainframes. A substation with transformers converts high voltage power delivered to a data centre site to lower voltage, typically 415V. The local distribution lines take the power from the substation to the main switchboard. The substation and transformers are managed by the electricity provider, not the building owner. Common examples of power distribution voltages are shown in Figure 2. Better Practice Guide: Data Centre Power | 7 Local Distribution Lines 11 kVA 11 kVA 415 V Main switch board Backup Generator UPS DB 415 V 3 phase Equipment racks ICT equipment 240 V single phase 415 V 3 phase 240 V single phase 240 V single phase Figure 2: Voltage Conversion Examples Observations about Power in Data Centres Fit for Purpose - Design and Operations The design of a data centre sets the limits of the operational achievements. Excellence in data centre operations will not overcome limits in the design. Conversely, poor operations performance will degrade any design, causing agencies to get less value for money than expected. The core design criteria for power in a data centre are safety, reliability, efficiency, flexibility and scalability. The core responsibilities for data centre operations are safety, reliability and efficiency. An agency will periodically evaluate the fitness for purpose of the data centre and the ICT equipment for the agency’s outcomes. Broadly, either the data centre will (or with justifiable investment) meet business expectations, or the data centre cannot meet expectations and an alternative must be found. Many older data centres have been neglected, and so are less reliable and efficient. Minor capital improvements to design or operational capability are likely to be value for money. Examples are ‘mining’ unused cables and selling them for scrap, removing idle ICT equipment providing little business value, or using power monitoring to find and replace equipment with high power costs. Major data centre upgrades are generally very expensive and disruptive to business operations. In most cases, moving to a more modern data centre will be a better option than refurbishing an existing data centre. Examples of major upgrades are installing a backup generator, doubling the power supplied to the data centre or installing redundant power supply to equipment racks with essential ICT systems. Better Practice Guide: Data Centre Power | 8 Safety The Work Health and Safety Act 2011 imposes mandatory obligations and roles on all people working in a data centre. Power in a data centre poses risks including electrocution, fire and explosion. These hazards can cause injury or death. The legal obligations include: That all incidents are recorded, assessed and treated. The safety management plan is maintained. People to know how to behave safely in and around the data centre. In addition to the legal obligations, agencies should understand that a likely consequence of a serious safety incident is a significant disruption to normal agency operations. The process of investigating, repairing and confirming that the data centre is safe may prevent business as usual operations for days or weeks. Agency business continuity planning should include the partial or total loss of the data centre for safety reasons. Several better practices provide risk controls that could improve safety and agencies should review these for applicability. These controls include: Ensuring that Australian Standards and applicable building regulations are satisfied. Employing suitably licensed and accredited people to work on power systems. Verifying that the capacity of the power system is not exceeded by ICT equipment changes. Routine inspection of the power and temperature throughout the power system, including thermal images of switchboard, UPS and PDUs. Analysing the monitoring logs for trends and accounting for all variations. This advice in the better practice guide does not affect the obligations of staff or agencies to maintain a safe workplace. Power Uses in Data Centres The power consumed in a data centre should be considered as having either productive or overhead (unproductive) uses. The productive power is used by the data centre ICT equipment. The overhead power is used by the supporting systems, which consume power to maintain the ICT equipment. Achieving and sustaining better data centre power practices requires regular monitoring, analysis and reporting of both productive and overhead power. Ideally automated, monitoring records the rate of consumption, fluctuations, failures, and the efficiency of all power subsystems. This baseline will allow managers to improve efficiency and reliability. Electricity Billing The other measurement device available in all data centres is the usage meter, which is used to calculate the electricity bill. This is positioned on the local distribution line before the main switchboard. The usage meter records two, equally Better Practice Guide: Data Centre Power | 9 correct perspectives of data centre power, which are demand and consumption. The demand, measured in kilovolt amperes (kVA), is the amount of power that the data centre requires to be delivered by the generators and transmission network. The consumption, measured in kilowatt hours (kWh), is the amount of power used by the equipment inside the data centre. The typical data centre electricity bill is about 80 percent consumption, 20 per cent demand and a small fee for the regulatory agencies to run the energy markets. The most effective way of reducing the bill is to reduce consumption, which is the power used by the ICT equipment and overheads systems. Measuring Power Use When planning the data collection, thought should be given to how much data will be collected and how often it will be collected. Setting the time interval for data collection measurements is a matter of judgment. A one minute interval is often used for systems that are essential to maintaining reliability, such as the UPS, and five or ten minutes for other systems. Small data centres may only generate kilobytes of data, easily analysed by a standard spreadsheet tool. Very large data centres can generate gigabytes of data, requiring a specialised software package. A baseline should be maintained of the power consumed by each device in the data centre. Analysis of the baseline can provide significant insights into the state of the data centre. High and low power consumption thresholds should be identified for all equipment, in particular the HVAC and ICT equipment. Examples of using this analysis are: ICT servers that remain under the low usage threshold will have low or zero utilisation. These are candidates for being reallocated or removed. Computer Room Air Conditioning (CRAC) units in close proximity that exceed the high usage threshold may have conflicting settings, resulting in one CRAC unit working to reach a particular temperature or humidity, and the other CRAC unit working to reach a different state. The correct settings CRAC units in this circumstance will save power and reduce costs. Metric – PUE or NABERS The PUE metric5 is widely used in the data centre industry. Good Better Best6 >1.9 >1.7 >1.5 The advantage of PUE as a metric is that it provides a simple, meaningful metric of power efficiency that addresses productive and overhead uses of power. Agencies should note that in October 2012 The Green Grid clarified how PUE is to be measured and reported. This may require changes in existing reports. 5 http://www.thegreengrid.org/en/Global/Content/white-papers/WP49-PUEAComprehensiveExaminationoftheMetric 6 The values are taken from the Uptime Institute survey. Better Practice Guide: Data Centre Power | 10 NABERS for data centres measures greenhouse gas emissions across a period of time. NABERS collects information on the power used by the data centre systems, and the utilisation of the ICT equipment in the data centre including the servers, storage and networks. NABERS has a star rating: 3 stars Average, 4 stars Good, 5 stars Excellent and 6 stars Market Leading. Agencies should pick only one metric when beginning to monitor and improve data centre efficiency. There is no equivalence between the two metrics. The data centre strategy and DCOT used PUE as it was available and recognised, with the expectation of switching NABERS once it became available. As an Australian standard, NABERS for data centres is better aligned with the local data centre industry. ICT Efficiency Reducing the power needed by the ICT equipment (the productive use) is often the most effective way to maximise power efficiency. Reducing the ICT equipment’s power load means smaller overhead power is needed, as for example, less heat is generated so less cooling is needed. Actions that reduce the power needed by ICT equipment include: Virtualisation – moving workloads from dedicated ICT equipment (including servers, storage and networks) to shared ICT equipment can reduce the amount of power required by 10% to 40%. Decommissioning – disused ICT equipment can be left powered on rather than decommissioned and removed. Modernising – the latest models of ICT hardware are using much less power for equivalent performance. Gartner advises that server power requirements have dropped by two thirds over the past two generations. Consolidation – Physical and logical consolidation projects can rationalise the data centre ICT equipment. Cooling Efficiency The cooling systems are usually the major source of overhead power consumption, and so there is usually value in making cooling more efficient. There is a wide range of data centre cooling technology, which provides agencies with great flexibility about investing in an optimum solution. Common techniques to minimise power use include: Free air cooling brings the cooler air outside the data centre into the data centre through dust and particle filters. In most Australian cities free air cooling can be used over 50 per cent of the time, and in Canberra over 80 per cent of the time. Hot or cold aisle containment is a technique that aligns all the ICT equipment in the racks so that all of the cold air arrives on side of the rack and leaves on the other side of the rack. This means that the chilled air produced by the cooling system is delivered to the ICT equipment without mixing with the warmer exhaust air. Better Practice Guide: Data Centre Power | 11 Raising the data centre temperature exploits the capability of modern ICT equipment to operate reliably at higher temperatures. Data centres can now operate at between 23 and 28 degrees Celsius, rather than the 18 to 21 degrees Celsius. Operating at higher temperatures means much less power is needed for cooling, and free air cooling becomes even more effective. The American Society of Heating Refrigeration and Air-conditioning Engineers (ASHRAE) publish guidance on maintaining optimum air temperatures in data centres. Agencies should also evaluate the environmental impact of cooling solutions. The environmental impact of cooling systems is typically excessive water use, however some cooling systems use hazardous chemicals. The investment case for cooling systems is quite different to ICT equipment. The asset life is usually 7 to 15 years. During the life of the cooling systems, the ICT equipment can be expected to change between two and five times. The amount of cooling required will vary significantly as the ICT equipment changes. This variability means that agencies should seek cooling solutions that can adjust as the demand for cooling rises and falls. Losses in Power Systems and Cables Converting power from high voltage into usable voltages and then distributing this power throughout the data centre results in inefficiencies, or losses. Losses occur in the power equipment, such as the transformer, distribution board, and UPS, and in the wiring (copper loss) that distributes the power throughout the data centre. The power factor is a measure of how efficiently power is distributed through the data centre. Efficient data centres have a power factor between 0.90 and 0.99, while inefficient data centres have a power factor below 0.80. Minimising power losses generally requires capital investment in new subsystems or auxiliary conditioning equipment. Power can be distributed efficiently over long distances as 3 phase power. The common alternative is to distribute the power as single phase power. In nearly all circumstances, single phase power will have higher losses than 3 phase power. However, these losses are proportional to the length and cross section of the copper cable. The difference between single and 3 phase power is negligible for distances of less than 100 metres. Generally, purpose built commercial data centres will use 3 phase power to reach the equipment racks. The 3 phase power is then converted to single phase power to be distributed to the racks. If 3 phase power is used to distribute power throughout the data centre, then the When the phases are balanced, three phase power is distributed with very small losses throughout the data centre. Balancing 3 phase power circuits requires planning and monitoring, particularly to ensure that the UPS has capacity for total load and each circuit. Ensuring that the 3 phase power circuits are balanced should be a routine maintenance task. . Continuity of Power and Reliability The reliability of the power supplied by the data centre to the ICT equipment is a major factor in the overall reliability of the ICT system. It is common to mitigate the Better Practice Guide: Data Centre Power | 12 effects of failures by duplicating key components. This imposes higher capital costs, and generally the power systems are less efficient. Typically, the business continuity planning process produces an analysis of business goals that can be mapped simply to reliability targets for the data centre. In general terms, requirements for power continuity will fall into one of three categories: Essential that there are no outages. The ICT systems should be operational at all times. Important that there are no unplanned outages. The ICT systems are useful to normal operations. These systems can be turned off with prior planning. Of little or no importance. The ICT systems can be allowed to turn off in the event of a loss of power. These systems are not essential to normal operations, or there are other controls in place, such as an alternate system that will take over in another location. Organisations that maintain high efficiency metrics and high reliability for their data centres are increasingly relying on virtualisation and cloud technology. In this model, the data centre is highly reliable, the ICT hardware is based on identical, inexpensive components that are not redundant, and the virtualised operating environment protects the applications and business processes against individual hardware component failures. This also works to mitigate the effect of partial failures in the power supply in the data centre. Trade-off between Efficiency and Reliability There is a trade-off between efficiency and reliability of power distribution. The various components (transformers, UPS, PDUs) operate most efficiently at high utilisation rates, typically above 90%. However, highly reliable power distribution designs typically provide two paths to the ICT equipment. In normal operations, each path carries less than half the power to the ICT equipment. If a failure occurs in one path, then the other path is capable of carrying the full power load to the equipment. This means that in normal operations the components are operating at less than 50% capacity. Most power systems are less efficient in this range. Newer equipment typically loses 2% to 4%, while older equipment can lose as much as 50%. Another advantage of highly reliable power distribution designs is that the amount of power sent from a UPS to group of racks can be adjusted while the power supply to other racks continues unaffected. This allows ICT equipment to be replaced in some racks without disrupting other ICT equipment. In older power distribution designs, all of the racks connected to the UPS would have to be powered down. Inspection and Maintenance Every component in a power system is subject to wear and tear. Regular inspection and maintenance is essential. Inspections can be aided by analysing the data collected for efficiency reporting. Variations, especially power losses, usually indicate a potential failure. A thermal imaging camera can detect hot spots, which are also signs of an incipient failure. Early detection and remediation is less disruptive and costly. Better Practice Guide: Data Centre Power | 13 Maintenance work is essential. All components, including cables, should be maintained according to the manufacturer’s schedules. There is an increased risk to business operations when that the maintenance work is carried out. This business risk should be assessed and controlled by business, ICT and data centre staff. Some data centres are designed so that maintenance work requires the ICT equipment to be shut down while the work is being done. Provided this shut down is planned and agreed with the business, and the length of the shutdown is within the reliability target, the better practice is being followed. Other data centre design allow for maintenance work to be carried out while the ICT equipment continues normal operation. The maintenance work still elevates the level of risk, as the level of protection provided by the data centre design has been lowered. It remains better practice to review this risk with the business prior to the maintenance work. Conclusion Keeping a data centre performing optimally is challenging, due to the multiple interacting systems. It is essential that power use is monitored, and that there are clear expectations of the reliability of the power supply in a data centre. Agencies must decide how much to invest in their data centres to obtain value for money and to support achieving the agency outcomes. The better practices will assist agencies to reach these objectives. The DCOT policy describes the minimum performance measures for APS data centres and ICT equipment. If an agency decides that the data centre performance is inadequate, the first point to review is the data centre operations. If the operations are satisfactory (that is, delivering the full capability of the design), then the data centre design must change. Generally, changing the data centre design means moving to a different data centre. As part of the data centre strategy two whole of government panels have been established. The Data Centre Migration Services panel allows agencies to obtain any mix of data centre project services, from design services to establish requirements and assess existing data centres, through commissioning and moving to another data centre, to cleaning and decommissioning the old data centre. The Data Centre Facility panel provides access to quality commercial data centre facilities. Better Practice Guide: Data Centre Power | 14 3. Better practices Data centres are very diverse in size and operational purposes. The guide focuses on key principles and processes, enabling agencies to gauge the operational practices in a data centre. Applying the better practices will require pragmatic judgements by agencies. Organisation of the Better Practices The better practices have been grouped into related topics: Fundamental: the most important practices are safety, targets and monitoring. If these practices are not being followed, the other practices will deliver limited benefits or unnecessary costs. Power System: practices relating to the subsystems that transform, condition, protect and distribute power throughout the data centre. Also includes continuity of supply of power to the ICT equipment, electricity billing and environmental considerations. Power Consumption: practices relating to the power used by other data centre subsystems, in particular the ICT equipment and HVAC. Main switchboard Local distribution lines to the building UPS DB Equipment racks ICT equipment Backup generator HVAC Power System Power Consumption Figure 3: Separation of the Better Practices for Power Systems and Power Consumption Better Practice Guide: Data Centre Power | 15 Implementing the Better Practices The successful application of the better practices will require planning, analysis and integration with existing capabilities. Agencies should expect to: Plan a phased implementation. Analyse how each better practice can be applied to their data centre. Create a standard operating procedure and supporting training material. Integrate the standard operating procedures into the existing ICT and data centre operations. Extend existing processes, such as capacity management, configuration control and availability monitoring, to include the power better practices. Monitor the first uses of the procedures, and adjust the procedures and training as needed. Better Practices – Fundamentals Business Alignment – Metrics and Reporting The data centre performance must be linked to the agency’s goals by clear metrics. Reliability and efficiency are the two key metrics of data centre power. The targets must be endorsed by the agency’s senior responsible officer. The reliability target should be based on the agency’s business continuity and disaster recovery planning. The reliability metric is commonly expressed in terms of continuity of the power supply to ICT systems. For example, critical systems must always have power, and useful systems must shut down gracefully and be able to be restarted within 2 hours of power being restored. The efficiency target can be measured and reported in one of two ways, Power Usage Effectiveness (PUE) and NABERS for data centres. The better practice standard for existing data centres is either a PUE less than 1.77, or a NABERS rating of 4 Stars8 or better. Monitoring, Analysis and Reporting Improving and sustaining the data centre performance requires regular monitoring and reporting. The monitoring system should record the state of all data centre power subsystems, and the power consumed by the ICT equipment and overhead systems. 7 The Uptime Institute’s 2012 Data Centre Survey reports the average PUE as between 1.8 and 1.89. Conducted in March and April 2012, over 1,100 organisations responded from around the world. “The 21st Century Data Center: An overview”, ZDNet, http://www.zdnet.com/the-21st-century-data-center-an-overview-7000012996/ (confirmed 16 April 2013). 8 NABERS website – 4 stars is a rating of ‘Good Performance’. http://www.nabers.gov.au/public/WebPages/ContentStandard.aspx?module=10&template=3&include=6starrating.h tm&side=factsheets.htm Better Practice Guide: Data Centre Power | 16 Agencies should set up an automated monitoring and data collection process as early as possible. Reporting should begin once the monitoring is in place and collecting data. The following conditions should be reported: Voltage: Transients, Interruptions, Sag / Undervoltage, Swell / Overvoltage, Waveform distortion, Voltage fluctuations, Frequency variations. Current: Over-current, Under-current, Idle equipment. Unexplained variations to the baseline are usually either failing components or unplanned configuration changes. Early detection of, and response to these variations assists in maintaining reliability and containing costs. Electricity Billing Information from the power bill should be combined with other reporting to report the consumption patterns of all equipment in the data centre, including the power cost for each ICT system. The analysis of the billing information, power consumption and demand should be available to all teams involved in the data centre. When reducing the power bill, data centre and ICT managers should focus on reducing the consumption, as this has proven the most effective. Better Practices – Power Systems This set of better practices focus on the subsystems that distribute the power through the data centre. Just enough power Using just enough power throughout a data centre can markedly improve efficiency. Any excess power usually becomes excess heat, which uses yet more power to remove it. Efficient data centres have a power factor between 0.95 and 0.99. Better practice is to monitor the power factor, and take value for money decisions to raise the power factor. If 3 phase power is used to distribute power then the circuits should be balanced. Confirming this and taking corrective action should be a routine operation. The optimum level of power for the power supply units (PSU) in the ICT equipment will be determined. It is likely to be 230V for servers. If practical, the power supply will be adjusted to reduce the excess power and so remove excess heat. Inspection Normal fluctuations in power cause expansion and contraction, particularly at connections in switchboards, PDUs and equipment racks. Over time this can lead to components failing, disrupting the power supply. The better practice is to conduct regular inspections using a thermal imaging camera. Better Practice Guide: Data Centre Power | 17 Maintenance Maintenance is essential to keep the data centre operating. The data centre equipment must be maintained according to the manufacturer’s specifications. However, maintenance activities increase risks to business operations during the time the maintenance is being done. The increased risk to business during maintenance activity should be assessed and controlled by business, ICT and data centre staff. Efficiency and Economy Modes Modern data centre equipment offers flexible operating modes that reduce power consumption. For example, HVAC systems have economisers that draw in cooler outside air, variable speed fans and more controlled responses to changes in conditions. The better practice is to ensure that the economy modes are being used to deliver optimum efficiency without compromising reliability. Power failure mitigation and continuity provisioning The better practice for power failure mitigation and continuity provisioning is that the agency’s declared business needs determine the level of protection for each ICT system. This then determines the investment in protecting each ICT system against power failure. The procedures and systems to ensure continuity of power should be practiced regularly. Sustainable Water Use Agencies optimising sustainability should estimate, report and optimise water use9 of the data centre. This includes on-site and off-site water use. Major on-site uses of water are cooling and humidifying. The key off-site use of water is electricity generation. 10 Better Practices – Power Consumption This set of better practices focus on the subsystems that are the major consumers of power. Data Halls A data hall is a room with dedicated PDUs and CRAC units within a data centre. Data halls should be in one of three states: 9 Full: should be operating at peak efficiency. Changing: all ICT moves, adds and changes should be in this data hall. The Green Grid, “WP#35-Water Usage Effectiveness (WUE™): A Green Grid Data Center Sustainability Metric”, http://www.thegreengrid.org/en/Global/Content/white-papers/WUE 10 http://www.ret.gov.au/energy/Documents/sustainbility-and-climatechange/Water%20and%20the%20Electricity%20Generation%20Industry%20Report.pdf Better Practice Guide: Data Centre Power | 18 Empty. Should be powered off, and may not have power or cooling systems installed. The air temperature should be raised, consistent with ASHRAE recommendations and ICT equipment tolerances. Data centre staff must plan the power capacity provided to racks and areas of the data centre floor. As equipment is installed and removed, it is necessary to consider the impact on the power demand and make suitable adjustments. Better practice also requires power cables are fit for purpose, catalogued and arranged in conduits or trays. The power cables must not obstruct airflow, or create hazards for staff. There must be no domestic power boards or extension cords in the data hall. Equipment Racks The power load to be carried by the circuits, and the circuit breakers, will be confirmed during the installation planning step for new ICT equipment. If intelligent power rails are installed, then power is turned on to each outlet only once change control is approved. The racks and equipment are earthed according to the electrical design for the data centre. ICT Equipment The DCOT targets are to be met by June 2015. Facilities and ICT staff should identify and agree what to do with: Idle ICT equipment that may be able to be powered off. Obsolete and powered down equipment that may be able to be removed from the data centre. Power costs should be attributed to ICT equipment. Power losses should be attributed to ICT equipment. When power is being restored to the data centre following a partial or complete power down, the ICT equipment is restarted in stages. This is to protect the power systems and shorten the ICT system restoration. The start up phase of most ICT equipment draws more power than normal operations. This peak demand may cause circuit breakers to trip unnecessarily, resulting in extended outages. Modern ICT systems in data centres are complex, preferring activation in particular sequences. Restarting the ICT equipment out of sequence can cause avoidable problems. Power should be restored to the ICT equipment in a staged manner, allowing enough time for software to load and configure itself, and become available for other systems. Better Practice Guide: Data Centre Power | 19 Fundamentals The safety plan has the goal of no injuries due to data centre power. The ⊠ procedures and training for staff and visitors support this goal. The safety incident log is actively used to identify, develop and apply ⊠ preventive actions to improve the effectiveness of the safety program. The reliability targets and associated metrics are defined, based on the ⊠ agency’s business continuity plan or equivalent. The reliability targets and ⊠ ⊠ metrics are signed off by the senior responsible officer. The efficiency targets and associated metrics are defined, based on the agency’s business requirements. The efficiency targets and metrics are signed off by the senior responsible officer. The measure is either PUE < 1.7 or NABERS rating of 4 stars or better. Power systems are being automatically monitored. Regular reports on safety, reliability and efficiency are provided to senior responsible officer, ICT and facility managers. Power Systems Data centre equipment is maintained according manufacturers’ schedules. ⊠ Regular inspections conducted using thermal imaging camera of all switchboards, switchgear and controlgear. ⊠ ⊠ ⊠ Efficiency and economy modes have been evaluated and used appropriately. Three phase power circuits have balanced phases. Power factor is as high as value for money will permit. Power cables are in marked single purpose trays or conduits, are not ⊠ hazards, and do not block airflow. Domestic grade electrical equipment, including double adapters, extension ⊠ cords and power strips, are not used in the equipment racks, data halls or ⊠ data centre. Domestic grade equipment may be used in the office area only. The contingency procedures established to meet the reliability targets and manage risks are practiced regularly. Risk management plan is current, jointly developed and agreed with business owners. Operations check the contingency plan and systems monthly for readiness ⊠ and completeness. ⊠ Report data centre water use regularly. Better Practice Guide: Data Centre Power | 20 Power Consumption ⊠ DCOT virtualisation targets are met by June 2015. Air temperature in data hall has been raised, to reduce power use while ⊠ maintaining ICT equipment reliability and warranties. ICT equipment is powered down when not idle. ICT equipment is removed ⊠ from the data centre when not required. Power cables in equipment racks do not impede airflow and cannot be ⊠ caught in doors. ⊠ ⊠ The voltage to PSUs has been tuned to reduce heat. Power loss (inefficiency) in the ICT equipment is tracked and reported. Power costs are reported monthly, and attributed to significant ICT systems ⊠ and overhead systems. ⊠ Equipment racks are earthed, according to the electrical and earthing design. Procedures and features to maintain continuity of power to ICT systems are ⊠ tested regularly. ICT equipment installation considers the impact of the increased power ⊠ demand at the equipment rack, circuit breakers, distribution boards and UPS. If intelligent power rails are installed in the equipment racks, these are being ⊠ used to control equipment installation (turn off all outlets except for authorised equipment) and to control surges due to large scale power restarts. ⊠ Restoration of power following a power down is staged. Better Practice Guide: Data Centre Power | 21 4. Conclusion Agencies that use better practices in their data centres can expect lower costs, better reliability, and improved safety than otherwise. Implementing the better practices will give managers more information about data centre power, enabling better decisions. Overall, the data centre will become more efficient, and better aligned to the agency’s strategic objectives. Agencies will also find it simpler and easier to report against the mandatory objectives of the data centre strategy. The key metric is avoided costs, that is, the costs that agencies did not incur as a result of improvements in their data centres. Capturing avoided costs is most effective when done by an agency in the context of a completed project that has validated the original business case. Summary of Better Practices Data centre operations are aligned to business expectations. A reliability target, based on the business continuity plan, has been set, and an efficiency target, either PUE < 1.7 or NABERS 4 star, has been set. The work health safety plans with respect to risks from data centre power have a goal of zero injuries. The agency is routinely: Tracking and reporting power use in the data centre. Analysing power efficiency of installed components. Reviewing the power bills for evidence that planned actions are reducing electricity use and power costs. Responding to losses in power efficiency, by finding root causes and taking corrective action. Examining the power path and subsystems for signs of heating and other early warnings of failure. Investing to improve or maintain data centre operations performance to reach or maintain the planned targets. Better Practice Guide: Data Centre Power | 22