6 steps to better data centers

Review existing data centers for improvement opportunities like power consumption and effective heating and cooling.

By Daniel Hallett, Arup, Los Angeles April 12, 2011

Management of data storage and processing are a part of every business, with a requirement for data centers and IT facilities common across nearly all business types. Data centers provide centralized IT systems, requiring power, cooling, and operational requirements above and beyond typical design parameters. This large density of power and cooling drives the need for continuous improvements; the goal for any system design or redesign should be to optimize performance of existing equipment, and prioritize replacement and reorganization of outdated systems.

This article provides a number of steps to lead the evaluation of an existing facility and proposes targeted improvements for reducing energy use and CO2 emissions into our environment.

Why improve performance of an existing data center? There are several reasons.

Operational enhancement: Improving the performance of data center systems will offer great benefits to the bottom line and allow for greater flexibility in future expansion:

  • Decreased operating and energy costs
  • Decreased greenhouse gas emissions, critical in anticipation of a future carbon economy.

Increased reliability and resilience: Continuity of supply and zero downtime in IT services result in greater resilience for the business. Improving resilience and reliability results in increased accessibility and facility use, and provides for adaptability into the future.

Consider how critical the data center applications and services are to an operation: What will it cost if no one can send e-mail, access an electronic funds transfer system, or use Web applications? How will other aspects of the business be affected if the facility fails?

Greater system dynamics: Assessment of an existing facility will lead to increased integration of all system components. Increasing data processing potential cannot be considered without understanding the implications on cooling and power demand, and the management systems behind the processes. All aspects of the data center system must be looked at holistically to achieve the greatest results.

Review and improve

Compared to similar-sized office spaces, data center facilities typically consume 35 to 50 times the amount of energy in normal operation and contribute CO2 into our environment. Power demand for IT equipment greater than 100W/sq ft is not uncommon, and as we move into the future, the requirement for data storage and transfer capability is only going to rise.

Whether the driver for improvements is overloaded servers, programmed budget, or corporate energy-saving policy, an analysis of the energy use and system management will have benefits for the business. The assessment process should be to first understand where energy is being used and how the system currently operates; then to identify where supply systems, infrastructure, and management of the facility can be optimized.

1: Review computer rack use

Levels of data storage and frequency of application use will fluctuate in a data center, as users turn on computers and access e-mail, Internet, and local servers. Design of the supporting power and cooling systems for data storage and processing is typically sized with no diversity in IT demand, and therefore rarely will be required in its entirety.

Figure 3 illustrates typical server activity across a normal office week. At different times of the day each server may encounter near-maximum use, but for the majority of time, utilization of racks may be only at 10% to 20%.

Low use of servers results in inefficient and redundant power consumption for the facility. For many server and rack combinations, the power consumed at 50% use is similar to that consumed at 100% use. For racks up to and exceeding 3 kW, this energy consumption can be quite large, when the subsequent cooling and other facility loads are also considered. To improve the system’s energy use, a higher level of use in fewer racks should be achieved.

Consolidation of the servers allows multiple applications and data to be stored on fewer racks, therefore consuming less “redundant” power. Physical consolidation of servers can be further improved when implemented with virtualization software. Virtualization allows a separation between the computer hardware (servers) and the software that they are operating, eliminating the physical bonds of certain applications to dedicated servers. Effective application leads to markedly improved use rates.

2: Review power consumption, supply

Reducing power consumed by a facility requires an understanding of where and how the energy is being used and supplied. There are many possibilities for inefficiency, which need to be known to improve the data center energy statistics.

Power that enters a data center can be divided into two components:

  • IT equipment (servers for data storage, processing, and applications)
  • Supporting infrastructure like cooling, UPS and switchgear, power distribution units (PDU), lighting, and others.

Figure 6 provides an example of the split for power demand across a facility. For this example, 45% of total data center power is utilized by supporting infrastructure and therefore not used for the core data processing applications. If a facility is operating at 100 W/sq ft IT power demand, energy used for the supporting infrastructure alone would result in an additional 80 W/sq ft of energy, energy costs, and the associated CO2 emissions.

To compare performance of one data center’s power usage to another’s, a useful metric is the power usage effectiveness, or PUE. This provides a ratio of total facility power to the IT equipment power:

                                               PUE = total facility power / IT equipment power

The optimal use of power in a data center is achieved as the PUE approaches 1. Studies show that on average, data centers have a PUE of 2.0 to 2.5, with goals of 1.5 and even 1.1 for state-of-the-art-facilities. For example, a facility with PUE nearing 3.0 will consume greater than 200% more power than a facility operating at a PUE of around 1.3.

Effective metering of a data center should be implemented to accurately understand the inputs and outputs of the facility. Accurate data measurement will allow continuous monitoring of the PUE and also allow effective segregation of power used by the data center from other facilities in the building.

To improve the total system efficiency and PUE of a site, the first step is to reduce the demand for power. Table 1 highlights the strategies for demand reduction with a basic description of each.

Following the reduction of power consumption, the second step toward improving the facility’s efficiency and performance is to improve the supply of power. Power supply for data center systems will typically rely on many components. Each of these components has an associated efficiency of transmission (or generation) of power. As the power is transferred from the grid, through the UPS, PDUs, and to the racks, the system efficiency will consecutively decrease. Therefore, all components are important for entire system efficiency.

A review of the manufacturer’s operational data will highlight the supply equipment’s efficiency, but it is important to note that as the equipment increases in age, the efficiency will decrease.

An effective power supply system should ensure that supply is always available, even in the event of equipment failure. Resilience of the supply system is determined by the level of redundancy in place, and the limitations of single-points-of-failure. The Uptime Institute’s four-tier classification system should be consulted, with the most suitable level selected for the site.

In most locations, reduction of demand from grid supply will result in higher efficiency and reduced greenhouse gas emissions. The constant cooling and electrical load required for the site can provide an advantage in the implementation of a centralized energy hub, possibly using a cogeneration/trigeneration system, which can use the waste heat from the production of electrical power to provide cooling via an absorption chiller.

3: Review room heat gains

As is the case with power consumption, any heat gains that are not directly due to IT server equipment represent an additional energy cost that must be minimized. Often, improvements from reduction in unnecessary heat gains can be implemented with little costs, resulting in short payback periods from energy savings.

Computing systems use incoming energy and transform this into heat. For server racks, every 1 kW of power generally requires 1 kW of cooling; this equates to very large heat loads, typically in the range of 100 W/sq ft and larger. These average heat loads are rarely distributed evenly across the room, allowing excessive hot spots to form.

The layout of the racks in the data center must be investigated and any excessively hot zones identified. Isolated hot spots can result in over- or under-cooling and need to be managed. For a typical system with room-wide control, any excessively hot servers should be evenly spaced out by physically or virtually moving the servers (Figure 8). If the room’s control provides rack-level monitoring and targeted cooling, then isolated hot spots may be less of an issue.

Room construction

A data center does not have the same aesthetic and stimulating requirements as an office space. It should be constructed with materials offering the greatest insulation against transmission of heat, both internally and externally.

Solar heat gains through windows should be eliminated, and any gaps that allow unnecessary infiltration/exfiltration need to be sealed.

Switchgear, UPS, other heat gains

Associated electrical supply and distribution equipment in the space will add to the heat gain in the room, due to transmission losses and inefficiency in the units. Selection of any new equipment needs to take this heat gain into account, and placement should be managed to minimize infringement on core IT systems for cooling.

4: Review cooling system

Data center cooling is as critical to the facility’s operation as the main power supply. The excessive heat loads provided by server racks will result in room and equipment temperature rising above critical levels in minutes, upon failure of a cooling system.

Ventilation and cooling equipment

Ventilation is required in the data center space for the following reasons only:

  • Provide positive air pressure and replace exhausted air
  • Allow minimum outside airflow rates for maintenance personnel, as per ASHRAE 62.1
  • Smoke extraction in the event of fire.

Ventilation rates in the facility do not need to exceed the minimum requirement and should be revised if they are exceeding it, in order to reduce unnecessary treatment of the excess makeup air. The performance of the cooling system in a data center largely affects the total facility’s energy consumption and CO2 emissions. There are various configurations of ventilation and cooling equipment, with many different types of systems for room control. Improvements to an existing facility may be restricted by the greater building’s infrastructure and the room’s location. Recent changes to ASHRAE 90.1 now include minimum requirements for efficiency of all computer room air conditioning units, providing a baseline for equipment performance.

After reviewing the heat gains from the facility (step 3), the required cooling for the room will be evident.

  • Can the existing cooling system meet this demand or does it need to be upgraded?
  • Is cooling provided by chilled water or direct expansion (DX)? Chilled water will typically offer greater efficiency but is restricted by location and plant space for chillers.
  • Does the site’s climate allow energy savings from economizers and in-direct cooling?
  • What type of heat removal system is used in the data center space? Can it effectively remove the heat from the servers and provide conditioned air to the front of the racks as required?

Effective cooling: removing server heat

Mixing of hot and cold air should be minimized as much as possible. There should be a clear path for cold air to flow to servers, with minimal intersection of the hot return air. The most effective separation of hot and cold air will depend on the type of air distribution system installed.

Underfloor supply systems rely on a raised floor with computer room air conditioning (CRAC) located around the perimeter of the room. Conditioned air is supplied to the racks via floor-mounted grilles; the air passes through the racks, then returns to the CRAC units at high level. To minimize interaction of the hot return air with cold supply air, a hot and cold aisle configuration will provide the most effective layout. The hot air should be drawn to CRAC units located in line with the hot aisles with minimal contact with the cold supply air at the rack front.

An in-row or in-rack cooling system provides localized supply and will also benefit from hot and cold aisle configuration. Airflow mixing is less likely because this type of system will supply conditioned air directly to the front of the rack and draw hot air from the rear of the rack. If the system does not use an enclosed rack, the implementation of hot aisle and cold aisle containment will ensure that airflows do not mix.

For data center facilities that are created in existing office buildings and fitouts, the cooling system may not be stand-alone but rather rely on central air handling unit systems, wall-mounted split air conditioners, or even exhaust fans. These nonspecific cooling systems typically will not offer the same efficiency as a dedicated CRAC system or in-row cooling, and will be limited in the potential for improvement.

To optimize this type of cooling system and ensure that conditioned air is delivered to the rack inlet, consider the following:

  • Racks should be arranged into hot and cold aisles.
  • Air distribution units should be placed in line with the cold aisles.

Reduce short circuiting

Improved airflow through racks and reducing the opportunities for “short circuiting” of conditioned air into hot aisles will enable improved control of server temperature.

  • Provide blanking plates at any empty sections of server cabinets to prevent direct mixing of hot and cold air.
  • Use server racks with a large open area for cold air intake at the front, with a clear path for the hot air to draw through at the rear.
  • Cable penetrations should be positioned to minimize obstruction of supply air passing through the racks. Any penetrations in raised floor systems should be sealed with brushes or pillows.
  • Cable trays and cable ties will manage these to ensure they do not impinge on the effective airflow.

Associated equipment

Any pumps or fans for cooling in the data center should be as efficient as possible. Installation of variable speed drives (VSD) will reduce the power consumed by the electric motors when operating at part load. If large numbers of VSDs are selected, a harmonics analysis is recommended for the site’s power supply.

Temperature, humidity control

Without appropriate control, the data center equipment’s performance will weaken; it will be detrimentally affected if the room conditions are not within its tolerance. However, studies have shown that the tolerance of data communication equipment is greater than that proposed for offices and human comfort.

According to ASHRAE, the ideal inlet conditions for IT equipment are:

  • Dry bulb temperature: 64.4 to 80.6 F
  • Dew point: 41.9 to 59 F.

Temperature and humidity sensors need to be placed effectively around the room to actively measure the real conditions and adjust the cooling supply accordingly. Optimal placement for measurement and monitoring points is at the front of the rack, to actively measure the inlet condition.

CRAC units and in-row coolers can be controlled with various sensor locations, setpoints, and strategies. Control strategies for regulation of cooling load and fan speed can be based on air conditions entering or leaving the unit.

  • Generally, supply air control systems will allow higher air temperatures in the room, resulting in improved efficiency for the cooling system.
  • Improvements in CRAC unit fans also allow for reductions in energy use as fans can be cycled up and down in response to underfloor static pressure monitoring, and therefore reduce power demand.
  • Ensure effective communication from all sensors and equipment with the building management system (BMS) for monitoring and analysis.

5: Optimize monitoring and maintenance

Designing an energy-efficient data center is not the final step for ensuring an efficient system. A data center may have been designed to operate within tight boundaries and conditions. During construction, commissioning, and hand-over, inefficiencies in power distribution, heat gains, and cooling provision can easily arise from poor training and ineffective communication of design intent.

Studies of existing facilities have shown that many data centers have faults and alarms that the facility’s management are not aware of. Effective monitoring of the facility allows the system to optimize operation, providing ideal room conditions and IT equipment use.

Rack-centric monitoring

Traditionally, control of the data center’s heat loads (cooling requirement) and power supply has been largely centralized, with the BMS connecting the components (Figure 12).

This type of system allows easy response for any center-wide faults and alarms but minimizes the control and management of individual equipment. To improve the efficiency of the data center, management of temperature and power distribution should lead toward a rack-centric approach, with sensors and meters at each rack, maximizing the operator’s ability to analyze how the system is performing on a micro scale.

Systematic checking and maintenance

The power usage, heat loads, and IT equipment should be reviewed on a regular basis to ensure the data center is operating as designed. The prevention of future failures in either the power or cooling systems will save the business large amounts of time and money.

6: Implement

This article has identified a range of performance measures and improvements to lead toward greater system efficiency.

The level of improvement that an existing facility is capable of depends on a number of factors, including its ability to finance, the total space available and expectations of future growth, and the age of the existing system. Therefore, the assessment and improvement process should consider which level to choose. Table 2 highlights the options for each aspect of assessment and ranks each based on the ease and cost of implementation (low to high).

Hallett is a mechanical engineer with Arup. He has been involved in the design and review of a number of significant data center projects, with energy and environmental footprint reduction a major part of the design process. Hallett’s experience includes design of greenfield sites and refurbishments, using modeling and simulation to optimize energy use and cooling applications.


References

[1] ASHRAE. 2007. Ventilation for Acceptable Indoor Air Quality. ANSI/ASHRAE/IESNA Standard 62.1-2007.

[1] ASHRAE. 2009. Thermal guidelines for data processing environments, Second Edition. ASHRAE Technical Committee 9.9.

[1] ASHRAE. 2010. Energy standard for buildings except low-rise residential buildings. ANSI/ASHRAE/IESNA Standard 90.1-2010.

[1] Dunlap, K. 2006. Cooling audit for identifying potential cooling problems in data centers. APC White Paper #40.

[1] Ebbers, M., A. Galea., M. Schaefer, and M.T.D. Khiem. 2008. The green data center: Steps for the Journey. IBM Redpaper.

Emerson Network Power. 2009. Energy Logic: Reducing data center energy consumption by creating savings that cascade across systems. White Paper.

[1] Green Grid. 2008. Green Grid data center efficiency metrics: PUE and DCIE. White Paper #6.