The myriad of data that enterprises have at their disposal is unprecedented, becoming the lifeblood and currency of the digital economy. In turn, artificial intelligence (AI) and machine learning (ML) have matured as powerful tools for processing this data to produce insights, gain competitive advantage, reduce costs, grow their business and improve efficiency. In fact, data, AI/ML and the high-performance computing (HPC) systems companies require are central to making strong business decisions.
Computer hardware and chip manufacturers have kept pace with the need for data-driven competitiveness by improving their hardware to meet the performance requirements of AI/ML, with AI model training requiring greater processing power. Moore's Law means modern computer processors pack more efficiency and computing power into a single unit, so we can run heavier workloads in a smaller footprint. However, Moore's Law is approaching its limits. The power requirements of servers and storage are steadily increasing, as well as the accompanying heat, driving data centre operators and enterprises to explore new cooling strategies to accommodate greater power densities.
Liquid cooling is a re-emerging technology for supporting high-density data centres. While air cooling has been the dominant approach, companies are exploring liquid cooling thanks to its ability to transfer heat more efficiently than air. In fact, Gartner says, "Liquid conducts more than 3,000 times as much heat as air and requires less energy to do so, allowing increased data centre densities." Here, we're using the term "liquid cooling" rather than "water cooling" since a variety of technical cooling fluids are used, depending on the cooling approach. Generally, there are three approaches that bring liquid closer to the rack that our customers and partners have been testing and evaluating to enable more efficient cooling. They are augmented air cooling, immersion cooling and direct-to-chip liquid cooling.
Augmented air cooling
Standard air-cooling technologies in the data centre already employ some chilled water to function. For example, computer room air handlers (CRAHs) have a chilled water coil inside. The augmented air approach is about bringing that existing technology closer to the rack and closer to the heat source. A rear-door heat exchanger (RDHx) is an increasingly popular way to achieve this. Chilled liquid goes through a coil in the rear door of the rack; that coil captures the heat from the equipment, delivering cool air back to the data centre. Technically, a RDHx isn't true liquid cooling because the chip at the server level is still air-cooled, but this does bring liquid closer to the rack to harness greater air-cooling effectiveness.
Considerations - As we're seeing greater heat output at the server and chip level, many companies have sought ways to enhance existing air cooling for greater efficiency, and we can gain more efficiency by reducing the heat expelled by equipment into the data centre. This enables greater power density and allows more power-hungry hardware to be packed into a smaller space. For organisations, this is a great first step to deliver efficiency benefits and involves relatively simple changes in the environment.
What changes are required? No server-level changes are required when implementing RDHx. Typically, facility water is extended to the rack. When implementing RDHx, it's important to work with your facility provider to ensure compatibility.
Immersion cooling is an emerging technology with many approaches, and it's exactly what it sounds like: servers are immersed in a large vat of technical cooling fluid. Think of it like a big bathtub. In single-phase immersion cooling, the fluid stays in a liquid state. In two-phase immersion cooling, it changes to gas when it draws heat from the computer chips and then returns to liquid within the cooling loop.
Considerations - While immersion cooling can allow organisations to achieve high power densities within the data centre, it requires the most substantial changes to server technology and data centre architecture. Being a radical departure from traditional methods of deploying IT equipment, immersion cooling can often have substantial upfront costs and considerations, so we recommend working closely with your immersion vendor and OEMs if you're contemplating a deployment. Immersion typically involves very large tubs of liquid equivalent to three cabinet spaces and is quite big and heavy. Depending on the approach, it can be more of a challenge and quite messy to remove servers from the immersion container, so this cooling method may not be suitable for all applications, such as those where frequent server moves, adds and changes are required.
What changes are required? Both server and data centre changes are needed:
- There are several considerations for all the moving parts and components within an immersed server. Compatibility of components, plastics and tapes with the immersion liquid is not guaranteed.
- Immersion in liquid can distort the refraction index of optical fibre, while copper connectivity options remain largely unaffected in the current generation of systems. Signal and power integrity in the copper of next-generation hardware will likely require customised designs for immersion.
- Networking hardware such as switches and routers are often kept in a separate non-immersion environment.
- Some single-phase immersion cooling systems integrate a CDU, essentially a pump that circulates the working fluid and controls the liquid temperature. The CDU connects to the facility water feed, pushing the heat from the tub out into the facility.
- Since servers are removed from immersion tanks vertically, it's recommended you implement infrastructure that assists in the insertion and removal of servers from immersion vessels.
- The data centre also needs to manage the fluid and maintain its stability, preventing spills, evaporation and precipitation into equipment over time.
Direct-to-chip liquid cooling
For direct-to-chip liquid cooling (DLC), a cold plate sits on top of the chip inside the server. The cold plate is enabled with liquid supply and return channels, allowing technical cooling fluid to run through the plate, drawing heat away from the chip. As with immersion cooling, direct-to-chip can be single-phase or two-phase, depending on whether the cooling fluid changes phase during the heat removal process.
Considerations - DLC is a unique approach that involves an interior augmentation of the IT equipment with minimal changes to the server exterior. This allows DLC-enabled servers to be installed in a standard IT cabinet like legacy air-cooled equipment even while being cooled in an innovative way.
Though direct-to-chip fits in a standard footprint, it requires architectural changes and additional equipment to deliver liquid to the cabinet and distribute it to the individual servers, typically more so than with RDHx but less than immersion cooling.
What changes are required? Requires some server and data centre changes:
- On the server side, a cold plate must be retrofitted in place of the heat sink with piping that runs through the inside of the server and into ports accessible from the outside.
- A CDU is typically implemented to control liquid temperatures and flow pressure to the cold plate.
- In the rack itself, you need a manifold, a liquid distribution unit that distributes cooling fluid to each rack unit, to provide liquid to the server.
- You need additional power strips for the increase in power density. Selecting 415V 3-phase power delivery can ease deployment pains.
- DLC supports high-density racks but benefits from wider (800mm) racks to support the additional components.
Innovating in the data centre for high-compute business solutions
There are many reasons why companies choose one or another option for liquid cooling. Often, the servers they're using, the vendors they work with and the needs of their specific workloads drive the decision.
For several years, we've been working with customers and partners on cooling innovations. We found many started with and saw success with augmented air cooling before advancing into DLC. As more enterprises gravitate to liquid cooling, we continue to innovate, co-create solutions, and invest in technologies that optimise efficiency in the data centre. We're actively exploring and testing the data centre infrastructure that supports this innovation with our liquid cooling technology vendors. We're excited about the opportunities liquid cooling brings to the data centre, and we will continue to evolve our facilities to support in-demand business solutions like AI/ML and HPC.