IT Brief Australia - Technology news for CIOs & IT decision-makers
Story image

AI success—and what data storage has to do with it

Thu, 22nd Aug 2024

We are living in an era where the ubiquity of the cloud and the emergence of AI use cases have driven up the value of massive data sets.
  
For many cloud service providers, the recent emphasis has been on building high performance compute infrastructure and then saturating it with data in a race to train the most intelligent and valuable language models possible. However, as these models are deployed and commercialised at scale, the emphasis will rebalance and demand for storage infrastructure – the backbone of AI innovation – will surge.
 
Data, which is not merely consumed, but generated by AI will grow at an unprecedented rate as new media-rich use-cases and experiences are invented. Identifying the right storage technologies to process and retain data efficiently and sustainably will become more important than ever to data centre operators. 

In recent times, much has been made of the relationship between hard drive and flash technology.

Some more extreme perspectives have suggested that hard drives will soon be a thing of the past and will be entirely replaced by flash-only technology in the data centre.

Predictions like this have persisted for well over a decade but have not aged well.

The reality is that hard drives continue to store the majority of data centre exabytes (EB) today. Flash is a critical but altogether different technology and complements the role of hard drives in cloud and AI use-cases. 

Let's examine some of the common misconceptions that seem to fuel the debate.

Myth: "SSD prices are expected to soon be on par with hard drives prices."

Without question, flash storage is well-suited to applications that require high-performance and speed. And flash demand, including all-flash arrays (AFA), is growing, but not at the expense of hard drives.

One of the reasons for this is that hard drives offer a firm cost-per-terabyte (TB) advantage over SSDs, making them a crucial component of storage infrastructure, more indispensable to data centre operators than ever. The price-per-TB difference between enterprise SSDs and enterprise hard drives is projected to remain at or above a 6 to 1 premium long into the future.

This price-per-TB differential is particularly evident in the data centre, where device acquisition costs dominate total cost of ownership (TCO) models. Despite the performance advantages of SSD's, hard drives will remain the primary choice for storing data due to their reliability, cost-effectiveness and wide adoption. 

Myth: "SSD's can replace all hard drive capacity."

The reality is that replacing hard drives entirely with (NAND based) SSD's would require untenable CapEx investment. Transitioning from hard drives to NAND isn't solely about producing more units.  Even if it were possible from an architectural perspective, the idea that the NAND industry could or would rapidly increase its supply to replace all hard drive capacity isn't just optimistic but would likely lead to financial ruin.

The Q4 2023 NAND Market Monitor report from industry analyst Yole Intelligence shows that from 2015 to 2023, the entire NAND industry shipped 3.1 zettabytes (ZB), while having to invest a staggering $208 billion in CapEx—approximately 47% of their combined revenue. 

In contrast, the hard drive industry addresses almost 90% of large-scale data centre storage needs in a highly capital-efficient manner, with a CapEx efficiency of about 5% of revenue. The NAND industry would face significant financial and logistical challenges, requiring investment in a market that is unprepared for such a dramatic shift in data centre architecture. 

Myth: "Only all-flash arrays can meet the performance requirements of modern enterprise workloads"

All-flash vendors advise enterprises to "simplify" and "future-proof" their storage by going all-in on flash for high performance. They argue that otherwise, enterprises risk falling behind in meeting the performance demands of modern workloads. 

This zero-sum logic fails for three reasons.

Firstly, most of the world's data resides in the cloud and large data centres. Here, only a small percentage of the workload requires a significant percentage of performance. This is why, according to IDC, over the last five years, hard drives have accounted for almost 90% of the storage install base in this segment. The idea that a single-tier storage architecture is "simpler" than adopting a mix of media types in a tiered architecture is a solution in search of a problem. Enterprise storage architecture should mix media types to optimise for the cost, capacity and performance needs of each workload.

Secondly, TCO considerations are key to most data centre infrastructure decisions. Optimal TCO is achieved by aligning the most cost-effective media - hard drive, flash or tape - to the workload requirements. Hard drives and hybrid arrays (comprising hard drives and SSDs) are a great fit for most enterprise and cloud storage and application use cases. 

Of course, one might choose to use SSDs or AFAs for workloads that are typically suited for hard drives. However, this approach becomes increasingly less economical as capacity scales.

Additionally, while technologies like triple level cell (TLC) and quad-level cell (QLC) allow flash to handle data-heavy workloads like hard drives, hard drives with increasing areal density, offer a far more cost-effective solution.

Thirdly, many hybrid storage systems employ a well-proven and finely tuned software-defined architecture that seamlessly integrates and harnesses the strengths of diverse media types into singular units. In scale-out private or public cloud data centre architectures, file systems or software defined storage is used to manage the data storage workloads across data centre locations and regions. While AFAs and SSDs are a great fit for high-performance, read-intensive workloads, assuming their suitability for all use-cases can be misleading. Especially in large deployments, AFAs often prove to be unnecessarily costly compared to hard drives, which offer a much lower TCO. 

And so, to the future
Data from IDC and TRENDFOCUS predicts a nearly 250% increase in hard drive exabytes by 2028, with this trend expected to continue into the next decade. 

The truth is that hard drives and flash have always complemented each other in the data centre and will continue to co-exist. In fact, in the age of AI, compute clusters closely coupled with flash technology indirectly drive the increased need for hard drive storage, as masses of generated contents need cost-effective data lakes to land on. Both storage media are critical to the ecosystem, with hard drives continuing to lead in exabytes stored.

*IDC, Multi-Client Study, Cloud Infrastructure Index 2023: Compute and Storage Consumption by 100 Service Providers, November 2023.

Follow us on:
Follow us on LinkedIn Follow us on X
Share on:
Share on LinkedIn Share on X