Scalability has always been a core part of the mindset of technology leaders. What’s changed in recent times is the way scalability is achieved.
When organisations ran their own enterprise data centres, the technology needs of a particular project were designed and specified in such a way that they could handle excessive demand or some multiplier of that amount. That often led to applications making use of hardware that was exponentially larger and more expensive than it needed to be from day one. The extra capacity acted as a kind of ‘insurance’ that the system could scale to meet sudden spikes in demand, when or if that happened.
Cloud, and specifically the ‘pay for what you use’ type model, altered that. It theoretically allowed organisations to buy enough compute capacity for today, safe in the knowledge they could tap into more capacity immediately, should they require it. However, some organisations still overprovisioned their cloud servers and systems, with idle capacity leading organisations and teams to overspend on cloud resources.
While not every team has gotten the sizing of their cloud environment right from day one, their actions show that scalability continues to be a front-of-mind challenge.
No organisation wants an unforeseen event - such as a change in consumer habits caused by an influencer recommendation - to send demand for a digital system, such as their web store skyrocketing, and for the underlying infrastructure to fail to keep up.
The answer to creating scalability isn’t in hosting workloads on cloud alone. Instead, the whole application, from the way it is architected and hosted to the way it is monitored and managed, has to be seamless, digital-first and cloud-native.
One of the ways that teams have sought to address this is by architecting for ‘hyperscale’. Hyperscale, in its simplest form, refers to an architecture’s ability to scale appropriately as organisations add increased demand to the system.
Organisations can tap hyperscale capabilities from cloud providers to uniformly scale out applications, customise environments to match their exact requirements and exercise a high degree of control over every element and policy of the computing experience.
However, at this scale, systems and operations get naturally more complex.
Multicloud and cloud-native architectures are critical to helping organisations achieve their digital transformation goals. While organisations benefit from the flexibility and scale that these technologies bring, the explosion of observability and security data they produce is increasingly hard to manage and analyse.
Every click, swipe, or tap from a user, every cloud instance that spins up, and every attempted cyber attack generates more data. Operations, development, and security teams must use this data to identify the necessary insights to optimise services and resolve problems effectively.
This is currently where the (hyper)scale ambitions of many teams and organisations become unstuck.
A survey of Australian CIOs and IT leaders late last year found that cloud-native technology stacks produce more data than a human can parse or manage in over 78% of cases.
These stacks run technology environments and applications that are increasingly dynamic, adjusting to internet-based conditions, customer demand and other variables on-the-fly. Over three-quarters (77%) of CIOs globally say their IT environment changes once every minute or less.
Due to the increasing complexity of these dynamic environments, log monitoring and log analytics are essential in a hyperscale scenario.
But not all teams are coping: in the survey, 51% of Australian CIOs said it was too costly to manage the large volume of observability and security data using existing analytics solutions, “so they only keep what is most critical”. On average, that means only about 10% of observability data is captured for querying and analytics. What’s in the other 90% could still be a cause for concern, it’s just not immediately recognisable as such.
Given the volume of log data that hyperscale environments generate, It’s important to find ways to automate the influx of data and centralise logs, metrics, and trace data. This is particularly the case for environments that make use of ephemeral cloud services such as serverless functions or ‘spot’ instances; data needs to be captured before the instance is torn down.
Automation can handle the scale of every component in an enterprise ecosystem, as well as the interdependencies. This alleviates manual tasks and shifts the focus to driving substantial business results.
Leading teams and organisations are moving in this direction: 91% of CIOs identified automation, and specifically AIOps, as increasingly vital to helping teams oversee the complexity of modern cloud and development environments.
So the solution to achieving scalability - and hyperscalability - is three-pronged.
It’s having infrastructure that has the capacity to scale, having applications that are architected in such a way that they can scale (and take advantage of infrastructure-scaling features), and finally having automatic and intelligent observability at scale, so the environment can be understood and always maintained in its most optimum state.