Story image

Why let data centre maintenance keep you up at night?

01 Jun 18

Article written by HPE South-Pacific vice president and general manager for hybrid IT Raj Thakur

Always-on uptime in a data centre is absolutely essential to business success, and ensuring uninterrupted service requires constant vigilance and maintenance. This need for constant upkeep and reliance on infrastructure only looks set to increase as organisations increasingly deploy more business-critical applications.

While there is continuous innovation to introduce new infrastructure management tools, many still fall short of achieving the enhanced automation and lowered maintenance requirements that the industry covets. As a result, many IT professionals are still wasting days and nights – possibly even missing important birthdays and anniversaries – to deal with issues that require manual tuning.

A major pain point that continuously surfaces during conversations with customers is how maintenance cycles still require human intervention. Furthermore, it is a large drain on operating budgets, with data centre operators spending a huge proportion of their budgets on keeping the lights on.

This begs the question of why maintenance is still keeping operators up at night despite the constant introduction of new tools to deal with the problem. What are we really missing?

The shortfalls of traditional infrastructure tools

Truly removing the burden of managing infrastructure requires having the foresight to predict problems before they occur, along with being able to provide deep insightful intelligence of underlying workloads and resources for better infrastructure optimisation.

Lose sleep over data centre maintenance no further. Consider these four factors to determine if your tools are falling short in overcoming frustrating maintenance problems:

They don’t learn from others

Analytics that simply report on local system metrics tend to offer limited value. Instead, what you should look for in a tool is its ability to learn from the behaviour of thousands of peer systems, so as to aid in detection and diagnosis of developing issues. In a sense where it is said that two minds are better than one, a thousand are infinitely more so.

A holistic approach to data collection and analysis can pool observations from an immense variety of workloads. This allows rare events identified at one site to be pre-emptively avoided at another, and for more common events to be detected quicker with greater accuracy.

Failing to see the whole picture

Traditional tools often only provide analytics in a siloed fashion; providing only system status per device, which is just one part of the overall story. With problems that disrupt applications popping up anywhere in the infrastructure stack, it is important to have the ability to conduct cross-stack analytics across multiple layers to get the bigger picture. This will require crucial components such as applications, compute, virtualisation, databases, networks and storage.

They don’t know enough

Predictive modelling requires deep domain experience – understanding all the operating, environmental, and telemetry parameters within each system in the infrastructure stack. General-purpose analytics can only go so deep. However, pairing domain experts with AI can enable machine-learning algorithms to identify causation from historical events, and in turn, predict the most complex and damaging problems.

They can’t act without you

Perhaps the biggest drawback of traditional tools is their inability to act. In the ideal state of autonomous operations, the data centre would be self-managing, self-healing and self-optimising. In essence, they should be able to avoid a problem or improve the environment without the need for human intervention from an administrator. To achieve this level of automation would require a proven history of automated recommendations that provide the necessary level of trust and confidence.

The future of data centre maintenance

To overcome the limitation of traditional tools and convincingly reduce maintenance requirements – and better automate a data centre – one would have to embrace a new generation of AI solutions. This means leveraging tools that are able to observe, learn, predict, recommend and ultimately, automate.

Through observation, AI will be able to develop a steady-state understanding of ideal operating environments for various workloads and applications. Deep system telemetry coupled with global connectivity allows for rapid cloud-enabled machine learning, resulting in AI tools being able to quickly predict problems through pattern-matching algorithms. Application performance can even be modelled and tuned for new infrastructure based on past historical configurations and workload patterns.

Based on these predictive analytics, AI solutions can determine appropriate responses required to improve the data centre environment. The pressure is then taken off IT teams – and they no longer have to work through the night to find the source of the problem when managing infrastructure. More importantly, in the event that the AI proves to be effective, recommendations can then be applied automatically without the intervention of IT administrators. That to me, is achieving the holy grail of automation.

Furthermore, with technological advancements set to invigorate all sectors of the Asia Pacific economy, the highly-diverse region is expected to experience a talent shortage of 2 million IT professionals by 2030. I’m certainly looking forward to the not so distant future where automation will be the next frontier in data centre management – and of course, getting a good night’s rest.

Gartner names LogRhythm leader in SIEM solutions
Security teams increasingly need end-to-end SIEM solutions with native options for host- and network-level monitoring.
Cylance makes APIs available in endpoint detection offering
Extensive APIs enable security teams to more efficiently view, enrich, and contextualise real-time intelligence collected at the endpoint to keep systems secure.
NBN Co rolls out 'optimised' wholesale business bundles for ISPs
“We recognise some businesses are on nbn powered plans that have not been optimised for their needs," says Paul Tyler.
How Schneider Electric aims to simplify IT management
With IT Expert, Schneider Electric aims to ensure secure, vendor agnostic, wherever-you-go monitoring and visibility of all IoT-enabled physical infrastructure assets.
SolarWinds adds SDN monitoring support to network management portfolio
SolarWinds announced a broad refresh to its network management portfolio, as well as key enhancements to the Orion Platform. 
Preparing for the future of work – growing big ideas from small spaces
We’ve all seen it: our offices are changing from the traditional four walls - to no walls. A need to reduce real estate costs is a key driver, as is enabling a more diverse and agile workforce.
JASK prepares for global rollout of their AI-powered ASOC platform
The JASK ASOC platform automates alert investigations, supposedly freeing the SOC analyst to do what machines can’t. 
Pitfalls to avoid when configuring cloud firewalls
Flexibility and granularity of security controls is good but can still represent a risk for new cloud adopters that don’t recognise some of the configuration pitfalls.