Story image

Observability: A new focus for cloud-native businesses

‘Observability’ is the word coming from everybody’s mouths across enterprises, whether you’re in IT Operations, DevOps, Agile, or Site Reliability Engineering (SRE). Let’s take a closer look at what observability means and how it applies to both web-scale and in the traditional sense.

What is Observability?

As with many new concepts in IT (such as DevOps), the industrial world was the first to coin the term observability. In this case, observability describes an attribute of systems that are internally instrumented, allowing equipment operators to see inside the otherwise hidden processes of their systems.

For example, if an operator at a water treatment plant can’t gain visibility of the inside of opaque water pipes, they have no way of determining if the water is flowing, which way it’s flowing, or whether the water is dirty or clean – a lack of observability.

What the operator could do is adding flow gauges and sensors inside the pipes. These would be connected by telemetry to a dashboard, allowing the operator to gain full visibility, or observability, of the status of water in the pipes.

Observability in Software Applications and Services

Similarly to the industrial world, observability can be applied to software services. When developers code today, they include measurement and telemetry which delivers observable applications.

This allows operations teams to:

  • Detect, contain, and alert sooner on critical incidents and events.
  • Investigate the root causes of problems more efficiently.
  • Fix incidents faster with real-time feedback on remediation efforts.
  • Undertake more accurate post-incident reviews and post-mortems.
  • Better understand the problem history and prevent recurrence.
  • Close feedback loops with requirements for continuous improvement.
  • Use analytics and machine learning to predict and prevent problems.

Observability in the Real World

Observability is becoming the norm for cloud-native businesses, unhindered by decades of success and the ‘legacy’ of systems and applications that come with that success. If large traditional enterprises do have this history, they are still able to implement observability into their existing services:

  • With no code changes – by streaming system-level data directly from infrastructure components (e.g. throughput, utilisation, capacity, etc. of servers, storage, visual management services (VMs), cloud services, containers, etc.)
  • With minimal code changes – by deploying collected to measure and forward specific infrastructure attributes (e.g. CPU workload, memory usage, I/O rates, or storage utilisation)
  • With some code changes – by deploying stats to collect and forwarding metric data from inside your application (e.g. counters and timers for transaction time, round-trip time, etc.)
  • With major code changes – by implementing semantic logging to instrument any application activity, from ‘speeds and feeds’ to business metrics (e.g. revenue, click-through rate (CTR), customer experience, etc.)

While these approaches are valuable in themselves, the additional effort always adds value. For example, data from legacy data centre infrastructure management (DCIM) or application performance management (APM) tools will help to detect and triage technical problem events and answer IT questions.

Actioning Observability with AIOps

Possessing new data, graphs, KPIs and dashboards alone will not allow your business to succeed. Observability has to be actioned in order for you to unlock its true value, whether this is from a real-time problem and incident triage, close DevOps feedback loops, or proactively prevent problems.
 
This means collecting observability data and aligning it with other monitoring outputs, processing it with analytics and using machine learning to begin producing automated responses. Once you have combined monitoring with observability, machine learning, predictive analytics and advanced data integration you will have what Gartner dubs ‘Artificial Intelligence for IT Operations’ or ‘AIOps.’

True business-technology alignment

For cloud-based startups delivering web-based services, observability is an exciting new concept in IT. For traditional IT Ops, it still seems difficult to achieve, however, it is achievable for any business, even large enterprises. As an addition to traditional monitoring, observability marks a new era in IT ops and software service delivery, facilitating businesses towards true business-technology alignment. 

By Andi Mann, Chief Technology Advocate, Splunk

Will 2019 be the year of network evolution?
An A10 Networks exec talks 5G, software-defined networks, and the continuing evolution needed for a modern cloud environment.
ZTE takes the lead in the global race to 5G
ZTE took the lead in completing the IMT-2020 third phase 5G test for core network performance stability and security function.
IDC: Relevance is combining strategy, creativity and IT services
IDC reveals the Top 10 Asia/Pacific predictions to impact IT and business services sourcing in 2019 and beyond.
How IIoT is creating opportunities for RFID companies
The growing demands for automation and digitisation are creating considerable growth opportunities for RFID vendors.
Huawei founder publically denies spying allegations
“After all the evidence is made public, we will rely on the justice system.”
Malware downloader on the rise in Check Point’s latest Threat Index
Organisations continue to be targeted by cryptominers, despite an overall drop in value across all cryptocurrencies in 2018.
Exclusive: Why Australia’s IT industry needs to invest in SMBs
"With SMBs generating employment for over five million Australians, it comes as no surprise that they play a vital role in the nation’s economy."
IoT breaches: Nearly half of businesses still can’t detect them
The Internet of Thing’s (IoT’s) rapid rise to prominence may have compromised its security, if a new report from Gemalto is anything to go by.