Datadog's $1bn investment in unifying data silos
Fri, 20th Mar 2026
"In today's incident response world, it's sheer chaos," Datadog's CPO told its Sydney Summit, where the company unveiled a new Gen-AI tool to automate incident detection, diagnostics and remediation.
"We're in the business of breaking down silos and solving complexity," Chief Product Officer Yanbing Li said at the Datadog Summit on 17 March.
Amidst the exponential growth in cloud spend, she suggested that AI is enhancing the capabilities of software businesses rather than threatening their viability.
"Many of you are probably worried about the AI disruption to Saas businesses and what's going to happen to them. The interesting truth is, all today's modern AI companies actually use Datadog as a way for them to observe their system end-to-end and to power their exponential trajectory… they teach us how to stay at the frontier of the technology," Li said.
$1 billion investment
In 2025 alone, Datadog claimed its R&D spend surpassed the $USD1 billion mark, with the company continuing to reinvest 30 per cent of revenue back into R&D.
Li noted this investment is being directed into five priority areas: AI-led productivity gains, infrastructure optimisation, connected front and backend visibility, security, and support for developers.
Part of this investment went into building its Bits AI SRE, the company's generative AI-powered engineer designed to streamline incident detection and remediation.
"When an incident happens, you gather a large number of people. Most of the people are still trying to piece together basic information; most people only have partial knowledge of the system or the services or the infrastructure or the business process. It's really difficult to bring that altogether and come to a quick resolution," Li said.
"What's interesting also is that most of the people attending those war rooms are far more motivated by proving it's not their problem versus actually finding the problem."
In a bid to address this bottleneck, Datadog has integrated Bits AI SRE across its entire platform, where it is capable of autonomously flagging potential incidents, investigating the root cause and proposing relevant code fixes, then using the learnings from each investigation to improve future accuracy.
Datadog is also moving from static thresholds to anomaly detection, where the LLM adapts to seasonal patterns and issues alerts only when abnormalities arise, with the aim of reducing 'alert fatigue' among staff.
Other developments showcased at the Summit included:
- expanded Data Streams Monitoring, in a bid to provide end-to-end visibility into complex, asynchronous pipelines.
- evolution of Cloud Cost Management, to align engineering and finance teams by providing real-time visibility into cloud spend alongside performance metrics.
- deeper integration of Application Security Management (ASM) with code-level vulnerabilities.
'We know before it breaks'
Speaking of its experience using Datadog's platform was Fitness Passport, which offers a corporate health and fitness program to employers where members can attend any of the 2,500 fitness centres across its Australia and New Zealand network.
"The company didn't have a lot of observability," Engineering Director Rob Mitchell told TechDay.
"I've always had this view of 'I've got to visualise things, we've got to know things'. The first thing was just to surface it, to get the visibility, do a deep dive and some analysis."
In its first 15 months since migrating to the platform, Fitness Passport has been able to proactively identify and resolve numerous issues, such as a coding glitch around partner sign-ups that had been "creating real headaches" for users and the customer support team alike. However, the ultimate goal is to transition from troubleshooting to prevention.
"We had an incident just yesterday, I was at lunch and got a ping on my phone… people are not able to sign the Ts and Cs," Mitchell recalled, adding that his dev team believed it pertained to a database migration deployment.
"We got back into the office with our lunch, and it was fixed. I pinged our customer support team to let them know we have an issue, and they were surprised. That for me was the 'aha moment': they didn't know about it yet. Any previous incidents we've had, we got told by customer support because customers were complaining. This one didn't work that way."
Silos of tech
Speaking with TechDay on the Summit's sidelines, Datadog's Regional Vice President for ANZ, Roz Gregory, suggested it is solely the implementation of technology, rather than technologies themselves, that is restricting fully automated prevention capability.
"From a solution delivery perspective, we're already there. It's in the application of that technology in complex organisations, that are often literally silos of tech aligned to sometimes different processes… That level of siloing is more the barrier than the technology," she said.
"Humans still have to take responsibility for all that decision-making at some point, so the monitoring of it is key. But the ability to be able to automate it is there."
According to Gregory, some of the greatest commercial rewards from a culture shift away from data silos will come from unifying security and observability.
"Up until now, observability and security have definitely been considered separate domains. There is a power in bringing security and observability together that hasn't been done before," Gregory said.
"Having a fully integrated, correlated view of all logs, metrics and traces – so all of the mini-events that are happening in your business, from infrastructure all the way to the frontend and data – and combining that with the ability to detect anomalies, is incredibly powerful."