Companies spread AI across multiple models amid failures

Wed, 22nd Apr 2026

Datadog research shows organisations are increasingly using multiple AI models in production, with 69% of companies now running three or more.

OpenAI remains the most widely used provider at 63%, while use of Google Gemini and Anthropic Claude rose by 20 and 23 percentage points respectively over the past year.

Based on anonymised telemetry from thousands of customers running large language models in production, the findings suggest a shift away from reliance on a single provider as companies expand AI services and agent-based workflows. Broader model portfolios are becoming more common as teams handle a wider range of workloads and system requirements.

At the same time, the report highlights growing operational strain. Around 5% of AI model requests fail in production, and nearly 60% of those failures are linked to capacity limits. The figures suggest that infrastructure and workflow management are becoming central issues as businesses deploy AI services more widely.

Model Mix

The move to multiple models has coincided with a sharp rise in the use of agent frameworks. Adoption doubled year on year, helping teams build AI systems more quickly but also adding more moving parts to production environments.

Usage intensity is also rising. The average number of tokens sent to AI models per request more than doubled for median-use teams and quadrupled for heavy users. That trend increases compute demand and may raise the risk of delays or failed requests when systems are under pressure.

Yanbing Li, Chief Product Officer at Datadog, drew a parallel with an earlier shift in enterprise computing.

"AI is starting to look a lot like the early days of cloud," Li said. "The cloud made systems programmable but much more complex to manage. AI is now doing the same thing to the application layer. The companies that win won't just build better models - they'll build operational control around them. In this new era, AI observability becomes as essential as cloud observability was a decade ago."

Operational Strain

Datadog argues that the main obstacle to scaling AI is increasingly operational rather than a matter of model quality. As organisations push deployments into production, failures are more often being driven by system design issues, including fragmented workflows, repeated retries and inefficient routing between models.

That assessment reflects a broader industry debate over how to manage AI systems once initial pilots move into customer-facing products and internal business tools. While attention has often centred on model performance, the report argues that practical concerns such as throughput, reliability and monitoring are carrying greater weight.

Guillermo Rauch, Chief Executive Officer at Vercel, said those concerns are becoming more visible as agent-based systems spread.

"The next wave of agent failures won't be about what agents can't do but what teams can't observe," Rauch said. "We built agentic infrastructure at Vercel because agents need the same production feedback loops as great software. Unlike traditional software, agents have control flow driven by the LLM itself, making observability not just useful, but essential."

Scaling Questions

The findings come as both start-ups and large companies face pressure to release AI products quickly. Datadog argues that faster deployment without matching oversight can expose businesses to reliability and governance problems, especially when systems depend on several models and connected agent workflows.

The analysis covered customers across industries and geographies using anonymised production usage data. That gives the report a view into live deployments rather than pilots or survey responses, offering a snapshot of how companies are managing AI systems once they are integrated into routine operations.

Li said the challenge now is gaining visibility across the stack as AI systems grow more complex.

"Innovation alone isn't enough," Li said. "To scale AI with confidence, organizations need real-time visibility across the entire stack - from GPU utilization to model behavior to agent workflows. Visibility and operational control are what allow teams to move fast without sacrificing reliability or governance. At scale, how you operate AI may matter more than the models you choose."

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google