IT Brief Australia - Technology news for CIOs & IT decision-makers
Story image
Vultr unveils Cloud Inference for global AI model deployment
Thu, 21st Mar 2024

Cloud computing platform, Vultr, has made an announcement on its latest offering, Vultr Cloud Inference. The new serverless platform provides customers with global AI model deployment and AI inference capabilities. The launch also promises seamless scalability, reduced latency, and enhanced cost efficiency for users' AI deployments, taking advantage of Vultr's existing infrastructure, the company states.

The rapidly changing digital landscape puts pressure on businesses from various sectors to deploy and manage AI models effectively. The demand for inference-optimised cloud infrastructure platforms with a global scope and scalability has resulted in an organisational shift in priorities, as companies increasingly spotlight inference expenditures as they transition their models into production. However, this also presents developers with the complicated task of optimising AI models for different regions, along with managing distributed server infrastructure and ensuring low latency and high availability.

In response to this challenge, Vultr developed the Cloud Inference platform intended to speed up time-to-market of AI-based features such as predictive and real-time decision-making. Along with this, it promises a captivating user experience across a variety of regions. Customers can bring their model trained on any platform, whether it's on cloud or on-premises, and it can be smoothly integrated and deployed on Vultr's global NVIDIA GPU-powered infrastructure.

J.J. Kardwell, CEO of Constant, Vultr's parent company, explained: "As an increasing number of AI models move from training into production, the volume of inference workloads is exploding, but the majority of AI infrastructure is not optimised to meet the world's inference needs. The launch of Vultr Cloud Inference enables AI innovations to have maximum impact by simplifying AI deployment and delivering low-latency inference around the world through a platform designed for scalability, efficiency, and global reach.”

One of the many perks of Vultr’s Cloud Inference includes the ability to self-optimise and auto-scale globally in real-time. This ensures that AI applications deliver consistent, cost-effective, and low-latino experiences to users worldwide. Moreover, its serverless architecture rids the complexities of managing and scaling infrastructure.

With Vultr Cloud Inference, firms are granted access to a secluded environment for sensitive or intense workloads, which provides augmented security and performance for crucial applications, harmonising with objectives around data protection, regulatory compliance, while maintaining high performance under peak loads.

“Demand is rapidly increasing for cutting-edge AI technologies that can power AI workloads worldwide," commented Matt McGrigg, director of global business development, cloud partners at NVIDIA. "The introduction of Vultr Cloud Inference will empower businesses to seamlessly integrate and deploy AI models trained on NVIDIA GPU infrastructure, helping them scale their AI applications globally."

As AI continues to break barriers and reimagine cloud and edge computing, the scale of infrastructure required to train large AI models and support global inference needs has reached unprecedented heights, the company states. Following the launch of Vultr CDN, Vultr Cloud Inference further establishes the platform as a key technology provider enabling innovation, cost efficiency, and global reach for companies worldwide.