IT Brief Australia - Technology news for CIOs & IT decision-makers
Cloud server stack multilingual books speech bubbles vector

Elastic unveils multilingual embeddings for search

Wed, 25th Feb 2026

Elastic has launched two multilingual text-embedding models for semantic search and related tasks. They are available through the Elastic Inference Service and for self-hosted deployment.

The models, jina-embeddings-v5-text-small and jina-embeddings-v5-text-nano, are positioned as Elasticsearch-native, small models built for search and semantic workloads.

Embeddings sit at the core of many modern search systems, converting text into numerical representations that systems can compare. This approach underpins semantic search, where users can query in natural language and retrieve documents that match meaning rather than exact keywords.

The jina-embeddings-v5-text family includes two models at 0.2B and 0.6B parameters. Elastic lists the small model at 239M parameters and the nano model at 677M parameters.

Elastic reports strong results on the MMTEB benchmark for multilingual text embeddings and says the models outperform larger 7B to 14B parameter models on key search and semantic tasks.

Search workloads

Elastic expects the models to be used for four tasks: retrieval (natural-language queries that return relevant documents), text matching (duplicate detection and similarity across paraphrases or translations), classification (assigning categories, detecting sentiment, or flagging anomalies), and clustering (grouping content by topic or meaning).

These tasks are common in enterprise search deployments and feature in retrieval-augmented generation workflows, where a system retrieves documents and passes them to a language model. Agent-style software also uses retrieval and matching as part of multi-step interactions.

Distribution options

Elastic is distributing the models through several channels. One is the Elastic Inference Service, a GPU-accelerated inference service designed to reduce setup complexity for running inference.

The models are available on Elastic Cloud Serverless and Elastic Cloud Hosted through the Elastic Inference Service. Elastic Cloud Trials also include access to the inference service.

A second channel is open-weight distribution through Hugging Face. Users can self-host the models through vLLM, llama.cpp, or MLX. Elastic also referenced access through an online API.

Platform positioning

Elastic framed the launch as part of its broader push around search and AI, positioning the offering as a combined stack. In this setup, the embedding models sit alongside a vector database in Elasticsearch and other parts of its data platform across cloud and on-premises environments.

Embedding models have become a competitive battleground among search and database vendors. Smaller models can reduce compute demands during indexing and querying, and they can broaden where organisations deploy semantic search, including environments with constrained memory or limited access to dedicated accelerators.

Elastic says the models' small size reduces infrastructure costs and improves response times for hybrid search. Hybrid search blends traditional keyword-based approaches with vector-based semantic techniques, which can improve recall and relevance when content varies in phrasing or language.

Executive comment

Elastic tied the release to retrieval quality for search and AI systems.

"Vector search, RAG, and AI agents depend on high-quality retrieval," said Steve Kearns, General Manager, Search, Elastic. "With the addition of the Jina v5's multilingual embeddings, Elasticsearch continues to be the platform of choice for end-to-end context engineering."

The models add another option for teams building multilingual search experiences. Elastic says they target semantic tasks across languages, a common requirement for global enterprises and content platforms.

The jina-embeddings-v5-text models are available now through the Elastic Inference Service on Elastic's cloud platforms, with self-hosted deployment options also available through open-weight distribution.