AI Models

Perplexity Releases Open-Source Embedding Models to Compete with Google and Alibaba

Perplexity has released two open-source embedding models that rival Google and Alibaba, offering superior efficiency with less memory.

Perplexityopen-sourceembedding modelsAIsearchRAG
※ このページにはアフィリエイトリンクが含まれています。リンク経由でご購入いただくと、運営費の一部として還元されます。

AI search company Perplexity has announced the release of two new open-source text embedding models, `pplx-embed-v1` and `pplx-embed-context-v1`. These models are engineered for high-performance, web-scale retrieval tasks and are positioned to compete with established models from industry giants like Google and Alibaba.


The models are available in two sizes: a 0.6 billion parameter version designed for lightweight, low-latency applications, and a more powerful 4 billion parameter version that maximizes retrieval quality. One of the most significant innovations is the models' native support for INT8 and binary quantization. This allows for a 4x and 32x reduction in storage requirements, respectively, compared to the standard FP32 format, without a substantial loss in performance. This efficiency makes storing and searching through billions of embeddings far more practical and cost-effective.


A key technical differentiator is the use of a diffusion-based pretraining process. Unlike traditional decoder-only models that use causal attention (only processing text in one direction), Perplexity's models can understand bidirectional context. This enables a more nuanced comprehension of text passages by considering information from both preceding and succeeding tokens, which is crucial for accurate retrieval. Furthermore, the models are designed for ease of use and do not require special instruction prefixes, simplifying their integration into existing AI pipelines.


In terms of performance, the `pplx-embed` family has demonstrated state-of-the-art results across a range of public benchmarks, including MTEB, BERGEN, and ConTEB. The 4B model, `pplx-embed-v1-4B`, has been shown to match the performance of Alibaba's Qwen3-Embedding-4B and surpass Google's `gemini-embedding-001` on the MTEB benchmark, all while offering significant storage advantages. The release of these high-performance, efficient, and open-source models is a major contribution to the AI community, empowering developers to build more advanced and scalable semantic search and Retrieval-Augmented Generation (RAG) systems.

AI Newsletter

Get the latest AI tools and news delivered weekly