Robots Atlas>ROBOTS ATLAS
Snowflake Arctic Embed L v2.0

Snowflake Arctic Embed L v2.0

2.0 · Family: Snowflake Arctic
Snowflake Arctic Embed L v2.0 is a 568M-parameter multilingual embedding model optimized for semantic retrieval and RAG pipelines. Apache 2.0, 1024-dimensional vectors, 8192-token context window.
✓ Active✓ Public access⚖ Open weightsEmbedding model📁 Snowflake Arctic
Context window
8192
tokens
Parameters
568M
parameters
Release date
4 December 2024
Access:APIDownloadDeployment:☁ Cloud💻 Local

Overview

Snowflake Arctic Embed L v2.0 is a multilingual text embedding model released by Snowflake in December 2024. Built on BAAI/bge-m3-retromae, it features 568M total parameters (303M non-embedding) and generates 1024-dimensional embedding vectors with a context window of up to 8192 tokens via RoPE.

The model delivers state-of-the-art retrieval quality across English (MTEB Retrieval: 55.6 NDCG@10) and multilingual benchmarks (MIRACL: 55.8; CLEF: 52.9). It supports vector compression via Matryoshka Representation Learning (MRL) to 256 dimensions and 4-bit quantization to 128 bytes per vector, retaining over 97% of baseline quality.

Available on Snowflake Cortex AI as `snowflake-arctic-embed-l-v2.0`, on Hugging Face, and via Sentence Transformers and Transformers.js. Licensed under Apache 2.0 for free commercial use.

Classification
Embedding model
Access & deployment
APIDownload
CloudLocal
Weights: Open weights
Key parameters
📏 Context: 8192
🧩 Parameters: 568M
✓ Fine-tuning
📥 Input: text

Technical specification

Context window
8192
tokens
Parameters
568M
parameters
License
Apache 2.0
Features:Fine-tuning
Modalities
⬇ Input
text
⬆ Output
structured_data

Capabilities and applications

Native model capabilities
Multilingual
Understanding and generating text in many languages.
Category: language
Long context
Maintaining coherence and focus across very long input context.
Category: language

Benchmark results

3 benchmarks
MTEB Retrieval (BEIR-15)
NDCG@10 · Average NDCG@10 across 15 BEIR datasets
55.6points
📄 Hugging Face model card (self-reported)
MIRACL (4 languages)
NDCG@10 · Multilingual retrieval – average across 4 languages
55.8points
📄 Hugging Face model card (self-reported)
CLEF (Focused)
NDCG@10 · CLEF Focused benchmark (multilingual)
52.9points
📄 Hugging Face model card (self-reported)

Pricing

Deployment and security

☁ Available on platforms
🔒 Security / Enterprise
✓ Verified enterprise information

Models hosted in Snowflake Cortex AI use Snowflake's full security infrastructure. Data does not leave the platform's security boundary.

Updated: 28 Apr 2026↗ Security documentation