DBRX MoE-A

DBRX MoE-A · Family: DBRX

Smallest DBRX family member: 7.7B total parameters, 2.2B active. Used internally by Databricks to study MoE training efficiency. Achieves 30.5% on the Databricks Gauntlet with 3.7x fewer FLOPs than MPT-7B.

🔬 Research🔬 Research onlyLLM📁 DBRX

Context window

32K

tokens

Parameters

7.7B total / 2.2B active

parameters

Release date

27 March 2024

🏢DatabricksProducer

Access:APIDeployment:☁ Cloud

Overview

DBRX MoE-A is the smallest member of the DBRX family: an internal Databricks research model with 7.7B total parameters and 2.2B active. It was not released publicly — it serves as a benchmark for studying training efficiency of Mixture of Experts (MoE) architectures and validating the final DBRX training recipe.

Purpose

DBRX MoE-A serves as a head-to-head comparison against MPT-7B (Databricks/Mosaic's May 2023 model). The results demonstrate how the new DBRX training stack (MoE architecture, better data, GPT-4 tokenizer) improves training efficiency over dense 7B-class models.

Results

On the Databricks Model Gauntlet v0.3 it achieves 30.5% — comparable to MPT-7B (30.9%) — but at 3.7x lower compute cost (FLOPs). This demonstrates that the full DBRX training recipe is roughly 4x more compute-efficient than the previous-generation MPT pipeline.

Status

The model is not publicly available, neither on Hugging Face nor through APIs. Weights and checkpoints remain internal to Databricks. The public DBRX blog (March 2024) lists it only as a validation artifact of the training pipeline.

Classification

LLM

Family: DBRX

Applications

Model evaluation

Access & deployment

API

Cloud

Weights: Closed

Key parameters

📏 Context: 32K

🧩 Parameters: 7.7B total / 2.2B active

📥 Input: text

Technical specification

Context window

32K

tokens

Parameters

7.7B total / 2.2B active

parameters

License

Databricks internal / research

Hardware requirements

Internal Databricks research model; no public checkpoint available.

Modalities

⬇ Input

text

⬆ Output

textcode

Capabilities and applications

Application domains

Model evaluation

Benchmark results

1 benchmark

Databricks Model Gauntlet v0.3

avg score · composite avg of 30+ tasks

30.5%

📄 Databricks DBRX blog (2024-03-27)

Technical architecture

Core Architecture

TRTransformer MOMoE

Model Form

LLLLM

Sources and related pages

1 source

BlogIntroducing DBRX — Databricks Blogdatabricks.com

Browse related topics

📁 DBRX 🌐 Model evaluation 🧠 Transformer 🧠 MoE 🧠 LLM All llm models