Robots Atlas>ROBOTS ATLAS
DBRX MoE-B

DBRX MoE-B

DBRX MoE-Bย ยทย Family: DBRX
Mid-size DBRX family member: 23.5B total parameters, 6.6B active. Used to study MoE training efficiency. Achieves 45.5% on the Databricks Gauntlet with 1.7x fewer FLOPs than LLaMA2-13B (13B active parameters).
๐Ÿ”ฌ Research๐Ÿ”ฌ Research onlyLLM๐Ÿ“ DBRX
Context window
32K
tokens
Parameters
23.5B total / 6.6B active
parameters
Release date
27 March 2024
Access:APIDeployment:โ˜ Cloud

Overview

DBRX MoE-B is an internal Databricks research model in the DBRX family, intermediate between MoE-A (7.7B) and the flagship DBRX (132B). It has 23.5B total parameters and 6.6B active. It was not released publicly โ€” it serves to validate scaling of the Mixture of Experts (MoE) architecture across growing model sizes.

Purpose

DBRX MoE-B serves as a head-to-head comparison against LLaMA2-13B and the full DBRX-132B. It demonstrates how training efficiency of the DBRX family scales into the 20B+ parameter range, and how fine-grained MoE (16 experts, 4 active) behaves at mid-scale.

Results

On the Databricks Model Gauntlet v0.3 it achieves 45.5%, outperforming LLaMA2-13B (43.8%) at significantly lower training cost. This is the key validation that the DBRX recipe (MoE architecture + curriculum learning + high-quality data + GPT-4 tokenizer) scales smoothly with model size.

Status

The model is not publicly available, neither on Hugging Face nor through APIs. Weights remain internal to Databricks. The public DBRX blog (March 2024) lists it only as a validation artifact โ€” together with MoE-A โ€” of the full DBRX-132B training recipe.

Classification
LLM
Family: DBRX
Applications
Access & deployment
API
Cloud
Weights: Closed
Key parameters
๐Ÿ“ Context: 32K
๐Ÿงฉ Parameters: 23.5B total / 6.6B active
๐Ÿ“ฅ Input: text

Technical specification

Context window
32K
tokens
Parameters
23.5B total / 6.6B active
parameters
License
Databricks internal / research
Hardware requirements
Internal Databricks research model; no public checkpoint available.
Modalities
โฌ‡ Input
text
โฌ† Output
textcode

Capabilities and applications

Application domains

Benchmark results

1 benchmark
Databricks Model Gauntlet v0.3
avg score ยท composite avg of 30+ tasks
45.5%
๐Ÿ“„ Databricks DBRX blog (2024-03-27)

Technical architecture

Core Architecture
Model Form