
Gated DeltaNet
Linear-transformer architecture combining the delta rule with gating, an improvement over Mamba2 and DeltaNet (NVIDIA Research, MIT CSAIL, ICLR 2025).
๐ฌ Research๐ฌ Research onlyLLM
Parameters
0.4B โ 1.3B (skala badawcza)
parameters
Release date
9 December 2024
Access:DownloadDeployment:๐ป Local
Overview
Classification
LLM
Access & deployment
Download
Local
Weights: Closed
Key parameters
๐งฉ Parameters: 0.4B โ 1.3B (skala badawcza)
๐ฅ Input: text
Technical specification
Parameters
0.4B โ 1.3B (skala badawcza)
parameters
License
NVIDIA Source Code License-NC
Modalities
โฌ Input
text
โฌ Output
text
Capabilities and applications
Native model capabilities
Language modeling
Ability to predict subsequent tokens and generate coherent natural-language text based on the preceding context.
Category: language
Long context
The model's ability to handle long context and maintain coherence over a large amount of input data.
Category: reasoning
Technical architecture
Model Form
Training Techniques