Robots Atlas>ROBOTS ATLAS

Transformer from Scratch · Decoder-Only Transformer

Language Modeling Head and Logits

Decoder-Only Transformer

Introduction

You will learn the language modeling head that maps token representations to vocabulary logits, and the relationship between logits, softmax, and cross-entropy.