Transformer from Scratch · Transformer Foundations
Encoder, Decoder and Decoder-Only Models
Transformer Foundations
Introduction
Transformers appear in several layouts. In this lesson you will distinguish encoder-only, decoder-only and encoder-decoder models, and understand why GPT uses a masked decoder-only setup while BERT uses an encoder.