Neural Networks: From Fundamentals to Modern AI · From Neuron to MLP: Architecture and Forward Pass
The Universal Approximation Theorem — why non-linearity is necessary
From Neuron to MLP: Architecture and Forward Pass
Introduction
The Universal Approximation Theorem (UAT) by Cybenko (1989) and Hornik (1991) is the theoretical foundation of neural networks: a single hidden layer with enough neurons and a non-linear activation can approximate any continuous function on a compact set to arbitrary accuracy. This lesson covers the precise statement of the theorem, the difference between width and depth universality, practical limitations (exponential neuron counts), and why modern networks are deep rather than just wide. You will see a concrete bump-function construction proving UAT and understand why a linear MLP formally collapses.