Neural Networks: From Fundamentals to Modern AI · From Neuron to MLP: Architecture and Forward Pass

The Universal Approximation Theorem — why non-linearity is necessary

From Neuron to MLP: Architecture and Forward Pass

Introduction

The Universal Approximation Theorem (UAT) by Cybenko (1989) and Hornik (1991) is the theoretical foundation of neural networks: a single hidden layer with enough neurons and a non-linear activation can approximate any continuous function on a compact set to arbitrary accuracy. This lesson covers the precise statement of the theorem, the difference between width and depth universality, practical limitations (exponential neuron counts), and why modern networks are deep rather than just wide. You will see a concrete bump-function construction proving UAT and understand why a linear MLP formally collapses.