Neural Networks: From Fundamentals to Modern AI · Backpropagation: How a Network Learns
Forward pass vs backward pass — symmetry and gradient flow
Backpropagation: How a Network Learns
Introduction
Every network layer has two mirror passes: forward propagates values from input to loss, backward propagates gradients from loss to parameters and inputs. This lesson shows the mathematical symmetry of these two flows: an operation's local Jacobian in the forward becomes its transpose in the backward (J → J^T), a "+" node distributes the gradient identically to both inputs, a "*" node swaps the inputs (x · y → upstream · y and upstream · x), a "copy" node in the forward becomes a "+" node in the backward. You will understand why the computational cost of backward is comparable to forward (FLOPS of the same order), why backward memory grows with depth (the activation cache), and how this symmetry extends to the elementary network operations: matmul, broadcast, reduction, indexing.