Neural Networks: From Fundamentals to Modern AI · Math and tools: tensors, gradients, Python, NumPy

Derivative and chain rule — intuition of the direction of fastest growth

Math and tools: tensors, gradients, Python, NumPy

Introduction

The derivative of a function at a point answers one question: "if I nudge x a bit right now, by how much and in which direction will y change?". It is a measure of the "steepness" of the graph at that specific point — a large positive derivative means a sharp climb, a large negative one a sharp drop, and zero means an instant of flatness (a peak, a valley, or an inflection point). The chain rule is the recipe for the derivative of a function composed of other functions — and it is exactly what lets us compute the gradient inside a neural network, where the output is a complicated composition of layers. No epsilon-delta in this lesson — only the intuition of "how fast something grows" plus an example showing that a chain of functions yields a chain of derivatives via multiplication.