Skip to content
Back to How Models Learn
GD

Gradient Descent Intuition

Build intuition for gradient descent, learning rate, and unstable updates without turning the lesson into a math lecture.

25 minHow Models Learneasy100 XP

Listen to hear this room section by section.

Key Ideas

Work through these sections in order. Each one builds the mental model you need before the checkpoint questions will feel easy.

A common mental model is a landscape where height represents loss. Gradient descent tries to move the model downhill toward a region with lower error. You do not need to picture the exact mathematics to understand the logic of repeatedly stepping toward better performance.

The important part is that each update is local. The optimizer looks at the current situation and takes a step that should help from there, then checks again after the step.

You've opened 1 of 3 sections. Once the ideas feel clear, move into the checkpoint block below.

Check Your Understanding

These checkpoints reinforce the lesson you just read. If one feels fuzzy, reopen the relevant section above before trying again.

3 checkpoints
1

Task 1

Spot the overshoot

Choose the outcome that best matches a learning rate that is too high.

What is the most likely result of a learning rate that is much too high?

2

Task 2

Classify the training behavior

Match each training trace to the most likely learning-rate behavior.

For each situation, choose the best interpretation.

After 100 steps, the loss is almost unchanged.

The loss rises and falls sharply between successive updates.

The loss trends downward and then gradually plateaus.

3

Task 3

Explain learning rate in builder language

Write a short definition of learning rate without using formal notation.

In plain language, what does the learning rate control?

Ready To Move On?

Up next: Tune the Toy Learner