A common mental model is a landscape where height represents loss. Gradient descent tries to move the model downhill toward a region with lower error. You do not need to picture the exact mathematics to understand the logic of repeatedly stepping toward better performance.
The important part is that each update is local. The optimizer looks at the current situation and takes a step that should help from there, then checks again after the step.