How it works
Fit y = mx + b by minimizing mean squared error — the average of (pred − actual)² across all points.
Start with a flat guess, compute how wrong it is, nudge m and b downhill. The gradient tells you which way is down. Repeat until it converges.