Stochastic gradient descent
= an iterative ⚙️ Algorithm for optimizing an objective function with suitable smoothness properties
How?
- Gradient determination: vector pointing in the direction of the greatest increase in a function
- Gradient descent: move in the opposite direction of where the gradient points at
- : Learning rate, hyperparameter → chosen by algorithm designer
Example
🔗 Links