Stochastic gradient descent

= an iterative ⚙️ Algorithm for optimizing an objective function with suitable smoothness properties

How?

  1. Gradient determination: vector pointing in the direction of the greatest increase in a function
  2. Gradient descent: move in the opposite direction of where the gradient points at
    • : Learning rate, hyperparameter → chosen by algorithm designer

Example

🔗 Links

🚶 Logistic Regression