Gradient Descent
An optimisation algorithm that iteratively updates model parameters to reduce prediction error.
Gradient Descent
An optimisation algorithm used to minimise the error of machine learning models by iteratively adjusting model parameters.
What Is Gradient Descent
Gradient descent is a foundational optimisation method used to improve machine learning models. It updates weights in the direction that most reduces the loss function, helping models learn patterns that better match expected outputs.
How Gradient Descent Works
During training, the model computes prediction error, then calculates gradients that indicate how each parameter affects that error. Parameters are adjusted by a learning rate multiplied by the gradient. Repeating this process over many iterations progressively lowers overall loss.
Types of Gradient Descent Algorithms
- Batch gradient descent: uses the full dataset for each update.
- Stochastic gradient descent (SGD): updates after each sample.
- Mini-batch gradient descent: updates using small subsets of data.
- Adaptive variants: methods such as Adam and RMSprop adjust update behaviour dynamically.
Role in Training Neural Networks
Neural networks with millions or billions of parameters rely on gradient descent and backpropagation to converge toward useful representations. Without this optimisation cycle, modern deep learning systems for translation, speech, and language understanding could not be trained effectively.
Importance for Machine Learning Systems
Gradient descent directly affects model quality, training stability, and compute efficiency. Choosing suitable learning rates, batch sizes, and optimiser strategies is critical for achieving reliable performance in production AI systems, including document-level language and translation models.