Gradient Descent (GD)

Shivam Singh
2 min readNov 19, 2024

--

AI generated image for gradient descent in 3D space

Gradient Descent (GD) is fundamental in training neural networks, enabling them to learn from data by updating the weights of the network to minimise the error between the predicted output and the actual target [RHW86]. GD is an optimisation algorithm that minimises the loss of a predictive model by iterating over a training dataset. Back-propagation, an automatic differentiation algorithm, calculates gradients for the weights in a neural network. Together, GD and back-propagation are used to train neural network models by continuously updating the model’s parameters to reduce errors.

Goodfellow et al. [GBC16] explain the learning process of neural networks as follows. Consider a simple feedforward neural network with an input layer, one hidden layer, and an output layer. Backpropagation computes the gradient of the loss function with respect to each weight. The process can be divided into two phases, the forward pass and the backward pass.

Mathematical calculation during forward pass
Mathematical calculation during backward pass
Weight and bias update post backward pass

GDandbackpropagation is a powerful and efficient method for training neural networks. By iteratively updating the network’s weights to minimise the loss function, it enables the network to learn complex patterns in data.

Happy coding ..!!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Shivam Singh
Shivam Singh

Written by Shivam Singh

AI Researcher - Generative AI, Deep Learning, Computer Vision

No responses yet