What is Backpropagation? 

Related articles:
Explain the basic architecture of a Neural Network, model training and key hyper-parameters
What is an activation function? What are the different types? Discuss their pros and cons
Top 20 Deep Learning Interview Questions with detailed Answers (All free)

Backpropagation, Deep Learning Training
Backpropagation, Deep Learning Training (Source: AIML.com)

Backpropagation

Backpropagation, short for “backward propagation of errors,” is a fundamental concept in the field of artificial neural networks, particularly in the context of training deep learning models. In Backward Propagation, the parameters of a Neural Network (i.e. weights and biases) are updated using a gradient descent optimization algorithm so that on each iteration, the gradient is one step closer to the minimum of the cost function. This minimizes the error between the predicted and the actual output thereby improving the predictive accuracy of the network.

Backpropagation consists of two main steps:

  1. Calculate gradients of the loss with respect to all model parameters
    In the backward pass, the algorithm works backward through the network to compute the gradients of the error with respect to the weights and biases. This is done using the chain rule from calculus, which allows for the calculation of how changes in the networkʼs parameters affect the error.
  1. Update the network weights using an optimization algorithm
    The computed gradients are used to adjust the weights and biases in the network, aiming to reduce the error (or loss). This is typically done using optimization algorithms like stochastic gradient descent (SGD) or its variants, such as Adam or RMSprop.
Backpropagation in a Neural Network

Backpropagation in a Neural Network
Source: AIML.com Research

The above phenomenon is referred to as Backpropagation because the process starts at the output layer and then utilizes the chain rule to calculate derivatives as it works its way backward to the input layer.

After each step of backpropagation, another step of forward propagation occurs, during which the input data is forwarded through the network using the updated weights and biases. If gradient descent is functioning correctly, the gradient of the cost function should be lower than it was before the update for every iteration.

For a complete picture of neural network training, please refer to this article: Describe briefly the training process of a Neural Network model

Video Explanation

  • The 3-pack Backpropagation video series by Deep Lizard explains the intuition and mathematics behind backpropagation succinctly (Total Runtime: 33 mins)
YouTube video
Backpropagation video series by Deep Lizard
  • The 3-pack Backpropagation video series by 3Blue1Brown explains the concept of Backpropagation in more detail starting from Gradient descent, mechanics of Backprop and the math behind it (Total Runtime: 43 mins)
YouTube video
Backpropagation video series by 3Blue1Brown
  • If you want a hands-on understanding of Backpropagation, check out the the video by Andrej Karpathy, Former AI Director of Tesla. In this video, Andrej explains the basics of Backpropagation by working out an example on Jupyter Book. You’ll never forget Backprop once you do this. (Total Runtime: 2.5 hrs)
YouTube video
Backpropagation on Jupyter notebook, by Andrej Karpathy, Former AI Director of Tesla

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute
Here goes your text ... Select any part of your text to access the formatting toolbar.