Understand Softmax Function Gradient: A Beginner Guide – Deep Learning Tutorial

By admin | January 15, 2020

Softmax function is widely used in deep learning, how about its gradient? We will discuss this topic in this tutorial for deep learning beginners.

The equation of softmax function

The formula of softmax function is:

where a₁+a₂+…+a_n = 1.

The gradient of softmax function

The gradient of softmax function is:

From above, we can find the softmax may cause gradient vanishing problem problem.

For example, if a_i ≈ 1 or a_i ≈ 0, the gradient of softmax will be 0, the back weight of softmax function will not be updated. If you plan to use soft softmax function, this point you should concern.

The equation of softmax function

The gradient of softmax function

Leave a Reply Cancel reply