Understand Softmax Function Gradient: A Beginner Guide – Deep Learning Tutorial

By | January 15, 2020

Softmax function is widely used in deep learning, how about its gradient? We will discuss this topic in this tutorial for deep learning beginners.

softmax function examples

The equation of softmax function

The formula of softmax function is:

softmax function equation

where a1+a2+…+an = 1.

The gradient of softmax function

The gradient of softmax function is:

the gradient of softmax function

From above, we can find the softmax may cause gradient vanishing problem problem.

For example, if ai ≈ 1 or ai ≈ 0, the gradient of softmax will be 0, the back weight of softmax function will not be updated. If you plan to use soft softmax function, this point you should concern.

Leave a Reply