Understand the Value of tf.gradients() on tf.nn.softmax() is 0

Softmax function is differentiable, however, if you get the gradient of it by tf.gradients(), you will get 0. In this tutorial, we will explain the reason for tensorflow beginners.

Look at example code below:

import tensorflow as tf
import numpy as np
z = tf.Variable(np.array([[1, 2],[3, 2]]), dtype = tf.float32)
y = tf.nn.softmax(z, axis = 1)
r = tf.gradients(y,z)

init = tf.global_variables_initializer() 
init_local = tf.local_variables_initializer()

with tf.Session() as sess:
    sess.run([init, init_local])
    print(sess.run([y]))
    print(sess.run([r]))

Run this python code, you will get result like:

[array([[0.26894143, 0.7310586 ],
       [0.7310586 , 0.26894143]], dtype=float32)]
[[array([[0., 0.],
       [0., 0.]], dtype=float32)]]

The value of tf.gradients() is 0.

Why the value of tf.gradients() is 0?

To understand the reason, you should understand these two topic:

How to compute the gradient of softmax function.

How to tf.gradients() return value in tensorflow.

As to code above, it can be expressed as:

The gradient of x₀₀ based on y is computed by tf.gradient() as:

So the value of tf.gradients() on tf.nn.softmax() is 0.

Understand the Value of tf.gradients() on tf.nn.softmax() is 0 – TensorFlow Tutorial

Why the value of tf.gradients() is 0?

Leave a Reply Cancel reply