Understand the Value of tf.gradients() on tf.nn.softmax() is 0 – TensorFlow Tutorial

By | January 17, 2020

Softmax function is differentiable, however, if you get the gradient of it by tf.gradients(), you will get 0. In this tutorial, we will explain the reason for tensorflow beginners.

Look at example code below:

import tensorflow as tf
import numpy as np
z = tf.Variable(np.array([[1, 2],[3, 2]]), dtype = tf.float32)
y = tf.nn.softmax(z, axis = 1)
r = tf.gradients(y,z)

init = tf.global_variables_initializer() 
init_local = tf.local_variables_initializer()

with tf.Session() as sess:
    sess.run([init, init_local])
    print(sess.run([y]))
    print(sess.run([r]))

Run this python code, you will get result like:

[array([[0.26894143, 0.7310586 ],
       [0.7310586 , 0.26894143]], dtype=float32)]
[[array([[0., 0.],
       [0., 0.]], dtype=float32)]]

The value of tf.gradients() is 0.

Why the value of tf.gradients() is 0?

To understand the reason, you should understand these two topic:

How to compute the gradient of softmax function.

How to tf.gradients() return value in tensorflow.

As to code above, it can be expressed as:

the gradient of softmax function examples

The gradient of x00 based on y is computed by tf.gradient() as:

the gradient of softmax function in tf.gradients()

So the value of tf.gradients() on tf.nn.softmax() is 0.

Leave a Reply