Understand tf.gradients(): Compute Tensor Gradient for TensorFlow Beginners – TensorFlow Tutorial

admin

5 years ago

TensorFlow tf.gradients() function can return the gradient of a tensor. How to understand the result of it? We will use some examples to help tensorflow beginners to understand and use it in this tutorial.

Syntax

tf.gradients(
    ys,
    xs,
    grad_ys=None,
    name='gradients',
    colocate_gradients_with_ops=False,
    gate_gradients=False,
    aggregation_method=None,
    stop_gradients=None
)

Constructs symbolic derivatives of sum of ys w.r.t. x in xs

where ys and xs are each a tensor or a list of tensors

How to understand the result of tf.gradients()?

Suppose

y = [y1, y2], x = [x1, x2, x3]

y = f(x)

To compute the gradient of x based on y, we can do like this:

g = tf.gradients(y, x)

g = tf.gradients([y1, y2], [x1, x2, x3])

As to g = [g1, g2, g3]

where

Here is an example:

import tensorflow as tf
import numpy as np

x = tf.Variable(np.array([[1, 1.2, 1.3],[1, 1.5, 1.2]]), dtype = tf.float32)
z = tf.Variable(np.array([[1, 1],[1.2, 1.2], [2, 1.7]]), dtype = tf.float32)

y = tf.matmul(x, z)
 
g = tf.gradients(y, x)

init = tf.global_variables_initializer() 
init_local = tf.local_variables_initializer()

with tf.Session() as sess:
    sess.run([init, init_local])
    print(sess.run([g]))

Run this code, you will get result:

[[array([[2. , 2.4, 3.7],
       [2. , 2.4, 3.7]], dtype=float32)]]