SVD Gradient May Be Different in NumPy and TensorFlow – TensorFlow Tutorial

By | January 7, 2020

SVD (Singular Value Decomposition) is common used in recommend system. However, if you are using it in your deep learning model, you should notice: the gradient of svd my be different in numpy and tensorflow.

For example, we will compute the gradient of a svd for a matrix.

Import libraries

import tensorflow as tf
import autograd.numpy as np
from autograd import grad
from autograd import elementwise_grad as egrad 
np.set_printoptions(precision=4)

Create a matrix

np_w = np.array([[[1,2,3,4,5],[6,7,8,9,0],[1,2,3,4,5],[6,7,8,9,0],[1,2,3,4,5]]], dtype = np.float32)
w = tf.convert_to_tensor(np_w)

Here np_w will be computed in numpy and w will be calculated in tensorflow. The value of them are the same.

Compute svd by numpy and tensorflow

svd in numpy

def computeSVD(w):
    S = np.linalg.svd(w, compute_uv = False )
    print(S)
    return S

More detail: Calculate Singular Value Decomposition (SVD) using Numpy

svd in tensorflow

s = tf.svd(w, compute_uv = False)

More detail: Compute Singular Value Decomposition (SVD) with TensorFlow

Compute gradient of svd in numpy and tensorflow

gradient of svd in numpy

grad_svd = egrad(computeSVD)
print(grad_svd(np_w))

gradient of svd in tensorflow

tf_svd_grad = tf.gradients(s, w)

Output all gradients

The result is:

Autograd ArrayBox with value [[2.3617e+01 8.1995e+00 1.0725e-15 2.0245e-16 2.8529e-33]]
[[[-0.7682  0.3316  0.2661  0.297   0.3752]
  [ 0.412   0.7662  0.4684 -0.1456 -0.0505]
  [ 0.359   0.2421 -0.5396  0.3314  0.6415]
  [ 0.2086 -0.1071  0.2293  0.8818 -0.339 ]
  [ 0.2601 -0.4825  0.605  -0.0565  0.5747]]]
Tensorflow s:

[[2.3617e+01 8.1995e+00 1.6554e-07 1.2132e-07 1.0287e-07]]
Tensorflow gradient:

[array([[[-0.7078,  0.4885,  0.0953,  0.2866,  0.4112],
        [ 0.0401, -0.1484,  0.6581,  0.645 , -0.3567],
        [ 0.326 , -0.263 , -0.3958,  0.6185,  0.5342],
        [ 0.5805,  0.8075,  0.0395,  0.0912, -0.0329],
        [ 0.2327, -0.1343,  0.6321, -0.3332,  0.6459]]], dtype=float32)]

the gradient of svd may be different in numpy and tensorflow

From the result we can find:

  • The singular value of np_w is the same, it is:2.3617e+01 8.1995e+00 1.0725e-15 2.0245e-16 2.8529e-33
  • The gradient of np_w on singular value (s) is not the same.

However, if we set np_w is:

np_w = np.array([[[2,2,3,4,5],[6,7,2,9,0],[1,2,2,4,5],[6,2,8,9,0],[1,2,3,4,5]]], dtype = np.float32)

The result is:

Autograd ArrayBox with value [[20.6259  7.8468  5.5151  0.6618  0.3809]]
[[[ 0.6523  0.0744  0.159  -0.2673  0.6872]
  [ 0.3692  0.7379 -0.2595  0.4647 -0.1895]
  [-0.1738 -0.3197 -0.3244  0.6852  0.5411]
  [ 0.3227 -0.2817  0.7171  0.4887 -0.2518]
  [-0.5512  0.5181  0.5365  0.0651  0.3683]]]
Tensorflow s:

[[20.6259  7.8468  5.5151  0.6618  0.3809]]
Tensorflow gradient:

[array([[[ 0.6523,  0.0744,  0.159 , -0.2673,  0.6872],
        [ 0.3692,  0.7379, -0.2595,  0.4647, -0.1895],
        [-0.1738, -0.3197, -0.3244,  0.6852,  0.5411],
        [ 0.3227, -0.2817,  0.7171,  0.4887, -0.2518],
        [-0.5512,  0.5181,  0.5365,  0.0651,  0.3683]]], dtype=float32)]

The single value and gradient are the same.

Leave a Reply