SVD (Singular Value Decomposition) is common used in recommend system. However, if you are using it in your deep learning model, you should notice: the gradient of svd my be different in numpy and tensorflow.
For example, we will compute the gradient of a svd for a matrix.
Import libraries
import tensorflow as tf import autograd.numpy as np from autograd import grad from autograd import elementwise_grad as egrad np.set_printoptions(precision=4)
Create a matrix
np_w = np.array([[[1,2,3,4,5],[6,7,8,9,0],[1,2,3,4,5],[6,7,8,9,0],[1,2,3,4,5]]], dtype = np.float32) w = tf.convert_to_tensor(np_w)
Here np_w will be computed in numpy and w will be calculated in tensorflow. The value of them are the same.
Compute svd by numpy and tensorflow
svd in numpy
def computeSVD(w): S = np.linalg.svd(w, compute_uv = False ) print(S) return S
More detail: Calculate Singular Value Decomposition (SVD) using Numpy
svd in tensorflow
s = tf.svd(w, compute_uv = False)
More detail: Compute Singular Value Decomposition (SVD) with TensorFlow
Compute gradient of svd in numpy and tensorflow
gradient of svd in numpy
grad_svd = egrad(computeSVD) print(grad_svd(np_w))
gradient of svd in tensorflow
tf_svd_grad = tf.gradients(s, w)
Output all gradients
The result is:
Autograd ArrayBox with value [[2.3617e+01 8.1995e+00 1.0725e-15 2.0245e-16 2.8529e-33]] [[[-0.7682 0.3316 0.2661 0.297 0.3752] [ 0.412 0.7662 0.4684 -0.1456 -0.0505] [ 0.359 0.2421 -0.5396 0.3314 0.6415] [ 0.2086 -0.1071 0.2293 0.8818 -0.339 ] [ 0.2601 -0.4825 0.605 -0.0565 0.5747]]] Tensorflow s: [[2.3617e+01 8.1995e+00 1.6554e-07 1.2132e-07 1.0287e-07]] Tensorflow gradient: [array([[[-0.7078, 0.4885, 0.0953, 0.2866, 0.4112], [ 0.0401, -0.1484, 0.6581, 0.645 , -0.3567], [ 0.326 , -0.263 , -0.3958, 0.6185, 0.5342], [ 0.5805, 0.8075, 0.0395, 0.0912, -0.0329], [ 0.2327, -0.1343, 0.6321, -0.3332, 0.6459]]], dtype=float32)]
From the result we can find:
- The singular value of np_w is the same, it is:2.3617e+01 8.1995e+00 1.0725e-15 2.0245e-16 2.8529e-33
- The gradient of np_w on singular value (s) is not the same.
However, if we set np_w is:
np_w = np.array([[[2,2,3,4,5],[6,7,2,9,0],[1,2,2,4,5],[6,2,8,9,0],[1,2,3,4,5]]], dtype = np.float32)
The result is:
Autograd ArrayBox with value [[20.6259 7.8468 5.5151 0.6618 0.3809]] [[[ 0.6523 0.0744 0.159 -0.2673 0.6872] [ 0.3692 0.7379 -0.2595 0.4647 -0.1895] [-0.1738 -0.3197 -0.3244 0.6852 0.5411] [ 0.3227 -0.2817 0.7171 0.4887 -0.2518] [-0.5512 0.5181 0.5365 0.0651 0.3683]]] Tensorflow s: [[20.6259 7.8468 5.5151 0.6618 0.3809]] Tensorflow gradient: [array([[[ 0.6523, 0.0744, 0.159 , -0.2673, 0.6872], [ 0.3692, 0.7379, -0.2595, 0.4647, -0.1895], [-0.1738, -0.3197, -0.3244, 0.6852, 0.5411], [ 0.3227, -0.2817, 0.7171, 0.4887, -0.2518], [-0.5512, 0.5181, 0.5365, 0.0651, 0.3683]]], dtype=float32)]
The single value and gradient are the same.