Understand SoftArgmax: An Improvement of Argmax

SoftArgmax is an improvement of Argmax. In this tutorial, we will introduce it for deep learning beginners.

Why we need SoftArgmax?

We have known argmax operation can not support backprop and gradient operation in tensorflow. Here is the tutorial.

TensorFlow tf.argmax() does not Support Backprop and Gradient Operation

In order to make argmax operation support backprop and gradient operation, we need softargmax.

What is SoftArgmax?

SoftArgmax operation is defined as:

where x = [x₁, x₂, x₃,…, x_n], β≥1

You should notice: the output of softargmax is a float.

We will use a tensorflow example to show you how to implement softargmax.

Import libraries

import tensorflow as tf
import numpy as np

import tensorflow as tf
import numpy as np

import tensorflow as tf
import numpy as np

Create a softargmax function

We will use a tensorflow softargmax function to compute a tensor with 2 dim, for example (32, 25)

def softargmax(alpha, time_step, theda = 100.0):
    alpha = theda * alpha
    
    alpha = tf.nn.softmax(alpha, axis = 1)
    indices = tf.range(time_step, dtype=tf.float32)
    indices_x = tf.reshape(indices, [-1, 1]) #28 * 1
    outputs = tf.matmul(alpha, indices_x)
    return outputs

def softargmax(alpha, time_step, theda = 100.0):
alpha = theda * alpha
alpha = tf.nn.softmax(alpha, axis = 1)
indices = tf.range(time_step, dtype=tf.float32)
indices_x = tf.reshape(indices, [-1, 1]) #28 * 1
outputs = tf.matmul(alpha, indices_x)
return outputs

def softargmax(alpha, time_step, theda = 100.0):
    alpha = theda * alpha
    
    alpha = tf.nn.softmax(alpha, axis = 1)
    indices = tf.range(time_step, dtype=tf.float32)
    indices_x = tf.reshape(indices, [-1, 1]) #28 * 1
    outputs = tf.matmul(alpha, indices_x)
    return outputs

Create a tensor to compute

v1 = tf.Variable(np.array([[5, 2, 1, 2], [3.9, 4, 1, 2], [2, 5.9, 6, 2]]), dtype = tf.float32, name='w1')

v1 = tf.Variable(np.array([[5, 2, 1, 2], [3.9, 4, 1, 2], [2, 5.9, 6, 2]]), dtype = tf.float32, name='w1')

v1 = tf.Variable(np.array([[5, 2, 1, 2], [3.9, 4, 1, 2], [2, 5.9, 6, 2]]), dtype = tf.float32, name='w1')

where the shape of v1 is (3, 4)

Get the index of the last value in v1 on axis = 1.

indices = softargmax(v1, 4)
indices2 = tf.argmax(v1, axis=1)
with tf.Session() as sess:  
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())
   
    print(sess.run(indices))
    print(sess.run(indices2))

indices = softargmax(v1, 4)
indices2 = tf.argmax(v1, axis=1)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
print(sess.run(indices))
print(sess.run(indices2))

indices = softargmax(v1, 4)
indices2 = tf.argmax(v1, axis=1)
with tf.Session() as sess:  
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())
   
    print(sess.run(indices))
    print(sess.run(indices2))

Run this code, we can find:

The softargmax value is:

[[0.       ]
 [0.9999546]
 [1.9999546]]

[[0. ]
[0.9999546]
[1.9999546]]

[[0.       ]
 [0.9999546]
 [1.9999546]]

The argmax value is:

[0 1 2]

[0 1 2]

[0 1 2]

We can find the value of softargmax is float and it is almost equal to argmax.

Understand SoftArgmax: An Improvement of Argmax – TensorFlow Tutorial