Tutorial Example

Swish (Silu) Activation Function in TensorFlow: An Introduction – TensorFlow Tutorial

In this tutorial, we will introduce how to implement swish or silu activation function in tensorflow. There are some tips you should notice.

Swish (Silu) activation function

This function is defined as:

\(f(x)=x⋅sigmoid(βx)\)

The graph of it may be:

The first derivations are:

How to implement swish in tensorflow?

If you are using tensorflow 2.0, you can use tf.nn.silu() to compute, however, if you are using tensorflow 1.0, you can use tf.nn.swish().

As to tf.nn.swish(), it only can compute x * sigmoid(x).

However, we also can build swish function as follows:

def silu(x, theda = 1.0):
    return x * tf.nn.sigmoid(theda *x)

We will use an example to evaluate it.

import tensorflow as tf
import numpy as np

g = tf.Variable(tf.truncated_normal([5, 10], stddev=0.1), name="S")

g1 = tf.nn.swish(g)
def silu(x, theda = 1.0):
    return x * tf.nn.sigmoid(theda *x)

g2 =silu(g)
init = tf.global_variables_initializer()
init_local = tf.local_variables_initializer()
with tf.Session() as sess:
    sess.run([init, init_local])
    np.set_printoptions(precision=4, suppress=True)
    a = sess.run(g1)
    b = sess.run(g2)
    print(a)
    print(b)

Run this code, we will get:

[[-0.078   0.0335  0.0191  0.0617 -0.0464 -0.0279  0.0534  0.003   0.0769
   0.0233]
 [ 0.1009 -0.031   0.0492  0.025   0.0719  0.0536  0.0074  0.0611  0.0198
  -0.0523]
 [-0.0166 -0.029  -0.0513 -0.0181 -0.0227 -0.0599 -0.0526  0.0456  0.0662
   0.0024]
 [-0.0127  0.0219  0.0654 -0.0276  0.0015  0.0697 -0.0611  0.0101 -0.0409
  -0.0018]
 [-0.0873 -0.043   0.0551  0.0507 -0.0393 -0.0582  0.0226  0.0496 -0.0262
  -0.0333]]
[[-0.078   0.0335  0.0191  0.0617 -0.0464 -0.0279  0.0534  0.003   0.0769
   0.0233]
 [ 0.1009 -0.031   0.0492  0.025   0.0719  0.0536  0.0074  0.0611  0.0198
  -0.0523]
 [-0.0166 -0.029  -0.0513 -0.0181 -0.0227 -0.0599 -0.0526  0.0456  0.0662
   0.0024]
 [-0.0127  0.0219  0.0654 -0.0276  0.0015  0.0697 -0.0611  0.0101 -0.0409
  -0.0018]
 [-0.0873 -0.043   0.0551  0.0507 -0.0393 -0.0582  0.0226  0.0496 -0.0262
  -0.0333]]

We will find g1 = g2.