Add forget_bias for Your Custom LSTM Using TensorFlow: A Beginner Guide – TensorFlow Tutorial

By | July 20, 2020

If you have used tensorflow tf.nn.rnn_cell.LSTMCell(), you will find there is a forget_bias in its initialized function.

__init__(
    num_units,
    use_peepholes=False,
    cell_clip=None,
    initializer=None,
    num_proj=None,
    proj_clip=None,
    num_unit_shards=None,
    num_proj_shards=None,
    forget_bias=1.0,
    state_is_tuple=True,
    activation=None,
    reuse=None,
    name=None
)

Source code is here:

https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/ops/rnn_cell_impl.py

Where forget_bias=1.0

Why we should add forget_bias for LSTM?

As to LSTM network, we can not find a forget bias in its formula.

The formula of LSTM

As to tensorflow lstm source code, we can find the reason for adding forget bias for lstm forget gate.

We add forget_bias (default: 1) to the biases of the forget gate in order to reduce the scale of forgetting in the beginning of the training.

In tensorflow, forget bias is add to lstm forget gate as:

    forget_bias_tensor = constant_op.constant(self._forget_bias, dtype=f.dtype)
    # Note that using `add` and `multiply` instead of `+` and `*` gives a
    # performance improvement. So using those at the cost of readability.
    add = math_ops.add
    multiply = math_ops.multiply
    new_c = add(multiply(c, sigmoid(add(f, forget_bias_tensor))),
                multiply(sigmoid(i), self._activation(j)))
    new_h = multiply(self._activation(new_c), sigmoid(o))

Following this method, we can add a forget bias for our custom lstm.

Here is an example:

            # Forget Gate
            # forget_bias, default is 1.0
            f = tf.sigmoid(
                tf.matmul(x, self.Wf) +
                tf.matmul(previous_hidden_state, self.Uf) + self.bf +
                1.0
            )

To understand how to create a custom lstm using tensorflow, you can read this tutorial:

Build Your Own LSTM Model Using TensorFlow: Steps to Create a Customized LSTM

Leave a Reply