If you have used tensorflow tf.nn.rnn_cell.LSTMCell(), you will find there is a forget_bias in its initialized function.
__init__( num_units, use_peepholes=False, cell_clip=None, initializer=None, num_proj=None, proj_clip=None, num_unit_shards=None, num_proj_shards=None, forget_bias=1.0, state_is_tuple=True, activation=None, reuse=None, name=None )
Source code is here:
https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/ops/rnn_cell_impl.py
Where forget_bias=1.0
Why we should add forget_bias for LSTM?
As to LSTM network, we can not find a forget bias in its formula.
As to tensorflow lstm source code, we can find the reason for adding forget bias for lstm forget gate.
We add forget_bias (default: 1) to the biases of the forget gate in order to reduce the scale of forgetting in the beginning of the training.
In tensorflow, forget bias is add to lstm forget gate as:
forget_bias_tensor = constant_op.constant(self._forget_bias, dtype=f.dtype) # Note that using `add` and `multiply` instead of `+` and `*` gives a # performance improvement. So using those at the cost of readability. add = math_ops.add multiply = math_ops.multiply new_c = add(multiply(c, sigmoid(add(f, forget_bias_tensor))), multiply(sigmoid(i), self._activation(j))) new_h = multiply(self._activation(new_c), sigmoid(o))
Following this method, we can add a forget bias for our custom lstm.
Here is an example:
# Forget Gate # forget_bias, default is 1.0 f = tf.sigmoid( tf.matmul(x, self.Wf) + tf.matmul(previous_hidden_state, self.Uf) + self.bf + 1.0 )
To understand how to create a custom lstm using tensorflow, you can read this tutorial:
Build Your Own LSTM Model Using TensorFlow: Steps to Create a Customized LSTM