In this tutorial, we will introduce you how to build your own BiLSTM model using tensorflow, you can modify our code and build a customized model.
Preliminary
In order to build a custom BiLSTM, you should read these tutorial first.
Build Your Own LSTM Model Using TensorFlow: Steps to Create a Customized LSTM – TensorFlow Tutorial
Understand TensorFlow tf.reverse(): Reverse a Tensor Based on Axis – TensorFlow Tutorial
Understand TensorFlow tf.reverse_sequence(): Reverse a Tensor by Length – TensorFlow Tutorial
Then we can start to build our own BiLSTM model.
Build our own BiLSTM model using tensorflow
The source code of BiLSTM model is below:
class BiLSTM(): def __init__(self,inputs, emb_dim, hidden_dim, sequence_length): forword = LSTM(inputs, emb_dim, hidden_dim, sequence_length) backword = LSTM(inputs, emb_dim, hidden_dim, sequence_length, revers = True) self.forword_output = forword.output() # batch_size x seq_length * 200 self.backword_output = backword.output() # batch_size x seq_length * 200 def output(self): output = tf.concat([self.forword_output, self.backword_output], 2 ) return output
Where
LSTM: it is our own LSTM Model in previous tutorial.
inputs: it will be batch_size * seq_length * emb_dim, such as 64 * 40 * 200
emb_dim: the dimension of word embeddings, such as 200
hidden_dim: the hidden dimension in LSTM
sequence_length: the valid length, it is a list.
As code above, we can find a BiLSTM contains two LSTM model, a forword and a backword, the backword LSTM should revers the input, which means we should modify our traditional LSTM.
Modify LSTM to support the feature of reversing inputs
We can modfiy the initialized function of LSTM as following:
def __init__(self,inputs, emb_dim, hidden_dim, sequence_length, revers = False): self.emb_dim = emb_dim self.hidden_dim = hidden_dim self.sequence_length = sequence_length self.batch_size = tf.shape(inputs)[0] self.revers = revers if revers: if sequence_length is not None: inputs = tf.reverse_sequence(inputs, seq_lengths=sequence_length, seq_axis = 1, batch_axis = 0) else: inputs = tf.reverse(inputs, axis = [1]) self.inputs = tf.transpose(inputs, perm=[1, 0, 2])
When sequence_length is None, we will use tf.reverse() to reverse the inputs, otherwise, we will use tf.reverse_sequence() to reverse inputs by sequence_length.
Finally, we will use output() to return the output fo BiLSTM.
We should notice: we should reverse the output if the lstm is backword.
def output(self): if self.revers: if self.sequence_length is not None: # it is a tensor self.outputs = tf.reverse_sequence(self.outputs, seq_lengths=self.sequence_length, seq_axis = 1, batch_axis = 0) else: self.outputs = tf.reverse(self.outputs, axis = [1]) return self.outputs