Build a Custom BiLSTM Model Using TensorFlow: A Step Guide – TensorFlow Tutorial

By | July 22, 2020

In this tutorial, we will introduce you how to build your own BiLSTM model using tensorflow, you can modify our code and build a customized model.

Preliminary

In order to build a custom BiLSTM, you should read these tutorial first.

Build Your Own LSTM Model Using TensorFlow: Steps to Create a Customized LSTM – TensorFlow Tutorial

Understand TensorFlow tf.reverse(): Reverse a Tensor Based on Axis – TensorFlow Tutorial

Understand TensorFlow tf.reverse_sequence(): Reverse a Tensor by Length – TensorFlow Tutorial

Then we can start to build our own BiLSTM model.

Build our own BiLSTM model using tensorflow

The structure of BiLSTM

The source code of BiLSTM model is below:

class BiLSTM():
    def __init__(self,inputs, emb_dim, hidden_dim, sequence_length):
        forword = LSTM(inputs, emb_dim, hidden_dim, sequence_length)
        backword = LSTM(inputs, emb_dim, hidden_dim, sequence_length, revers = True)
        
        self.forword_output = forword.output() # batch_size x seq_length * 200
        self.backword_output = backword.output() # batch_size x seq_length * 200
    
    def output(self):
        output = tf.concat([self.forword_output, self.backword_output], 2 )
        return output

Where

LSTM: it is our own LSTM Model in previous tutorial.

inputs: it will be batch_size * seq_length * emb_dim, such as 64 * 40 * 200

emb_dim: the dimension of word embeddings, such as 200

hidden_dim: the hidden dimension in LSTM

sequence_length: the valid length, it is a list.

As code above, we can find a BiLSTM contains two LSTM model, a forword  and a backword, the backword LSTM should revers the input, which means we should modify our traditional LSTM.

Modify LSTM to support the feature of reversing inputs

We can modfiy the initialized function of LSTM as following:

def __init__(self,inputs, emb_dim, hidden_dim, sequence_length, revers = False):
        self.emb_dim = emb_dim
        self.hidden_dim = hidden_dim
        self.sequence_length = sequence_length
        
        self.batch_size = tf.shape(inputs)[0]
        self.revers = revers
        if revers:
            if sequence_length is not None:
                inputs = tf.reverse_sequence(inputs, seq_lengths=sequence_length, seq_axis = 1, batch_axis = 0)
            else:
                inputs = tf.reverse(inputs, axis = [1])
       
        self.inputs = tf.transpose(inputs, perm=[1, 0, 2])

When sequence_length is None, we will use tf.reverse() to reverse the inputs, otherwise, we will use tf.reverse_sequence() to reverse inputs by sequence_length.

Finally, we will use output() to return the output fo BiLSTM.

We should notice: we should reverse the output if the lstm is backword.

    def output(self):
        if self.revers:
            if self.sequence_length is not None: # it is a tensor
                self.outputs = tf.reverse_sequence(self.outputs, seq_lengths=self.sequence_length, seq_axis = 1, batch_axis = 0)
            else:
                
                self.outputs = tf.reverse(self.outputs, axis = [1])
        return self.outputs

Leave a Reply