How Long Sequence Can be Processed Effectively by LSTM?

LSTM is a good method to process sequence in NLP, however, how long sequence can be handled effectively by it? In this tutorial, we will discuss this topic.

Look at two paper:

(1) Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN

In this paper, we can find:

Experiments have demonstrated that an IndRNN can well process sequences over5000 steps while LSTM could only process less than1000 steps.

(2) Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context

In this paper, we can find:

we have empirically shown that a standard LSTM language model can effectively use about 200 tokens of context on two benchmark datasets, regardless of hyperparameter settings such as model size.

The sequence length can be processed effectively is different in this papers, but we also can be sure:

LSTM can handle long sequence, however, the sequence length can not be too long, for eample, we can limit the lstm to process less 200 words in a sentence.

In order to enhance the ability to handle long sequence using lstm, we can use residual lstm. Here is a paper:

Residual LSTM: Design of a Deep Recurrent Architecture for Distant Speech Recognition

How Long Sequence Can be Processed Effectively by LSTM? – LSTM Tutorial

Leave a Reply Cancel reply