Can We Remove Reset Gate in GRU? Can It Decrease the Performance of GRU?

As to GRU, there is a reset gate in it.

Here r^t is the rest gate of GRU. We have use z^t to forget information that passes into current GRU cell, can we remove the reset gate r^t?

The answer is yes.

Why can we remove reset gate in GRU?

Jos van der Westhuizen and Joan Lasenby proposed a JANET in paper ‘The unreasonable effectiveness of the forget gate‘. The JANET is an improved GRU which has removed reset gate r^t.

Compare GRU and JANET

GRU	JANET

From the formula of JANET, we can find if we remove r^t in GRU, it will be converted to JANET.

From paper we can find the performance of JANET is little better than LSTM on synthetic memory tasks and on the MNIST, pMNIST, and MIT-BIH arrhythmia datasets.

Which means if we remove the reset gate in GRU, the performance of GRU will not be decreased or may be improved on some tasks.

Can We Remove Reset Gate in GRU? Can It Decrease the Performance of GRU? – Deep Learning Tutorial

The answer is yes.

Why can we remove reset gate in GRU?

Compare GRU and JANET

Leave a Reply Cancel reply