Understand Smoothing Normalization in Speech Recognition – Deep Learning Tutorial

By | March 29, 2022

Smoothing normalization is proposed in paper: Attention-Based Models for Speech Recognition. In this tutorial, we will introduce how to implement it in tensorflow.

Smoothing Normalization

It is defined as:

\(a_{i, j} = sigmoid(e_{i, j}) / sum_j(sigmoid(e_{i, j}))\)

Here \(e_{i,j}\) is attention score.

How to implement smoothing normalization in tensorflow?

It is easy to implement, here is an example code:

    def _smoothing_normalization(e):
	"""Applies a smoothing normalization function instead of softmax
	Introduced in:
		J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Ben-
	  gio, “Attention-based models for speech recognition,” in Ad-
	  vances in Neural Information Processing Systems, 2015, pp.
	  577–585.
	############################################################################
						Smoothing normalization function
				a_{i, j} = sigmoid(e_{i, j}) / sum_j(sigmoid(e_{i, j}))
	############################################################################
	Args:
		e: matrix [batch_size, max_time(memory_time)]: expected to be energy (score)
			values of an attention mechanism
	Returns:
		matrix [batch_size, max_time]: [0, 1] normalized alignments with possible
			attendance to multiple memory time steps.
	"""
	return tf.nn.sigmoid(e) / tf.reduce_sum(tf.nn.sigmoid(e), axis=-1, keepdims=True)

Leave a Reply