There two outputs in Bert model, they are get_sequence_output() and get_pooled_output(). In this tutorial, we will introduce them for bert beginners.
get_sequence_output()
get_sequence_output() function will return the value of variable self.sequence_output. Here is the source code:
def get_sequence_output(self): """Gets final hidden layer of encoder. Returns: float Tensor of shape [batch_size, seq_length, hidden_size] corresponding to the final hidden of the transformer encoder. """ return self.sequence_output
get_pooled_output()
get_pooled_output() will return the value of self.pooled_output.
def get_pooled_output(self): return self.pooled_output
self.sequence_output and self.pooled_output
From the source code, we can find:
self.sequence_output is the output of last encoder layer in bert. The shape of it may be: batch_size * max_length * hidden_size
hidden_size can be set in file:bert_config.json.
For example: self.sequence_output may be 32 * 50 * 768, here batch_size is 32, the maximum sequence length is 50.
self.pooled_output is the output of the first token in self.sequence_output passed a dense layer. The shape of it may be: batch_size * hidden_size. It can be used to classification.
In order to understand tf.layers.dense(), you can view:
Understand tf.layers.Dense(): How to Use and Regularization
Understand tf.contrib.layers.fully_connected(): How to Use and Regularization
The relationship of self.sequence_output and self.pooled_output
It can be viewed as: