Understand Keras binary_crossentropy() Loss

In Keras, we can use keras.losses.binary_crossentropy() to compute loss value. In this tutorial, we will discuss how to use this function correctly.

Keras binary_crossentropy()

Keras binary_crossentropy() is defined as:

@tf_export('keras.metrics.binary_crossentropy',
           'keras.losses.binary_crossentropy')
def binary_crossentropy(y_true, y_pred):
  return K.mean(K.binary_crossentropy(y_true, y_pred), axis=-1)

It will call keras.backend.binary_crossentropy() function.

@tf_export('keras.backend.binary_crossentropy')
def binary_crossentropy(target, output, from_logits=False):
  # Note: nn.sigmoid_cross_entropy_with_logits
  # expects logits, Keras expects probabilities.
  if not from_logits:
    # transform back to logits
    epsilon_ = _to_tensor(epsilon(), output.dtype.base_dtype)
    output = clip_ops.clip_by_value(output, epsilon_, 1 - epsilon_)
    output = math_ops.log(output / (1 - output))
  return nn.sigmoid_cross_entropy_with_logits(labels=target, logits=output)

From code above, we can find this function will call tf.nn.sigmoid_cross_entropy_with_logits() to compute the loss value.

Understand tf.nn.sigmoid_cross_entropy_with_logits(): A Beginner Guide – TensorFlow Tutorial

How to understand from_logits parameter?

We will use an example to show you how to understand.

import tensorflow as tf
from keras import backend as K
#
batch_size = 5
output = tf.Variable(tf.truncated_normal([batch_size, 5], stddev=0.1), name="output")
label = tf.one_hot([0, 1, 1, 0, 1],  batch_size)
tf_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=label, logits=output))

#compute with keras
#output = tf.sigmoid(output)
keras_loss = tf.reduce_mean(K.binary_crossentropy(target = label, output= output, from_logits= True))

init = tf.global_variables_initializer()
init_local = tf.local_variables_initializer()
with tf.Session() as sess:
    sess.run([init, init_local])
    a =sess.run([tf_loss, keras_loss])
    print(a)

In this tutorial, we will compute a loss value by using tf.nn.sigmoid_cross_entropy_with_logits() and K.binary_crossentropy().

Part 1: If the output is not computed by tf.sigmoid()

We will set from_logits= True

Run this code, you will see:

[0.71272296, 0.71272296]

They are the same.

Part 2: If the output is computed by tf.sigmoid()

from_logits = False

output = tf.sigmoid(output)
keras_loss = tf.reduce_mean(K.binary_crossentropy(target = label, output= output, from_logits= False))

We will see the losses are:

[0.71148753, 0.71148753]

They are also the same.

We shoud notice: as to tf.nn.sigmoid_cross_entropy_with_logits(), the output will not be computed by tf.sigmoid() function.

Understand Keras binary_crossentropy() Loss – Keras Tutorial

Keras binary_crossentropy()

How to understand from_logits parameter?

Leave a Reply Cancel reply