In tensorflow, we can use tf.trainable_variables() to list all trainable weights to implement l2 regularization. Here is the tutorial:
Multi-layer Neural Network Implements L2 Regularization in TensorFlow – TensorFLow Tutorial
However, it may be not a good way if you have used some built-in functions in tensorflow. In this tutorial, we will introduce you another way: Using GraphKeys.REGULARIZATION_LOSSES to implement l2 regularization.
Preliminary
In tensorflow, parameter regularizer exists in many tensorflow functions. For example:
tf.compat.v1.get_variable( name, shape=None, dtype=None, initializer=None, regularizer=None, trainable=None, collections=None, caching_device=None, partitioner=None, validate_shape=True, use_resource=None, custom_getter=None, constraint=None, synchronization=tf.VariableSynchronization.AUTO, aggregation=tf.compat.v1.VariableAggregation.NONE )
Here regularizer=None.
tf.layers.conv2d(inputs, filters, kernel_size, strides=(1, 1), padding='valid', data_format='channels_last', dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer=None, bias_initializer=tf.zeros_initializer() kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None, trainable=True, name=None, reuse=None)
Here kernel_regularizer = None and bias_regularizer = None.
If we use tf.contrib.layers.l2_regularizer(0.0001) to initialize these weights, how to regularize them?
For example:
x = tf.layers.conv2d(input_tensor, filters1, (1, 1), kernel_initializer=tf.orthogonal_initializer(), use_bias=False, trainable=True, kernel_regularizer=tf.contrib.layers.l2_regularizer(weight_decay), name=conv_name_1 )
We have set kernel_regularizer=tf.contrib.layers.l2_regularizer(weight_decay), how to get regularization loss of this kernel?
How to get regularization loss GraphKeys.REGULARIZATION_LOSSES?
We should notice: Weights initialized by regularizer will be saved in GraphKeys.REGULARIZATION_LOSSES. We can use it to get regularization loss.
For example:
import tensorflow as tf import numpy as np weight_decay = 1e-4 regularizer = tf.contrib.layers.l2_regularizer(weight_decay) input_tensor = tf.get_variable(shape = [64, 40, 200, 1], regularizer = regularizer, dtype=tf.float32, name = "w1") x = tf.layers.conv2d(input_tensor, 64, (3, 3), kernel_initializer=tf.orthogonal_initializer(), use_bias=True, trainable=True, kernel_regularizer=regularizer, name = "conv" ) att = tf.nn.relu(x, name="relu") keys = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) print("list variables in tf.GraphKeys.REGULARIZATION_LOSSES") for k in keys: print(k) init = tf.global_variables_initializer() init_local = tf.local_variables_initializer() with tf.Session() as sess: sess.run([init, init_local]) np.set_printoptions(precision=4, suppress=True) a =sess.run(att) print(a.shape) print("list all trainable variables:") for n in tf.trainable_variables(): print(n.name)
In this code, we will list all variables in tf.GraphKeys.REGULARIZATION_LOSSES and all trainable variables.
Run this code, you will get:
list variables in tf.GraphKeys.REGULARIZATION_LOSSES Tensor("w1/Regularizer/l2_regularizer:0", shape=(), dtype=float32) Tensor("conv/kernel/Regularizer/l2_regularizer:0", shape=(), dtype=float32) list all trainable variables: w1:0 conv/kernel:0 conv/bias:0
In order to get l2 regularization loss, we can use two methods.
If we use tf.GraphKeys.REGULARIZATION_LOSSES to compute, we can do as follows:
keys = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) print("list variables in tf.GraphKeys.REGULARIZATION_LOSSES") for k in keys: print(k) # compute l2 loss using tf.GraphKeys.REGULARIZATION_LOSSES loss = tf.add_n(keys)
If we use tf.trainable_variables() , we can do like this:
# compute l2 loss using tf.trainable_variables() l2_loss = weight_decay * tf.reduce_sum([tf.nn.l2_loss(n) for n in tf.trainable_variables() if 'bias' not in n.name])
We can evaluate results computed by these two methods. Here is the example:
import tensorflow as tf import numpy as np weight_decay = 1e-4 regularizer = tf.contrib.layers.l2_regularizer(weight_decay) input_tensor = tf.get_variable(shape = [64, 40, 200, 1], regularizer = regularizer, dtype=tf.float32, name = "w1") x = tf.layers.conv2d(input_tensor, 64, (3, 3), kernel_initializer=tf.orthogonal_initializer(), use_bias=True, trainable=True, kernel_regularizer=regularizer, name = "conv" ) att = tf.nn.relu(x, name="relu") keys = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) print("list variables in tf.GraphKeys.REGULARIZATION_LOSSES") for k in keys: print(k) # compute l2 loss using tf.GraphKeys.REGULARIZATION_LOSSES loss = tf.add_n(keys) # compute l2 loss using tf.trainable_variables() l2_loss = weight_decay * tf.reduce_sum([tf.nn.l2_loss(n) for n in tf.trainable_variables() if 'bias' not in n.name]) #att = tf.reduce_max(att, axis=-1, keep_dims=True) init = tf.global_variables_initializer() init_local = tf.local_variables_initializer() with tf.Session() as sess: sess.run([init, init_local]) np.set_printoptions(precision=4, suppress=True) a =sess.run(att) print(a.shape) loss = sess.run([loss, l2_loss]) print(loss) print("list all trainable variables:") for n in tf.trainable_variables(): print(n.name)
Run this code, we can find the loss is:
[0.0005492301, 0.0005492301]
It means two loss values are the same.