Sometimes, we need to train a deep learning model with different learning rate. In this tutorial, we will introduce you how to do in tensorflow.
For example:
There are 12 layers in our model, we plan to train layer 1 – layer 10 with 2e-5 learning rate, layer 11 – layer 12 with 1e-3 learning rate.
This is common question when you plan to a fine tune a model.
How to train model with different learning rate?
We will use some steps to introduce how to d0.
Step 1: Create two global step variables
global_step = tf.Variable(0, name="g1", trainable=False) pretraind_model_global_step = tf.Variable(0, name="g2", trainable=False)
Step 2: Get different layers for different learning rate
First, we should get neural network layers for different learning rate.
all_variables = tf.trainable_variables() pretrained_var_list = [x for x in all_variables if 'layer 11' not in x.name and 'layer 12' not in x.name] normal_var_list = [x for x in all_variables if 'layer 11' in x.name or 'layer 12' in x.name]
Here pretrained_var_list will trained by 2e-5, normal_var_list will be learned with 1e-3.
Step 3: Create an operation to train
Here is an example:
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) with tf.control_dependencies(update_ops): normal_optimizer = tf.train.AdamOptimizer(1e-3, name = 'normal_adam') normal_grads_and_vars = normal_optimizer.compute_gradients(model.loss, var_list=normal_var_list) train_normal_op = normal_optimizer.apply_gradients(normal_grads_and_vars, global_step=global_step) pretrained_optimizer = tf.train.AdamOptimizer(2e-5, name = 'pretrained_adam') pretrained_grads_and_vars = pretrained_optimizer.compute_gradients(model.loss, var_list=pretrained_var_list) train_pretrained_op = pretrained_optimizer.apply_gradients(pretrained_grads_and_vars, global_step=pretraind_model_global_step) train_op = tf.group(train_normal_op, train_pretrained_op)
model.loss is the model loss function, we will use tf.group() to merge two training strategies.
Finally, we can use sess.run() to start to train our model.
sess.run(tf.global_variables_initializer()) _, step, loss, accuracy = sess.run( [train_op, global_step, model.loss, model.accuracy], feed_dict)
However, if you are fine tuning an existig model and have used saver.restore() to restore a model, you should use sess.run(tf.global_variables_initializer()) carefully.
Here is a tutorial:
Steps to Load TensorFlow Model Using saver.restore() Correctly – TensorFlow Tutorial