We often create our model based on an existing model. For example, you may create a BiLSTM at the top of the Bert Model.
In order to train you BiLSTM, you have to notice some questions.
Question 1: How to load an existing model
In tensorflow, we can use saver.restore() to load an existing model. Here is the tutorial:
Steps to Load TensorFlow Model Using saver.restore() Correctly – TensorFlow Tutorial
Question 2: How to initialize new variables
As to example above, we have created a BiLSTM at the top of Bert, we should only initialize variables in BiLSTM, variables in Bert we can not initialize.
Here this the tutorial:
Only Initialize New Variables When Using an Existing Model for Fine-tuning – TensorFlow Tutorial
Question 3: How to get the ouput and weights in existing model
As to example above, the output of Bert is the input of BiLSTM. In order to get the output of Bert, we can read this tutorial:
Question 4: How to train models with different learning rate
You may plan to train your BiLSTM with learning rate 1e-3, fine-tune Bert with learning rate 1e-5. In order to implement this strategy, you can read:
Train Multiple Neural Layers with Different Learning Rate – TensorFlow Tutorial
Question 5: How to get stable result
In tensorflow, we can set an random seed to make the result is stable. Here is the tutorial:
A Beginner Guide to Get Stable Result in TensorFlow – TensorFlow Tutorial
Question 6: How to save tensorflow model
In tensorflow, we can use saver.save() to save a tensorflow model. Here is the example code:
path = saver.save(sess, checkpoint_dir+model_name+"-"+str(round(test_acc, 4)), global_step=current_step)