Understand tf.get_variable(): A Beginner Guide – TensorFlow Tutorial

By | November 8, 2019

tf.get_variable() is often used to get or create a tensorflow variable, in this tutorial, we will discuss how to use it correctly for tensorflow beginners.

Syntax of tf.get_variable()

tf.get_variable(
    name,
    shape=None,
    dtype=None,
    initializer=None,
    regularizer=None,
    trainable=True,
    collections=None,
    caching_device=None,
    partitioner=None,
    validate_shape=True,
    use_resource=None,
    custom_getter=None,
    constraint=None
)

tf.get_variable() is a python function, not a python class.

Gets an existing variable with these parameters or create a new one.

Notice: if tf.get_variable() returns an existing variable,  these variables should be created by tf.get_variable() in the same scope_name defined by tf.variable_scope(scope_name) and reuse = True or tf.AUTO_REUSE.

Parameters explained

As to tf.get_variable(), these parameters are very important.

name, initializer,trainable,dtype, they are same with tf.Variable()

Understand tf.Variable(): A Beginner Guide

Like tf.Variable() , if you use tf.get_variable() to create a new variable, this variable will be added to tf.GraphKeys.GLOBAL_VARIABLES. If you set trainable = True, it will also be added to tf.GraphKeys.TRAINABLE_VARIABLES. If you set trainable = False, it only be added to tf.GraphKeys.GLOBAL_VARIABLES.

For example:

v = tf.get_variable('w',tf.random_normal(shape=[2,2], mean=0, stddev=1))

First, tf.get_variable() will try to get a variable by its name ‘w

If it can not find a exsiting one, it will create a new tensorflow variable named ‘w

If it have found a variable, it will return this variable.

We will use an example to show this process.

Create a two tensorflow variables

import tensorflow as tf
import numpy as np   
  
w1 = tf.Variable(tf.random_normal(shape=[2,2], mean=0, stddev=1), name='w')  
w2 = tf.get_variable('w',tf.random_normal(shape=[3,3], mean=0, stddev=1))

In this code, we have create a variable w1, the name of it is ‘w‘. Then, we want to get a variable by name ‘w‘.

Are w1 and w2 is the same?

Display w1 and w2

with tf.Session() as sess:  
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())
   
    print("w1 = ")
    print(w1.name)
    print(w1.eval())
    print("w2 = ")
    print(w2.name)
    print(w2.eval())

The output result is:

w1 = 
w:0
[[-1.96469784  0.63282955]
 [ 0.28358859  0.86019534]]
w2 = 
w_1:0
[[ 0.49888042  1.6399796  -0.58938736]
 [-0.11313031 -0.52823818 -0.5171479 ]
 [-0.29777986  0.93977857  0.44501412]]

From the result, we can find w1 and w2 is not the same.

Why w1 and w2 are not the same, their name are ‘w‘ when they are created.

Becuase w1 is not created by tf.get_variable() and we have not defined a reused variable scope.

Look at example below, we will create three tensorflow variables.

import tensorflow as tf
import numpy as np   


with tf.variable_scope('v', reuse=tf.AUTO_REUSE):
    w1 = tf.Variable(tf.random_normal(shape=[2,2], mean=0, stddev=1), name='w')
    w2 = tf.get_variable(name = 'w',initializer = tf.random_normal(shape=[3,3], mean=0, stddev=1))
    w3 = tf.get_variable(name = 'w',initializer = tf.random_normal(shape=[4,4], mean=0, stddev=1))

w1, w2 and w3 are the same?

From above, we can find: w1 ≠ w2 and w2 = w3.

Why?

1.w1 is created by tf.Variable() and w2 is created by tf.get_variable()

2.The variable scope is tf.AUTO_REUSE, w2 has been created by tf.get_variable(), so to w3, it will be created by tf.get_variable(), however, the name of it is same with w2, so w2 = w3.

with tf.Session() as sess:  
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())
    
    print("w1 = ")
    print(w1.name)
    print(w1.eval())
    print("w2 = ")
    print(w2.name)
    print(w2.eval())
   
    print("w3 = ")
    print(w3.name)
    print(w3.eval())

From result, we also will find:

w1 = 
v/w:0
[[-1.75778663  0.67832196]
 [ 0.27338189 -1.11548007]]
w2 = 
v/w_1:0
[[ 1.28037763 -1.75218618  0.46354058]
 [-0.3378973   1.29323447  0.26210344]
 [-0.95706284 -0.41852772 -0.34168029]]
w3 = 
v/w_1:0
[[ 1.28037763 -1.75218618  0.46354058]
 [-0.3378973   1.29323447  0.26210344]
 [-0.95706284 -0.41852772 -0.34168029]]

w1 ≠ w2 and w2 = w3

In general, if you want to use tf.get_variable() correctly, you should use it with tf.variable_scope.

Leave a Reply