Understand torch.nn.init.xavier_uniform_() and torch.nn.init.xavier_normal_() with Examples – PyTorch Tutorial

By | May 6, 2022

Xavier initialized method contains two types: uniform and normal. In pytorch, they are:

  • uniform: torch.nn.init.xavier_uniform_()
  • normal: torch.nn.init.xavier_normal_()

They will initialize pytorch tensors with different ways.

Notice: if you are using tensorflow, you also can read:

Initialize TensorFlow Weights Using Xavier Initialization : A Beginner Guide – TensorFlow Tutorial

Understand torch.nn.init.xavier_uniform_() and torch.nn.init.xavier_normal_() with Examples - PyTorch Tutorial

torch.nn.init.xavier_uniform_()

It is defined as:

torch.nn.init.xavier_uniform_(tensor, gain=1.0)

This function will initialize tensors that have values sampled from \(U(−a,a)\).

Where

torch.nn.init.xavier_uniform_()

gain can be computed by torch.nn.init.calculate_gain()

fan_in and fan_out is the input and output dimension.

torch.nn.init.xavier_normal_()

This function is defined as:

torch.nn.init.xavier_normal_(tensor, gain=1.0)

Similar to torch.nn.init.xavier_uniform_(), this function will initialize tensors that have values sampled from \(N(0,std^2)\).

Where

torch.nn.init.xavier_normal_()

How to use torch.nn.init.xavier_uniform_() and torch.nn.init.xavier_normal_()?

It is easy to use them, here is an example:

import torch
in_dim = 100
out_dim = 50
linear_layer = torch.nn.Linear(in_dim, out_dim, bias=True)
print(linear_layer.weight)
torch.nn.init.xavier_uniform_(
           linear_layer.weight,
           gain=torch.nn.init.calculate_gain("linear"))
print(linear_layer.weight)

Run this code, we will see:

Parameter containing:
tensor([[-0.0030, -0.0031,  0.0433,  ..., -0.0645,  0.0269,  0.0951],
        [-0.0520,  0.0277, -0.0528,  ..., -0.0348,  0.0114, -0.0755],
        [ 0.0827,  0.0281, -0.0810,  ..., -0.0709,  0.0492, -0.0836],
        ...,
        [-0.0325, -0.0521, -0.0286,  ...,  0.0039,  0.0762,  0.0317],
        [ 0.0968,  0.0857, -0.0362,  ...,  0.0049,  0.0345,  0.0693],
        [-0.0134,  0.0613, -0.0942,  ...,  0.0269,  0.0531,  0.0927]],
       requires_grad=True)
Parameter containing:
tensor([[-0.0781,  0.0376,  0.0442,  ...,  0.1334,  0.1347,  0.0445],
        [ 0.0806, -0.0572, -0.0770,  ..., -0.1106, -0.0826, -0.1175],
        [-0.0429,  0.0161,  0.0430,  ..., -0.1173,  0.1779,  0.0881],
        ...,
        [ 0.1457,  0.1077, -0.0646,  ..., -0.1134,  0.1504,  0.0044],
        [ 0.0993,  0.0520,  0.0589,  ...,  0.1021,  0.0407, -0.1307],
        [-0.1907,  0.0018,  0.0224,  ...,  0.1684,  0.0456,  0.1235]],
       requires_grad=True)

We can find that we use torch.nn.init.xavier_uniform_() to change the value of weight in torch.nn.linear().

Leave a Reply