Implement Reproducibility in PyTorch Lightning

In pytorch lightning, it is very easy to make your deep learning mode can be reproduced. In this tutorial, we will introduce this topic.

How to implement reproducibility in PyTorch Lightning?

We will use an example to implement, for example:

from pytorch_lightning import Trainer, seed_everything

seed_everything(42, workers=True)
# sets seeds for numpy, torch, python.random and PYTHONHASHSEED.
model = Model()
trainer = Trainer(deterministic=True)

Here workers=True in seed_everything(), Lightning derives unique seeds across all dataloader workers and processes for torch, numpy and stdlib random number generators. When turned on, it ensures that e.g. data augmentations are not repeated across workers.

We also should notice: deterministic=True in trainer.

However, if you only plan to set a random seed for python, numpy, pytorch. and you do not use pytorch lightning to train mode. seed_everything() is indeed.

Because seed_everything() is implementd as follows:

os.environ["PL_GLOBAL_SEED"] = str(seed)
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)