Fix A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set padding_side=’left’ when initializing the tokenizer

By | April 17, 2024

When we are using a llm model to infer, we may get this warning: A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set padding_side=’left’ when initializing the tokenizer. In this tutorial, we will introduce you how to fix.

This warning is below:

Fix A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set padding_side='left' when initializing the tokenizer

How to fix?

First, we should set padding_side=’left’ when initializing the tokenizer, which is very important if you plan to make a batch inference.

For example:

tokenizer padding_side left

LLM Train and Inference vs Right Padding and Left Padding

However, we find this warning is still existing.

We can set our tokenizer padding side is left, which is correct for inference. Then, we can hide this warning.

Add code below in your script.

from transformers.utils import logging
logging.get_logger("transformers").setLevel(logging.ERROR)

Finally, you will find this warning is disappeared.