Fix use_cache=True is incompatible with gradient checkpointing

When we are finetuning a LLM, we may get this error: use_cache=True is incompatible with gradient checkpointing. In this tutorial, we will introduce you how to fix it.

What is this error?

This error likes below:

How to fix this error?

First, we should set use_cache = False when loading LLM model.

For example:

    model = AutoModelForCausalLM.from_pretrained(
        args.model_name_or_path,
        device_map=device_map,
        load_in_4bit=True,
        torch_dtype=torch.float16,
        trust_remote_code=True,
        use_cache = False,
        quantization_config=BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_compute_dtype=torch.float16,
            bnb_4bit_use_double_quant=True,
            bnb_4bit_quant_type="nf4",
            llm_int8_threshold=6.0,
            llm_int8_has_fp16_weight=False,
        ),
    )

Secondly, if you have installed flash-attn, you can uninstall it.

Then, you can find this error is fixed.

Fix use_cache=True is incompatible with gradient checkpointing – LLM Tutorial

What is this error?

How to fix this error?

Leave a Reply Cancel reply