Add same initi parameters to the LlamaForCausalLM
instantiation(B)
# B) Load with LlamaForCausalLM + config
model_llama = LlamaForCausalLM(config, attn_implementation="eager", torch_dtype=dtype).cuda()
If you don't use same parameters in the init, the model will use different default settings.