Dig into the code for the Init Flux LoRA Training node and look at where it builds the blocks when blocks_to_swap > 0. Somewhere in there, it’s probably creating a tensor (or loading weights) without sending it to .to("cuda"). You can try manually forcing .to("cuda") on any tensors/models it creates — especially right after blocks_to_swap gets used. If that doesn’t help, wrap that section in a check like if tensor.device != target_model.device: tensor = tensor.to(target_model.device) just to be safe.