Background
- The error has got something to do with your multiple GPU-training, not
training_step()
method because it looks find roughly.
- More precisely, you might have used more than one GPU and when you train the model, there is some part that doesn't need to use the parameters gathered from the multiple GPUs and
lightning
says 'this is an unexpected behaviour, so if you mean it please provide me with the right argument in Trainer
'
Solution
- Thus, the easiest solution would be
trainer = L.Trainer(
... # whatever arguments you've set up
strategy="ddp_find_unused_parameters_true",
)
triner.fit(model={your_model}, dm={your_datamodule}) # note you need to modify here