I ran into the same issue and @rok answer worked for me. However, I wanted to avoid dropping the last batch. According to this thread, the issue seems to be related to parallel and distributed/multi gpu training. Removing this call to nn.DataParallel
worked for me without needing to set add drop_last=True
in the DataLoader
:
model = nn.DataParallel(model)