check if the loss before loss.backward()
requires grad by printing loss.requires_grad
. If not you should check in the loss calculation function if:
pred_conf[i]
requires grad?From what I see, your function in detect.py
convert tensor to numpy
and python, which break the gradient chain. That should be why your loss
doesn't require grad.