I'd like to provide a rule of thumb based on Satya Prakash Dash's answer:
Prefer torch.inference_mode()
for pure inference scenarios which gives you maximum performance.
Use torch.no_grad()
when working with custom autograd functions or dealing with older code.