I'd like to provide a rule of thumb based on Satya Prakash Dash's answer:
Prefer torch.inference_mode() for pure inference scenarios which gives you maximum performance.
Use torch.no_grad() when working with custom autograd functions or dealing with older code.