This error was resolved after I reworked my sampling method (see PyTorch Checkpointing Error: Recomputed Tensor Metadata Mismatch in Global Representation with Extra Sampling); it had to do with the fact that I wasn't computing a separate global representation through the forward pass.
Something I could have never figured out based on the traceback I got, as this mentioned nothing related to the sampling part of the code.