I understand your problem and tbh I don't have a clear cut answer. However, using the same validation set repeatedly like you propose is also not desired since you risk (over)fitting your hyperparams on that specific part of the data.
You could pre-specify the folds and impute the validation fold based on the other training folds. You could also accept that there is a small chance of leakage due to one or two observations ending up in the validation fold. Of course, this totally depends on your data and method of imputation. Wildly varying performance across the CV folds is an indicator that leakage is a problem.