The way batch_size works is still hard to predict without digging through the source code, which I try to avoid at the moment. If I supply 63 configurations, each resampled three times, the result is a total of 189 iterations. The Terminator is none, and I'm calling this job on 30 cores. If par batch_size
determines exactly how many configurations are evaluated in parallel, then setting it to a value of 50, e.g., should divide jobs into four batches. When I call this, the returned info says that I actually have two batches, each evaluating a 33/31 configuration, 96/93 resamplings. Any other batch_size
also leads to an unpredictable split of iterations. How does this load balancing actually work?
tune(
task = task,
tuner = tnr("grid_search", batch_size = 50),
learner = lrn("regr.ranger", importance = "permutation", num.threads = 8),
resampling = rsmp("cv", folds = 3),
measures = msr("regr.mae"),
terminator = trm("none"),
search_space = ps(
num.trees = p_fct(seq(100, 500, 50)),#9
mtry = p_fct(seq(3, 9, 1))#7
)