I have not been able to find a satisfying answer to submitting tasks on multiple nodes using job steps. However I found that in my case (multiple identical runs) what works really well is to submit only one job step split in many tasks. The batch script would then look like:
#!/bin/sh
#SBATCH --partition parallel
#SBATCH --ntasks=100
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=100M
#SBATCH --job-name test
#SBATCH --output test.out
srun -n100 -u exec.sh
with the executable script exec.sh
containing expressions with the variable $SLURM_PROCID
to differentiate between the tasks. For example:
#!/bin/sh
echo $SLURM_PROCID
sleep 1200
This will result in the desired behavior, but from what I understand it has some drawbacks compared to submitting separate job steps when it comes to the independently controlling each task. However, until a better alternative is found, this is the only approach that seems to work for this use case.