Restart Jobs and Preemption
Updated: October 27, 2021
Edit this Page via GitHub Comment by Filing an Issue Have Questions? Ask them here.The restart partition will allow you unlimited use of idle cores with the caveat that jobs running in this partition will be terminated if priority jobs (jobs in the campus partition) require the use of that core. This process is called “preemption,” in that the job in the restart partition is preempted by a job in the higher-priority campus partition.
If your workflow can handle jobs being terminated in-flight, this can be a good option for increasing job throughput.
How to Use the Restart Partition
For historical reasons, the restart partition is currently called “restart-new”. It is also necessary to specify the “restart” QOS when submitting these jobs.
An example (using sbatch
) would be:
sbatch --partition=restart-new --qos=restart ...
If you omit the “qos” options you jobs will not be eligible to run and will be held- you will see messages similar to “partition not available” in the “reason” field of
squeue
Finding Preempted Jobs
The sacct
command can be used to find preempted jobs:
sacct -S 2021-10-01 -E 2021-10-26 -s PR
This command shows your jobs that were preempted (the PR
state) between the specified dates (note that the dates are necessary).
Managing Restart Jobs
For the restart partition to be an effective solution, you need to be able to recover from (or possibly not care about) jobs being terminated early. This is highly dependent on the nature of your work.
Workflow managers like Cromwell, NextFlow, and snakemake are good ways to manage restart jobs. These systems have features managing the outputs of the steps and can automatically re-run individual steps should those steps be killed prematurely.
When using the restart partition it’s important to note a few things:
Slurm doesn’t necessarily remove intermediate files
If your task is writing a file into fast, scratch, or some other non-transient storage, that partial output isn’t removed when the job is preempted. Restarting that task later with the same output file specified may cause problems.
This can be circumvented by using the local job temporary directory specified by $TMPDIR in the job environment. That directory and its contents are removed when the job is preempted.
Jobs aren’t actually restarted
The “restart” name is a bit of a misnomer since your jobs aren’t actually restarted by Slurm. You can have Slurm requeue the job if it’s preempted by adding --requeue
to the job submit arguments.
Note that a job that fails will not be requeued.
Updated: October 27, 2021
Edit this Page via GitHub Comment by Filing an Issue Have Questions? Ask them here.