Body
We strongly encourage all Quest users to review the summary of post-maintenance changes as these changes could immediately impact your workflow after the Quest March 2025 downtime.
For more detailed information, see Slurm sbatch documentation.
Job Submission Options
Option |
Slurm (sbatch) |
Script directive |
#SBATCH |
Job name |
--job-name=<name>
-J <name> |
Account |
--account=<account>
-A <account> |
Queue |
--partition=<queue> |
Wall time limit |
--time=<hh:mm:ss>
-t <hh:mm:ss> |
Node count |
--nodes=<count>
-N <count> |
Core count |
-n <count> |
Process count per node |
--ntasks-per-node=<count> |
Core count (per process) |
--cpus-per-task=<cores> |
Memory limit |
--mem=<limit> (Memory per node in MB) |
Minimum memory per processor |
--mem-per-cpu=<memory> |
Request GPUs |
--gres=gpu:<count> |
Request specific nodes |
-w, --nodelist=<node>[,node2[,...]]>
-F, --nodefile=<node file> |
Job array |
-a <array indices> |
Standard output file |
--output=<file path> (path must exist) |
Standard error file |
--error=<file path> (path must exist) |
Combine stdout/stderr to stdout |
--output=<combined out and err file path> |
Architecture constraint |
--constraint=<architecture>
-C <architecture> |
Copy environment |
--export=ALL (default)
--export=NONE to not export environment |
Copy environment variable |
--export=<variable[=value][,variable2=value2[,...]]> |
Job dependency |
--dependency=after:jobID[:jobID...]
--dependency=afterok:jobID[:jobID...]
--dependency=afternotok:jobID[:jobID...]
--dependency=afterany:jobID[:jobID...] |
Request event notification |
--mail-type=<events>
Note: multiple mail-type requests may be specified in a comma separated list:
--mail-type=BEGIN,END,FAIL |
Email address |
--mail-user=<email address> |
Defer job until the specified time |
--begin=<date/time> |
Node exclusive job |
--exclusive |
Common Job Commands
Option |
Slurm (sbatch) |
Submit a job |
sbatch <job script> |
Delete a job |
scancel <job ID> |
Job status (all) |
squeue |
Job status (by job) |
squeue -j <job ID> |
Job status (by user) |
squeue -u <netID> |
Job status (detailed) |
scontrol show job -dd <job ID>
checkjob <job ID> |
Show expected start time |
squeue -j <job ID> --start |
Queue list / info |
scontrol show partition [queue] |
Hold a job |
scontrol hold <job ID> |
Release a job |
scontrol release <job ID> |
Start an interactive job |
salloc <args>
srun --pty <args> |
X forwarding |
srun --pty <args> --x11 |
Monitor or review a job's resource usage |
sacct -j <job_num> --format JobID,jobname,NTasks,nodelist,CPUTime,ReqMem,Elapsed
(see sacct for all format options) |
View job batch script |
scontrol write batch_script <jobID> [filename] |
Script Variables
Info |
Slurm |
Notes |
Version |
– |
Can extract from sbatch --version |
Job name |
$SLURM_JOB_NAME |
|
Job ID |
$SLURM_JOB_ID |
|
Submit directory |
$SLURM_SUBMIT_DIR |
Slurm jobs starts from the submit directory by default |
Node list |
$SLURM_JOB_NODELIST |
The Slurm variable has a different format to the PBS one.
To get a list of nodes use:
scontrol show hostnames $SLURM_JOB_NODELIST |
Job array index |
$SLURM_ARRAY_TASK_ID |
|
Queue name |
$SLURM_JOB_PARTITION |
|
Number of nodes allocated |
$SLURM_JOB_NUM_NODES
$SLURM_NNODES |
|
Number of processes |
$SLURM_NTASKS |
|
Number of processes per node |
$SLURM_TASKS_PER_NODE |
|
Requested tasks per node |
$SLURM_NTASKS_PER_NODE |
|
Requested CPUs per task |
$SLURM_CPUS_PER_TASK |
|
Scheduling priority |
$SLURM_PRIO_PROCESS |
|
Job user |
$SLURM_JOB_USER |
|
Hostname |
$HOSTNAME == $SLURM_SUBMIT_HOST |
Unless a shell is invoked on an allocated resource, the $HOSTNAME variable is propagated (copied) from the submit machine environment to all allocated nodes. |