SLURM
| Affiliation | OMNI Cluster |
| Free of charge |
Free of charge |
|
Documentation |
|
| Pre-installed on the cluster |
SLURM 22.05.8 |
The job scheduler SLURM version 22.05.8 is installed on the OMNI cluster. SLURM is used to distribute the users' computing jobs on the cluster in such a way that it is used as efficiently as possible and the waiting times are as short as possible. More information on setting up jobs can be found here. The SLURM documentation can be found here.
Tip: SLURM is only available if the SLURM module is also loaded(module load slurm). By default, the SLURM module is always loaded at login, but it can be unloaded accidentally or intentionally by user actions (e.g. with module purge). If you do not have the SLURM commands available, use module list to ensure that the module is loaded. Further information on modules can be found here.
A SLURM job obtains its settings from various sources. These are in descending order of priority:
- Command line parameters of the command used to set the job.
- Environment variables, if set.
- The parameters at the start of a job script, if specified.
- The standards set in SLURM.
This means, for example, that you can specify in a job script which settings it should run with, and if you need different settings once, you can simply specify a command line parameter when calling it.
Terms
In principle, SLURM recognizes and takes into account the division of the cluster into nodes with individual cores. When setting up jobs, there are several options for requesting resources. In SLURM terminology:
- A job, a self-contained calculation that can consist of several tasks and is assigned(allocated) certain resources such as individual CPU cores, a certain amount of RAM or complete nodes.
- A task, the run of a single process. By default, one task is started per node and one CPU is allocated per task.
- A partition (often also called a queue) is a queue in which jobs are placed.
- A CPU in Slurm refers to a single core. This differs from the usual terminology in which a CPU (a microprocessor chip) consists of several cores. Slurm speaks of "sockets" when referring to CPU chips.
Console commands and options
Here are the most common commands that you will need as a SLURM user.
| Command | Command Function |
|---|---|
squeue |
Display jobs. |
sinfo |
Display information on occupied and free nodes. |
sbatch |
Sets a batch job. |
srun |
Outside a job: sets up a job with a Linux command. Inside a job: executes a command in each task. |
spartition |
Display information about the partitions (queues). ZIMT addition. |
scancel |
Delete a job. |
sview |
Start graphical user interface. |
scontrol |
More detailed information (not all available to normal users). |
Many of these commands have options (command line parameters) that you can specify when calling them. The most important ones are briefly listed here:
| Option | Function | OMNI |
|---|---|---|
--partition, -p |
Defines the partition on which the job is to run. | spartition shows the available partitions. |
--nodes, -N |
Defines the number of nodes on which the job should run. | 1 (or more for MPI applications) |
--ntasks, -n |
Defines the number of tasks for the job. | 1-64 (depending on the use case) |
--ntasks-per-node |
Defines the maximum number of tasks per node. Usually used with MPI programs. | |
--cpus-per-task |
Defines the number of CPUs per task. Usually important when using OpenMP. | |
--mem |
Defines the memory limit per node. The job is aborted if the limit is exceeded (no swapping). The numerical value is in megabytes. | Default: 3750MB, max. 240GB (hpc-node) and 480GB (fat-node), please use sparingly. For applications with particularly high memory requirements, the nodes of the smp partition can be used with 1530GB RAM each. |
--time, -t |
Defines the time limit of the job. If the limit is exceeded, the job is aborted. Format: D-HH:MM:SS | spartition shows the default and min/max runtimes of the various partitions. |
--gpus, -G |
Number of GPUs to be used. Analog: --gres=gpu:X where X is the number of GPUs. |
Note that the OMNI cluster has nodes with different numbers of GPUs. |
--output= |
For the sbatch command, this specifies the log file to which the stdout stream is directed. By default, a file named slurm- is created in the folder in which sbatch was executed. |
|
--error= |
For the sbatch command, this specifies the log file in which the stderr stream is directed. By default, stderr is written to the same location as stdout, see above. |
|
--mail-type |
Specifies the events for which an e-mail is to be sent to the address specified with --mail-user. Possible specifications are BEGIN, END, FAIL, REQUEUE, ALL. |
|
--mail-user= |
Specifies the recipient of the e-mail. | |
--no-requeue |
Deactivates the automatic restart of a job. | Default: Requeue=1(--requeue) |
You can find a full list for each command using man or and also in the SLURM documentation of sbatch.
Environment variables
SLURM uses a number of environment variables. You can use some of them to give SLURM jobs settings (unless you overwrite them when setting up the job using command line parameters), they are described in the sbatch documentation in the section on input variables section. Other variables are set by SLURM when the job is set up. You can use these within the job to query job settings. Only a few examples are listed here, the section on output variables in the sbatch documentation contains a complete list.
| Variable | Function |
|---|---|
SLURM_CPUS_PER_TASK |
CPUs per task. |
SLURM_JOB_ID |
ID number of the job. |
SLURM_JOB_NAME |
Name of the job, by default the name of the job script for sbatch |
SLURM_JOB_NUM_NODES |
Number of nodes that have been assigned. |
SLURM_JOB_PARTITION |
Partition in which the job is running. |
SLURM_NTASKS |
Number of tasks |
SLURM_TASK_PID |
Linux process ID of the corresponding task. |