Slurm Job Submission at SU-HPC Cluster
On Tosun cluster you can find Slurm submission script templates in a the folder: /cta/share/jobscripts
Copy the one you need to your work folder and modify it as required:
mkdir /$HOME/<username>/workfolder
cd /$HOME/<username>/workfolder/
cp /cta/share/jobscripts/example_submit.sh /$HOME/<username>/workfolder/my_experiment.sh
emacs my_experiment.sh
OR
vim my_experiment.sh
Submitting jobs to the queue
Jobs are submitted to the system with the command below:
sbatch myscript.sh
See the page about Slurm Queueing System Commands for more information on creating job submission scripts.
Slurm Partitions (Job Queues)
Slurm Resource Manager has partitions which are job queues. These partitions has different limits and member nodes. You can see the active partitions and their limits with sinfo
command on the cluster.
The Slurm Cheat Sheet
Essential Slurm Commands
Command | Description | Example |
---|---|---|
sbatch [script] |
Submit a batch job | $ sbatch job.sub |
scancel [job_id] |
Kill a running job or cancel queued one | $ sbatch job.sub |
squeue |
List running or pending jobs | $ squeue |
squeue -u [userid] |
List running or pending jobs | $ squeue -u mdemirkol |
Submitting a Slurm Job Script
The job flags are used with SBATCH
command. The syntax for the Slurm directive in a script is #SBATCH
. Some of the flags are used with the srun
and salloc
commands, as well for interactive jobs.
Resource | Flag Syntax | Description | Notes |
partition | --partition=short | Partition is a queue for jobs. | default on |
qos | --qos=short | QOS is quality of service value (limits or priority boost) | default on |
time | --time=01:00:00 | Time limit for the job. | 1 hour; default is 2 hours |
nodes | --nodes=1 | Number of compute nodes for the job. | default is 1 |
cpus/cores | --ntasks-per-node=4 | Corresponds to number of cores on the compute node. | default is 1 |
resource feature | --gres=gpu:1 | Request use of GPUs on compute nodes | default is no feature |
memory | --mem=15500 | Memory limit per compute node for the job. Do not use with mem-per-cpu flag. | default limit is 15500 MB per core in beegfs[101-108] nodes |
memory | --mem-per-cpu=4000 | Per core memory limit. Do not use the mem flag, | default limit is 15500 MB per core in beegfs[101-108] nodes |
account | --account=users | Users may belong to groups or accounts. | default is the user's primary group. |
job name | --job-name="hello_test" | Name of job. | default is the JobID |
constraint | --constraint=gpu | compute-nodes | AVAIL_FEATURES |
output file | --output=test.out | Name of file for stdout. | default is the JobID |
email address | --mail-user=username@sabanciuniv.edu | User's email address | required |
email notification | --mail-type=ALL --mail-type=END | When email is sent to user. | omit for no email |
Running a GUI on the Cluster
Some applications provide the capability to interact with a graphical user interface (GUI). Large-memory applications and computationally steered applications can offer such capability. With Slurm, once a resource allocation is granted for an interactive session (or a batch job when the submitting terminal if left logged in), we can use srun
to provide X11 graphical forwarding all the way from the compute nodes to our desktop using srun.
For example, to run an X terminal:
srun --x11 -A users -p short -n1 --qos=users --pty $SHELL
Note that the user must have X11 forwarded to the login node for this to work -- this can be checked by running xclock
at the command line.
Additionally, the --x11argument can be augmented in this fashion --x11=[batch|first|last|all] to the following effects:
- --x11=first This is the default, and provides X11 forwarding to the first compute hosts allocated.
- --x11=last This provides X11 forwarding to the last of the compute hosts allocated.
- --x11=all This provides X11 forwarding from all allocated compute hosts, which can be quite resource heavy and is an extremely rare use-case.
- --x11=batch This supports use in a batch job submission, and will provide X11 forwarding to the first node allocated to a batch job. The user must leave open the X11 forwarded login node session where they submitted the job.
Job Reason Codes
These codes identify the reason that a job is waiting for execution. A job may be waiting for more than one reason, in which case only one of those reasons is displayed.
State | Code | Meaning |
PENDING | PD | Job is awaiting resource allocation. |
RUNNING | R | Job currently has an allocation. |
SUSPENDED | S | Job has an allocation, but execution has been suspended. |
COMPLETING | CG | Job is in the process of completing. Some processes on some nodes may still be active. |
COMPLETED | CD | Job has terminated all processes on all nodes. |
CONFIGURING | CF | Job has been allocated resources, but are waiting for them to become ready for use |
CANCELED | CA | Job was explicitly cancelled by the user or system administrator. The job may or may not have been initiated. |
FAILED | F | Job terminated with non-zero exit code or other failure condition. |
TIMEOUT | TO | Job terminated upon reaching its time limit. |
PREEMPTED | PR | Job has been suspended by an higher priority job on the same ressource. |
NODE_FAIL | NF | Job terminated due to failure of one or more allocated nodes. |
InvalidQOS | The job's QOS is invalid. | |
PartitionNodeLimit | The number of nodes required by this job is outside of it's partitions current limits. Can also indicate that required nodes are DOWN or DRAINED. | |
PartitionTimeLimit | The job's time limit exceeds it's partition's current time limit. | |
QOSJobLimit | The job's QOS has reached its maximum job count. | |
QOSResourceLimit | The job's QOS has reached some resource limit. | |
QOSTimeLimit | The job's QOS has reached its time limit. |
Please follow the links for more...
QoS settings
Command | Description |
users MaxSubmitJobsPerUser=10 MaxTRES=cpu=40 MaxNodes=2 --account=users --qos=short | All users All partitions |
cuda MaxSubmitJobsPerUser=10 MaxTRES=cpu=40 MaxNodes=2 --account=cuda --qos=cuda --gres=gpu:1 | All users Includes cuda partitions |
Software
System software
- Various operating systems are being used in our systems. If you need a particular OS please let us know.
- Slurm resource manager
Compilers and parallel programming libraries
- GNU Compiler (GCC, GFortran)
- Java, Python, Perl, Ruby
- OpenMPI - library for MPI message passing for use in parallel programming over Infiniband and Ethernet
- ...and more.
Libraries
- Please run the
module avail
command from your ssh console to view a list of available applications.
Application software
- Gaussian, Blast, Namd, Gromacs and many more.
- Please run the
module avail
command from your ssh console to view a list of available applications.