Batch job tutorial - Serial jobs
In this tutorial weβll get familiar with the basic usage of the Slurm batch queue system at CSC
- The goal is to learn how to request resources that match the needs of a job
π¬ A batch job consists of two parts: resource requests and the job step(s)
βπ» Examples are done on Puhti. If using the web interface, open a login node shell.
Serial jobs
π¬ A serial program can only use one core (CPU)
- One should request only a single core from Slurm
- The job does not benefit from additional cores
- Excess cores are wasted since they will not be available to other users
π¬ Within the job (or allocation), the actual program is launched using the command srun
βπ» If you use a software that is pre-installed by CSC, please check its documentation page; it might have a batch job example with useful default settings.
Launching a serial job
- Go to the
/scratch
directory of your project:
cd /scratch/<project> # replace <project> with your CSC project, e.g. project_2001234
- Now your input (and output) will be on a shared disk that is accessible to the compute nodes.
π‘ You can list your projects with csc-projects
π‘ Note! If youβre using a project with other members (like the course project), first make a subdirectory for yourself (e.g. mkdir $USER
and then move there (cd $USER
) to not clutter the /scratch
root of your project)
- Create a file called
my_serial.bash
e.g. with thenano
text editor:
nano my_serial.bash
- Copy the following batch script there and change
<project>
to the CSC project you actually want to use:
#!/bin/bash
#SBATCH --account=<project> # Choose the billing project. Has to be defined!
#SBATCH --time=00:02:00 # Maximum duration of the job. Upper limit depends on the partition.
#SBATCH --partition=test # Job queues: test, interactive, small, large, longrun, hugemem, hugemem_longrun
#SBATCH --ntasks=1 # Number of tasks. Upper limit depends on partition. For a serial job this should be set 1!
srun hostname # Run hostname-command
srun sleep 60 # Run sleep-command
- Submit the job to the batch queue and check its status with the commands:
sbatch my_serial.bash
squeue -u $USER
π¬ In the batch job example above we are requesting
- one core (
--ntasks=1
) - for two minutes (
--time=00:02:00
) - from the test queue (
--partition=test
)
π¬ We want to run the program hostname
that will print the name of the Puhti compute node that has been allocated for this particular job
π¬ In addition, we are running the sleep
program to keep the job running for an additional 60 seconds, in order to have time to monitor the job
Checking the output and the efficiency
- By default, the output is written to a file named
slurm-<jobid>.out
where<jobid>
is a unique job ID assigned to the job by Slurm - Check the efficiency of the job compared to the reserved resources by issuing the command
seff <jobid>
(replace<jobid>
with the actual Slurm job ID)
π You can get a list of all your jobs that are running or queuing with the command squeue -u $USER
π― A submitted job can be cancelled using the command scancel <jobid>
More information
π‘ FAQ on CSC batch jobs in Docs CSC