sacct
and seff
to understand resource usage of finished jobs㪠In this tutorial we look at the seff
and sacct
commands. The tutorial should be done on Puhti.
π seff
shows detailed data on used resources in an easy-to-read format, but can only show one job at a time.
π sacct
is useful when you want to look at a listing of jobs, but by default it only shows minimal data.
sacct
which by default shows the jobs you have run on the current date (i.e. since last midnight):sacct
-S
option. Donβt query too long time intervals, since this causes significant load on the system (max. queryable interval is three months).sacct -S YYYY-MM-DD # replace YYYY-MM-DD
-j
option (if you canβt think of one, you can use 18622472
):sacct -j <slurmjobid> # replace <slurmjobid> with a valid job ID
sacct -l -j <slurmjobid> # replace <slurmjobid> with a valid job ID
-o
option. For example, to see job name, job ID, used memory, job state and elapsed wall-clock time, try:sacct -o jobname,jobid,maxrss,state,elapsed -j <slurmjobid> # replace <slurmjobid> with a valid job ID
sacct -e
βΌοΈ Note, running sacct
is heavy on the batch queue system.
π¬ Run a simple array job to practice using seff
and sacct
.
βπ» If you have limited time, you can skip to Examining the finished job and use the job ID 18648826
(it is the same job).
array.sh
and paste the following contents in it.#!/bin/bash
#SBATCH --account=<project> # Choose the billing project. Has to be defined!
#SBATCH --time=00:01:00 # Maximum duration of the job. Max: depends of the partition.
#SBATCH --partition=small # Job queues: test, interactive, small, large, longrun, hugemem, hugemem_longrun
#SBATCH --job-name=array_job # Name of the job visible in the queue.
#SBATCH --output=out_%A_%a.txt # Name of the output-file.
#SBATCH --error=err_%A_%a.txt # Name of the error-file.
#SBATCH --ntasks=1 # Number of tasks. Max: depends on partition.
#SBATCH --cpus-per-task=1 # How many processors work on one task. Max: Number of CPUs per node.
#SBATCH --mem=1000 # How much RAM is reserved for job per node. Unit: MiB
#SBATCH --array=1-6 # The indices of the array jobs.
/appl/soft/bio/course/sacct_exercise/test-a ${SLURM_ARRAY_TASK_ID}
<project>
with your actual project name, e.g. project_2001234
sbatch array.sh
Submitted batch job 123456
squeue -u $USER
π How is an array job listed in the queue?
squeue
), you can use sacct
to study it:sacct -j <slurmjobid> # replace <slurmjobid> with the actual job ID
sacct -X -j <slurmjobid> # replace <slurmjobid> with the actual job ID
π¬ sacct
is especially handy here, because it is easy to spot the failed sub jobs.
seff
to look at individual sub jobs, e.g.:seff <slurmjobid>_5 # replace <slurmjobid> with the actual job ID
sacct
with the -o
option (discussed above). This time add the fields reqmem
(requested memory) and timelimit
(requested time):sacct -o jobname,jobid,reqmem,maxrss,timelimit,elapsed,state -j <slurmjobid> # replace <slurmjobid> with the actual job ID
π Note that in this case we can not use the -X
option as we want to see the memory usage for each step.
βπ» If you have limited time, you can skip to step 4 and use the job ID 18648849
(it is the same job with adjusted resource requests).
#SBATCH --time=00:05:00
#SBATCH --mem=2000
#SBATCH --array=3,5 # Specify which ones to run
seff
and sacct
to look at the jobs. How much memory and time did they use?π‘ You can read more about array jobs and seff and sacct in Docs CSC.