The goal of this short blog post is to share some simple tips on profiling your (to be) submitted jobs on high performance computing resources. Profiling your jobs can give you information about how efficiently you are using your computational resources, i.e., your CPUs and your allocated memory. Typically you would perform these checks on your experiment at a smaller scale, ensuring that everything is working as it should, before expanding to more tasks and CPUs.
Your first check is squeue
typically paired with your user ID on a cluster. Here’s an example:
(base) [ah986@login02 project_dir]$ squeue -u ah986
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
5688212 shared <job_name> ah986 R 0:05 1 exp-4-55
This tells me that my submitted job is utilizing 1 node in the shared
partition of this cluster. If your cluster is using the SLURM scheduler, you can also use sacct
which can display accounting data for all jobs you are currently running or have run in the past. There’s many pieces of information available with sacct
, that you can specify using the --format
flag. Here’s an example for the same job:
(base) [ah986@login02 project_dir]$ sacct --format=JobID,partition,state,time,start,end,elapsed,nnodes,ncpus,nodelist,AllocTRES%32 -j 5688212
JobID Partition State Timelimit Start End Elapsed NNodes NCPUS NodeList AllocTRES
------------ ---------- ---------- ---------- ------------------- ------------------- ---------- -------- ---------- --------------- --------------------------------
5688212 shared RUNNING 20:00:00 2021-09-08T10:55:40 Unknown 00:19:47 1 100 exp-4-55 billing=360000,cpu=100,mem=200G+
5688212.bat+ RUNNING 2021-09-08T10:55:40 Unknown 00:19:47 1 100 exp-4-55 cpu=100,mem=200G,node=1
5688212.0 RUNNING 2021-09-08T10:55:40 Unknown 00:19:47 1 100 exp-4-55 cpu=100,mem=200G,node=1
In this case I can see the number of nodes (1) and the number of cores (100) utilized by my job as well as the resources allocated to it (100 CPUs and 200G of memory on 1 node). This information is useful in cases where a task launches other tasks and you’d like to diagnose whether the correct number of cores is being used.
Another useful tool is seff
, which is actually a wrapper around sacct
and summarizes your job’s overall performance. It is a little unreliable while the job is still running, but after the job is finished you can run:
(base) [ah986@login02 project_dir]$ seff 5688212
Job ID: 5688212
Cluster: expanse
User/Group: ah986/pen110
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 100
CPU Utilized: 1-01:59:46
CPU Efficiency: 68.16% of 1-14:08:20 core-walltime
Job Wall-clock time: 00:22:53
Memory Utilized: 38.25 GB
Memory Efficiency: 19.13% of 200.00 GB
The information here is very useful if you want to find out about how efficiently you’re using your resources. For this example I had 100 separate tasks I needed to perform and I requested 100 cores on 1 node and 200 GB of memory. These results tell me that my job completed in 23mins or so, the total time using the CPUs (CPU Utilized
) was 01:59:46, and most importantly, the efficiency of my CPU use. CPU Efficiency
is calculated “as the ratio of the actual core time from all cores divided by the number of cores requested divided by the run time”, in this case 68.16%
. What this means it that I could be utilizing my cores more efficiently by allocating fewer cores to the same number of tasks, especially if scaling up to a larger number of nodes/cores. Additionally, my allocated memory is underutilized and I could be requesting a smaller memory allocation without inhibiting my runs.
Finally, while your job is still running you can log in the node(s) executing the job to look at live data. To do so, you simply ssh
to one of the nodes listed under NODELIST
(not all clusters allow this). From there, you can run the top
command like below (with your own username), which will start the live task manager:
(base) [ah986@r143 ~]$ top -u ah986
top - 15:17:34 up 25 days, 19:55, 1 user, load average: 0.09, 12.62, 40.64
Tasks: 1727 total, 2 running, 1725 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.3 us, 0.1 sy, 0.0 ni, 99.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 257662.9 total, 249783.4 free, 5561.6 used, 2317.9 buff/cache
MiB Swap: 716287.0 total, 716005.8 free, 281.2 used. 250321.1 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
78985 ah986 20 0 276212 7068 4320 R 0.3 0.0 0:00.62 top
78229 ah986 20 0 222624 3352 2936 S 0.0 0.0 0:00.00 slurm_script
78467 ah986 20 0 259464 8128 4712 S 0.0 0.0 0:00.00 srun
78468 ah986 20 0 54520 836 0 S 0.0 0.0 0:00.00 srun
78481 ah986 20 0 266404 19112 4704 S 0.0 0.0 0:00.24 parallel
78592 ah986 20 0 217052 792 720 S 0.0 0.0 0:00.00 sleep
78593 ah986 20 0 217052 732 660 S 0.0 0.0 0:00.00 sleep
78594 ah986 20 0 217052 764 692 S 0.0 0.0 0:00.00 sleep
78595 ah986 20 0 217052 708 636 S 0.0 0.0 0:00.00 sleep
78596 ah986 20 0 217052 708 636 S 0.0 0.0 0:00.00 sleep
78597 ah986 20 0 217052 796 728 S 0.0 0.0 0:00.00 sleep
78598 ah986 20 0 217052 732 660 S 0.0 0.0 0:00.00 sleep
Memory and CPU usage can be tracked from RES
and %CPU
columns respectively. In this case, for the sake of an example, I just assigned all my cores to sleep a certain number of minutes each (using no CPU or memory). Similar information can also be obtained using the ps
command, with memory being tracked under the RSS
column.
(base) [ah986@r143 ~]$ ps -u$USER -o %cpu,rss,args
%CPU RSS COMMAND
0.0 3352 /bin/bash /var/spool/slurm/d/job3509431/slurm_script
0.0 8128 srun --export=all --exclusive -N1 -n1 parallel -j 100 sleep {}m ::: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
0.0 836 srun --export=all --exclusive -N1 -n1 parallel -j 100 sleep {}m ::: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
0.1 19112 /usr/bin/perl /usr/bin/parallel -j 100 sleep {}m ::: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
0.0 792 sleep 3m
0.0 732 sleep 4m
0.0 764 sleep 5m
0.0 708 sleep 6m
0.0 708 sleep 7m
0.0 796 sleep 8m
0.0 732 sleep 9m
0.0 712 sleep 10m