Checking Processor and Memory Utilization for Jobs on Quest

Body

Quest and Kellogg Linux Cluster Downtime, December 14 - 18.

Quest, including the Quest Analytics Nodes, the Genomics Compute Cluster (GCC), the Kellogg Linux Cluster (KLC), and Quest OnDemand, will be unavailable for scheduled maintenance starting at 8 A.M. on Saturday, December 14, and ending approximately at 5 P.M. on Wednesday, December 18. During the maintenance window, you will not be able to login to Quest, Quest Analytics Nodes, the GCC, KLC, or Quest OnDemand submit new jobs, run jobs, or access files stored on Quest in any way including Globus. For details on this maintenance, please see the Status of University IT Services page.

Quest RHEL8 Pilot Environment - November 18.

Starting November 18, all Quest users are invited to test and run their workflows in a RHEL8 pilot environment to prepare for Quest moving completely to RHEL8 in March 2025. We invite researchers to provide us with feedback during the pilot by contacting the Research Computing and Data Services team at quest-help@northwestern.edu. The pilot environment will consist of 24 H100 GPU nodes and seventy-two CPU nodes, and it will expand with additional nodes through March 2025. Details on how to access this pilot environment will be published in a KB article on November 18.

This page provides commands for checking processor (core) and RAM memory utilization for jobs that run on Quest.

Understanding the memory and CPU requirements of your jobs will help you to utilize Quest resources more efficiently. Below are some methods you can apply to measure a job's CPU and RAM usage.

How to check resource utilization for completed jobs?

Slurm provides a tool calledseffto check the memory utilization and CPU efficiency for completed jobs. Note that for running and failed jobs, the efficiency numbers reported byseffare not reliable so please use this tool only for successfully completed jobs.

We are going to look at a finished job that was submitted using the script below:

#!/bin/bash
#SBATCH --account=p12345
#SBATCH --job-name=lmp-cpu
#SBATCH --ntasks=10
#SBATCH --ntasks-per-node=10
#SBATCH --mem-per-cpu=100M
#SBATCH --time=01:02:00

module purge
module load lammps/lammps-22Aug18

mpirun -n 10 lmp -in in.fcc

This job submission script requests 10 tasks in a node. The scheduler assigns one core to one task so 10 cores were assigned for this job. The script also requests 100 megabytes per core. As a result, 1000 megabytes were reserved for this job.

After the job is completed, we can examine the utilization report produced by theseff <jobid>command.

[abc123@quser24 ~]$ seff 549437
Job ID: 549437
Cluster: quest
User/Group: abc123/abc123
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 10
CPU Utilized: 01:06:22
CPU Efficiency: 103.16% of 01:04:20 core-walltime
Job Wall-clock time: 00:06:26
Memory Utilized: 287.50 MB
Memory Efficiency: 28.75% of 1000.00 MB

CPU Efficiencyis calculated as the ratio of the actual core time from all cores divided by the number of cores requested divided by the run time. Here, we see that theCPU Efficiencyis 103% which means that the job utilized all 10 cores fully during the run time.

Memory Efficiencyis calculated as the ratio of the high-water mark of memory used by all tasks divided by the memory requested for the job. The total memory request for this job was 1000 megabytes and only 287.5 megabytes were used. Thus the memory efficiency is calculated as 28.75%.

Profiling your processes for memory and CPU usage before production

The time command is provided with Quest's operating system. You can launch a program with /usr/bin/time in front of it so that the system will watch your program and provide statistics about the CPU and RAM usage.

To test your code, it is recommended that you start an interactive session reserving a compute node. Here is an example test for an MPI parallelized code (namely lmp):

[abc123@qnode4233 ~]$ /usr/bin/time -v mpirun -n 10 lmp -in in.fcc > lmp.out

The output of this command is as follows:

Command being timed: "mpirun -n 10 lmp -in in.fcc"
User time (seconds): 3682.19
System time (seconds): 1.41
Percent of CPU this job got: 999%
Elapsed (wall clock) time (h:mm:ss or m:ss): 6:25.64
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 31050
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 51
Minor (reclaiming a frame) page faults: 408429
Voluntary context switches: 11770
Involuntary context switches: 2159
Swaps: 0
File system inputs: 0
File system outputs: 181592
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

The Percent of CPU this job got line reports core utilization. A value around 100% means that the job run on 1 core. In this case, 999% means that the job used about 10 cores. The number reported for Maximum resident set size will inform you about how much RAM your task has used at most. Since the example program is using MPI parallelization with 10 tasks and one task of the program used maximum 31050 kilobytes of memory, we can estimate that the whole job needs a bit more than 310 megabytes.

The information gathered from this test will be helpful estimating the memory/cpu needs of similar jobs in the future. When submitting a job, a good practice is to request 10-15% more memory than the finding in your memory profiling as a safety factor.

Here is another profiling example for a program (pi_red) that is parallelized using OpenMP threads:

[abc123@qnode4233 ~]$ /usr/bin/time -v ./pi_red
Command being timed: "./pi_red"
User time (seconds): 22547.96
System time (seconds): 0.79
Percent of CPU this job got: 996%
Elapsed (wall clock) time (h:mm:ss or m:ss): 37:42.85
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 832
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 305
Voluntary context switches: 19
Involuntary context switches: 67224
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

The program used 832 kilobytes of memory and about 10 CPU cores with its threads. Unlike MPI tasks, threads share a common memory space, thus the whole job used 832 kilobytes in this case.

How to check resource utilization for running jobs?

While your job is running, you can examine the instantaneous memory and CPU utilization from the compute node(s) directly. You can find out which node(s) is (are) working on your job by using squeue -j <jobID> command. Let's look at the MPI parallelized lmp program which is submitted as a batch job:

[abc123@quser21 ~]$ squeue -j 549437
   JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
  549437     short  lmp-cpu   abc123   R       2:34      1 qnode4233

Once you have identified the name of the node that your job is running on, you can connect to this node directly:

[abc123@quser21 ~]$ ssh qnode4233
[abc123@qnode4233 ~]$

You will see your command prompt name changes from the login node (quser21 in this example) to the compute node (qnode4233 in this case). There are two useful commands that will inform you about the resource utilization of your program. The first one is ps which gives a snapshot of the running processes on the node.

[abc123@qnode4233 ~]$ ps -u$USER -o %cpu,rss,args

In the command above, we request ps to report %cpu and resident set size (rss) utilized for the running processes (args). The output is obtained as follows:

%CPU   RSS COMMAND
 0.0  1576 /bin/bash /var/spool/slurmd/job549437/slurm_script
 0.4  5504 mpirun -n 10 lmp -in in.fcc
 100 30248 lmp -in in.fcc
 100 28384 lmp -in in.fcc
 100 28524 lmp -in in.fcc
 100 28632 lmp -in in.fcc
 100 28512 lmp -in in.fcc
 100 28264 lmp -in in.fcc
 100 28064 lmp -in in.fcc
 100 28084 lmp -in in.fcc
 100 28140 lmp -in in.fcc
 100 28668 lmp -in in.fcc
 0.0  2124 sshd: abc123@pts/10
 0.1  2140 -bash
 0.0  1572 ps -uabc123 -o %cpu,rss,argsK

The output shows 10 lmp tasks each using 100% of a CPU core and about 30 megabytes (default units for rss in ps command is kilobytes). In total, this job has been using 10 CPU cores and around 300 megabytes of memory when ps command was issued.

A similar result can be obtained from top (i.e. short for table of processes) command which shows live data instead of a snapshot.

[abc123@qnode4233 ~]$ top -u$USER

The command will start the live task manager. Memory and CPU usages can be tracked from RES and %CPU columns respectively. We see 10 lmp tasks, each consuming around 30000 kilobytes of memory and 99.9% of one CPU. Once you have gathered the necessary information, press q to quit the task manager.

top - 01:07:31 up 168 days, 18:08,  1 user,  load average: 7.15, 2.42, 1.84
Threads: 1332 total,  11 running, 1321 sleeping,   0 stopped,   0 zombie
%Cpu(s): 50.3 us,  0.2 sy,  0.0 ni, 49.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 13183529+total, 11912448+free,  6916132 used,  5794680 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 12324249+avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
16234 abc123    20   0  613580  28668  11948 R 99.9  0.0   1:15.64 lmp
16223 abc123    20   0  629792  30700  12108 R 99.9  0.0   1:15.64 lmp
16225 abc123    20   0  613220  28796  12276 R 99.9  0.0   1:15.67 lmp
16229 abc123    20   0  612824  28560  12260 R 99.9  0.0   1:15.65 lmp
16230 abc123    20   0  612824  28332  12232 R 99.9  0.0   1:15.66 lmp
16232 abc123    20   0  612812  28116  11872 R 99.9  0.0   1:15.66 lmp
16233 abc123    20   0  612804  28152  11920 R 99.9  0.0   1:15.66 lmp
16226 abc123    20   0  612836  28536  12312 R 99.7  0.0   1:15.66 lmp
16228 abc123    20   0  613184  28636  11988 R 99.7  0.0   1:15.66 lmp
16231 abc123    20   0  612416  28088  11968 R 99.7  0.0   1:15.63 lmp

Now, let's examine what we see for another job running the pi_red program. Note that this program is using threading for parallelization. The ps reports 999% CPU (translates to utilizing 10 cores) and 620 kilobytes of memory (RSS).

[abc123@qnode4233 ~]$ ps -u$USER -o %cpu,rss,args
%CPU   RSS COMMAND
 0.0  1468 /bin/bash /var/spool/slurmd/job549441/slurm_script
 999   620 ./pi_red
 0.0  2120 sshd: abc123@pts/0
 0.0  2192 -bash
 0.0  1572 ps -uabc123 -o %cpu,rss,args

The same information could be obtained from RES and %CPU columns of top command for pi_red program.

[abc123@qnode4233 ~]$ top -u$USER
top - 15:49:20 up 123 days,  8:50,  1 user,  load average: 9.30, 4.14, 1.61
Tasks: 279 total,   2 running, 277 sleeping,   0 stopped,   0 zombie
%Cpu(s): 50.0 us,  0.0 sy,  0.0 ni, 49.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 13183529+total, 10491147+free,  6206644 used, 20717172 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 12455683+avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2068 abc123    20   0   29220    620    516 R  1000  0.0  27:01.04 pi_red
 2357 abc123    20   0  157836   2376   1560 R   0.7  0.0   0:00.07 top
 1876 abc123    20   0  113120   1468   1212 S   0.0  0.0   0:00.00 slurm_script
 2082 abc123    20   0  130612   2120    936 S   0.0  0.0   0:00.00 sshd
 2083 abc123    20   0  115516   2192   1672 S   0.0  0.0   0:00.02 bash

Details

Details

Article ID: 1695
Created
Thu 5/12/22 1:39 PM
Modified
Fri 10/25/24 3:15 PM