There are two flavors of A100 GPUs on Quest, PCIe and SXM. In the submission script in the block above, there is no way of knowing which type of A100 GPU Slurm would assign the job to. However, you can specify which type of A100 node you'd prefer using the --constraint flag. The choices are either pcie for the 40GB A100s or sxm for the 80GB A100s.
Choosing whether you want your job to land on a PCIe or SXM A100 largely depends on the kind of job you are executing. Specifically, there are considerations to make when it comes to using a single GPU card or multiple GPU cards, which will influence whether you want to use a PCIe A100 or a SXM A100.
Considerations for Using a Single GPU Card
If you only need to use one GPU card, you want to look at how many GB you will need on that one card. If your memory needs are <40GB, you request a PCIe A100. However, if you need >40GB on a single GPU card, you should request a SXM A100.
Considerations for Using Multiple GPU Cards
If you know that you want to use multiple GPU cards for your job, an important consideration to make is that the sharing of data between two, three, or four SXM cards will be a lot faster than sharing data between the two cards on the PCIe A100.
The following example submission script would request one 80GB SXM A100 card.
#SBATCH -A <allocationID>
#SBATCH -p gengpu
#SBATCH --gres=gpu:a100:1
#SBATCH --constraint=sxm
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -t 1:00:00
#SBATCH --mem=XXG
Replace sxm with pcie and you'd receive a 40GB A100.
If you don't specify any constraint you will be assigned an A100 at random. With the GPUs you are automatically given 100% of the memory on the GPU, 40GBs or 80GBs. GPU memory is treated separately from the system memory you request with the --mem flag, they are not the same thing.