Using RStudio Server on a Quest compute node

Summary

This article demonstrates how a user can run RStudio Server on a Quest compute node through a Slurm interactive or batch job.

Body

Quest and Kellogg Linux Cluster Downtime, March 22 - 31.

Quest, including the Quest Analytics Nodes, the Genomics Compute Cluster (GCC), the Kellogg Linux Cluster (KLC), and Quest OnDemand, will be unavailable for scheduled maintenance starting at 8 A.M. on Saturday, March 22, and ending approximately at 5 P.M. on Monday, March 31. During the maintenance window, you will not be able to login to Quest, Quest Analytics Nodes, the GCC, KLC, or Quest OnDemand submit new jobs, run jobs, or access files stored on Quest in any way including Globus. For details on this maintenance, please see the Status of University IT Services page.

Quest RHEL8 Pilot Environment

The RHEL8 Pilot Environment is available for use now.

Starting Tuesday, December 10, seventy of the latest Intel Emerald Rapids CPU nodes (128 cores and 512GB of RAM) will be available to the RHEL8 Pilot, significantly increasing the compute capacity of this environment. With this expansion, the pilot environment will consist of twenty-four NVIDIA H100 GPU nodes (totaling ninety-six H100 cards) and 140 CPU nodes totaling 12,600 cores. Quest users are encouraged to test their workflows in RHEL8 Pilot environment to prepare for Quest moving completely to RHEL8 in March 2025. Detailed instructions are available on how to submit jobs for the new Operating System in the Knowledge Base article, RHEL8 Pilot Environment.

RStudio Server is offered to all Quest users on the Analytics Nodes and is a convenient, browser-based option for R users. Because the Analytics Nodes are a shared resource, you are limited in what kind of workflows are appropriate to run on this resource. Moreover, your experience using RStudio Server may be slower or unstable due to the activity of other users. Especially, if you are working with large datasets, you may find yourself crashing your RStudio session on the Analytics Nodes.

To that end, users can run RStudio Server (preferred) or RStudio Desktop on the Quest compute nodes, where you can reserve memory and CPU resources and do not have to share resources with other users.

RStudio Server

RStudio Server can be launched and run on a Quest compute node through an interactive job or batch job on Quest.

To schedule the interactive job from the command line on Quest, ssh into a login node and type:

srun -A <allocation_name> -p <queue_name> -N 1 --ntasks-per-node=1 --mem-per-cpu=4G --time=04:00:00 --pty bash -l

This example requests a single core for a 4 hour job. Substitute an active allocation name and queue name, for example if using allocation p12345 this might be:

srun -A p12345 -p short -N 1 --ntasks-per-node=1 --mem-per-cpu=4G --time=04:00:00 --pty bash -l

Note that the more cores requested, the longer the wait for the interactive session to start. Do not request more than 1 node for RStudio Server sessions.

Once the session begins, get the name of the compute node the session has landed on by running the command hostname, e.g.,

$ hostname
qnode0372

Next, load the version of R you would like to run and any additional modules that you need for installing or using certain R packages, e.g.,

[qnode0372 ~]$ module purge
[qnode0372 ~]$ module load R/4.2.0
[qnode0372 ~]$ module load geos/3.8.1

After you have loaded these modules, load the rstudio-server/2023.06 module, which will display to you a short hand version of the instructions that you see here.

[qnode0372 ~]$ module load rstudio-server/2023.06
If you have not already done so, make sure you are running and interactive or batch
job.
Before launching RStudio Server, load the version of R you would like to run and any
additional modules that you need for installing or using certain R packages, e.g.,
module load R/4.2.0
module load geos/3.8.1
Once this is done, call the command `rserver <port_number>` where <port_number>
should be a value between 8000 and 9000.
Based on the port number and the compute node that RStudio Server is running on, you
will then tunnel to the server using the command:
`ssh -L <port_number>:localhost:<port_number> <your_netID>@quest.northwestern.edu ssh
-N -L <port_number>:localhost:<port_number> qnode<number>` filling in the appropriate
value for <port_number> and <compute_node>
Finally, in your *local browser* you can then put in the URL
`localhost:<port_number>` and connect to your RStudio Server session. 

Once RStudio Server is running on the compute node, open a new terminal window on your local computer, and type:

ssh -L <port_number>:localhost:<port_number> <your_netID>@quest.northwestern.edu ssh -g -N -L <port_number>:localhost:<port_number> qnode<number>

In the command template above, be sure to replace <your_netID> with your netID, replace qnode<number> with the name of the compute node, and replace all <port_number> instances with the port number between 8000 and 9000 that you selected. You will be prompted for your Quest password, which will not return a prompt.

On your local computer, open up your browser and connect to http://localhost:<port_number>/. Your browser is now connected to the RStudio Server session running on Quest.

Note that your RStudio Server session will quit abruptly when the walltime of the interactive job comes to an end. Save often and be aware of walltime to avoid losing your work.

Details

Details

Article ID: 2175
Created
Tue 2/21/23 4:37 PM
Modified
Thu 1/23/25 3:32 PM

Related Articles

Related Articles (1)

Learn how your can run VS Code (Server) on a Quest Compute Node