Quest FAQ

Body

Quest and Kellogg Linux Cluster Downtime, December 14 - 18.

Quest, including the Quest Analytics Nodes, the Genomics Compute Cluster (GCC), the Kellogg Linux Cluster (KLC), and Quest OnDemand, will be unavailable for scheduled maintenance starting at 8 A.M. on Saturday, December 14, and ending approximately at 5 P.M. on Wednesday, December 18. During the maintenance window, you will not be able to login to Quest, Quest Analytics Nodes, the GCC, KLC, or Quest OnDemand submit new jobs, run jobs, or access files stored on Quest in any way including Globus. For details on this maintenance, please see the Status of University IT Services page.

Quest RHEL8 Pilot Environment - November 18.

Starting November 18, all Quest users are invited to test and run their workflows in a RHEL8 pilot environment to prepare for Quest moving completely to RHEL8 in March 2025. We invite researchers to provide us with feedback during the pilot by contacting the Research Computing and Data Services team at quest-help@northwestern.edu. The pilot environment will consist of 24 H100 GPU nodes and seventy-two CPU nodes, and it will expand with additional nodes through March 2025. Details on how to access this pilot environment will be published in a KB article on November 18.

This page contains frequently asked questions about the Quest high performance computing cluster at Northwestern.

Allocations And Accounts

How do I get an account on Quest?

To access Quest, you will need to be part of an active research allocation. There are two ways to obtain one:

  • Submit a new allocation request form for either a Research I or Research II allocation.
  • Join an existing research allocation.

The application forms can be found at Request Research Allocation Forms

The Research Allocation I is suitable for projects requiring 100,000 compute hours or less. This allocation provides a 1 TB project directory. The resources provided by Research I will fit the needs of the majority of our users. A Research II allocation is suitable for projects requiring up to 500,000 compute hours. This allocation provides a 2 TB project storage. Research II allocations are for projects with a large computational need, however you will need to submit a detailed proposal. Both Research I and II allocations are available free of charge, but we do request that a chartstring is provided for your research if possible. This chartstring would be used for internal tracking purposes so that we can see the impact Quest is having on research done on campus.

Am I currently in an active allocation on Quest?

To see which allocations you belong to on Quest, login to Quest and run the groups command:
$ groups
p31014 p31182 w10001

Note that this list includes all allocations you have belonged to on Quest, some of which may no longer be active. To check the status of an allocation you belong to, use the checkproject command:

$ checkproject p12345
====================================

Reporting for project p12345
------------------------------------
768 GB in 141942 files (76.80% of 1000 GB quota)
Allocation Type: Allocation I
Expiration Date: 2023-03-01
Status: ACTIVE
Compute and storage allocation - when status is ACTIVE, this allocation has compute node access and can submit jobs
------------------------------------

====================================

The “Status” of your allocation will either be “ACTIVE” or “EXPIRED”, and you may apply to renew expired allocations using the form appropriate for your allocation type. In addition, checkproject displays the percentage of storage currently used in your allocation, the date when this allocation expires, and if this allocation can submit jobs or is storage only.

How do I retain access to Quest after I leave the University?

As long as you have a valid Northwestern NetID/password, you can access to Quest. When your NetID is deactivated by the University, your Quest access will also end. If you want to continue using Quest you should obtain a guest or affiliate NetID. There are individuals in departments/schools who can request guest NetIDs. You or your supervisor can reach out to them.

Please review the following documentation for more information: Affiliate, Departmental, and Guest NetIDs.

Resources

How do I use GPUs on Quest?

There are two ways to use GPUs on Quest. You can either purchase your own GPU nodes or use a general access allocation (i.e. Research I, Research II, and Education) to access shared GPU nodes. Please see GPUs on Quest for more information.

Data Transfer

How do I transfer files to and from OneDrive/Sharepoint on Quest?

Transfer data to and from the Northwestern Quest Globus Endpoint to the Northwestern OneDrive/Sharepoint Endpoint.

How do I get access to the Globus endpoint for RDSS (RESFILES/FSMRESFILES)?

To be able to use Globus to transfer data to and from your RDSS (also known as RESFILES or FSMRESFILES), open a service request by emailing quest-help@northwestern.edu.

How do I transfer files to and from RDSS (RESFILES/FSMRESFILES) via Globus?

Transfer data to and from the Northwestern Quest Globus Endpoint to the Northwestern RDSS/FSMRES Endpoint.

How do I share the data on Quest with my collaborators outside Northwestern?

You can use Globus to share data with external collaborators who do not have Northwestern affiliation. Please follow the instructions here: Sharing files with Globus. Login to Globus with Northwestern affiliation. The collection name (i.e. the endpoint) you should select is "Northwestern Quest" to access your files/folders on Quest. You can share data with collaborators who have personal Globus IDs or accounts through their institutions' subscription.

Scheduler and Job Submissions

I get the error "Unable to allocate resources: Requested time limit is invalid (missing or exceeds some limit)" when trying to submit a job.

This error indicates that you have specified your job to run for longer than a given queue will allow. To allow this job to run, you will need to either reduce the amount of walltime for the job to be within the selected queue's limits, or define a larger queue with a higher walltime.

You can find a list of all queues and their walltime limits at Quest Partitions/Queues.

My job was killed on a login node.

From time to time, we encounter errors on the login nodes that require killing all running jobs on that node to prevent the node from crashing. Unfortunately, your job may have been one of those that were killed.

It is recommended that users submit interactive or batch jobs to the compute nodes to avoid such job cancellations. Login nodes are shared resources are intended as entry points to Quest for all users. Submitting CPU or memory heavy jobs will affect everyone trying to access Quest. Please see Submitting a Job On Quest for more information about submitting interactive jobs.

I get the error "sbatch: error: Batch job submission failed: Invalid qos specification" when trying to submit a job.

This error is commonly observed if your allocation has expired. Slurm does not allow job submission if you are using an expired allocation. You can run the command checkproject <allocation ID> to see the expiration date of your allocation. If your research project continues and you want to continue using the same allocation, you will need to renew it by Requesting a Research Allocation.

Software

The software I want to use is not available on Quest.

If the software you require for your research is not available on Quest, there are a few options you can try. The first is to perform a local software installation to your home directory or project directory following instructions specific to the software you are trying to install. If a local install does not work, you can fill out the Software Installation Request Form. We will then assist you with the software installation.

How can I install and use Tensorflow with GPUs on Quest?

Please see the What GPU software is available on QUEST? section of GPUs on Quest for more information.

 

Details

Details

Article ID: 1802
Created
Thu 5/12/22 1:39 PM
Modified
Fri 10/25/24 3:12 PM