Transferring data from Google Cloud Storage to Quest using the gcloud CLI

Body

Quest and Kellogg Linux Cluster Downtime, December 14 - 18.

Quest, including the Quest Analytics Nodes, the Genomics Compute Cluster (GCC), the Kellogg Linux Cluster (KLC), and Quest OnDemand, will be unavailable for scheduled maintenance starting at 8 A.M. on Saturday, December 14, and ending approximately at 5 P.M. on Wednesday, December 18. During the maintenance window, you will not be able to login to Quest, Quest Analytics Nodes, the GCC, KLC, or Quest OnDemand submit new jobs, run jobs, or access files stored on Quest in any way including Globus. For details on this maintenance, please see the Status of University IT Services page.

Quest RHEL8 Pilot Environment - November 18.

Starting November 18, all Quest users are invited to test and run their workflows in a RHEL8 pilot environment to prepare for Quest moving completely to RHEL8 in March 2025. We invite researchers to provide us with feedback during the pilot by contacting the Research Computing and Data Services team at quest-help@northwestern.edu. The pilot environment will consist of 24 H100 GPU nodes and seventy-two CPU nodes, and it will expand with additional nodes through March 2025. Details on how to access this pilot environment will be published in a KB article on November 18.

Transferring data from Google Cloud Storage to Quest using the gcloud CLI 

How to transfer data between Quest and Google Cloud Storage 

Setup

To use the Google Cloud command line interface (gcloud CLI) to transfer data between Quest and Google Cloud Storage Bucket, you will need  

Using the gcloud CLI in Quest 

First you will need to log in to Quest. For detailed instructions, see the Logging in to Quest help article. 

The gcloud CLI is installed system-wide on Quest. To load this module, run: 

module load gcloud/379.0.0 

Once you have loaded the package, you will be able to leverage the gcloud CLI to issue commands to copy data between Quest and Google Cloud Storage. To initialize the gcloud CLI, run: 

gcloud init 

Running this command will prompt the module to guide you through selecting or setting up your Google Cloud configuration and authenticating with your user credentials. For detailed information for authenticating the gcloud CLI with the Google Cloud Platform, see Google’s Initializing the gcloud CLI documentation. 

After authenticating, you will be prompted to select or create the cloud project that contains your Google Cloud Storage bucket. You can identify which project that your bucket belongs to by navigating to the Google Cloud Console and selecting Cloud Storage from the Quick access menu or via the hamburger menu in the upper left corner of the page. Once in the Cloud Storage page, you can select an appropriate project that you have access to via the drop-down and it will list all the buckets that are a resource for that project. Once you see the bucket that you would like to access, you will know which Google Cloud project contains that bucket and will want to select in the gcloud CLI prompt.  

After selecting the appropriate project, you can use the gcloud CLI to transfer data between Quest storage and the Google Cloud Storage bucket. To transfer data between Quest and your Google Cloud Storage bucket, you will use the gsutil tool that is part of the gcloud CLI. Google provides documentation on the gsutil tool, but some common commands used are: 

  • To list all the Google Cloud Storage buckets you have access to under your selected project: 

gsutil ls 

  • To list objects in the top level of a bucket with names of each subdirectory: 

gsutil ls gs://bucket_name 

  • To copy a local file on Quest to your Google Cloud Storage bucket: 

gsutil cp quest_file.txt gs://bucket_name/quest_file.txt 

  • To copy a local directory to your Google Cloud Storage bucket: 

gsutil cp -r Source_Directory gs://bucket_name/Destination_Directory 

  • To sync a directory and copy only the missing files/objects or those whose data has changed: 

gsutil rsync -r Source_Directory gs://bucket_name/Destination_Directory 

 

For more information on the gcloud CLI, reference the Google Cloud documentation for gcloud CLI

For more information on gsutil commands, reference the gsutil tool documentation provided by Google.

Details

Details

Article ID: 2443
Created
Mon 9/18/23 1:54 PM
Modified
Fri 10/25/24 3:13 PM

Related Services / Offerings

Related Services / Offerings (3)

Quest, Quest Analytics Nodes, Kellogg Linux Cluster (KLC), and Genomics Compute Cluster (GCC).
Northwestern IT offers consultations on using cloud resources (AWS, Azure, GCP, etc) for research at Northwestern.
Northwestern IT offers support, training, and workshops on research data management topics.