Using Amazon S3 Storage

Body

Quest and Kellogg Linux Cluster Downtime, December 14 - 18.

Quest, including the Quest Analytics Nodes, the Genomics Compute Cluster (GCC), the Kellogg Linux Cluster (KLC), and Quest OnDemand, will be unavailable for scheduled maintenance starting at 8 A.M. on Saturday, December 14, and ending approximately at 5 P.M. on Wednesday, December 18. During the maintenance window, you will not be able to login to Quest, Quest Analytics Nodes, the GCC, KLC, or Quest OnDemand submit new jobs, run jobs, or access files stored on Quest in any way including Globus. For details on this maintenance, please see the Status of University IT Services page.

Quest RHEL8 Pilot Environment - November 18.

Starting November 18, all Quest users are invited to test and run their workflows in a RHEL8 pilot environment to prepare for Quest moving completely to RHEL8 in March 2025. We invite researchers to provide us with feedback during the pilot by contacting the Research Computing and Data Services team at quest-help@northwestern.edu. The pilot environment will consist of 24 H100 GPU nodes and seventy-two CPU nodes, and it will expand with additional nodes through March 2025. Details on how to access this pilot environment will be published in a KB article on November 18.

When and how to use Amazon Web Service's (AWS) S3 storage to store data in the cloud.

Amazon S3 is a cloud-based "object storage" service, allowing files to be securely stored and accessed from anywhere. It can scale to handle any amount of data and pricing is based solely on usage, with multiple tiers of service available to allow for flexibility and cost optimization. More information about Amazon S3 can be found at https://aws.amazon.com/s3/.

When to Use Amazon S3

  • Backups and data archiving. S3 is a relatively low cost solution for data storage ($0.023/GB/month) and can be used to automatically transfer data to the even lower cost AWS Glacier service for long term archiving of data that will not be frequently accessed.
  • Data analysis using Amazon Web Services. If you will be using Amazon's other services (AWS EC2, AWS Batch, AWS Elastic MapReduce, etc.) to analyze your data, storing it in S3 will provide significantly faster data access than transferring from other services.

The Northwestern Globus S3 connector service can access S3 buckets in the us-east-1 (Northern Virginia), us-east-2 (Ohio), and us-west-2 (Oregon) regions.

Amazon S3 Pricing and Cost Considerations

Amazon S3 pricing can be found here: https://aws.amazon.com/s3/pricing/. For the regions supported by this service, the standard storage tier is $0.023/GB/Month. There is a charge for requests (that is, API commands interacting with the S3 service) as well but it is usually only a few cents per month unless a very large amount of requests is issued.

Data transfer *in* to Amazon S3 is always free. However, data transfer out to the internet is $0.09/GB and data transfer from one S3 region to another costs $0.01/GB or $0.02/GB, depending on the region. Therefore it is important to consider which region you create your buckets in to minimize cost and latency of data transfer.

Northwestern does have a consolidated billing account in place with Amazon Web Services, and accounts created within this billing structure have their data transfer out cost waived (provided certain conditions are met). More information about accounts can be found here: http://www.cloud.northwestern.edu/aws/.

 

Details

Details

Article ID: 2024
Created
Wed 11/16/22 3:18 PM
Modified
Fri 10/25/24 3:18 PM

Related Articles

Related Articles (3)

Related Services / Offerings

Related Services / Offerings (1)

Northwestern IT offers support, training, and workshops on research data management topics.