Troubleshooting Installing R Packages on Quest and Quest Analytics

Body

Quest and Kellogg Linux Cluster Downtime, December 14 - 18.

Quest, including the Quest Analytics Nodes, the Genomics Compute Cluster (GCC), the Kellogg Linux Cluster (KLC), and Quest OnDemand, will be unavailable for scheduled maintenance starting at 8 A.M. on Saturday, December 14, and ending approximately at 5 P.M. on Wednesday, December 18. During the maintenance window, you will not be able to login to Quest, Quest Analytics Nodes, the GCC, KLC, or Quest OnDemand submit new jobs, run jobs, or access files stored on Quest in any way including Globus. For details on this maintenance, please see the Status of University IT Services page.

Quest RHEL8 Pilot Environment - November 18.

Starting November 18, all Quest users are invited to test and run their workflows in a RHEL8 pilot environment to prepare for Quest moving completely to RHEL8 in March 2025. We invite researchers to provide us with feedback during the pilot by contacting the Research Computing and Data Services team at quest-help@northwestern.edu. The pilot environment will consist of 24 H100 GPU nodes and seventy-two CPU nodes, and it will expand with additional nodes through March 2025. Details on how to access this pilot environment will be published in a KB article on November 18.

Installing R Packages in a RStudio Server or R instance on Quest log-in nodes, Quest analytics nodes, or on Quest OnDemand.

Introduction

Due to the technical limitations of RStudio Server, some R packages that can be installed and loaded on Quest log-in nodes may not work when loaded in RStudio Server on the Quest Analytics nodes. If you would still like to use the graphical interface of RStudio Server  but are unable to load all the R packages you need on Quest Analytics, you will need to launch RStudio Server through Quest OnDemand.

The R environment in RStudio Server on the Quest Analytics nodes is equivalent to loading the following modules on the Quest log-in or compute nodes:

R/4.4.0
hdf5/1.14.1-2-gcc-12.3.0
gsl/2.7.1-gcc-12.3.0
fftw/3.3.10-gcc-12.3.0
gdal/3.7.0-gcc-12.3.0
nlopt/2.7.1-gcc-12.3.0

For posterity, we keep the previous installation instructions from older versions of R.

Please check for files where you may have at some point accidentally hardcoded paths before starting these instructions. This includes ~/.bash_profile, ~/.bashrc, ~/.local/share/rstudio/Renviron, ~/.local/share/rstudio/.Rprofile, or ~/.R/Makevars.

If issues arise, you can delete the folder where R/4.4.x packages get installed, which is (usually) ~/R/x86_64-pc-linux-gnu-library/4.4, and start again.

Seurat

Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data.

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("Seurat", repos="https://cloud.r-project.org/")

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
R

Note: if you are installing Seurat for the first time and don't have any of the dependencies this installation can take upwards of an hour as it will build all dependencies from source. If it fails to install a particular dependency, please try installing that dependency and then retrying the Seurat install. Some users have reported an error that asks to remove a particular file from their home directory during the installation process. If you get this error, remove the file in question and then retry the installation. If you have questions, let us know at quest-help@northwestern.edu

Monocle3

The monocle3 package provides a toolkit for analyzing single-cell gene expression experiments. Single-cell transcriptome sequencing (sc-RNA-seq) experiments help discover new cell types and understand how they arise in development.

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("remotes", repos="https://cloud.r-project.org/")
install.packages("sf", configure.args = c(sf = "--with-sqlite3-lib=/hpc/software/spack_v20d1/spack/opt/spack/linux-rhel7-x86_64/gcc-12.3.0/sqlite-3.40.1-gzayqyouerp6yxtxcd35gxeorakrlsg4/lib"))
remotes::install_github("cole-trapnell-lab/monocle3")

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
R

SF

The sf package provides support for simple features in R. It binds to 'GDAL' for reading and writing data, to 'GEOS' for geometrical operations, and to 'PROJ' for projection conversions and datum transformations. This package can also use the 's2' package for spherical geometry operations on geographic coordinates.

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("sf", configure.args = c(sf = "--with-sqlite3-lib=/hpc/software/spack_v20d1/spack/opt/spack/linux-rhel7-x86_64/gcc-12.3.0/sqlite-3.40.1-gzayqyouerp6yxtxcd35gxeorakrlsg4/lib"))

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
R

Terra

Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("terra", configure.args = c(terra = "--with-sqlite3-lib=/hpc/software/spack_v20d1/spack/opt/spack/linux-rhel7-x86_64/gcc-12.3.0/sqlite-3.40.1-gzayqyouerp6yxtxcd35gxeorakrlsg4/lib"))

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
R

RGDAL

Provides bindings to the 'Geospatial' Data Abstraction Library ('GDAL') (>= 1.11.4) and access to projection/transformation operations from the 'PROJ' library. Use is made of classes defined in the 'sp' package. Raster and vector map data can be imported into R, and raster and vector 'sp' objects exported. The 'GDAL' and 'PROJ' libraries are external to the package, and, when installing the package from source, must be correctly installed first; it is important that 'GDAL' < 3 be matched with 'PROJ' < 6. From 'rgdal' 1.5-8, installed with to 'GDAL' >=3, 'PROJ' >=6 and 'sp' >= 1.4, coordinate reference systems use 'WKT2_2019' strings, not 'PROJ' strings.

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("rgdal", repos="https://cloud.r-project.org/")

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
R

hdf5r

hdf5r is an R interface to the HDF5 library. It is implemented using R6 classes based on the HDF5-C-API. The package supports all data-types as specified by HDF5 (including references) and provides many convenience functions yet also an extensive selection of the native HDF5-C-API functions.

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("hdf5r", repos="https://cloud.r-project.org/")

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
R

JPEG

The rjags package provides an interface from R to the JAGS library for Bayesian data analysis. JAGS uses Markov Chain Monte Carlo (MCMC) to generate a sequence of dependent samples from the posterior distribution of the parameters

Note: the commands below can only be run in a terminal session on a Quest login node.

On the Quest log-in node, create a script called build.R which contains the following lines:

Sys.setenv(JPEG_LIBS="-L/hpc/software/spack_v20d1/spack/opt/spack/linux-rhel7-x86_64/gcc-12.3.0/libjpeg-turbo-2.1.5-rersiv4gdrubpy3or46q2vjjdho7mc7y/lib64")
install.packages("jpeg")
On the Quest log-in node, run the follow commands on the command-line to install RJAGS.
module purge all
module load R/4.4.0
module load libjpeg-turbo/2.1.5-gcc-12.3.0
Rscript --vanilla "build.R"

RJAGS

The rjags package provides an interface from R to the JAGS library for Bayesian data analysis. JAGS uses Markov Chain Monte Carlo (MCMC) to generate a sequence of dependent samples from the posterior distribution of the parameters

Note: the commands below can only be run in a terminal session on a Quest login node.

On the Quest log-in node, create a script called build.R which contains the following lines:

install.packages("Rcpp", repos="https://cloud.r-project.org/")
install.packages("rjags", repos="https://cloud.r-project.org/")
On the Quest log-in node, run the follow commands on the command-line to install RJAGS.
module purge all
module load R/4.4.0
module load jags
Rscript --vanilla "build.R"

V8/RSTANARM/RSTAN

The rstan package is the R interface to Stan. User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo (MCMC), rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.

Note: the commands below can only be run in a terminal session on a Quest login node.

On the Quest log-in node, create a script called build.R which contains the following lines:

install.packages("V8", repos="https://cloud.r-project.org/")
install.packages(c("rstanarm", "rstan"), repos="https://cloud.r-project.org/")

On the Quest log-in node, run the follow commands on the command-line to install V8/RSTANARM/RSTAN.

module purge all
module load R/4.4.0
DOWNLOAD_STATIC_LIBV8=1 Rscript --vanilla "build.R"

DiffBind

DiffBind is an R package that is used for identifying sites that are differentially enriched between two or more sample groups. It helps to compute differentially bound sites from multiple ChIP-seq experiments using affinity (quantitative) data and allows occupancy (overlap) analysis and plotting functions.

Note: the commands can only be run in a terminal session on a Quest login node.

On the Quest log-in node, create a script called build.R which contains the following lines:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("V8")
BiocManager::install("DiffBind")

On the Quest log-in node, run the follow commands on the command-line to install DiffBind.

module purge all
module load R/4.4.0
DOWNLOAD_STATIC_LIBV8=1 Rscript --vanilla "build.R"

Details

Details

Article ID: 1834
Created
Thu 5/12/22 1:39 PM
Modified
Fri 10/25/24 3:07 PM