Troubleshooting Installing R Packages on Quest and Quest Analytics

Installing R Packages in a RStudio Server or R instance on Quest log-in nodes, Quest analytics nodes, or on Quest OnDemand.

Introduction

Due to the technical limitations of RStudio Server, some R packages that can be installed and loaded on Quest log-in nodes may not work when loaded in RStudio Server on the Quest Analytics nodes. If you would still like to use the graphical interface of RStudio Server  but are unable to load all the R packages you need on Quest Analytics, you will need to launch RStudio Server through Quest OnDemand.

The R environment in RStudio Server on the Quest Analytics nodes is equivalent to loading the following modules on the Quest log-in or compute nodes:

R/4.4.0
hdf5/1.14.1-2-gcc-12.3.0
gsl/2.7.1-gcc-12.3.0
fftw/3.3.10-gcc-12.3.0
gdal/3.7.0-gcc-12.3.0
nlopt/2.7.1-gcc-12.3.0

For posterity, we keep the previous installation instructions from older versions of R.

Please check for files where you may have at some point accidentally hardcoded paths before starting these instructions. This includes ~/.bash_profile, ~/.bashrc, ~/.local/share/rstudio/Renviron, ~/.local/share/rstudio/.Rprofile, or ~/.R/Makevars.

If issues arise, you can delete the folder where R/4.4.x packages get installed, which is (usually) ~/R/x86_64-pc-linux-gnu-library/4.4, and start again.

Seurat

Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data.

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("Seurat", repos="https://cloud.r-project.org/")

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
module load cmake/3.26.3-gcc-12.3.0
R

Note: if you are installing Seurat for the first time and don't have any of the dependencies this installation can take upwards of an hour as it will build all dependencies from source. If it fails to install a particular dependency, please try installing that dependency and then retrying the Seurat install. Some users have reported an error that asks to remove a particular file from their home directory during the installation process. If you get this error, remove the file in question and then retry the installation. If you have questions, let us know at quest-help@northwestern.edu

Monocle3

The monocle3 package provides a toolkit for analyzing single-cell gene expression experiments. Single-cell transcriptome sequencing (sc-RNA-seq) experiments help discover new cell types and understand how they arise in development.

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("remotes", repos="https://cloud.r-project.org/")
install.packages("sf", configure.args = c(sf = "--with-sqlite3-lib=/hpc/software/spack_v20d1/spack/opt/spack/linux-rhel7-x86_64/gcc-12.3.0/sqlite-3.40.1-gzayqyouerp6yxtxcd35gxeorakrlsg4/lib"))
remotes::install_github("cole-trapnell-lab/monocle3")

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
module load cmake/3.26.3-gcc-12.3.0
R

SF

The sf package provides support for simple features in R. It binds to 'GDAL' for reading and writing data, to 'GEOS' for geometrical operations, and to 'PROJ' for projection conversions and datum transformations. This package can also use the 's2' package for spherical geometry operations on geographic coordinates.

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("sf", configure.args = c(sf = "--with-sqlite3-lib=/hpc/software/spack_v20d1/spack/opt/spack/linux-rhel7-x86_64/gcc-12.3.0/sqlite-3.40.1-gzayqyouerp6yxtxcd35gxeorakrlsg4/lib"))

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
module load cmake/3.26.3-gcc-12.3.0
R

Terra

Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("terra", configure.args = c(terra = "--with-sqlite3-lib=/hpc/software/spack_v20d1/spack/opt/spack/linux-rhel7-x86_64/gcc-12.3.0/sqlite-3.40.1-gzayqyouerp6yxtxcd35gxeorakrlsg4/lib"))

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
module load cmake/3.26.3-gcc-12.3.0
R

RGDAL

Provides bindings to the 'Geospatial' Data Abstraction Library ('GDAL') (>= 1.11.4) and access to projection/transformation operations from the 'PROJ' library. Use is made of classes defined in the 'sp' package. Raster and vector map data can be imported into R, and raster and vector 'sp' objects exported. The 'GDAL' and 'PROJ' libraries are external to the package, and, when installing the package from source, must be correctly installed first; it is important that 'GDAL' < 3 be matched with 'PROJ' < 6. From 'rgdal' 1.5-8, installed with to 'GDAL' >=3, 'PROJ' >=6 and 'sp' >= 1.4, coordinate reference systems use 'WKT2_2019' strings, not 'PROJ' strings.

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("rgdal", repos="https://cloud.r-project.org/")

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install Seurat.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
module load cmake/3.26.3-gcc-12.3.0
R

hdf5r

hdf5r is an R interface to the HDF5 library. It is implemented using R6 classes based on the HDF5-C-API. The package supports all data-types as specified by HDF5 (including references) and provides many convenience functions yet also an extensive selection of the native HDF5-C-API functions.

In the RStudio Server simply copy and paste the command below into your R command window:

install.packages("hdf5r", repos="https://cloud.r-project.org/")

On the Quest log-in node, run the following commands to get the R command line and then run the command above to install hdf5r.

module purge all
module load R/4.4.0
module load hdf5/1.14.1-2-gcc-12.3.0
module load gsl/2.7.1-gcc-12.3.0
module load fftw/3.3.10-gcc-12.3.0
module load gdal/3.7.0-gcc-12.3.0
module load nlopt/2.7.1-gcc-12.3.0
R

JPEG

Note: the commands below can only be run in a terminal session on a Quest login node.

On the Quest log-in node, create a script called build.R which contains the following lines:

Sys.setenv(JPEG_LIBS="-L/hpc/software/spack_v20d1/spack/opt/spack/linux-rhel7-x86_64/gcc-12.3.0/libjpeg-turbo-2.1.5-rersiv4gdrubpy3or46q2vjjdho7mc7y/lib64")
install.packages("jpeg", repos ="https://cloud.r-project.org")
On the Quest log-in node, run the follow commands on the command-line to install jpeg.
module purge all
module load R/4.4.0
module load libjpeg-turbo/2.1.5-gcc-12.3.0
Rscript --vanilla "build.R"

RJAGS

The rjags package provides an interface from R to the JAGS library for Bayesian data analysis. JAGS uses Markov Chain Monte Carlo (MCMC) to generate a sequence of dependent samples from the posterior distribution of the parameters

Note: the commands below can only be run in a terminal session on a Quest login node.

On the Quest log-in node, create a script called build.R which contains the following lines:

install.packages("Rcpp", repos="https://cloud.r-project.org/")
install.packages("rjags", repos="https://cloud.r-project.org/")
On the Quest log-in node, run the follow commands on the command-line to install RJAGS.
module purge all
module load R/4.4.0
module load jags
Rscript --vanilla "build.R"

V8/RSTANARM/RSTAN

The rstan package is the R interface to Stan. User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo (MCMC), rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.

Create a script called build.R which contains the following lines:

install.packages("V8", repos="https://cloud.r-project.org/")
install.packages(c("rstanarm", "rstan"), repos="https://cloud.r-project.org/")

Start an interactive job on a Quest compute node with at least 10GB of RAM:

# replace <allocation> with an allocation you are a member of and run this from a log-in node to land on a compute node
srun -N 1 -n 1 --time=02:00:00 --mem=10G --account=<allocation> --partition=short --pty bash -l

On the Quest compute node, run the follow commands on the command-line to install V8/RSTANARM/RSTAN.

module purge all
module load R/4.4.0
module load nlopt/2.7.1-gcc-12.3.0
DOWNLOAD_STATIC_LIBV8=1 Rscript --vanilla "build.R"

DiffBind

DiffBind is an R package that is used for identifying sites that are differentially enriched between two or more sample groups. It helps to compute differentially bound sites from multiple ChIP-seq experiments using affinity (quantitative) data and allows occupancy (overlap) analysis and plotting functions.

Note: the commands can only be run in a terminal session on a Quest login node.

On the Quest log-in node, create a script called build.R which contains the following lines:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("V8")
BiocManager::install("DiffBind")

On the Quest log-in node, run the follow commands on the command-line to install DiffBind.

module purge all
module load R/4.4.0
DOWNLOAD_STATIC_LIBV8=1 Rscript --vanilla "build.R"
Was this helpful?
0% helpful - 3 reviews