Having two urgent questions, as I need to support R users soon.
How to boil R libraries in the image of single user Dockerfile
I am using the following script; when building the image it takes a long time in the R packages installation lines; and I need to interrupt it. Does it look fine?
# Make sure to match your JupyterHub application version
FROM quay.io/jupyter/datascience-notebook:hub-5.2.1
USER root
# Install OS packages, dependencies, packages, ...
# For example:
#RUN pip install jupyter-ai[all]
RUN pip install minio
RUN pip install pandas
RUN pip install duckdb
RUN pip install xgboost
RUN pip install prophet
RUN pip install plotly
RUN pip install polars
### Install R packages
RUN R -e "install.packages(c('tidyverse','data.table','janitor','ggplot2','plotly','gganimate','caret','mlr3','xgboost','glmnet','torch','keras','tidyquant','lubridate','knitr','shiny'), repos='https://cloud.r-project.org')"
USER jovyan
How to install R libraries in the notebook itself?
Please have a look at the upstream Dockerfile. Wherever possible docker-stacks the mamba package manager tool to provision a virtual environment, which is activated by default. This directly supports the conda(-forge) ecosystem, and indirectly, PyPI via an extra file, environment.yml, use of which in a container has advantages (better tooling support e.g. renovate) and disadvantages (another COPY in Dockerfile).
Using the equivalent of sudo pip and sudo R to install packages may have unintended side-effects for the ones already installed with mamba which are accessible to (and changeable by) a user.
Consider the following:
don’t drop to root
use mamba install and clean up after yourself (it will cache a couple hundred megabytes for intermediate data)
I’m back with another question: how to create a new (multiple) Python environment boiled in the image? I need to support users with different Python versions; when the user creates a new environment using conda, the pod does not become alive after idle turn off.
Well, when users create new conda environments, they do that in their own persistent storage. In this case, you can use nb_conda_kernels which will pick up kernels installed inside conda environments at runtime and add them to Launcher.