Custom Docker Image for JupyterHub on Kubernetes: Issues with Multi-Stage Build

Hi everyone,

I’m working on a custom Docker image for a specific use case with JupyterHub on Kubernetes. Initially, I used the jupyterhub/singleuser image to spin up notebooks, and everything worked fine. However, my use case requires a custom Docker image that installs several additional packages. Here’s a simplified version of my Dockerfile:

FROM ubuntu:20.04

ENV DEBIAN_FRONTEND=noninteractive
ENV USER=root

WORKDIR "/"

RUN apt-get update && \
    apt-get install -y build-essential tmux libsm6 nano libcupti-dev libgl1-mesa-glx libglib2.0-0 libfreetype6-dev git graphviz texlive texlive-latex-extra pandoc python3 python3-dev python3-tk python3-pip && \
    pip3 install --upgrade pip

COPY requirements.txt /requirements.txt
RUN pip3 install --no-cache-dir -r /requirements.txt

RUN mkdir -p /myapp
WORKDIR /myapp

COPY . /myapp

EXPOSE 8888

ENTRYPOINT ["sh", "-c", "jupyter lab --notebook-dir=/ --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.tornado_settings='{\"websocket_max_message_size\": 104857600}' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]

When I tried to build and run the above image, it worked fine. However, when I switched to using the jupyterhub/singleuser image as the base, I faced issues because the apt-get commands failed due to the lack of a full Linux environment. To address this, I created a multi-stage Docker build:

# Stage 1: Use an Ubuntu image to install the necessary packages
FROM ubuntu:20.04 as builder

ENV DEBIAN_FRONTEND=noninteractive
ENV USER=root

WORKDIR "/"

RUN apt-get update && \
    apt-get install -y build-essential tmux libsm6 nano libcupti-dev libgl1-mesa-glx libglib2.0-0 libfreetype6-dev git graphviz texlive texlive-latex-extra pandoc python3 python3-dev python3-tk python3-pip && \
    pip3 install --upgrade pip

COPY requirements.txt /requirements.txt
RUN pip3 install --no-cache-dir -r /requirements.txt

# Stage 2: Use the single-user base image
FROM jupyterhub/singleuser

ENV DEBIAN_FRONTEND=noninteractive
ENV USER=root

# Use sh instead of bash if bash is not available
SHELL ["/bin/sh", "-c"]

# Copy Python packages installed in builder stage
COPY --from=builder /usr/local/lib/python3.8/dist-packages/ /usr/local/lib/python3.8/dist-packages/
COPY --from=builder /usr/local/bin/ /usr/local/bin/

# Copy additional files and directories
RUN mkdir -p /myapp
WORKDIR /myapp
COPY . /myapp

EXPOSE 8888

ENTRYPOINT ["sh", "-c", "jupyter lab --notebook-dir=/ --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.tornado_settings='{\"websocket_max_message_size\": 104857600}' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]

Despite this, I encountered the following error:

Traceback (most recent call last):
  File "/usr/local/bin/jupyterhub-singleuser", line 5, in <module>
    from jupyterhub.singleuser import main
  File "/usr/local/lib/python3.10/dist-packages/jupyterhub/singleuser/__init__.py", line 18, in <module>
    from .mixins import HubAuthenticatedHandler, make_singleuser_app
  File "/usr/local/lib/python3.10/dist-packages/jupyterhub/singleuser/mixins.py", line 56, in <module>
    from ._disable_user_config import _disable_user_config, _exclude_home
  File "/usr/local/lib/python3.10/dist-packages/jupyterhub/singleuser/_disable_user_config.py", line 25, in <module>
    from jupyter_core import paths
ModuleNotFoundError: No module named 'jupyter_core'

It seems like I’m missing something simple. I’ve seen similar discussions on this forum, but I’m not sure what the best way to customize the image to install all necessary packages and include my files into the user environment is.

Any guidance or suggestions on how to resolve this issue would be greatly appreciated!

Thank you in advance!

jupyterhub/singleuser image is based on docker stacks base image. I think it is easier if you take one of docker stacks image def files as your base and add custom packages that you need.

1 Like

Yes , I tried doing that. Here is what I did :

# Start from the Jupyter base-notebook image
FROM quay/jupyter/base-notebook #( I used the correct image ) 



# Set the working directory
WORKDIR "/"


USER root


# Upgrade pip
RUN pip3 install --upgrade pip

# Install Python packages from requirements.txt
ADD requirements.txt ./requirements.txt
RUN pip3 install --no-cache-dir -r ./requirements.txt

# Create working directory as root
RUN mkdir -p /somefolder

# Switch to non-root user
USER ${NB_UID}

# Add project files
WORKDIR "/somefolder"
ADD ./presentation ./presentation # and added some bunch of files


# Set Jupyter Notebook environment variable
ENV NB_PREFIX=/

# Expose Jupyter Notebook port
EXPOSE 8888

# Set the entrypoint to start Jupyter Lab
ENTRYPOINT ["sh", "-c", "jupyter lab --notebook-dir=/ --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.tornado_settings='{\"websocket_max_message_size\": 104857600}' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]

But I am still getting the same error :

Defaulted container "notebook" out of: notebook, block-cloud-metadata (init)
Traceback (most recent call last):
  File "/usr/local/bin/jupyterhub-singleuser", line 5, in <module>
    from jupyterhub.singleuser import main
  File "/usr/local/lib/python3.10/dist-packages/jupyterhub/singleuser/__init__.py", line 18, in <module>
    from .mixins import HubAuthenticatedHandler, make_singleuser_app
  File "/usr/local/lib/python3.10/dist-packages/jupyterhub/singleuser/mixins.py", line 56, in <module>
    from ._disable_user_config import _disable_user_config, _exclude_home
  File "/usr/local/lib/python3.10/dist-packages/jupyterhub/singleuser/_disable_user_config.py", line 25, in <module>
    from jupyter_core import paths
ModuleNotFoundError: No module named 'jupyter_core'

Any insights?

Could you post your requirements.txt? The base-notebook image installs all the Jupyter stack in mamba env. In your image def, you are installing it in the /usr/local/lib. The error might be due to missing dependencies in your requirements.txt.

pandas
scikit-learn
matplotlib
numpy

There is a image that ships all your custom packages in docker stacks. Here it is: scipy. You can use this image def as your base and try adding new entrypoint.

I think that is one of the thing that I am having an hard time with. Can you recommend a good entrypoint that works when we use the base image as base-notebook and then create a additional directory ( say ‘folder1’ ) , and then probably add some files to it. Forget about all the packages to install. For this use case , what would be the entrypoint ? ( or maybe docker file )

You dont have to redefine entrypoint. base-notebook has already everything configured. For instance, based on your first post and using scipy-notebook, here is a sample dockerfile

ARG REGISTRY=quay.io
ARG OWNER=jupyter
ARG BASE_CONTAINER=$REGISTRY/$OWNER/scipy-notebook
FROM $BASE_CONTAINER

RUN mkdir -p /myapp
WORKDIR /myapp

COPY . /myapp