Kubernetes cannot create container for centos7 based jupyterhub notebook

Hello, I made a modified version of the jupyter/base-notebook based on centos7 instead of ubuntu:focal. The image could be successfully run in docker on my terminal.

When I installed jupyterhub using helm3 with my image, the hook-image-puller failed to pull the image

Here’s the output of the kubectl describe pods hook-image-puller-*

Events:
  Type     Reason          Age                 From               Message
  ----     ------          ----                ----               -------
  Normal   Scheduled       58m                 default-scheduler  Successfully assigned jupyterhub/hook-image-puller-smjnr to axion-k8s-node-1
  Normal   AddedInterface  58m                 multus             Add eth0 [10.233.118.100/32]
  Normal   Created         58m                 kubelet            Created container image-pull-metadata-block
  Normal   Started         58m                 kubelet            Started container image-pull-metadata-block
  Normal   Pulled          58m                 kubelet            Container image "jupyterhub/k8s-network-tools:1.1.3" already present on machine
  Normal   Pulled          36m                 kubelet            Successfully pulled image "axionhub/base-notebook:latest" in 20.943385991s
  Warning  Failed          36m (x6 over 55m)   kubelet            Error: failed to start container "image-pull-singleuser": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: chdir to cwd ("/home/jovyan") set in config.json failed: permission denied: unknown
  Normal   Created         31m (x7 over 55m)   kubelet            Created container image-pull-singleuser
  Normal   Pulled          31m                 kubelet            Successfully pulled image "axionhub/base-notebook:latest" in 16.117170092s
  Normal   Pulled          20m                 kubelet            Successfully pulled image "axionhub/base-notebook:latest" in 1m2.87055963s
  Warning  BackOff         17m (x2 over 29m)   kubelet            Back-off restarting failed container
  Normal   Pulling         70s (x11 over 55m)  kubelet            Pulling image "axionhub/base-notebook:latest"

The process failed with Error: failed to start container "image-pull-singleuser": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: chdir to cwd ("/home/jovyan") set in config.json failed: permission denied: unknown

All tests run by pytest were successful and I could launch jupyterhub with the image.

Expected behaviour

Image pull should be successful and I could launch a jupyterhub based on centos7. This is necessary for the applications I’m using.

Actual behaviour

image-pull-singleuser failed.

How to reproduce

Change the helm chart config to:

singleuser:
  image:
    name: axionhub/base-notebook
    tag: latest

and run helm install

Your personal set up

I’m using a private OpenStack cloud with centos 7 as the operating system. I’m using z2jh 1.1.3

  • OS:
    centos 7
  • Version(s):
Client: Docker Engine - Community
Version:           20.10.5
API version:       1.41
Go version:        go1.13.15
Git commit:        55c4c88
Built:             Tue Mar  2 20:33:55 2021
OS/Arch:           linux/amd64
Context:           default
Experimental:      true
kubectl version:
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:45:37Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}

I’m using the default spawner (KubeSpawner) and oauth2. But these shouldn’t affect the ability to use the image.

The Dockerfile I used to create the image:

ARG ROOT_CONTAINER=centos:centos7

FROM $ROOT_CONTAINER

ARG NB_USER="jovyan"
ARG NB_UID="1000"
ARG NB_GID="100"

# Fix DL4006
SHELL ["/bin/bash", "-o", "pipefail", "-c"]

USER root

RUN yum -y install epel-release && \
    yum repolist && \
    yum -y install \
    tini \
    wget \
    ca-certificates \
    sudo \
    which \
    vim \
    bash-completion &&\
    yum -y clean all

# Configure environment
ENV CONDA_DIR=/opt/conda \
    SHELL=/bin/bash \
    NB_USER="${NB_USER}" \
    NB_UID=${NB_UID} \
    NB_GID=${NB_GID} \
    LC_ALL=en_US.UTF-8 \
    LANG=en_US.UTF-8 \
    LANGUAGE=en_US.UTF-8
ENV PATH="${CONDA_DIR}/bin:${PATH}" \
    HOME="/home/${NB_USER}"

# Copy a script that we will use to correct permissions after running certain commands
COPY fix-permissions /usr/local/bin/fix-permissions
RUN chmod a+rx /usr/local/bin/fix-permissions

# Enable prompt color in the skeleton .bashrc before creating the default NB_USER
# hadolint ignore=SC2016
RUN echo  'force_color_prompt=yes' >> /etc/skel/.bashrc && \
    echo  'export PS1="\e[1;32m[\u@\h \W]\$ \e[m "' >> /etc/skel/.bashrc && \
    echo  'alias vi=vim' >> /etc/skel/.bashrc && \
   # Add call to conda init script see https://stackoverflow.com/a/58081608/4413446
   echo 'eval "$(command conda shell.bash hook 2> /dev/null)"' >> /etc/skel/.bashrc

# Create NB_USER with name jovyan user with UID=1000 and in the 'users' group
# and make sure these dirs are writable by the `users` group.
RUN echo "auth requisite pam_deny.so" >> /etc/pam.d/su && \
    touch /etc/sudoers && \
    sed -i.bak -e 's/^%admin/#%admin/' /etc/sudoers && \
    sed -i.bak -e 's/^%sudo/#%sudo/' /etc/sudoers && \
    useradd -l -m -s /bin/bash -N -u "${NB_UID}" "${NB_USER}" && \
    mkdir -p "${CONDA_DIR}" && \
    chown "${NB_USER}:${NB_GID}" "${CONDA_DIR}" && \
    chmod g+w /etc/passwd && \
    fix-permissions "${HOME}" && \
    fix-permissions "${CONDA_DIR}"



USER ${NB_UID}
ARG PYTHON_VERSION=default

# Setup work directory for backward-compatibility
RUN mkdir "/home/${NB_USER}/work" && \
    fix-permissions "/home/${NB_USER}"

# Install conda as jovyan and check the sha256 sum provided on the download site
WORKDIR /tmp

# ---- Miniforge installer ----
# Check https://github.com/conda-forge/miniforge/releases
# Package Manager and Python implementation to use (https://github.com/conda-forge/miniforge)
# We're using Mambaforge installer, possible options:
# - conda only: either Miniforge3 to use Python or Miniforge-pypy3 to use PyPy
# - conda + mamba: either Mambaforge to use Python or Mambaforge-pypy3 to use PyPy
# Installation: conda, mamba, pip
RUN set -x && \
    # Miniforge installer
    miniforge_arch=$(uname -m) && \
    miniforge_installer="Mambaforge-Linux-${miniforge_arch}.sh" && \
    wget --quiet "https://github.com/conda-forge/miniforge/releases/latest/download/${miniforge_installer}" && \
    /bin/bash "${miniforge_installer}" -f -b -p "${CONDA_DIR}" && \
    rm "${miniforge_installer}" && \
    # Conda configuration see https://conda.io/projects/conda/en/latest/configuration.html
    conda config --system --set auto_update_conda false && \
    conda config --system --set show_channel_urls true && \
    if [[ "${PYTHON_VERSION}" != "default" ]]; then mamba install --quiet --yes python="${PYTHON_VERSION}"; fi && \
    mamba list python | grep '^python ' | tr -s ' ' | cut -d ' ' -f 1,2 >> "${CONDA_DIR}/conda-meta/pinned" && \
    # Using conda to update all packages: https://github.com/mamba-org/mamba/issues/1092
    conda update --all --quiet --yes && \
    conda clean --all -f -y && \
    rm -rf "/home/${NB_USER}/.cache/yarn" && \
    fix-permissions "${CONDA_DIR}" && \
    fix-permissions "/home/${NB_USER}"

# Install Jupyter Notebook, Lab, and Hub
# Generate a notebook server config
# Cleanup temporary files
# Correct permissions
# Do all this in a single RUN command to avoid duplicating all of the
# files across image layers when the permissions change
RUN mamba install --quiet --yes \
    'notebook' \
    'jupyterhub' \
    'jupyterlab' && \
    mamba clean --all -f -y && \
    npm cache clean --force && \
    jupyter notebook --generate-config && \
    jupyter lab clean && \
    rm -rf "/home/${NB_USER}/.cache/yarn" && \
    fix-permissions "${CONDA_DIR}" && \
    fix-permissions "/home/${NB_USER}"

EXPOSE 8888


# Configure container startup
ENTRYPOINT ["tini", "-g", "--"]
CMD ["start-notebook.sh"]

# Copy local files as late as possible to avoid cache busting
COPY start.sh start-notebook.sh start-singleuser.sh /usr/local/bin/
# Currently need to have both jupyter_notebook_config and jupyter_server_config to support classic and lab
COPY jupyter_notebook_config.py /etc/jupyter/

# Fix permissions on /etc/jupyter as root
USER root

# Prepare upgrade to JupyterLab V3.0 #1205
RUN sed -re "s/c.NotebookApp/c.ServerApp/g" \
    /etc/jupyter/jupyter_notebook_config.py > /etc/jupyter/jupyter_server_config.py && \
    fix-permissions /etc/jupyter/

# Switch back to jovyan to avoid accidental container runs as root
USER ${NB_UID}
RUN fix-permissions "/home/${NB_USER}"

WORKDIR "${HOME}"

To help narrow down the problem please could you:

  • Try the pre-puller with a standard docker-stacks base image
  • Try tagging your image instead of using latest
  • Disable the pre-puller, and see if the image launches (you may need to increase the timeout since the image will have to be pulled)
  • Tell us how you setup your Kubernetes cluster since it sounds you’vwe installed it yourself
  • Show us you full Z2JH configuration with secrets redacted?

Thanks!

Hi Manic,

My install of the z2jh was working perfectly with the docker-stacks base image. I even remade the image on my own system for testing and it worked fine. As there is only one image in the repo, latest works fine for now, but I’ll retag them just to be sure. Using latest also worked for my remade version of the ubuntu base-notebook.

Just thought I should mention that axionhub/base-notebook is actually on docker hub right now and someone could pull it for a test.

I set up kubernetes with Kubespray on a private OpenStack cloud, everything, from dynamic PVC to ingress had been made to work, so I really guess the issue may be ubuntu and centos7 being not compatible somehow?

Here’s my Z2JH config:


hub:
  config:
    GlobusOAuthenticator:
      client_id: ********************
      client_secret: ********************
      oauth_callback_url: ********************
    JupyterHub:
      authenticator_class: globus                                             

singleuser:
  image:
    name: axionhub/base-notebook
    tag: latest
  defaultUrl: "/lab"
  storage:
    extraVolumes:
      - name: cvmfs-fermilab
        persistentVolumeClaim:
          claimName: jupyterhub-cvmfs-fermilab
      - name: cvmfs-dune
        persistentVolumeClaim:
          claimName: jupyterhub-cvmfs-dune
      - name: cvmfs-larsoft
        persistentVolumeClaim:
          claimName: jupyterhub-cvmfs-larsoft
    extraVolumeMounts:
      - name: cvmfs-fermilab
        mountPath: /cvmfs/fermilab.opensciencegrid.org/
      - name: cvmfs-larsoft
        mountPath: /cvmfs/larsoft.opensciencegrid.org/
      - name: cvmfs-dune
        mountPath: /cvmfs/dune.opensciencegrid.org/

ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/rewrite-target: /
  hosts: 
    - *******************
proxy:
  service:
    type: ClusterIP

Thanks!

I just tested a minimal-ish config with your image on a throwaway K8s cluster (no storage, DummyAuthenticator) and it worked fine:

hub:
  db:
    type: sqlite-memory

proxy:
  service:
    type: NodePort
    nodePorts:
      http: 31080

singleuser:
  image:
    name: axionhub/base-notebook
    tag: latest
  defaultUrl: "/lab"
  storage:
    type: none

Could you try something like this to rule out issues related to your storage controller?

Hi manics,
I tried your config and still got the errors

Events:
  Type     Reason          Age                From               Message
  ----     ------          ----               ----               -------
  Normal   Scheduled       70s                default-scheduler  Successfully assigned jupyterhub/hook-image-puller-lr6jk to axion-k8s-node-2
  Normal   AddedInterface  70s                multus             Add eth0 [10.233.92.246/32]
  Normal   Pulled          70s                kubelet            Container image "jupyterhub/k8s-network-tools:1.1.3" already present on machine
  Normal   Created         70s                kubelet            Created container image-pull-metadata-block
  Normal   Started         70s                kubelet            Started container image-pull-metadata-block
  Normal   Pulling         16s (x2 over 70s)  kubelet            Pulling image "axionhub/base-notebook:latest"
  Normal   Pulled          16s                kubelet            Successfully pulled image "axionhub/base-notebook:latest" in 53.556316525s
  Normal   Created         16s                kubelet            Created container image-pull-singleuser
  Warning  Failed          16s                kubelet            Error: failed to start container "image-pull-singleuser": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: chdir to cwd ("/home/jovyan") set in config.json failed: permission denied: unknown

Did you use helm and what version of jupyterhub did you use?

I used the latest version of Helm (3.7.1), and Z2JH 1.1.3. I tested it on Katacoda (a free to use playground for learning infrastructure, but it’s also handy for quickly testing things like this when you don’t feel like spinning up a new server/cluster):