Creating a JupyterLab DockerSpawner image (launched by JupyterHub) that employs Fedora systemd(5)

nmvega · December 16, 2020, 2:43am

Hello Friends:

I run JupyterHub and it launches JupyterLab image containers via DockerSpawner when someone logs in. But I need to build a JupyterLab image container that runs Fedora systemd (for various reasons not necessarily related to JupyterLab).

I already designed a Dockerfile for this, and from the CLI it boots with systemd just fine, as I demonstrate now.

Here are the Dockerfile's ENTRYPOINT and CMD parameters:

# ==========================================================================
# Set default ENTRYPOINT, COMMAND, and EXPOSE (publish) attributes:
# ==========================================================================
EXPOSE 8888
ENTRYPOINT ["/usr/sbin/init"]
CMD ["start-singleuser.sh"]
# ==========================================================================

Now let’s launch that image container and log into it via the CLI:

RUN:   root# docker run --name my-container --detach --privileged --volume /sys/fs/cgroup:/sys/fs/cgroup:ro --publish 18888:8888 acme/lab-with-systemd:1.0
LOGIN: root# docker exec --user root -it my-container /bin/bash

And inside that container we see:

user@container$ ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 02:07 ?        00:00:00 /usr/sbin/init /usr/local/bin/start-singleuser.sh
root          24       1  0 02:07 ?        00:00:00 /usr/lib/systemd/systemd-journald
root          37       1  0 02:07 ?        00:00:00 /usr/lib/systemd/systemd-logind
dbus          39       1  0 02:07 ?        00:00:00 /usr/bin/dbus-broker-launch --scope system --audit
dbus          40      39  0 02:07 ?        00:00:00 dbus-broker --log 4 --controller 9 --machine-id af44103f2675
root          42       0  0 02:07 pts/0    00:00:00 /bin/bash
root          60      42  0 02:07 pts/0    00:00:00 ps -ef

This appears fine. Notice that the command start-singleuser.sh is indeed running (as the argument to /usr/sbin/init), but that parameters to it are missing. This is expected because the container was launched via the CLI as opposed to via JupyterHub.

So let’s launch it via JupyterHub login. When I kill the running container, and try to have it launch by logging into JupyterHub, the container immediately exits. Here is what docker ps -a --no-trunc shows (which I reformatted for visual clarity):

cb13f5b87d5195a2b7648c7cfb14ac2d75e369eb537167068131dd1b47957e76  acme/lab-with-systemd:1.0
  "/usr/sbin/init /usr/local/bin/start-singleuser.sh
      --ip=0.0.0.0
      --port=8888
      --notebook-dir=/home/user/work
      --SingleUserNotebookApp.default_url=/lab
      --debug --disable-user-config"
    Exited (255) 9 secs ago jupyter-jdoe

A couple of notes before ending this question:

I use user instead of jovyan.
Even if the above had worked, it may be a better idea to have /usr/sbin/init run by itself, and then start start-singleuser.sh somehow as a sub-process (rather than being an argument to /usr/sbin/init). But I’ll take anything that works for now. Just saying.
I tried creating a jupyterlab.service systemd service, but that didn’t work out well. Admittedly, I rushed that implementation, so maybe it works but I rushed and made errors. I’ll revisit that.

Any ideas how to make this work? Thank you!

manics · December 17, 2020, 10:06am

Can you share your full jupyterhub config with secrets redacted?

nmvega · December 17, 2020, 5:33pm

Hi @manics How delightful to hear from you.

Because I launch JupyterHub using a ./docker-compose.yml file, I included that file as well as the ./.env file which it references, too.

As a quick follow-up, I don’t think using a jupyterlab.service file would work because systemd will (by default) launch that service without supplying context information passed to it by JupyterHub. So I kind of gave up on that version of a solution (i.e. using systemd).

And truthfully, I won’t need to solve this question / problem at all if this much easier problem is solved HERE.

Thank you. See below.

jupyterhub_config.py:

# ===========================================================================
# SEE: https://jupyterhub.readthedocs.io/en/stable/api/app.html
# Configuration file for JupyterHub. This file is evaluated inside the
# JupyterHub container, once is boots. All environment variables referenced
# below are injected by way of docker-compose.yml(5). (See next comment).
# ===========================================================================
import os, secrets
c = get_config()
# ===========================================================================


# ===========================================================================
# Initialize dictionaries here that we update later on, rather than
# initialize them somewhere deep in the code.
# ===========================================================================
c.DockerSpawner.extra_host_config   = dict()
c.DockerSpawner.extra_create_kwargs = dict()
# ===========================================================================


# ===========================================================================
# We use 'environment: and 'env_file:' directives of docker-compose.yml(5) to
# inject UNIX environment variables into the JupyterHub container; which, in
# turn are referenced by this 'jupyterhub_config.py' file (via Python's
# os.environ[...]). This helps us avoid having to rebuild the JupyterHub
# Docker image every time we change a configuration parameter. Instead, after
# editing this file, we do: `docker-compose down [-v]` && `docker-compose up -d`.
# ===========================================================================



# ===========================================================================
# Spawn single-user servers as Docker containers
# ===========================================================================
c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'
# ===========================================================================


# ===========================================================================
notebook_dir = os.environ.get('DOCKER_NOTEBOOK_DIR') or '/home/user'
c.DockerSpawner.notebook_dir = notebook_dir
# ===========================================================================



# ===========================================================================
# REF: https://github.com/jupyter/docker-stacks
# ===========================================================================
# -- 'c.DockerSpawner.image' specifies the default Docker Notebook image to
#     launch when a drop-down menu of user-selectable images is not offered
#     to the user at login time. It is defined in the './.env' file.
# ===========================================================================
# -- 'c.DockerSpawner.image_whitelist' specifies the Docker Notebook images
#     (i.e. list items) that appear in user-selectable drop-down menu.
#     It is a Python dict() generated below from the 'SPAWN_IMAGE_WHITELIST'
#     UNIX environment variable defined and maintained in the './.env' file.
# ===========================================================================
c.DockerSpawner.image = 'jupyter/minimal-notebook:latest'
c.DockerSpawner.image = os.getenv('DOCKER_NOTEBOOK_IMAGE', c.DockerSpawner.image)
SPAWN_IMAGE_WHITELIST = os.getenv('SPAWN_IMAGE_WHITELIST').strip('"')
c.DockerSpawner.image_whitelist = dict(
    x.split(',') for x in "".join(SPAWN_IMAGE_WHITELIST.split()).strip(';').split(';'))
# ===========================================================================



# ===========================================================================
# Allows users to launch multiple instances of the same Notebook image; each
# of which they can give a friendly name to. We explicitly prohibit that. =:)
# ===========================================================================
c.JupyterHub.allow_named_servers = False
# ===========================================================================



# ===========================================================================
# JupyterHub requires a single-user instance of the Notebook server, so we
# default to using the `start-singleuser.sh` script included in the
# jupyter/docker-stacks *-notebook images as the Docker run command when
# spawning containers. Optionally, you can override the Docker run command
# using the DOCKER_SPAWN_CMD environment variable.
# ===========================================================================
#spawn_cmd = "start-singleuser.sh --SingleUserNotebookApp.default_url=/lab"
spawn_cmd = "tini -g -- start-notebook.sh \
               --ip=0.0.0.0 \
               --port=8888 \
               --notebook-dir=c.DockerSpawner.notebook_dir \
               --SingleUserNotebookApp.default_url=/lab --debug \
               --disable-user-config"

#spawn_cmd += " --SingleUserNotebookApp.disable_user_config=True"
#spawn_cmd = os.environ.get('DOCKER_SPAWN_CMD', spawn_cmd)
#c.DockerSpawner.extra_create_kwargs.update({'command': spawn_cmd})
# ===========================================================================


# ===========================================================================
# Connect containers to this Docker network. Pass the network name as
# argument to spawned Notebook containers.
# ===========================================================================
network_name = os.environ['DOCKER_NETWORK_NAME']
c.DockerSpawner.use_internal_ip = True
c.DockerSpawner.network_name = network_name
c.DockerSpawner.extra_host_config.update({'network_mode': network_name})
# ===========================================================================


# ===========================================================================
# https://discourse.jupyter.org/t/how-do-you-run-a-jupterlab-docker-container-in-privileged-mode/5352
# https://docker-py.readthedocs.io/en/latest/api.html
# ===========================================================================
# The attribute 'c.DockerSpawner.extra_host_config' allows one to TUNE the
# CLI parameters that JupyterHUB passes to the docker(1) command when it
# launches a JupyterLAB instance. Because we need to run full Fedora O/S
# with systemd(5) included, instead of Fedora Minimal (which doesn't include
# systemd(5), the below launch parameters are needed. See the following:
#    https://hub.docker.com/r/jrei/systemd-fedora/dockerfile
# ===========================================================================
c.DockerSpawner.extra_host_config.update({
            "privileged" : True,
            "devices"    : ["/sys/fs/cgroup:/sys/fs/cgroup:ro",],
            "tmpfs"      : {"/tmp":"", "/run":"", "/run/lock":""}, })

#c.DockerSpawner.extra_host_config = {
#            "privileged" : True,
#            "devices"    : ["/sys/fs/cgroup:/sys/fs/cgroup:ro",], }
# ===========================================================================


# ===========================================================================
# Mount the real user's Docker volume on the host to the notebook user's
# notebook directory in the container.
# ===========================================================================
c.DockerSpawner.volumes = { 'jupyterhub-user-{username}': notebook_dir }
# ===========================================================================


# ===========================================================================
# volume_driver is no longer a keyword argument to create_container()
# c.DockerSpawner.extra_create_kwargs.update({ 'volume_driver': 'local' })
# ===========================================================================


# ===========================================================================
# Remove containers once they are stopped.
# ===========================================================================
c.DockerSpawner.remove_containers = True
# ===========================================================================


# ===========================================================================
# For debugging arguments passed to spawned containers.
# ===========================================================================
c.DockerSpawner.debug = True
# ===========================================================================


# ===========================================================================
# User Notebook containers will access jupyterhub by container-name on
# backend Docker network.
# ===========================================================================
from jupyter_client.localinterfaces import public_ips
# ===========================================================================
c.JupyterHub.hub_port = 8080
c.JupyterHub.hub_ip = 'jupyterhub'
c.JupyterHub.hub_connect_ip = 'jupyterhub'
#c.JupyterHub.hub_ip = '0.0.0.0'
#c.JupyterHub.hub_connect_ip = '0.0.0.0'
#c.JupyterHub.hub_ip = public_ips()[0]
#c.JupyterHub.hub_connect_ip = public_ips()[0]
# ===========================================================================


# ===========================================================================
# TLS config
# NOTE: 'c.JupyterHub.port = 443' refers to the Container-side port, not the
# Host-side port. The Host-side requires mapping to 8443 port, which we do
# within 'docker-compose.yml'. Reason: Port 443 is taken by GitLab nginx.
# ===========================================================================
c.JupyterHub.port = 443
c.JupyterHub.ssl_key = os.environ['SSL_KEY']
c.JupyterHub.ssl_cert = os.environ['SSL_CRT']
# ===========================================================================


# ===========================================================================
from oauthenticator.gitlab import GitLabOAuthenticator
c.JupyterHub.authenticator_class = GitLabOAuthenticator
c.GitLabOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']
c.GitLabOAuthenticator.client_id          = os.environ['OAUTH_CLIENT_ID']
c.GitLabOAuthenticator.client_secret      = os.environ['OAUTH_CLIENT_SECRET']
# ===========================================================================


# ===========================================================================
# Persist jupyterhub data on volume mounted inside container.
# ===========================================================================
data_dir = os.environ.get('DATA_VOLUME_CONTAINER', '/data')
# ===========================================================================


# ===========================================================================
# TBD...
# ===========================================================================
#c.JupyterHub.cookie_secret_file = os.path.join(data_dir, 'jupyterhub_cookie_secret')
os.environ['JPY_COOKIE_SECRET'] = secrets.token_hex(16)
c.JupyterHub.cookie_secret = bytes.fromhex(os.environ['JPY_COOKIE_SECRET'])
# ===========================================================================


# ===========================================================================
# - POSTGRES_HOST is injected via docker-compose.yml (environment: section).
# - POSTGRES_USER, POSTGRES_PASSWORD and POSTGRES_DB are also injected via
#   docker-compose.yml (env_file: -./etc/secrets.d/postgres.env)
# ===========================================================================
#c.JupyterHub.db_url = 'postgresql://postgres:{password}@{host}/{db}'.format(
c.JupyterHub.db_url = 'postgresql://{user}:{password}@{host}/{db}'.format(
    user=os.environ['POSTGRES_USER'],
    password=os.environ['POSTGRES_PASSWORD'],
    host=os.environ['POSTGRES_HOST'],
    db=os.environ['POSTGRES_DB'],)
# ===========================================================================


# ================================================================================
# Restrict access to only members of certain GitLab Projects or Groups.
# Note: Using this causes extra API calls, which incurs performance penalty.
# ================================================================================
##c.GitLabOAuthenticator.gitlab_project_id_whitelist = [ ... ]
##c.GitLabOAuthenticator.gitlab_group_whitelist = [ ... ]
# ================================================================================


# ===========================================================================
# Whitlelist users and admins.
# ================================================================================
c.Authenticator.whitelist = whitelist = set()
c.Authenticator.admin_users = admin = set()
c.JupyterHub.admin_access = True
userlist = os.environ['USERLIST_FILE']
# ===========================================================================
with open(userlist) as f:
    for line in f:
        if not line: continue
        parts = line.split()
        # In case of newline at the end of userlist.txt file.
        if len(parts) >= 1:
            name = parts[0]
            whitelist.add(name)
            if len(parts) > 1 and parts[1] == 'admin':
                admin.add(name)
# ===========================================================================


# ================================================================================
# Uncommented-out settings (i.e. to explicitly set them).
# ================================================================================
c.JupyterHub.active_server_limit = 20
c.Spawner.disable_user_config = True # Increases security.
c.JupyterHub.cookie_max_age_days = 14 # Maybe change to 7.
c.JupyterHub.shutdown_on_logout = True
c.Authenticator.admin_users = set('janeDoe',) # Set of users w/ admin rights.
c.Spawner.default_url = '/lab' # Starts JupyterLab by default. \o/
# ================================================================================

docker-compose.yml:

version: '3.7'

networks:
  backend-net:
    external:
      name: ${DOCKER_NETWORK_NAME}
 #frontend-net:

volumes:
  data:
    external:
      name: ${DATA_VOLUME_HOST}
  db:
    external:
      name: ${DB_VOLUME_HOST}

services:
  jupyterhub-db:
    image: postgres:latest
    container_name: ${DATABASE_DOCKER_MACHINE_NAME}
    restart: always
    environment:
      POSTGRES_DB: ${POSTGRES_DB}
      PGDATA: ${DB_VOLUME_CONTAINER}
    env_file:
      - ./secrets.d/postgres.env
    volumes:
      - "db:${DB_VOLUME_CONTAINER}"
    networks:
      backend-net:
        aliases:
          - "postgres01"
    ports:
      - "15432:5432"

  jupyterhub:
    depends_on:
     - jupyterhub-db
    build:
      context: .
      dockerfile: Dockerfile
      args:
        FEDORA_VERSION: ${FEDORA_VERSION}
    restart: always
    image: ${JUPYTERHUB_IMAGE_FQ_NAME}
    container_name: ${JUPYTERHUB_DOCKER_MACHINE_NAME}
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:rw"
      - "data:${DATA_VOLUME_CONTAINER}"
    networks:
      backend-net:
        aliases:
          - "jupyterhub01"
    ports:
     #- "8443:443"
      - "443:443"
    links:
     - jupyterhub-db
    environment:
      POSTGRES_HOST: jupyterhub-db
    env_file:
      - ./.env
      - ./secrets.d/postgres.env
      - ./secrets.d/oauth.env

    command: >
      /opt/jupyterhub.d/usr/bin/jupyterhub.sh
   #command: >
   #  sleep 6000000000000
   # For testing.

.env (which is read by the above docker-compose.yml:

# ====================================================================
# ENVs: Friendly names given to JupyterHub and Postgres containers ...
# ====================================================================
DATABASE_DOCKER_MACHINE_NAME=jupyterhub-db
JUPYTERHUB_DOCKER_MACHINE_NAME=jupyterhub
# ====================================================================


# ====================================================================
# ENVs: JupyterHub container volume mappings ...
# ====================================================================
# root@vps10# docker volume create --name=volName
# root@vps10# docker volume inspect --format '{{ .Mountpoint }}' volName
# ====================================================================
DATA_VOLUME_CONTAINER=/data
DATA_VOLUME_HOST=jupyterhub-data
#
DB_VOLUME_CONTAINER=/var/lib/postgresql/data
DB_VOLUME_HOST=jupyterhub-db-data
# ====================================================================


# ====================================================================
# ENVs: JupyterHub container backend network ...
# ====================================================================
# All containers will join network specified in DOCKER_NETWORK_NAME.
# ====================================================================
# root@vps10# docker network create --driver bridge \
#   --ipam-driver default --subnet 172.10.0.0/16 jupyterhub-backend
# ====================================================================
DOCKER_SUBNET_CIDR='172.10.0.0/16'
DOCKER_NETWORK_NAME=jupyterhub-backend
# ====================================================================


# ====================================================================
DOCKER_NOTEBOOK_IMAGE=jupyter/minimal-notebook:latest
# ====================================================================


# ====================================================================
# A Map of user-seletable Notebook images that appear (by name) upon
# login. Each should appear in the output of `docker image ls` on the
# JupyterHub Docker HOST (or an attempt will be made to 'pull' it down
# from hub.docker.com, which will likely not be the image you want).
# The below data-structure will be converted into a Python dict() named
# 'c.DockerSpawner.image_whitelist' via Python code in 'jupyterhub_config.py'.
# ====================================================================
SPAWN_IMAGE_WHITELIST="minimal,jupyter/minimal-notebook:latest; all-spark,jupyter/all-spark-notebook:latest; sin
gle-user,jupyterhub/singleuser:1.2; acme_base,acme/lab-with-systemd:1.0"
# ====================================================================


# ====================================================================
# The Notebook's Dockerfile(5) (not JupyterHub's Dockerfile) specifies
# the command to launch via a combination of "ENTRYPOINT [ ... ]" and
# "CMD [...]" directives. This can be overriden here, as long as the
# command specified is valid (i.e. works) inside the container.
# ====================================================================
#DOCKER_SPAWN_CMD="tini -g -- start-singleuser.sh --SingleUserNotebookApp.default_url=/lab"
#DOCKER_SPAWN_CMD="start-singleuser.sh --SingleUserNotebookApp.default_url=/lab"
JUPYTER_ENABLE_LAB=yes
# ====================================================================


# ====================================================================
# For the JupyterLab Docker instance named, 'jupyter-nmvega', this can be
# checked in the output of: nmvega@HOST$ docker inspect jupyter-nmvega
# ====================================================================
# I believe this also sets the initial Current Working Directory
# (CWD) in JupyterLab's navigation pane.
# ====================================================================
DOCKER_NOTEBOOK_DIR=/home/user
# ====================================================================


# ====================================================================
# ENV variables related to: JupyterHub site SSL Cert and Userlist location.
# ====================================================================
SSL_CRT=/opt/jupyterhub.d/etc/ssl.d/ide.example.com.crt
SSL_KEY=/opt/jupyterhub.d/etc/ssl.d/ide.example.com.key
USERLIST_FILE=/opt/jupyterhub.d/etc/conf.d/userlist.txt
# ====================================================================


# ====================================================================
# ENV variable for: 'ARG FEDORA_VERSION' in Dockerfile.
# ====================================================================
# At bulld-time (i.e. docker-compose(1M) up [--build] ...),
# docker-compose.yml(5) sets it in the jupyterhub Service like this:
#   :args
#     FEDORA_VERSION=${FEDORA_VERSION}
# which would be equivalent to specifying this on the CLI:
#   docker-compose(1M) --build-args=FEDORA_VERSION=${FEDORA_VERSION} ...
# making it available to the 'ARG FEDORA_VERSION' statement in Dockerfile.
# ====================================================================
FEDORA_VERSION=32
# ====================================================================


# ====================================================================
# ENVs: (1) Name for JupyterHub built image and
# (2) Name of Postgres DB/Schema to create.
# ====================================================================
JUPYTERHUB_IMAGE_FQ_NAME=acme/jupyterhub:1.0
POSTGRES_DB=jupyterhub
# ====================================================================

Topic		Replies	Views
Need Exact documentation to build own Docker image spawnable with Jupyterhub JupyterHub	8	3381	August 13, 2019
Podman(1) not working inside JupyterLab containers that are launched by JupyterHub General community , jupyterhub , how-to , help-wanted	12	3934	December 24, 2020
After updating JupyterHub 0.8.1 to 1.3.0.Not able to spawn docker images with network name as "host" in jupyterhub_config.py. #3533 JupyterHub how-to	9	1532	October 1, 2021
Jupyterhub on WSL-2 cannot spawn Docker JupyterHub how-to , help-wanted	3	1622	March 22, 2022
Why does JupyterHub not see the docker network I have created? General	20	8041	August 27, 2020

Creating a JupyterLab DockerSpawner image (launched by JupyterHub) that employs Fedora systemd(5)

Related topics