Jupyterhub on K8s: IPython startup file gets deleted

We use a custom Docker image for singleuser that contains a COPY command, which copies an IPython startup file for the jovyan user:

ARG BASE_CONTAINER=$OWNER/pyspark-notebook:spark-3.2.0

...

COPY --chown="${NB_UID}:${NB_GID}" init_spark.py "${HOME}/.ipython/profile_default/startup/init_spark.py"

However, it appears that this file does not exist at all on the end singleuser container, and that initially the directories above appear to have been deleted to. Is something happening at some point in the process to wipe one of these intermediate directories? More importantly, what’s the correct and most idiomatic way to drop init_spark.py into the image?

jovyan@jupyter-xxxxxx:~$ ls -la ~/.ipython
total 8
drwxr-sr-x 2 jovyan users 4096 Dec 14 17:39 .
drwxrwsr-x 7 root   users 4096 Dec 14 17:39 ..

(Side note, it also seems strange that /home/jovyan is owned by root.)

We can see that the Docker image itself doesn’t seem to have issues; when run in isolation with k8s out of the equation, the file exists:

$ docker container run -it --rm --entrypoint=bash jupyterhub-singleuser:0.8.0
(base) jovyan@cce1d31f07fa:~$ ls -la ~/.ipython/profile_default/startup/
total 4
drwxr-sr-x 2 jovyan users  27 Dec 14 17:03 .
drwxr-sr-x 3 jovyan users  21 Dec 14 17:03 ..
-rw-r--r-- 1 jovyan users 612 Dec 14 17:01 init_spark.py

(This file defines sc and spark just as the startup script for pyspark does.)

Edit: I see that there is a volumemount at /home/jovyan, which would explain why it is wiped clean.

So perhaps we want singleuser.extraFiles here? But is it safe to put that under /home/jovyan since that is already an existing VolumeMount? It seems like that might conflict with singleuser.homeMountPath.

Another edit: it seemed like singleuser.extraFiles would work, but it did not, because of the surrounding directories being hardcoded to mode 420.

  extraFiles:
    init_spark.py:
      mountPath: "/home/jovyan/.ipython/profile_default/startup/init_spark.py"
      # Use 664 to make writable by `users` group (owned by root)
      mode: 0664
      stringData: |
        FOO = "bar"

This will result in

/opt/conda/lib/python3.9/site-packages/IPython/paths.py:59: UserWarning: IPython dir ‘/home/jovyan/.ipython’ is not a writable location, using a temp directory.

As this message indicates, this is because the surrounding directories, specifically /home/jovyan/.ipython, are not writeable by jovyan/users.

jovyan@jupyter-xxxx:~$ ls -dl /home/jovyan/.ipython
drwxr-sr-x 3 root users 4096 Dec 14 20:27 /home/jovyan/.ipython
jovyan@jupyter-xxxx:~$ ls -la /home/jovyan/.ipython/profile_default/startup/init_spark.py
-rw-rw-r-- 1 root users 612 Dec 14 20:26 /home/jovyan/.ipython/profile_default/startup/init_spark.py

As a result, the startup dir is changed to under /tmp, which no longer reads init_spark.py, defeating the initial goal:

In [1]: get_ipython().profile_dir.startup_dir
   ...: 
Out[1]: '/tmp/tmpatviwsy_/profile_default/startup'
$ k get po -n xxx jupyter-xxxx -o yaml

  - name: files
    secret:
      defaultMode: 420
      items:
      - key: init_spark.py
        mode: 436
        path: init_spark.py
      secretName: jupyter-singleuser

Related to `extraFiles` makes folder unwritable? - #5 by consideRatio

What we have ended up doing is putting the startup script under /tmp and setting IPYTHONDIR environment variable to point to the intermediate /tmp/xxx/.ipython directory. Doesn’t feel great, but it works.

You could use a lifecycle hook to copy the file at startup: Customizing User Environment — Zero to JupyterHub with Kubernetes documentation

1 Like

Thanks @manics, that worked like a charm:

Dockerfile:

COPY --chown="${NB_UID}:${NB_GID}" init_spark.py /tmp/init_spark.py

Helm:

singleuser:
  ...
  lifecycleHooks:
    postStart:
      exec:
        command:
          - "sh"
          - "-c"
          - >
            mkdir -p /home/jovyan/.ipython/profile_default/startup/;
            cp -a /tmp/init_spark.py /home/jovyan/.ipython/profile_default/startup/init_spark.py
1 Like