However, it appears that this file does not exist at all on the end singleuser container, and that initially the directories above appear to have been deleted to. Is something happening at some point in the process to wipe one of these intermediate directories? More importantly, what’s the correct and most idiomatic way to drop init_spark.py into the image?
jovyan@jupyter-xxxxxx:~$ ls -la ~/.ipython
total 8
drwxr-sr-x 2 jovyan users 4096 Dec 14 17:39 .
drwxrwsr-x 7 root users 4096 Dec 14 17:39 ..
(Side note, it also seems strange that /home/jovyan is owned by root.)
We can see that the Docker image itself doesn’t seem to have issues; when run in isolation with k8s out of the equation, the file exists:
$ docker container run -it --rm --entrypoint=bash jupyterhub-singleuser:0.8.0
(base) jovyan@cce1d31f07fa:~$ ls -la ~/.ipython/profile_default/startup/
total 4
drwxr-sr-x 2 jovyan users 27 Dec 14 17:03 .
drwxr-sr-x 3 jovyan users 21 Dec 14 17:03 ..
-rw-r--r-- 1 jovyan users 612 Dec 14 17:01 init_spark.py
(This file defines sc and spark just as the startup script for pyspark does.)
Edit: I see that there is a volumemount at /home/jovyan, which would explain why it is wiped clean.
So perhaps we want singleuser.extraFiles here? But is it safe to put that under /home/jovyan since that is already an existing VolumeMount? It seems like that might conflict with singleuser.homeMountPath.
Another edit: it seemed like singleuser.extraFiles would work, but it did not, because of the surrounding directories being hardcoded to mode 420.
extraFiles:
init_spark.py:
mountPath: "/home/jovyan/.ipython/profile_default/startup/init_spark.py"
# Use 664 to make writable by `users` group (owned by root)
mode: 0664
stringData: |
FOO = "bar"
This will result in
/opt/conda/lib/python3.9/site-packages/IPython/paths.py:59: UserWarning: IPython dir ‘/home/jovyan/.ipython’ is not a writable location, using a temp directory.
As this message indicates, this is because the surrounding directories, specifically /home/jovyan/.ipython, are not writeable by jovyan/users.
jovyan@jupyter-xxxx:~$ ls -dl /home/jovyan/.ipython
drwxr-sr-x 3 root users 4096 Dec 14 20:27 /home/jovyan/.ipython
jovyan@jupyter-xxxx:~$ ls -la /home/jovyan/.ipython/profile_default/startup/init_spark.py
-rw-rw-r-- 1 root users 612 Dec 14 20:26 /home/jovyan/.ipython/profile_default/startup/init_spark.py
As a result, the startup dir is changed to under /tmp, which no longer reads init_spark.py, defeating the initial goal:
In [1]: get_ipython().profile_dir.startup_dir
...:
Out[1]: '/tmp/tmpatviwsy_/profile_default/startup'
$ k get po -n xxx jupyter-xxxx -o yaml
- name: files
secret:
defaultMode: 420
items:
- key: init_spark.py
mode: 436
path: init_spark.py
secretName: jupyter-singleuser
What we have ended up doing is putting the startup script under /tmp and setting IPYTHONDIR environment variable to point to the intermediate /tmp/xxx/.ipython directory. Doesn’t feel great, but it works.