How are dockerspawner mounts supposed to work?

I have jupyterhub running with my own oauth implementation. I’m really struggling to get persistent volumes to work though as the dockerspawner documentation is not clear to me. Currently getting a permission denied error when I try to create a file.

I’d like each user to have their own mounted volume on my machine at /data as an example. So the file structure is /data/user1@gmail.com /data/user2@gmail.com…

I really don’t understand how this is supposed to be accomplished in the config though. I can’t find the users files when browsing the docker container either, where are they stored?

For instance, in my config I have the following.

c.JupyterHub.spawner_class = DockerSpawner
c.DockerSpawner.network_name = ‘jupyterhub’
c.DockerSpawner.remove = True
notebook_dir = os.environ.get(‘DOCKER_NOTEBOOK_DIR’) or “/home/jovyan/notebook” Notebook images are ran as user jovyan…
c.DockerSpawner.notebook_dir = notebook_dir
c.DockerSpawner.volumes = {‘jupyterhub-user-{username}’: notebook_dir}
c.DockerSpawner.image = “jupyter/datascience-notebook:latest”
c.DockerSpawner.mounts = [{‘source’: ‘/jupyterhub-user-{username}’, ‘target’: notebook_dir, ‘type’: ‘bind’}]

What volume should I mount in my compose file? do I mount /data:/jupyterhub-user-{username}?

How are the mount and the volume variables related? and how are these related to the docker volume I mount to jupyterhub?

do I mount /jupyterhub-user-{username}:/data???

having three sets of mounts/volumes is really confusing!

none of this makes sense to me… I’d really appreciate some help.

Thank you very much for any help you can give.

Lads, I’ve still not got this. It’s not clicking. What the hell am I supposed to be mounting.

If you’re not familiar with Docker we’ve got a full example in

that should be helpful. There’s different types of volume mounts, using c.DockerSpawner.volumes = {"jupyterhub-user-{username}": notebook_dir} alone is probably the easiest

1 Like

Hey! Thank you so much for getting back to me.

I’m quite familiar with Docker, running a huge number of services on my homelab using it.

Persistant data is not working for me unfortunately.

When using the more simpler docker volume command, the volume is created, and is retained, however when a server is closed only the files the user created are maintained.

I as a user for instance would like to add multiple kernels, multiple conda environments, and have this persist even when the server restarts…

This is unfortunately not working with c.DockerSpawner.volumes.

The docker mount is not clear, as mounting in docker normally creates any necessary existing directories. However the dockerspawner doesnt seem to be able to do this.

I would like all user data to persist on reboot. Is it more appropriate to have the target volume mount as /??

You could try building a custom image where everything is owned by root apart from wherever the user’s volume is mounted, so it’s impossible for anything to be changed outside that directory. You may also need some Conda or Python configuration to force them to use that directory, but you can try without first.

DockerSpawner uses the standard Docker API, same as the Docker command-line client, so you can try out different volume configurations locally without JupyterHub using just docker run, then work out what the equivalent DockerSpawner config is.

1 Like

Thank you for sharing your thoughts.

So, just to query, the only way of having packages, environments etc… persist is by building a custom image? Isn’t persistance of environments etc… quite important? What’s the point of packaging the jupyterlab images with conda, and mamba, if people don’t get to really use them?

I’m sorry if this is quite naive. I’m looking to put this together for a small research team, and this is 100% a requirement, I can’t really think of a circumstance where you wouldnt want their environments etc… to persist.

Persistent environment and docker containers are kind of opposite things. I’m not sure if you can bind mount the entire /home/jovyan instead of just /home/jovyan/work? Unless you can rig up some persistent container overlay.

1 Like

I am a fool!

I had the spawner remove on, so the containers were deleted instead of simply being stopped whenever.

This is much more acceptable as it persists at least within the container.

I can just run a script now for each user for docker to create a weekly and monthly backup of the image. It also means that if team members new to environments botch up their setup or get overwhelmed I can just strip it back to the base environment again.

I am thinking it would be easier to use bind mounts and rsync for this to save space, but I havent been able to get dockerspawner to create directories the same way I can on my own machine. Needs some investigation on my part.

Just an update.

Whilst having containers persist like this is a temporary solution, ideally I could use docker volumes or docker mounts to create data.

Unfortunately the docker volumes created as part of the documention on dockerspawner are stuck at root:root ownership, with no apparent way for the dockerspawner to force a change in ownership. I’ve tried dockerspawner.cmd dockerspawner.environment dockerspawn mounts and a tremendous number of different configurations of existing variables etc…

With no success.

This is a question that has been asked multiple times independently with substantial discussion. It was marked as a bug in 2017, but still has yet to be resolved. (I cant find this discussion again unfortunately)

here are some of the relevant discussions off hand.

This is a recurring issue people are facing. People like to have their environments and other folders backed up. Yes some people have created scripts as a workaround. However being able to change folder and user permissions of softare, containers etc… is at least from my experience, quite basic functionality.

fortunately it seems that the issue only occurs on newly mounted folders, i.e. if the image already possesses the file/folder, then this is a non issue. However the conda/pkgs and conda/envs folder does not exist in the base image making backup and sharing of packages and environments between users quite difficult. This is something worthwhile to have in a team, as it means I can forward on my working environment, scripts, and packages to a colleague, for troubleshooting rather than having to get admin approval to create a new image each time…

I wonder if this could be put forward as a feature request. At the very least the envs and pkgs folders can be put into the base image, as virtual environments are a staple, and backing up packages has other applications also.

DockerSpawner uses the standard Docker API, which is also used by the docker command line client.

This means DockerSpawner can only do what the docker API provides. For example, If you create a named Docker volume and mount it to an existing directory in the container the permissions should be correct, however if the directory doesn’t exist in the container, or you are using a host volume, you are responsible for setting the permissions.

Did you try the suggestion from @markperri of mounting /home/jovyan? Assuming this folder exists and is owned by the default image user, and you’re using Docker named volumes (not a host mounted volume) I think it should be writeable.

If you’re using a docker-stacks image you can configure it to chown the home directory:

Can provide a docker run ... command that works for a standalone JupyterLab container? If so we can work out the equivalent configuration for DockerSpawner.

The rest of what you require, i.e. persisting changes to the user environment, should be possible but you’ll need to configure Conda and Python to install additional packages to the mounted volume. My earlier suggestion of rebuilding the image but making all non-volume directories owned by root will help with that, since if it’s owned by root it’s impossible for a user to write to it, so any changes will be limited to the mounted volume. You may still need additional configuration.

Otherwise maybe Docker isn’t the right solution?

Are you sure you want to use containers for this? You might want to look at systemdspawner or slurmspawner.

1 Like