Dockerspawner undocumented behaviour: `remove=False` does not replace existing image A with new image B

As per GitHub issue. Seems (at least to me) like weird and undocumented behaviour so posting (as requested) to see what the community thinks.


Description

Previously stopped container 1 derived from image A that is different to the latest image B chosen by a user from the options form is not removed when it should be in order to create container 2 . Instead, container 1 is simply restarted.

Expected behaviour

A user stopping their server and selecting image B should cause the existing stopped container 1 based on image A to be removed and a new container 2 created based on image B .

Actual behaviour

A user stops their server and selects image B but existing container 1 is restarted.

How to reproduce

With c.Spawner.remove = False and c.Spawner.allowed_images = {image_name_a: image_a, image_name_b: image_b} and a Lab ide.

  1. User clicks Start Server and selects image A in the option form
  2. User waits for server to load.
  3. User goes to File > Hub Control Panel > Stop My Server
  4. User clicks Start Server and selects image B in the option form
  5. User waits for server to load.
  6. User is presented with the container 1 as before (original container 1 is not removed)

Personally I’d argue that if a configuration option is offered then the system should handle resulting behaviour/state in a logical way, rather than weird undocumented behaviour like image B never spawning when image B is what the user specifically requested.

Seems a simple fix, but appreciate there could be additional complexity e.g. how does one define whether the current image is the same as previous? Image hash or name? Should one warn the user that their previous image container will be deleted permanently? What about stale Hub configurations like new container volume mounts?

Still, it’s weird for users when remove = False and they can only ever access the first image they ever spawn, whereas “hitting everything with a hammer” aka. remove = True means things like user apt package installations cannot be persisted and encourages users to keep their containers alive for as long as humanly possible.

Short answer: it’s ~never right to use remove=False except for testing out JupyterHub deployment, even though it’s the default. It’s ~always the right thing to do to use remove=True, and make appropriate deliberate choices for persistent storage using volumes.

It is true that remove=False means that containers will not be removed by jupyterhub, which means it is left to admins to decide if and when containers should be deleted. One consequence, as you are seeing, is that config changes stored in stopped containers will not be reflected until containers are manually deleted. This can mean old containers won’t work at all (e.g. changed connection info for the Hub) or won’t be up to date (image changes). It’s ambiguous, however, what the right thing to do is, when faced with a config change such as the image:

  1. use outdated config (old image) and preserve user data, or
  2. destroy user data and get more up-to-date config

Automatically picking the destructive second option doesn’t seem desirable because someone’s data is sure to be lost as a result. Option 1 is recoverable with intervention, while option 2 is unconditional permanent data loss without user input. So when we are talking about default behavior, I think option 2 is not appropriate.

The only reason remove=False is the default is data loss: For remove=True to work as most folks want, user data should be persisted in volumes, but we don’t have a default persistent volume, in part because there’s no obvious, general default (how to determine a user’s home directory in general for images?).

All that said, I think the right thing to do is:

  1. warn when using remove=True about pitfalls that it’s not really what should be used
  2. persist a user home directory volume by default (use /home/jovyan by default, such that it’s easily overrideable)
  3. switch to remove=True as the default

I think it’s probably not worth the effort to make remove=False more robust and complex to remove containers sometimes, when it generally shouldn’t be used once you are up and running.

1 Like