Short answer: it’s ~never right to use remove=False except for testing out JupyterHub deployment, even though it’s the default. It’s ~always the right thing to do to use remove=True, and make appropriate deliberate choices for persistent storage using volumes.
It is true that remove=False means that containers will not be removed by jupyterhub, which means it is left to admins to decide if and when containers should be deleted. One consequence, as you are seeing, is that config changes stored in stopped containers will not be reflected until containers are manually deleted. This can mean old containers won’t work at all (e.g. changed connection info for the Hub) or won’t be up to date (image changes). It’s ambiguous, however, what the right thing to do is, when faced with a config change such as the image:
- use outdated config (old image) and preserve user data, or
- destroy user data and get more up-to-date config
Automatically picking the destructive second option doesn’t seem desirable because someone’s data is sure to be lost as a result. Option 1 is recoverable with intervention, while option 2 is unconditional permanent data loss without user input. So when we are talking about default behavior, I think option 2 is not appropriate.
The only reason remove=False is the default is data loss: For remove=True to work as most folks want, user data should be persisted in volumes, but we don’t have a default persistent volume, in part because there’s no obvious, general default (how to determine a user’s home directory in general for images?).
All that said, I think the right thing to do is:
- warn when using
remove=Trueabout pitfalls that it’s not really what should be used - persist a user home directory volume by default (use /home/jovyan by default, such that it’s easily overrideable)
- switch to
remove=Trueas the default
I think it’s probably not worth the effort to make remove=False more robust and complex to remove containers sometimes, when it generally shouldn’t be used once you are up and running.