Launch additional docker containers from a Jupyter Notebook instance

carlskii · February 18, 2021, 4:01pm

Hi, I was wondering if theres a way to launch additional containers from within a Jupyter notebook. I’m guessing it would need to have access to a Docker CLI or API to do this. Anybody done anything like this before ?

markperri · February 18, 2021, 6:49pm

I think you could do it when you spawn the notebook container by mounting /var/run/docker.sock and making sure the container is on a manager node.

minrk · February 23, 2021, 10:56am

Option 1: mount

I haven’t seen an example, but @markperri’s would work.

Assuming you are using DockerSpawner, then you can grant users access to the docker API with configuration like:

# mount the docker socket into user containers
c.DockerSpawner.volumes = {
    "/var/run/docker.sock": "/var/run/docker.sock",
}

# ensure the users have permission for the docker socket.
# this can also be done in the image
# gid may differ depending on host configuration,
# but this is what I need for a VM created by `docker-machine`
c.DockerSpawner.extra_host_config = {
    "group_add": [999],
}

Then you can install the docker cli client and/or pip install docker for a Python client, and you are off to the races.

However, granting users access to the docker socket like this can be a huge security issue, depending on your relationship between users and the jupyterhub deployment, because they would be able to have extensive admin access. If jupyterhub itself is in docker on the same host, then this would mean all users have full admin access to jupyterhub itself and all other users. If it’s already a shared machine where everybody’s an admin anyway, this doesn’t change anything.

Option 2. jupyterhub-authenticated service

A more controlled approach, with a bit more work, is to run an additional service that can only do what you want them to. Maybe you are talking about dask or spark workers in containers, etc.

For that, you can build a hub-authenticated service with a REST API to take the specific actions you need. In this case:

only your service talks directly to docker
it is authenticated with the Hub so you know which Hub user is making requests
you can launch containers with ‘link’ or ‘network’ arguments so that the requesting-user’s container can talk to the containers it has spawned, but not the containers it hasn’t
you can implement quotas/limits so that users don’t hog all the resources

This is more work, because you need to:

implement the service itself
specify the REST API and/or implement and document a client installed in your user environments
potentially add cleanup hooks to shutdown sibling containers when the user’s server stops

But at this point, you’ve clearly defined and have control over what users are able to do, and don’t need to worry about them having arbitrary access to docker itself.

carlskii · February 23, 2021, 1:01pm

Hi thanks for the reply - As you say option2 is probably the preferred way for us. Would you or anybody here be interested in helping us build this type of a solution ? The work would be paid for and the generic parts contributed to the OSS community.

Topic		Replies	Views
Docker container vs JupyterHub JupyterHub help-wanted	1	591	February 15, 2021
Create new containers from within existing container - specific users only JupyterHub	1	326	March 2, 2021
Access to outside world from a DockerSpawner spawned container JupyterHub	1	1072	January 27, 2020
Sharing of spawned notebook server discuss	3	910	June 10, 2020
JupyterHub DockerSpawner not preserving API token when attempting to spawn container with NB_USER JupyterHub help-wanted , docker	2	1587	February 28, 2022

Launch additional docker containers from a Jupyter Notebook instance

Option 1: mount

Option 2. jupyterhub-authenticated service

Related topics