Design advice for a %%docker magic to run commands in containers

Hi all,

I’m exploring a design for my project Dockyter, an IPython/Jupyter extension that adds a %%docker magic to run shell commands inside Docker containers while keeping the base Python environment light.

The intended user-facing flow looks roughly like this:


%%docker -v /home/jovyan/data:/input image:latest

!tool --input /input/file.txt

Internally this would translate !tool ... into something like:


docker run --rm -v /home/jovyan/data:/input image:latest \

bash -lc "tool --input /input/file.txt"

after %%docker has configured the image and args.

Architecture questions

I’m unsure where Dockyter should live in the Jupyter stack (kernel process + protocol, as described in kernel architecture docs).

  • IPython extension only: %load_ext dockyter, patch InteractiveShell.system so that ! goes through Docker when enabled.

  • Integrated into ipykernel: same behaviour, but configurable via traitlets and enabled/disabled per kernel.

  • Separate “Docker-aware” kernel: a dedicated kernel that always loads and configures this behaviour.

From the point of view of people running JupyterHub/Binder which approach fits real-world deployments best?

Security questions

I realise notebooks already allow arbitrary Python, but a magic that makes it easy to launch containers raises extra concerns. I’m currently thinking:

  • basic safeguards in the kernel (filtering dangerous Docker flags, allow/deny lists for images.

  • real isolation from the surrounding infrastructure (containers/pods per user, resource limits, network/volume policies).

Questions:

  • Does this threat model sound reasonable, or it present too many problems ?

  • Would you expect such a feature to be “off” by default and only enabled explicitly (config or kernelspec), or is a plain %load_ext dockyter extension acceptable?

  • Are there examples of similar projects that have worked well (or badly) in JupyterHub/Binder environments?

Thank you in advance for your help !

That seems like a great project idea! Looking forward to a nice POC

1 Like

A couple feelings about mixing containers into a Jupyter session:

  • not just docker (e.g. podman, apptainer, minikube, etc.)
  • not just one container (e.g. docker-compose, podman-compose)
  • multiple registries, auth, etc.
  • ways to find/prune unused images

Overloading ! is somewhat fraught, and it would likely be more readable as a % magic, perhaps something more general like %oci. Auto-enabling extensions is also fraught: again, it’s more readable to either rely on an explicit %reload_ext something and then see %something later on.

Typographically, if basing this off ipykernel/ipython/python, one approach to getting some of the magic would be to use a context manager:

with some_container:
    # scoped magic stuff here 

As for running any of the above inside hub/binder: probably just isn’t going to work in most cases. Cluster operators really don’t like running arbitrary workloads with rando containers and the docker/k8s socket bound. This is a particularly sound argument for podman(-compose), which can be configured to run containers-in-containers without elevated privileges.

3 Likes

I think it’s a really interesting idea.

I wouldn’t overload the magic commands, as I might want to use others, making the cells difficult to read. Instead, I would prefer to have the names of the services to send the code to in a dropdown menu, in each cell.

As for security, they should be explicitly stated somewhere, as should at least the available kernel type (python, R, etc.) for each of the service available.

Also you may want to have a look at the Enterprise Kernel Gateway in order to have an idea about how to manage the kernel execution through many layers of jupyter