Envkernel: manage kernels in different environments (venv, conda, Lmod, docker)

I run several JupyterHub clusters, and have to deploy kernels to users (and help them deploy their own). Doing this manually doesn’t scale, and I eventually ran into limits of scripting with existing interfaces. Thus, I created envkernel, which isn’t a kernel but activates different environments, then starts some other kernel.

It will be especially useful to anyone who has to manage many kernels across clusters or HPC or shared filesystem systems with many different environments (Lmod, conda, singularity, etc) and kernel types.

Example

You would often install a kernel from a virtualenv globally using:

source venv/bin/activate
python -m ipykernel  install --name=X --prefix=/path/to/jh-prefix

… but in this case, $PATH isn’t set because it’s running Python inside the venv, but not activating the environment. If you’re only running Python the notebook, you won’t notice, but if you are doing more (e.g. spawning other processes on $PATH), you’ll notice.

I have envkernel installed in the /path/to/jh-prefix environment, so I can do this instead:

source /path/to/jh-prefix/bin/activate
envkernel virtualenv --name=X --sys-prefix venv/
# --> create a new kernel X in jh-prefix that wraps ipykernel in venv/

This sets the kernel argv to ['path/to/envkernel', 'virtualenv', 'run', '/path/to/venv/', '--', 'python', '-m', 'ipykernel_launcher', '-f', '{connection_file}']. envkernel starts, activates the venv, then runs whatever the final arguments after -- are. Thus, there is nothing specific to the ipykernel here, it can be used with any kernel.

Everything could be done by editing the argv manually, but I grew tired of this.

Other modes

Conda, same as virtualenv:

# conda
envkernel conda --name=X /path/to/condaenv

Docker gets interesting, because there is actually non-trivial work to do to get it working and communicating inside the image. This runs ipykernel (etc) in the docker image but sets up communication to outside.

envkernel docker --name=X [docker args] [image name]

singularity, often used on HPC systems:

envkernel singularity --name=X [singularity args] /path/to/image.simg

Lmod, the original reason I wrote this. Lmod is an environment module system, and with it, you sometimes have to properly activate the module, and you would rather them loaded dynamically, not hard-coding paths. The first anaconda3 is the name to save the kernel, the second is the name of the module to be loaded, and --purge is familiar if you use Lmod (unload all other modules).

envkernel lmod --name=anaconda3 --purge anaconda3

Other kernels

envkernel only knows about IPython, but by using --kernel-cmd and --kernel one can have it wrap any other kernel. --kernel-template can clone an existing kernel, to get other support files such as kernel.js or the icons in IRkernel.

Future and see also

It works for me and allows me to more easily deploy lots of kernels on a HPC cluster - and provide simple instructions for anyone to put their own environments on the cluster using the --user option. see my user instructions

This also makes nbgrader secure autograding possible by isolating the execution inside of docker/singularity.

It will undoubtedly need improvements once others start using, but you can discuss here or file issues.

I’ve listed similar projects I know of under “See Also” in the readme - if you know anything else similar, I will add it to there. The most similar to look at is a2km.

Any comments, feedback, or use cases?