A TLJH Plugin to build user environments with repo2docker

Hey folks!

Just wanted to share something we’ve been working on as part of a wider Jupyter-based project for a university.

tljh-repo2docker is a plugin for The Littlest JupyterHub to create multiple user environments with repo2docker.

The plugin starts a JupyterHub service to manage the user environments.

The Environments page shows the list of built environments, as well as the ones currently being built:

Just like on Binder, new environments can be added by clicking on the Add New button and providing a URL to the repository. Optional name, memory, and CPU limits can also be set for the environment:

Once ready, the environments can be selected from the JupyterHub spawn page:

This will spawn a new user server using DockerSpawner.

Overall it’s a little bit similar to a Binder running on a single machine :slight_smile:

There is also this TLJH issue to find a way to extend the plugin system and enable these kind of use cases more easily. The idea would be to skip the creation of the default TLJH user environment if another one is provided in a plugin.

If anyone has any input on this that would be very welcome!

Thanks!

8 Likes

This looks great - thank you for sharing! It has some things in common with my Repo2DockerSpawner. You’ve managed to simplify a lot of the process, and make the build part admin-only whereas mine worked directly as a version of DockerSpawner where the user can specify their own Binder-ready repo URLs whenever they make a new server. It probably makes sense in most cases for the available Binders to be admin-controlled as they are in tljh-repo2docker.

Anyway, the most enlightening thing is that I am now aware that there is such a thing as a TLJH plugin!

To what extent does that plugin system need to be tied to TLJH (as opposed to JupyterHub in general)? I’ve taken a look but maybe failed to fully understand what the plugins are allowed to do…

I have essentially built a JupyterHub plugin in my ContainDS Dashboards project. As you can see in the Installation instructions, the way it fits into JupyterHub is through manual jupyterhub_config.py imports from my package: template_paths, tornado_settings, and extra_handlers.

And then optionally template_vars - plus some requirements around how you configure DockerSpawner and allow_named_servers

The extension overrides some of the core JupyterHub templates too, injecting its own UI into specific places (e.g. an extra section on the home.html page).

Setting these multiple entrypoints works, assuming an otherwise plain JH configuration… but certainly it would cause problems if you also installed a similar extension that had its own ideas about how these entrypoints should be extended.

I think my approach is very different to what you mean by a ‘plugin’ here, but it would be interesting to know if there is any overlap in the long term vision…

Standardising some of what I’ve tried to do could allow multiple extensions to co-exist without conflict.

1 Like

Thanks! Yes Repo2DockerSpawner was definitely an option. But indeed the idea is to let admins prepare the environments in the background and expose them to the users only when they are ready.

It starts to be tied / coupled in some way when the plugin makes assumptions on the setup where it is installed. TLJH runs on a single machine and has a clear list of opinionated choices and the plugin takes those as requirements.

But it’s true that for this particular plugin, the JupyterHub service part of the plugin could also be used in other JupyterHub deployments that run on a single machine (since it builds Docker images locally). Or actually extracted into its own package, and the plugin would import it and define the hooks only.
With something like the following in jupyterhub_config.py after adding tljh_repo2docker to the requirements:

from tljh_repo2docker import Repo2DockerSpawner

c.JupyterHub.spawner_class = Repo2DockerSpawner
c.JupyterHub.services.append(
        {
            "name": "environments",
            "admin": True,
            "url": "http://127.0.0.1:9988",
            "command": [
                sys.executable,
                "-m",
                "tljh_repo2docker.images",
            ],
        }
    )

The ContainDS Dashboards project looks really cool, thanks for sharing!

Maybe what we need in the end is a minimal repo2docker API that could be started as a JupyterHub service? That other services or plugins could depend on and give them the possibility to provide different frontends if they want to. It sounds like this has already been discussed on Discourse, would need to find the relevant topics.

Another example of a TLJH plugin that uses repo2docker to build local environments: https://github.com/voila-dashboards/tljh-voila-gallery

This was spearheaded by @yuvipanda and tljh-repo2docker is heavily inspired by it!

@jtp Thank you very much for this explanation. Since then I have spent some time playing with TLJH and the plugin system. It all makes sense - e.g. as you say, the idea of having a TLJH plugin system is that you have some relatively simple assumptions about the base environment.

Maybe some hooks would be useful for z2jh, but maybe the config possibilities are just too varied with k8s.

I think the discussion on Twitter has already overtaken us here really!

It would certainly be useful to have a base DockerSpawner plugin that ideally installs docker itself, but definitely allows tljh-config to be used to set the base image or image whitelist.

Then your repo2docker plugin adds the r2d web handlers and building on top of that perhaps.

The idea of a standardised repo2docker (core) service could be useful, but to be honest it is a relatively small part of your plugin codebase since a Docker container takes care of calling r2d so might not be worth spinning off in itself yet.

I might leave it to the experts for a bit, but I will come back to the idea of a dockerspawner plugin in a while if no-one else has had a chance to look at that side of things.

Thanks again for all your pointers here!

1 Like

ZTJH is indeed another beast and provides a lot of hooks!

Ideally Discourse would be the place to discuss such topics in depth :slight_smile:

Based on the discussion from the Twitter thread, it looks like we could easily create a tljh-docker plugin following the idea you mentioned:

  1. Use DockerSpawner as the spawner
  2. Admins whitelist images with: tljh-config add-item docker.images jupyter/datascience-notebook, so the state stays in the TLJH config.yaml
  3. Fetch the list of available Docker images and show them in the Server Options Form

Indeed, abstracting too early might add some overhead. Let’s keep an eye on the use cases and maybe then we could consider having such core service.

Thank you as always for your thoughts.

Yes, that outline looks roughly what I had in mind.

Ideally, there would also be a step:
0. Install docker daemon

Just to reiterate my understanding of DockerSpawner’s behavior, in case you’re not aware:

tljh-config add-item docker.images should set:

  • c.DockerSpawner.image if only one image is provided
  • c.DockerSpawner.image_whitelist if multiple images are specified

and then I think you’ll find that the existing functionality in DockerSpawner will take care of the Server Options form if relevant based on image_whitelist being non-empty. (i.e. I believe that step 3 doesn’t need any work in this plugin, provided step 2 behaves as described.)

Another future consideration, that I think we should come back to rather than completely thrash out now… and it’s possible that TLJH plugins already have this built in: plugin dependencies or at least ordering…

I would like to build a TLJH plugin to install and configure ContainDS Dashboards. This would in turn require the Docker plugin above to be installed and active first. It’s not asking too much of the user to tell them to specify both plugins, and of course my plugin can probably work out for itself and complain if docker isn’t enabled. Again, I can see what happens when the time comes, but I thought it might be helpful to plant it in the back of your mind :slight_smile: