A way to tell a JupyterHub to run users with a specific image
This can be done with just docker, and will enable us to build a true ‘Littlest BinderHub’, providing the entire functionality of BinderHub but within a single VM. This has a couple advantages:
Makes it infinitely easier to run a BinderHub that you don’t expect a lot of traffic for (esp. when used with Auth)
Enable abstraction that would make BinderHub usable in broader contexts in the future, such as on HPC machines
I’ve laid out a pathway for how to get there in the issue, and am slowly working on it. Please comment here on discourse if this is something you will be interested in, and on the GitHub issue if you have opinions on how this should be accomplished.
Yes - this will be running the BinderHub software (that powers mybinder.org) directly. tljh-repo2docker uses repo2docker directly and has its own UI instead.
For a distribution, I’d imagine this would get deployed as two docker containers. Should be pretty easy to wrap the container deploys in ansible.
For a lot of edu use cases, I could see this being really hard in orgs where it might be institutionally difficult to to get k8s support but achievable to run a service from a single container.
With such a server, it’d presumably be easy enough to lock it down to run just a single whitelisted repo, or one of a few whitelisted services, ideally from a really simple list of whitelisted repo URLs?
Hmmm… what permissions are required to launch a container from another docker container?The same permissions would presumably apply to launching a set of linked containers using docker-compose? I think docker-compose has just been bumped up to a first class member of the docker CLI (docs). Binderhub doesn’t do docker compose, I think? I’m guessing repo2docker doesn’t either? But if a docker-compose.yml file was referring to separate images build from different subdirs, I guess that you just mean calling repo2docker on each then running them via docker-compose. So repo2docker would have to parse the docker-compose.yml to look for build: steps?
indeed… there’s a lot of stuff that could just be done by ansible, though i’ve struggled to keep track of the current state of things with their package reorg.
On the repo front, an in-house binder could be much more aware of where it’s pulling stuff, e.g. actually use API calls in the UI to autocomplete repo/branch names.
I think this again raises the question of whether repo2x might be solved for more than x=docker. In some HPC settings, for example, the docker play might be substantially more complicated, where they have slurm|whatever and that’s the way they likes it. Somebody mentioned a packer-based approach, which has a lot of legs, as it can target just about anything (including docker).
Alternately, and not to bang the old conda drum too much, but with a bit of time in the conda-forge mine for some hub deps (i gave up when traefik needed bzr to build from source), conda-packs could be a compelling target. packs are dumb tarballs that can manage anything that fits in an as installed conda PREFIX which could include pip|npm|gem installed stuff, or just ./configure --prefix $PREFIX && make && make install. Instead of running a docker registry, these tarballs could be stored by their input commits, e.g. /opt/repo2pack/<sha> or potentially with some salt from e.g. the builder major version, or something content-addressable.
The win here is, provided ingress to 443 was handled, nothing else would probably need to run with elevated permissions.
All it does is convert the internal Dockerfile generated by repo2docker into a shell-script, and then adds an optional packer template. I think we’re stuck with using the Dockerfile as the interface for now, though longer term there may be other options, such as CNCF buildpacks:
I’ve been playing with the setup in binderhub/testing/local-binder-local-hub to achieve the same thing. Would a LittlestBinderHub be quite close to this? what would be the main differences