Creating a new Binder-at-home tool

For a while I’ve been thinking about how to “take your binder home”. A site like mybinder.org lets you run stuff but you are limited in terms of CPU/RAM and can’t make your changes persistent.

What about a new tool that you run on your laptop that just uses docker and python (no kubernetes) so that you could take a binder link like https://mybinder.org/v2/gh/binder-examples/r and convert it to http://localhost/v2/gh/binder-examples/r and have it start locally?

We could add an option to the UI to add some persistent storage to your docker container so that you can save your changes.

This is a bit related to BinderHub for HPC

8 Likes

From a technical point of view I’d start with writing a small web server that can respond at that URL. It would use repo2docker to build and launch the container for you, then redirect you there.

You could also run this on a remote machine (though I’d say multi user and auth are out of scope for v1) like a powerful machine in a corner of your office or even a AWS/GCE VM that was spun up “just now” and that is billed to your credit card.

2 Likes

If the initial use is for running locally v0 could be even simpler: command-line python script that takes the binder URL, builds and starts the server, opens the browser.

This also opens up options for things like putting a custom tag on the Docker image for full reproducibility. And would be really useful for figuring out why a new repo fails during startup on mybinder.

1 Like

That is what you get with repo2docker already, except for the browser opening part. I think we should add that.

1 Like

That’s what I thought, but I’ve never managed to successfully run repo2docker so I assumed there must be some other configuration required which could go into the wrapper script.

1 Like

If you have a working dockerd (running docker images or docker info in a terminal prints something that isn’t an error) you should be able to run repo2docker https://github.com/binder-examples/requirements to get started.

Going via a web server would give us a chance to add some additional UI (beyond just a CLI).

In terms of strapline, this reminded me of O’Reilly LaunchBot (now deprecated?): https://blog.ouseful.info/2017/08/30/oreilly-launchbot-like-a-personal-binder-app-for-running-jupyter-notebooks-and-other-containerised-browser-apps/

2 Likes

BTW, in terms of less reinventing of wheels, doing that as a JupyterHub extension of some sort would satisfy quite some requirements implicitly, like user management and “having a web server”.

Did you manage to run repo2docker locally? What does something like docker info say? Does docker run docker/whalesay cowsay Hey work? I’d first confirm that docker works locally, then let’s tackle repo2docker not working. If the two previous commands work: repo2docker https://github.com/binder-examples/requirements should get you a running “local” binder.

2 Likes

I would not make binder-at-home something that knows about multiple users etc. Precisely because that is JupyterHub. Instead you’d write a new spawner for JupyterHub (this is how Everware (a predecessor of Binder) worked, I’d recommend against it) or possibly a new kind of frontend (like BinderHub is) to orchestrate your hub. It would run as a JupyterHub service and first build the image, then use a standard docker spawner to launch it.

Basically BinderHub without kubernetes :slight_smile: Given we have BinderHub as a multi-user, auth’ed, etc tool I am less excited about building a copy of it minus k8s. Getting something individual users can use seems more exciting, others can just use BinderHub (and we can focus resources there).

2 Likes

Just thought I’d let you know I haven’t forgotten this! Been a bit busy with other things.

My original failure was on Fedora Linux. I’ve tried repo2docker https://github.com/binder-examples/requirements just now with Docker-for-Mac (18.09.2) and it’s eaten up all the allocated disk space!

Before running repo2docker:

$ docker system df
TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              6                   1                   3.068GB             3.068GB (100%)
Containers          1                   1                   25.54MB             0B (0%)
Local Volumes       0                   0                   0B                  0B
Build Cache         0                   0                   0B                  0B

When repo2docker gets stuck (after printing out Creating home directory ... Copying files from /etc/skel ...:

$ docker system df
TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              7                   2                   3.186GB             3.068GB (96%)
Containers          2                   0                   782.5GB             782.5GB (100%)
Local Volumes       0                   0                   0B                  0B
Build Cache         0                   0                   0B                  0B

I’ll keep on investigating.

Exciting :slight_smile: What does id say on your machine? We have had a problem for users who have a “centrally managed” login to their OSX laptop where running repo2docker never (within the patience of the users) completes and seems to hang on copying /etc/skel. We’ve never tracked down why exactly it happens. The thing that gives it away is that id reports them having a very very large number as user ID.

The fix is to run repo2docker --user-id 1000 --user-name jovyan ... to use those details instead of your local user’s ID and name.

That was it! UID was >2000000000. I’ve added a comment to https://github.com/jupyter/repo2docker/issues/223#issuecomment-490041298

1 Like

Related to this, ContainDS is a Windows or Mac app that now provides a mybinder-like GUI that you can run locally to create a Docker image and container from any Binder-ready URL. Announcement is here: A new Binder GUI for launching local containers

1 Like

FYI Renku (Renku on your Own Machine — Renku documentation) already does basically this, they are also using repo2docker but their CLI tool wraps starting the session. so if you do renku session start in a renku project on your local system with docker installed it’ll pull the container image if you’ve already got one built or build it locally if you haven’t then you can just click on a link to open the session on your localhost to the webUI from the container, JupyterHub, RStudio or even a Linux desktop session. There might be something to learn from their approach that could be applied to binder.

1 Like