Using Jupyterhub & Binderhub to launch arbitrary containers

Whilst jupyter-server-proxy allows a notebook server to proxy other services within a container, are there any requirements that a container launched by Jupyterhub:

  • launches a container that runs a jupyter notebook server?
  • launches a container that runs a service on a particular port?

With Jupyterhub now allowing users to launch containers from a selection of images, I wonder if, for example, they could launch one that just exposes a standalone RStudio application, or an XPRA mediated desktop, or even just an RDP endpoint that could be connected to from an RDP client?

Whilst these use cases sit outside the Jupyter context, the Jupyterhub idea qua a server that can manage both multiple users, and the launching of containers from a selection of images, is a very powerful one, particularly for educators who want to be able to make containerised applications available to learners, aren’t sys admins or multi-user app developers, but have managed to get to grips with running a Jupyterhub server.

In Binderhub, I see there is a related issue of launching arbitrary containers, not just ones that run a notebook server.

I think JupyterHub only cares that something starts responding on a TCP port. It doesn’t care what “the thing that is listening” is. So you can start RStudio directly and have it be “the thing that is listening”.

For my education: what is RDP?

However because users talk directly to “the thing that is listening” it should implement the JupyterHub auth setup, otherwise anyone can connect to your session (and run arbitrary code in your name).

There is an example in the JupyterHub repo of how to retrofit a flask server to use the JupyterHub auth setup. You’d want to copy this or be inspired by it to implement auth for “the thing that is listening”.

If you can’t modify the code of “the thing that is listening” you need some form of authentication proxy. My hunch would be that you quickly end up with something that looks like jupyter-server-proxy → Maybe a good way forward is to extend jupyter-server-proxy so that it can run “standalone” and take care of the authentication, instead of running as a notebook server extension?

1 Like

Hi Tim

Thanks for clarifyng that. I guess I need to try a couple of tests…

RDP is Remote Desktop Protocol, which I think was a Microsoft thing (they certainly make cross platform clients for it).

Rather than use something like novnc to expose a Linux desktop via a browser window, with xrdp you can connect an RDP client to an RDP service running in a Linux VM and work on the desktop that way. One big plus for it is that sound works (which I don’t think it does in novnc?).

Example here of accessing MS VSCode that way (yes, I know, since realised you can run VS Code as a service in its own right). Another example here that I need to do a blog post for that runs a dekstop Windows app under Wine in a container that can be accessed via RDP.

1 Like

So I tried to do a “quick” demo using OpenRefine, but didn’t get very far. Jupyterhub seems to start the Java server bit of the OpenRefine server, but then it falls over… (Just running the OpenRefine container on its own works fine.)

This has got me wondering what Docker command the Dockerspawner actually generates when it tries to launch a container. Is that easily found in a log somewhere?

There is one additional requirement: the service running on the given ip/port can run on a url prefix, e.g. /user/abc123/. Note that for BinderHub, this excludes authentication, etc. With the switch to traefik-proxy coming in the future, activity tracking, which is critical for Binder, will stop working unless we take action (.

So I would say that JupyterHub assumes that it’s running a jupyterhub-singleuser process, while BinderHub assumes that it’s at least a jupyter-notebook for now.

Seems like it’s now possible: New package to run arbitrary web service in JupyterHub (jhsingle-native-proxy)