New package to run arbitrary web service in JupyterHub (jhsingle-native-proxy)

Just in case it is useful to anyone else, I wanted to let you know about an alternative to jupyter-server-proxy that I’ve been working on.

Whereas jupyter-server-proxy allows you to run an arbitrary web service within a Jupyter notebook environment (which can also work inside a JupyterHub singleuser notebook), my jhsingle-native-proxy package runs entirely without a dependency on Jupyter notebook.

So this must be within a JupyterHub environment, but serves the arbitrary web service as a direct (‘native’) replacement for the jupyter-singleuser image. The is no notebook server started at all.

It enforces OAuth web authentication based on the usual JUPYTERHUB_* env vars.

More details, and examples for running e.g. a streamlit server, are on the GitHub page.

Please let me know any thoughts if this is useful to you, or if you have questions about how to get this running.

6 Likes

Nice! Do you have an example that can be launched on mybinder.org?

Good question… I have tried with a basic Dockerfile here: https://github.com/danlester/jupyterhub-singleuser-streamlit-native

But in the first instance it seems that between them repo2docker and BinderHub are installing notebook and running that!

In my particular use case, I’m building the Docker images manually (but similar to the repo linked above) and then launching it from JupyterHub just with the image set in DockerSpawner.

If I get a chance to look closer I’ll see if I can pick apart the reasons why Binder is not respecting this. Or anyone else please let me know!

I think mybinder and repo2docker override the CMD in the Docker image. You might be able to get around this by figuring out what command repo2docker runs and overwrite that command with your own script in the image, e.g. install a “fake” Jupyter notebook command that actually runs your application.

I was confused why notebook was being installed in the image at all, but it turned out to be an obscure dependency via ipywidgets in streamlit (the arbitrary web service I was using as an example).

And then as you suggested, BinderSpawner forces a CMD of jupyter notebook, so the latest version of the test Binder repo https://github.com/danlester/jupyterhub-singleuser-streamlit-native wraps everything in a new entrypoint that just looks for the --port argument and passes that to the underlying jhsingle-native-proxy process.

That gets us a bit further but the OAuth doesn’t seem to work - I just get a 403 from JupyterHub itself I think. Maybe some env vars are lost, or there’s something else specific to BinderHub that I’m missing. I’ll take another look!

Thank you for your thoughts.

1 Like

This repo now works on mybinder.org:

The problem was that JupyterHub in BinderHub doesn’t allow authentication at all, let alone via OAuth. So that’s why I was getting 403 compared to a standard JupyterHub which OAuthed successfully.

Jupyter Notebook just relies on the ?token query parameter for authenticating the session, but of course doesn’t actually need the concept of a named user. So it is public other than requiring the query token, i.e. is not restricted in terms of JupyterHub auth.

So I have updated jhsingle-native-proxy to allow an --authtype=none argument just to completely turn off the OAuth. This now works with mybinder.org but of course it would be even better to implement a ?token GET parameter for protection.

3 Likes

Cool!

For my education: what do the {--} in the CMD of the Dockerfile do?

Great question!

Actually, the CMD in the Dockerfile is overridden by BinderHub when run that way. So the relevant line is actually from entrypoint.sh:

I’ll work on a simplified version of this command.

Let’s say that the underlying service will be:

streamlit hello --server.port 8506

You might attempt to run jhsingle-native-proxy like this:

jhsingle-native-proxy --destport 8506 --port 8888 streamlit hello --server.port {port}

Where {port} will be substituted as 8506 by jhsingle-native-proxy when it runs the command.

That doesn’t work because jhsingle-native-proxy thinks that --server.port {port} is intended for an argument for jhsingle-native-proxy rather than its underlying command. It will say that --server.port is not a valid option.

So the usual solution is a double dash to signify the end of arguments and start of the command:

jhsingle-native-proxy --destport 8506 --port 8888 -- streamlit hello --server.port {port}

That works fine.

BUT when JupyterHub attempts to run jupyter notebook, it appends a --port argument to the very end of the command it runs. Really jhsingle-native-proxy needs to pick that up as its own argument, but it appears to be intended for the underlying streamlit command.

So the solution instead is that {–} is simply substituted with – to inform jhsingle-native-proxy to pass it on to the underlying command. Any further --options (without {–}) may still be picked up by jhsingle-native-proxy itself.

Thus, this will work and be equivalent to the previous one above, but --port is allowed to come at the end and still be picked up by jhsingle-native-proxy rather than passed on to the underlying command:

jhsingle-native-proxy --destport 8506 streamlit hello {--}server.port {port} --port 8888

This tells jhsingle-native-proxy that we want a destport of 8506, and an incoming port of 8888. So requests to 8888 will be proxied to port 8506, and the process that we hope to receive those will be run as: streamlit hello --server.port 8506

I hope that makes sense - maybe I can bring some of this into the readme…

2 Likes

Thanks @danlester for sharing, this looks great!

Have you been able to try it with other examples than streamlit? Wondering if that would work with a voila standalone app as well.

Yes, definitely! I have added a standalone Voila Dockerfile example which works as part of an authenticated JupyterHub:

There’s just a bit of fiddling to make sure voila knows it’s running at the /user/dan URL subfolder (or similar).

I’ve pushed this example to Docker Hub at ideonate/jupyterhub-singleuser-voila-native if you want to try it out.

Please let me know your thoughts!

1 Like

Thanks @danlester, I’ll check it out!

I guess this could also be running on Binder as well? (similar to the streamlit Binder example)

Go on then, since you mentioned it… I’ve put together an example using Voila in Binder:

There is a direct link to mybinder.org on the Github readme homepage.

But, yes, it’s much the same as the Streamlit example.

I think it’s worth pointing out that the Binder examples are highly theoretical… There is no need to use jhsingle-native-proxy at all in this case since we aren’t doing any auth. You can just run voila/streamlit directly, as in this example:

It would make sense to use jhsingle-native-proxy within Binder IF it implemented the ?token GET param style of authentication to wrap a layer of protection around voila or whichever web app is running. Although in voila’s case I think it would make sense to implement the ?token auth directly in voila (if it isn’t already there - I couldn’t see it).

Maybe the Binder examples have taken us off track - the point of jhsingle-native-proxy is really the OAuth wrapper for use in authenticated JupyerHub environments.

1 Like

For me, where the use case is in an edu / teaching setting, though it equally applies to research, the current situation is broadly as follows:

  • instructor on course1 wants to provide a Jupyter notebook fronted Python envt to students;
  • fights with IT and eventually gets a Jupyterhub service they can use, partly because:
  • instructor on course2 also wants to provide students with a notebook fronted R environment;
  • instructor on course3 wants students to use RStudio with an R environment, but reads around a bit and finds they can use jupyter-server proxy to give access to RStudio via the Jupyter notebook UI. It’s a bit clunky, but they can work with it, and once students bookmark the proxied RStudio URL, it’s less clunky;

What jhsingle-native-proxy allows is instructor3 to ask their IT folk for just an RStudio container, and IT can serve it to them via the JupyterHub layer, which is the students’ access point to (notebook) containerised services already.

JupyterHub is now a provider of arbitrary, user selected (from a predefined dropdown list) containerised web apps to authed multiusers, not just notebook server fronted containers.

2 Likes