Setting up JupyterHub users to launch a virtual framebuffer (such as Xvfb) at environment startup

I am experimenting with JupyterHub so I can properly document for my users how to setup a JupyterHub server for my project. In TLJH setup process I am unable to figure out how to replicate what binder’s start script does.

The end goal is to setup JupyterHub with a working installation of py5. This Python library is unique in that it require a Java JVM and a display of some kind. A virtual framebuffer such as Xvfb works just fine. I can get all of this working using binder. There is a working repo here:

However, when I follow the TLJH setup instructions, I have difficulty connecting those instructions with what I see in the working binder setups.

I have installed all of the necessary system and Python packages. The problem has to do with duplicating whatever the start script does for binder (binder/start in the previous repo.)

#!/bin/bash

/usr/bin/Xvfb :0 -screen 0 1024x768x24 &
export DISPLAY=":0"
exec "$@"

I can hack it by editiing /home/jupyter-<user>/.bashrc to launch Xvfb but that is not a solution. It does confirm that my problem has to do with launching Xvfb and setting the DISPLAY environment variable.

I believe I have to change or add something in /opt/tljh/ but I can’t find where. Any advice?

Also, are there any security issues to using something like this? I want to make sure each user has their own Xvfb and not everybody using the same one.

1 Like

There is also this other repo created by others in the JupyterHub community that works:

And if it helps, here is the relevant setup instructions for py5.

Can anyone help with this? Am I in the wrong forum for this question? Do I need to add any information?

Hi! I suspect the lack of an answer is because no-one else has tried what you’re doing with TLJH. I’m not aware of any examples of using an X server with JupyterHub that aren’t in a container.

I guess I was too specific in how I phrased the question. I really just need to know how to get TLJH to run an arbitrary shell script for each user, much like binder does for users with the start script. The Xvfb stuff is just extra details explaining what this problem is a part of. Am I scaring off readers with the discussion thread title?

That clarifies things :smiley:
TLJH let’s you add arbitrary configuration, though this is not supported: Custom configuration snippets — The Littlest JupyterHub documentation

For example, you can override the spawn command to run a script, e.g.
/opt/tljh/config/jupyterhub_config.d/custom.py:

c.Spawner.cmd = "/usr/local/bin/custom-start.sh"

/usr/local/bin/custom-start.sh:

#!/bin/sh
set -x

date >> /tmp/start.txt
id >> /tmp/start.txt
env >> /tmp/start.txt

exec /opt/tljh/user/bin/jupyterhub-singleuser "$@"

Make sure the script is executable, and reload JupyterHub:

tljh-config reload hub

If you run cat /tmp/start.txt in a teminal inside your server you should see the expected output.

TLJH uses the Systemd Spawner which means the environment isn’t the same as that of standard login user. This may affect whether an X server will work. I don’t know whether Xvfb is secured from other users.

Thank you for your detailed reply. I created the script and was able to get it to work.

The script I created is as follows:

#!/bin/sh
set -x

/usr/bin/Xvfb :0 -screen 0 1024x768x24 &
export DISPLAY=":0"

exec /opt/tljh/user/bin/jupyterhub-singleuser "$@"

I restarted TLJH and it worked. Although it seems there is a bug with this script: when a second user logs in, the script will attempt to start a second instance of Xvfb using :0 and will fail. But the script will continue with the export command and the final command to start jupyterhub. Everything still works, but now both users are sharing the same Xvfb instance. This script could be improved by picking new, unused numbers for the display each time it runs. But then I’d have to come up with a way to shutdown each Xvfb instance when they log out though. Is that even possible? This could quickly become more trouble than it is worth.

For what I am trying to use TLJH for, if everyone is sharing the same display, everything would work just fine for basic users writing python code from the notebook interface. But any user with knowledge of Linux would be able take screenshots of that shared screen and could see what everyone else is doing, or at least whichever windows were on top. I just demo’ed this using ffmpeg and x11grab. But if I improved the script to give every user their own Xvfb instance, they could still use ffmpeg and x11grab to take screenshots of the other screens by getting the display numbers of the other running Xvfb instances, so there is no security improvement from starting multiple instances.

Perhaps I misunderstood some things about TLJH. Perhaps I imagined TLJH would be more like Binder, where the docker containers strictly partition users from each other and no user can access anything beyond what is in their own container. That doesn’t seem to be the case here, and probably not the situation that TLJH is built for.

I think the best choice for running my project on TLJH is to have just one instance of Xvfb running, which can launch from cron or something when the machine starts up. Then I’d just need to make sure each user has the same DISPLAY environment variable that points to the Xvfb instance. This is really only good for situations where the users aren’t worried about other hostile users monitoring what they are doing though. If that level of security is required, probably the Kubernetes JupyterHub installation is more appropriate. What do you think?

If I only want to set a DISPLAY environment variable for each user, is there a simpler way to do that than that custom script? A way that is supported?

Also this page on TLJH Security Considerations was helpful. It seems what I am doing here with Xvfb weakens the security of TLJH by trading security for a simpler setup. For some use-cases that will not be OK, so this issue will need to be called out in the documentation.

TLJH doesn’t use containers for running singleuser servers, everything runs on a single machine.

A few people have managed to get DockerSpawner working with TLJH, but this isn’t currently supported. For example see Using DockerSpawner with The Littlest JupyterHub – Ideonate – Tools for Data Scientists