Configuration of jupyterHub to use two VM

Hello,

I’ve a noob’s question,
I’ve install jupyterhub on a RedHat Virtual Machine (RHEL9) with Python 3.9.16
I use that tuto for the installation : tuto
I don’t use Conda, only pip

my pip list give that for the versionning
jupyter_client 7.4.9
jupyter_core 5.3.1
jupyter-events 0.7.0
jupyter-lsp 2.2.0
jupyter_server 2.7.2
jupyter_server_fileid 0.9.0
jupyter-server-mathjax 0.2.6
jupyter_server_terminals 0.4.4
jupyter_server_ydoc 0.8.0
jupyter-telemetry 0.1.0
jupyter-ydoc 0.2.5
jupyterhub 4.0.2
jupyterhub-idle-culler 1.2.1
jupyterhub-ldapauthenticator 1.3.2
jupyterlab 3.6.6
jupyterlab_git 0.44.0
jupyterlab-pygments 0.2.2
jupyterlab_server 2.24.0
jupyterlab-widgets 3.0.8

On that VirtualMAchine we have1 GPU in it, this GPU is used by the developper
But they need to have a second one to be able to do more parallelism with their development.
The infrastructure doesn’t allow us to add a second GPU for the VM, but the responseable tells me that he could make me a second VM to have that second GPU.
I can’t deploy jupyterhub on kubernetes, because the hardware is not linked with the kubernetes infra for the moment

Here is for the situation

My questions :

  • how can I configure jupyterhub to be able to use two VM without have two installation of jupyterhub?
  • If it’s with the configurable-http-proxy, how to do it (I’m not a network specialist and I don’t know how to configure a loadbalancer)
  • If it is with the spawner? which one must I use? I use the jupyterhub.spawner.LocalProcessSpawner by default for the moment.

I hope I’m clear enough with my issue
I can’t install docker on this VM but maybe with podman.

The spawner is responsible for launching the singleuser server, including deciding where to launch it on a multi-node system. I can’t think of an easy way to do it without Kubernetes or some other scheduling system- you’ll need to write a custom spawner.

It’s probably easier to install JupyterHub independently on both VMs. If you use OAuthenticator for logins it’ll be relatively seamless for the user to login to both servers.

Thanks Manics for your answer, I understand it’s not really easy with the constraints I have.
I’ll try to understand the differents spawner available to see what to do

in the same time I ask chatGPT to see, and it tells me that :

To add a second virtual machine (VM) to your JupyterHub deployment on Red Hat 9, you will need to configure JupyterHub to support multiple Jupyter server instances. Here’s a general approach to doing this:

Install and configure a reverse proxy: Use a reverse proxy such as Nginx or Apache to redirect traffic to different JupyterHub instances based on the requested URL.

Configure user and server management: Configure JupyterHub to manage multiple users and assign specific Jupyter servers to each user or user group.

Configure Jupyter server spawning: Configure JupyterHub to be able to start and manage multiple Jupyter servers based on demand.

Configure communication between VMs: Ensure that VMs can communicate with each other if necessary, for example by configuring appropriate firewall rules.

Testing and monitoring: Once configured, test the deployment to ensure users can access the Jupyter servers on the different VMs and monitor the system for possible issues.

Please note that these steps are general and may require specific configurations depending on your infrastructure and needs. Refer to the JupyterHub documentation and resources specific to your environment for detailed instructions on configuring JupyterHub to support multiple VMs.

in the same time I ask chatGPT to see, and it tells me that :

Remember, chatGPT doesn’t know anything about anything, and LLMs can never be reliably used as an information retrieval system. The best it can ever hope to do is a first guess at what an answer might look like, which you can then attempt to confirm with a trustworthy source. None of the points it provided to you are specific enough to be helpful, and those that are specific are wrong.

  • You don’t need multiple hubs, a reverse proxy may be fine, but doesn’t contribute to a solution to this problem
  • The rest of what is says is essentially “configure JupyterHub to do what you want by configuring it to do what you want” with no help in any way

For a real answer:

The Spawner itself can be used to handle the communication between the VMs. For example, sshspawner (which hasn’t been updated in some time, but should show you roughly what you need) launches users on one or more machines via ssh.

In short, JupyterHub doesn’t need to know how your Spawner launches processes, it only cares about the URL where it can be connected in the end.

There is already a proxy involved, so as long as the two VMs can talk to each other, the only hurdle you have is to start processes on one machine from the other. SSH is one way to do that. Docker swarm might be the next choice, as it works to span machines, too, and is not too hard to set up on a few VMs, at least compared to something like kubernetes.

1 Like

Blockquote * You don’t need multiple hubs, a reverse proxy may be fine, but doesn’t contribute to a solution to this problem

you are right, in my head, I wanted to have a loadbalancing to be able to use the GPU 's

I test the changes for the reverse proxy but it doesn’t work so, I checked the spawner and with the LocalProcessSpawner it’ll be very hard

So thanks for the name of other spawner, I 'm actually reading some stuff for the docker spawner to see if it’s compatible with podman.

I’ll check the sshspawner, that’ will maybe be more adapted for me

As @minrk suggested, SSH spawner would be a good fit for your use case if you have SSH access from VM1 to VM2. We use SSH spawner for our deployment with uses user specific SSH certificates instead of generic SSH key pair for authentication. You can look into original SSHspawner and our implementation and that will give you an idea of how to implement your case.

I’ve tested the original sshspawner and a modified version where there is a yaml file to define the vm per user.
Jupyterhub starts welll but when I open a connection with a the admin user on the two vm and the two jupyterhub instance, the servers for this user doesn’t want to start with and throw a permission denied
I don’t understand why, cause the ssh connection between the two VM is OK, and the permissions are the same everywhere

I’ll try your solution to check if it’s better for my case

Otherwise, I’ll be forced to develop my own spawner, and it’ll take some times