Hi, I have a JupyterHub instance deployed on an OpenShift (Kubernetes) cluster, let’s call it cluster_cpu
. I also have another cluster, cluster_gpu
, and I want to spawn pods on cluster_gpu
directly from the JupyterHub running on cluster_cpu
.
Is there a way to achieve this without deploying a separate JupyterHub instance on cluster_gpu
? If so, could you guide me on how to configure it?
manics
September 16, 2024, 10:39pm
2
JupyterHub is designed to be very customisable, so you could write your own spawner , perhaps using KubeSpawner as inspiration.
I think that this will be cool feature if we can expand the kubeSpawner to do that.
I saw an open issue about that subject:
opened 03:45PM - 12 Jul 21 UTC
enhancement
### Proposed change
Right now, the kubernetes pod is spawned in the same clus… ter as the hub pod. It would be great if we can configure it to be spawned in other remote clusters. One hub can then spawn into different cloud regions, which is very helpful when dealing with cloud datasets.
The kubernetes API can easily be accessed remotely, but the hub and proxy pod need to find a way to send traffic to the user pod. We can find ways to tunnel this traffic through without much work. My favorite way is to use `kubectl port-forward`, also used by my earlier expeirments with [accessing dask-kubernetes remotely](https://words.yuvi.in/post/dask-local-kubernetes/) and now [dask-kubernetes](https://github.com/dask/dask-kubernetes/blob/6da9413cdfeac686217ccae3ae29189e04e16144/dask_kubernetes/utils.py#L85) itself.
### Alternative options
1. Deploy one hub per cluster users want to spawn into. This is more complicated logistically, and for the user.
2. Make a `Service` object for each pod, and expose it to the internet via a `LoadBalancer`. This can receive traffic from the hub and proxy pod
### Who would use this feature?
Anyone interested in accssing compute near datasets stored across multiple cloud providers or regions
### (Optional): Suggest a solution
- [ ] Override [`get_pod_url`](https://github.com/jupyterhub/kubespawner/blob/fd62c917bc3b429cbc81ba5e70fccb2f938748d6/kubespawner/spawner.py#L1741) to start a `kubectl port-forward` on a free port, to the pod IP on the remote cluster
- [ ] Make sure that `c.JupyterHub.hub_connect_url` is something that the pod can connect to. This could be over https on the public internet, or something else.
- [ ] Figure out how to specify which kubernetes cluster the API will need to connect to