Remote execution of code using GPUs in Jupyter


We are using JupyterHub in an OpenShift cluster where we create a separate Jupyter server for each user and they are able to spawn notebooks after selecting the required configuration. We provide an option to choose whether to attach a GPU with their notebook or not.
Currently, when a user chooses to use the GPU, a whole GPU is allocated to their notebook which as per our observation is more than required to them most of the times. Also, since the GPUs in the cluster are limited, sometimes users have to wait for the GPUs to be free which are being used by other users.

So, we were thinking of a way to mitigate this wastage and then we came across this idea of remote execution of code in Jupyter, i.e. running a piece of code (maybe a cell of the notebook) on remote GPU while the rest of the notebook is running on a CPU. This would keep the GPUs free for most of the time and available to all the users.

I have gone through few similar topics here but could not find a stable and working solution for the same or maybe I have missed some topic which has it.
So, can someone please help me to find the solution to this?

Dear @Harvy,

We have the same problems and are currently looking into two ways of solving it:

  1. Time-slicing a gpu: Time-Slicing GPUs in Kubernetes — NVIDIA Cloud Native Technologies documentation

  2. submitting notebooks to a gpu que with something like Home - papermill 2.4.0 documentation

Keen to hear how others solved this problem


1 Like

Without a New Invention Here, one could consider having a persistent, GPU-enabled workload scheduler which Jupyter Kernels (and other clients) could use more-or-less transparently.

My go-to example of this would be be, which can be deployed in many ways. The advantage to the dask API is one can write code that works in the browser (almost) up to humorously-large clusters.

1 Like