GPU enabled JupyterHub with Kubernetes Cluster

Hello,

I have access to GPU enabled hardware (NSF Jetstream2 cloud) and I am able to successfully launch VMs and run NVIDIA-based Docker containers such as this one without issue on those GPU VMs.

docker run --gpus all cschranz/gpu-jupyter:v1.4_cuda-11.6_ubuntu-20.04 nvidia-smi
Wed Sep 21 17:16:22 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02    Driver Version: 510.85.02    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GRID A100X-8C       On   | 00000000:00:06.0 Off |                    0 |
| N/A   N/A    P0    N/A /  N/A |      1MiB /  8192MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

My aim is to launch a JupyterHub with Kubernetes cluster with those GPU VMs as worker nodes. (I’ve launched and run many of these clusters with CPUs in the past so I am comfortable in that area). However, I would like to launch a GPU enabled Zero to JupyterHub cluster. In order to do that, I have to supply the --gpus all to the Docker image above, but I don’t know how to achieve that. Searching around I did find this reference, but I am not sure how to make use of that code snippet. I did try the code below in the configuration yaml file, but that failed with import errors, and I believe that is the wrong approach anyway.

hub:
  extraConfig:
      cuda: |
        import docker

        c.DockerSpawner.extra_host_config = {
            "device_requests": [
                docker.types.DeviceRequest(
                    count=-1,
                    capabilities=[["gpu"]],
                ),
            ],
        }

Does anyone know how to simply supply the --gpus all when launching a singleuser Docker image via Jupyter with Kubernetes?

Z2JH uses KubeSpawner, not DockerSpawner.
There’s some information in the Z2JH docs on enabling GPUs with some public cloud providers

If that doesn’t work then I think it’s best if you try and manually create a GPU enabled pod in your K8s cluster:

then try and copy the required settings to Z2JH/KubeSpawner.

Thanks @manics. I examined option #1 but the only cloud options that are available are the commercial clouds providers. I am running on Openstack (NSF Jetstream2). I tried anyway but got 0/2 nodes are available: 2 Insufficient nvidia.com/gpu. I think there is a Kubernetes piece missing on Jetstream2 that AWS / GCE / Azure are providing. Option #2 might be a possibility though it seems providing a dictionary (e.g., {"--gpus" : "all"}) of Docker CL extra arguments could be an escape hatch, if that option existed. (I don’t think it does.) Anyway, I may skip Kubernetes route for now until the technology matures. I’ve found working the NVIDIA software to access the necessary GPU hardware to be quite challenging, in general. I’ve only really had any luck with the NVIDIA supplied Docker containers (or containers that use those as base images).

1 Like