GPU not detected in Jupyternotebook on Kubernetes GPU enabled cluster

I have a GPU enabled Kubernetes cluster in which I can run pods that actually use the GPU. This has been verified via GPU test pods, running ML workflows that use the GPU etc.

In other words, I’m confident that the cluster is configured properly since when running a GPU enabled workload, I can monitor the GPU activity/processes, etc via nvidia-smi.

I have Jupyerhub installed on the cluster and I’ve followed the instructions in the zero 2 hero docs as shown below.

The problem is, when I launch a notebook and run the basic tests such as:
!pip install torch
import torch
torch.cuda.is_available()

It returns FALSE

On this particular custer, I have one NVIDIA GPU and when I launch a notebook/server and request a GPU . the notebook/server will spin up - but - the above tests fail.

If I try to launch a 2’nd notebook/server while the first is running I see an error stating that no GPU’s are available - which is expected since the cluster only has 1 GPU and it’s allocated to the first notebook/server that was spun up.

Therefore, due to the above, I believe my jupyterhub-config.yaml is accurate.

Once again, the problem is basic commands withing the notebook to detect a GPU return false.

Note, I’ve tried restarting the notebook kernal to no avail

I’m likely missing something simple and any help is GREATLY appreciated.

Once again, the kubernetes cluster is configured properly as my other pods that use GPU’s are able to do so… just not the notebook ??

This is what I used from the zero 2 hero guide

singleuser:
profileList:
- display_name: “GPU Server”
description: “Spawns a notebook server with access to a GPU”
kubespawner_override:
extra_resource_limits:
nvidia.com/gpu: “1”

I’m posting my solution here in the hope that it will help someone else.

In addition to modifying the singleuser: attribute as in the above post. I needed to add the following:

hub:
  revisionHistoryLimit:
  config:
    KubeSpawner:
      extra_pod_config:
        runtimeClassName: nvidia

IMO, it would be GREAT if the above config/KubeSpawner ref is added to the zero 2 hero doc's section related to deploying and using JupyterHub on a GPU enabled k8's intstance
2 Likes