I created a node with GPU. Overall, I configured everything correctly: when I simply create a static resource pod and add the limit nvidia.com/gpu: 1, everything works as expected.
But when I try to do the same thing in the profileList of JupyterHub (add GPU request/limit), I get the following error in describe:
extra_resource_requests:
nvidia.com/gpu: 1
text
Warning FailedScheduling 2m14s default-scheduler 0/8 nodes are available: 1 Insufficient nvidia.com/gpu.
preemption: 0/8 nodes are available: 1 No preemption victims found for incoming pod, 7 Preemption is not helpful for scheduling.
To clarify: the resources are definitely there — as I already mentioned, creating pods directly works fine. This behavior only appears when the pod is spawned through JupyterHub.
P.S. I already tried the obvious thing — switched from jupyter-scheduler to default-scheduler, it didn’t help.
Hey there, KubeSpawner does not have the option extra_resource_requests, but extra_resource_guarantees and extra_resource_limits instead. So trying to set these values may resolve your issue?
Hi! Yeah, sorry, my fault!! I mean I try to do this:
- display_name: "ML with GPU"
description: "Ubuntu 22.04"
kubespawner_override:
scheduler_name: default-scheduler
image: some_image_with_nvidia_smi
extra_resource_limits:
nvidia.com/gpu: 1
Hmm, that’s odd. Did you take a look at Customizing User Resources — Zero to JupyterHub with Kubernetes documentation and verified that the command to check GPU availability succeeds?
Yes! Command to check GPU availability is succeed
kubectl get nodes -o=custom-columns=NAME:.metadata.name,GPUs:.status.capacity.‘nvidia.com/gpu’
NAME GPUs
gpu-node 2
manics
January 21, 2026, 11:00am
6
Can you show us the full YAML for the pod?
Full yaml of test pod
apiVersion: v1
kind: Pod
metadata:
name: gpu-test
spec:
containers:
- name: cuda
image: nvidia/cuda:13.0.0-base-ubuntu22.04
command: ["nvidia-smi", "-l", "5"]
resources:
limits:
nvidia.com/gpu: 1
manics
January 21, 2026, 1:01pm
8
Sorry, I meant the full YAML of the failing jupyter pod (kubectl get pod jupyter-…. -oyaml)
1 Like
apiVersion: v1
kind: Pod
metadata:
annotations:
hub.jupyter.org/jupyterhub-version: 5.4.3
hub.jupyter.org/kubespawner-version: 7.0.0
hub.jupyter.org/username: ikol006
creationTimestamp: "2026-01-22T03:06:14Z"
labels:
app: jupyterhub
app.kubernetes.io/component: singleuser-server
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: kubespawner
app.kubernetes.io/name: jupyterhub
chart: jupyterhub-4.3.2
component: singleuser-server
helm.sh/chart: jupyterhub-4.3.2
hub.jupyter.org/network-access-hub: "true"
hub.jupyter.org/servername: ""
hub.jupyter.org/username: ikol006
release: jupyterhub
name: jupyter-ikol006
namespace: jupyterhub
resourceVersion: "930369902"
uid: ccc01307-09bb-4b2a-b870-88c82764d874
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: hub.jupyter.org/node-purpose
operator: In
values:
- user
weight: 100
automountServiceAccountToken: false
containers:
- args:
- jupyterhub-singleuser
env:
- name: JPY_API_TOKEN
value: 66sdfe0e9e7af47a91b43205aa6dc8c4
- name: JUPYTERHUB_ACTIVITY_URL
value: http://hub:8081/hub/api/users/ikol006/activity
- name: JUPYTERHUB_ADMIN_ACCESS
value: "1"
- name: JUPYTERHUB_API_TOKEN
value: 66sdfe0e9e7af47a91b43205aa6dc8c4
- name: JUPYTERHUB_API_URL
value: http://hub:8081/hub/api
- name: JUPYTERHUB_BASE_URL
value: /
- name: JUPYTERHUB_CLIENT_ID
value: jupyterhub-user-ikol006
- name: JUPYTERHUB_COOKIE_HOST_PREFIX_ENABLED
value: "0"
- name: JUPYTERHUB_DEBUG
value: "1"
- name: JUPYTERHUB_HOST
- name: JUPYTERHUB_OAUTH_ACCESS_SCOPES
value: '["access:servers!server=ikol006/", "access:servers!user=ikol006"]'
- name: JUPYTERHUB_OAUTH_CALLBACK_URL
value: /user/ikol006/oauth_callback
- name: JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES
value: '[]'
- name: JUPYTERHUB_OAUTH_SCOPES
value: '["access:servers!server=ikol006/", "access:servers!user=ikol006"]'
- name: JUPYTERHUB_PUBLIC_HUB_URL
- name: JUPYTERHUB_PUBLIC_URL
- name: JUPYTERHUB_SERVER_NAME
- name: JUPYTERHUB_SERVICE_PREFIX
value: /user/ikol006/
- name: JUPYTERHUB_SERVICE_URL
value: http://0.0.0.0:8888/user/ikol006/
- name: JUPYTERHUB_USER
value: ikol006
- name: JUPYTER_IMAGE
value: ml-with-gpu-notebook:x86_64-ubuntu-22.04
- name: JUPYTER_IMAGE_SPEC
value: ml-with-gpu-notebook:x86_64-ubuntu-22.04
- name: MEM_GUARANTEE
value: "1073741824"
- name: PIP_INDEX_URL
value: https://nexus.local/repository/pypi-proxy/simple
- name: PIP_TIMEOUT
value: "60"
- name: PIP_TRUSTED_HOST
value: nexus.local
- name: SSL_CERT_FILE
value: /etc/ssl/certs/ca-certificates.crt
image: ml-with-gpu-notebook:x86_64-ubuntu-22.04
imagePullPolicy: Always
lifecycle:
postStart:
exec:
command:
- sh
- -c
- |
cp /etc/ssl/certs/ca-certificates.crt /opt/conda/lib/python3.11/site-packages/certifi/cacert.pem
name: notebook
ports:
- containerPort: 8888
name: notebook-port
protocol: TCP
resources:
limits:
nvidia.com/gpu: "1"
requests:
memory: "1073741824"
nvidia.com/gpu: "1"
securityContext:
allowPrivilegeEscalation: false
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/ssl/certs/ca-certificates.crt
name: files
subPath: ca-certificates.crt
- mountPath: /home/jovyan
name: volume-ikol006
dnsPolicy: ClusterFirst
enableServiceLinks: true
initContainers:
- command:
- iptables
- --append
- OUTPUT
- --protocol
- tcp
- --destination
- 111.127.222.127
- --destination-port
- "80"
- --jump
- DROP
image: quay.io/jupyterhub/k8s-network-tools:4.3.2
imagePullPolicy: IfNotPresent
name: block-cloud-metadata
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
privileged: true
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
preemptionPolicy: PreemptLowerPriority
priority: 1000
priorityClassName: develop
restartPolicy: OnFailure
schedulerName: default-scheduler
securityContext:
fsGroup: 100
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: hub.jupyter.org/dedicated
operator: Equal
value: user
- effect: NoSchedule
key: hub.jupyter.org_dedicated
operator: Equal
value: user
- effect: NoSchedule
key: nvidia.com/gpu
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
volumes:
- name: files
secret:
defaultMode: 420
items:
- key: ca-certificates.crt
mode: 420
path: ca-certificates.crt
secretName: singleuser
- name: volume-ikol006
persistentVolumeClaim:
claimName: claim-ikol006
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2026-01-22T03:06:15Z"
message: '0/8 nodes are available: 1 Insufficient nvidia.com/gpu. preemption:
0/8 nodes are available: 1 No preemption victims found for incoming pod, 7 Preemption
is not helpful for scheduling..'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: Burstable
manics
January 22, 2026, 7:35pm
10
Can you try creating your gpu-test pod with these requests as well as limits?
1 Like
From jupyterhub or just single resource pod ?
manics
January 23, 2026, 3:23pm
12
The test pod that you said was working in Jupyter spawner can not spawn pods with GPU nvidia - #7 by metaIhead
Try creating that pod immediately after your JupyterHub pod fails to spawn
A test pod is created under any circumstances, even after a Jupyter pod attempts to create or crashes.
Creating a test pod without limits, but only with requests, won’t work.
The Pod “gpu-test” is invalid: spec.containers[0].resources.limits: Required value: Limit must be set for non overcommitable resources
If I specify only limits in a test pod, then the requests section will exist in the manifest obtained via kubectl get pods -o yaml.
I recall some complexity about this, only declare the nvidia limit, not request, and see if it works.
Yes, that’s correct — I specify only limits, and requests are added automatically in the pod manifest.
1 Like
manics
January 28, 2026, 11:02am
16
This will be quite tedious, but can you manually (not via JupyterHub) create a pod based on the YAML in
apiVersion: v1
kind: Pod
metadata:
annotations:
hub.jupyter.org/jupyterhub-version: 5.4.3
hub.jupyter.org/kubespawner-version: 7.0.0
hub.jupyter.org/username: ikol006
creationTimestamp: "2026-01-22T03:06:14Z"
labels:
app: jupyterhub
app.kubernetes.io/component: singleuser-server
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: kubespawner
app.kubernetes.io/name: jupyterhub
chart: jupyterhub-4.3.2
component: singleuser-server
h…
but:
Change argsto jupyterlab, which should allow the pod to run as a standalone JupyterLab server
Delete the fields managed by Kubernetes (e.,g. creationTimestamp, status, etc)
Presumably that should fail with the same error as in your first post( Insufficient nvidia.com/gpu). Then try removing fields until eventually you approach your working example in
Full yaml of test pod
apiVersion: v1
kind: Pod
metadata:
name: gpu-test
spec:
containers:
- name: cuda
image: nvidia/cuda:13.0.0-base-ubuntu22.04
command: ["nvidia-smi", "-l", "5"]
resources:
limits:
nvidia.com/gpu: 1
3 Likes
I removed metadata such as status, creationTime, and so on, and also removed the PVC attachment. After that, the pod started on the required node where GPU access is available.
So in the end, the manifest looked like this.
apiVersion: v1
kind: Pod
metadata:
annotations:
hub.jupyter.org/jupyterhub-version: 5.4.3
hub.jupyter.org/kubespawner-version: 7.0.0
hub.jupyter.org/username: ikol006
labels:
app: jupyterhub
app.kubernetes.io/component: singleuser-server
app.kubernetes.io/instance: jupyterhub
app.kubernetes.io/managed-by: kubespawner
app.kubernetes.io/name: jupyterhub
chart: jupyterhub-4.3.2
component: singleuser-server
helm.sh/chart: jupyterhub-4.3.2
hub.jupyter.org/network-access-hub: "true"
hub.jupyter.org/servername: ""
hub.jupyter.org/username: ikol006
release: jupyterhub
name: jupyter-ikol006
namespace: test
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: hub.jupyter.org/node-purpose
operator: In
values:
- user
weight: 100
automountServiceAccountToken: false
containers:
- args:
- jupyterhub-singleuser
env:
- name: JPY_API_TOKEN
value: Laiv5Jie0Fie9Ue3auf5ohnge
- name: JUPYTERHUB_ACTIVITY_URL
value: http://hub:8081/hub/api/users/ikol006/activity
- name: JUPYTERHUB_ADMIN_ACCESS
value: "1"
- name: JUPYTERHUB_API_TOKEN
value: Laiv5Jie0Fie9Ue3auf5ohnge
- name: JUPYTERHUB_API_URL
value: http://hub:8081/hub/api
- name: JUPYTERHUB_BASE_URL
value: /
- name: JUPYTERHUB_CLIENT_ID
value: jupyterhub-user-ikol006
- name: JUPYTERHUB_COOKIE_HOST_PREFIX_ENABLED
value: "0"
- name: JUPYTERHUB_DEBUG
value: "1"
- name: JUPYTERHUB_HOST
- name: JUPYTERHUB_OAUTH_ACCESS_SCOPES
value: '["access:servers!server=ikol006/", "access:servers!user=ikol006"]'
- name: JUPYTERHUB_OAUTH_CALLBACK_URL
value: /user/ikol006/oauth_callback
- name: JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES
value: '[]'
- name: JUPYTERHUB_OAUTH_SCOPES
value: '["access:servers!server=ikol006/", "access:servers!user=ikol006"]'
- name: JUPYTERHUB_PUBLIC_HUB_URL
- name: JUPYTERHUB_PUBLIC_URL
- name: JUPYTERHUB_SERVER_NAME
- name: JUPYTERHUB_SERVICE_PREFIX
value: /user/ikol006/
- name: JUPYTERHUB_SERVICE_URL
value: http://0.0.0.0:8888/user/ikol006/
- name: JUPYTERHUB_USER
value: ikol006
- name: JUPYTER_IMAGE
value: registry.mycompany.com/internals/jupyterhub/ml-with-gpu-notebook:x86_64-ubuntu-22.04
- name: JUPYTER_IMAGE_SPEC
value: registry.mycompany.com/internals/jupyterhub/ml-with-gpu-notebook:x86_64-ubuntu-22.04
- name: MEM_GUARANTEE
value: "1073741824"
- name: PIP_INDEX_URL
value: https://repo.mycompany.com/repository/pypi-proxy/simple
- name: PIP_TIMEOUT
value: "60"
- name: PIP_TRUSTED_HOST
value: repo.mycompany.com
- name: SSL_CERT_FILE
value: /etc/ssl/certs/ca-certificates.crt
image: jupyterhub/ml-with-gpu-notebook:x86_64-ubuntu-22.04
imagePullPolicy: Always
lifecycle:
postStart:
exec:
command:
- sh
- -c
- |
cp /etc/ssl/certs/ca-certificates.crt /opt/conda/lib/python3.11/site-packages/certifi/cacert.pem
name: notebook
ports:
- containerPort: 8888
name: notebook-port
protocol: TCP
resources:
limits:
nvidia.com/gpu: "1"
requests:
memory: "1073741824"
nvidia.com/gpu: "1"
securityContext:
allowPrivilegeEscalation: false
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
# volumeMounts:
# - mountPath: /etc/ssl/certs/ca-certificates.crt
# name: files
# subPath: ca-certificates.crt
# - mountPath: /home/jovyan
# name: volume-ikol006
dnsPolicy: ClusterFirst
enableServiceLinks: true
initContainers:
- command:
- iptables
- --append
- OUTPUT
- --protocol
- tcp
- --destination
- 111.222.333.444
- --destination-port
- "80"
- --jump
- DROP
image: jupyterhub/k8s-network-tools:4.3.2
imagePullPolicy: IfNotPresent
name: block-cloud-metadata
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
privileged: true
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
preemptionPolicy: PreemptLowerPriority
priority: 1000
priorityClassName: develop
restartPolicy: OnFailure
schedulerName: default-scheduler
securityContext:
fsGroup: 100
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: hub.jupyter.org/dedicated
operator: Equal
value: user
- effect: NoSchedule
key: hub.jupyter.org_dedicated
operator: Equal
value: user
- effect: NoSchedule
key: nvidia.com/gpu
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
# volumes:
# - name: files
# secret:
# defaultMode: 420
# items:
# - key: ca-certificates.crt
# mode: 420
# path: ca-certificates.crt
# secretName: singleuser
# - name: volume-ikol006
# persistentVolumeClaim:
# claimName: claim-ikol006
1 Like
manics
February 3, 2026, 10:53am
18
What storage provider are you using? Are persistent volumes always mountable from all nodes?
2 Likes
Is there a daemonset resource related to storage, that runs on many nodes, but not the GPU node because it doesnt tolerate a taint? Then make it tolerate it and try again!
I’m using the vSphere CSI driver, but I removed the PVC from the manifest since I didn’t create any PVCs for the pod.