Singleuser GPU limits

dasantonym · March 24, 2022, 11:14am

This might be because of me not understanding how singleuser works (or K8s scheduling for that part…), but I exhaust the limits once I start a single server:

My config looks like this:

singleuser:
  storage:
    dynamic:
      storageClass: openebs-zfspv
  nodeSelector:
    node-role.kubernetes.io/worker: worker
  image:
    name: my-singleuser-gpu-image
    tag: v1.5.0
  extraResource:
    limits:
      nvidia.com/gpu: 1

nvidia-smi within a notebook for the first user shows only one of the eight GPUs, so it should not be using all of them.

Now my resources for the node look OK:

Allocatable:
  cpu:                256
  ephemeral-storage:  397152651836
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             2101193248Ki
  nvidia.com/gpu:     8
  pods:               110

...

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests         Limits
  --------           --------         ------
  cpu                100m (0%)        100m (0%)
  memory             1126170624 (0%)  50Mi (0%)
  ephemeral-storage  0 (0%)           0 (0%)
  hugepages-1Gi      0 (0%)           0 (0%)
  hugepages-2Mi      0 (0%)           0 (0%)
  nvidia.com/gpu     1                1

Does anyone have an idea what I am missing here?

dasantonym · March 24, 2022, 2:56pm

Restarted the master node after reading this:

github.com/kubernetes/kubernetes

Kubernetes Pods not scheduled due to "Insufficient CPU" when CPU resources are available

opened 06:44PM - 29 Sep 16 UTC

closed 02:59AM - 14 Apr 21 UTC

armandocerna

sig/scheduling area/nodecontroller lifecycle/rotten

**Kubernetes version** (use `kubectl version`): ``` Client Version: version.In…fo{Major:"1", Minor:"3", GitVersion:"v1.3.7", GitCommit:"a2cba278cba1f6881bb0a7704d9cac6fca6ed435", GitTreeState:"clean", BuildDate:"2016-09-12T23:15:30Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.7", GitCommit:"a2cba278cba1f6881bb0a7704d9cac6fca6ed435", GitTreeState:"clean", BuildDate:"2016-09-12T23:08:43Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"} ``` **Environment**: - **Cloud provider or hardware configuration**: AWS, masters (Count: 3, Size: m3.medium), minions (Count 5, Size m4.xlarge) - **OS** (e.g. from /etc/os-release): 14.04.5 LTS, Trusty Tahr - **Kernel** (e.g. `uname -a`): Master: 3.13.0-95-generic Minion: 4.4.0-38-generic - **Install tools**: Ansible using modified contrib playbooks: https://github.com/kubernetes/contrib/tree/master/ansible - **Others**: **What happened**: When scheduling pods with a low resource request for CPU (15m) We recieve the message "Insufficient CPU" across all nodes attempting to schedule the pod. We are using multi container pods and running a describe pods shows nodes with available resources to schedule the pods. However k8s refuses to schedule across all nodes. [kubectl_output.txt](https://github.com/kubernetes/kubernetes/files/501431/kubectl_output.txt) **What you expected to happen**: **How to reproduce it** (as minimally and precisely as possible): Below is a sample manifest that we can use to produce the output. [manifest.txt](https://github.com/kubernetes/kubernetes/files/501443/manifest.txt) We end up scheduling pods up until about 10-14 pods and then we run into this problem. See graph below <img width="1090" alt="screen shot 2016-09-29 at 11 30 56 am" src="https://cloud.githubusercontent.com/assets/5752233/18967327/9a73acde-8639-11e6-9b93-1e545fa3d94f.png">

Works now, so not related to Jupyter.

Topic		Replies	Views
How Cpu and memoryl in jupyterhub/k8S affect the number of singleuser JupyterHub	2	320	September 14, 2021
Insufficient gpu problem! Zero to JupyterHub on Kubernetes help-wanted	1	1267	October 24, 2021
Unwanted shared GPU Zero to JupyterHub on Kubernetes	2	732	June 1, 2021
Cpu limits in jupyterhub/k8S Zero to JupyterHub on Kubernetes	6	2459	February 15, 2022
How to share GPU to mutiple pods? Insufficient nvidia.com/gpu JupyterHub jupyterhub , help-wanted	6	1148	June 3, 2024

Singleuser GPU limits

Related topics