GPU usage from all nodes while using JupyterHub on Kubernetes

Hello folks,
I have 1 master - 3 worker cluster of Kubernetes. All 3 worker nodes have NVIDIA GPU. I am planning to use a dockerspawner to spin up new containers on this containers. Now I am interested about the GPU resource allocation here. If I run a deep learning model on one of the notebooks then it should be able to use all GPU’s from other nodes as well. I want to distribute my ML workload here.

My question is : Will combination of (kubernetes + jupyterhub + dockerspawner ) work for my usecase or should I look for any alternative architecture? Also please let me know what are other alternatives that can help me in my usecase

cc: @mahendrapaipuri @markperri @consideRatio

If you use kubernetes, you need kubespawner and not dockerspawner, and in practice you probably should use the jupyterhub helm chart.

If I run a deep learning model on one of the notebooks then it should be able to use all GPU’s from other nodes as well. I want to distribute my ML workload here.

This is hard, and you need some software strategy involving k8s awareness to handle that then.

1 Like

Yes, you will need to use k8s Job (or something similar) that will launch pods on each node to distribute your ML workload. I dont think Kubespawner supports k8s jobs.

Maybe Kubeflow might be an option for your use case and it supports Jupyter notebooks.