Hello folks,
I have 1 master - 3 worker cluster of Kubernetes. All 3 worker nodes have NVIDIA GPU. I am planning to use a dockerspawner to spin up new containers on this containers. Now I am interested about the GPU resource allocation here. If I run a deep learning model on one of the notebooks then it should be able to use all GPU’s from other nodes as well. I want to distribute my ML workload here.
My question is : Will combination of (kubernetes + jupyterhub + dockerspawner ) work for my usecase or should I look for any alternative architecture? Also please let me know what are other alternatives that can help me in my usecase
If I run a deep learning model on one of the notebooks then it should be able to use all GPU’s from other nodes as well. I want to distribute my ML workload here.
This is hard, and you need some software strategy involving k8s awareness to handle that then.
Yes, you will need to use k8s Job (or something similar) that will launch pods on each node to distribute your ML workload. I dont think Kubespawner supports k8s jobs.
Maybe Kubeflow might be an option for your use case and it supports Jupyter notebooks.