New-er administrator of a JupyterHub cluster here. We are running a few different clusters for JupyterHub, one which is in AWS, using Amazon’s hosted Kubernetes service. There’s an issue that we’ve begun to hit up against with nodes that auto scale. The hub-db-dir persistent volume that is attached to the hub container will get created, and then not be able to spin up on any nodes since they may be in a different availability zone. Here’s a GH forum where this is discussed:
Has anyone experimented with using a more persistent storage option? Having this pointed to an NFS share like user storage?
Had the same desire but for a different reason: Azure disks are slow to provision and I got tired of waiting a minute or two for it to mount each time I upgraded the cluster. Turns out the trick is to create a persistent volume, persistent volume claim, and override the hub-db-dir value in config.yaml.
Thanks for the reply @akaszynski! Great to see that I’m not trying to reinvent the wheel. Are you deploying this on Azure using Helm? Looking at your post on Github, I have something similar setup, however I’m still having issues with the deployment.
I have two config files that I pass to run the deployment. First is the PVC for Kubernetes:
I’ve put just the hub portion to keep things tidy. When I go to deploy this, Helm has an issue since the hub-db-dir is already being specified in the jupyterhub/jupyterhub helm chart. Did you happen to overcome this? Or did you deploy in a different fashion to circumvent this?
@akaszynski I was able to get around this, after some troubleshooting throughout the day. The workflow I found is that you need to do the initial deployment, without the hub-db-dir being changed at all, and then you can do an upgrade to override the change. So for me it looked like this:
Deploy basic values chart
Update pvc.yaml with PV and PVC for shared data, and PV and PVC for hub-db-dir
Mount hub-db-dir on an ec2 instance and create directory. Set permissions to 777 (haven’t had a chance to narrow down what the least permissions needed are yet)
Run a helm update --install deployment jupyterhub/jupyterhub --namespace namespace -f values.yaml --debug and it should just move it over to the update pvc.
It would be amazing if we didn’t have to deploy, then redeploy in the future - but it’s such a small workflow addition that it outweighs the problems we had before.
That makes sense now. On my second cluster deployment I ran into the issue, but when developing it on the first cluster I didn’t run into the issue as I was updating rather than installing.