Current advice for NFS home directories on GKE?

I am just setting up a JupyterHub / Kubernetes cluster for our University, and I’m considering NFS for home directory storage.

I found the DataHub writeup - and that made me wonder - what is the current best advice for using NFS on Google Cloud? Is it the data8x approach of Google Cloud Filestore? Or a hand-maintained server? How about the best client connection method? Where would I start in compiling a recipe?

I also read this thread - thanks to dirkcgrunwald for posting that.

1 Like

In my experience Google Cloud Filestore works well. The only/biggest disadvantage is the high starting price for small teams/disk usage.

Yes, good point, I’ve just been exploring. I see the minimum size is 1TB, which translates to a minimum cost to us of $240 / month.

On the other hand, I see that a dedicated g1-small instance, with 350GB of standard zonal disk, would cost about $35 per month.

Exploring more - I think I will need separate NFS server - perhaps installed via Helm [1, 2].

But - I’m struggling to orient myself, and I suspect this morass is one that many of y’all have already escaped. Just for example, where should I look for the meaning of parts of the datahub setup, such as the nfsPVC section, and the storage section?

Do y’all set up the NFS server by hand, or do you use Helm? If you use Helm, how do y’all reclaim the storage when restarting the cluster?

[1] https://github.com/dirkcgrunwald/zero-to-jupyterhub-k3s/tree/master/basic-with-nfs-volumes
[2] https://www.padok.fr/en/blog/readwritemany-nfs-kubernetes

To answer my own questions:

The nfsPVC section in the datahub setup is specific to the datahub setup, and abstracts out creation of NFS PersistentVolume and PersistentVolumeClaim parameters.

Adding NFS proved relatively straightfoward. Our config is here:

with some notes on getting storage working in storage.md. The actual NFS setup is in init_nfs.sh and used in the config.yaml.