Request for Implementation: Archive user home directories transparently to object storage for massive cost savings

When running JupyterHubs, storage often is a bigger driver of cost than compute. Compute often scales with how many users are currently running, while storage scales with how many users ever used it. So storage cost just grows and grows.

A simple way to make this cheaper is to not provide regular POSIX filesystem (a traditional disk drive) to users, and instead only provide object storage - object storage is much cheaper. However, a lot of code relies on access to a POSIX filesystem (git for example), and most users don’t want to port their code to just use object storage.

A decent compromise is to move user home directories to object storage when the user hasn’t logged in for a while, and fetch it transparently back to a POSIX file system when the user logs in again. This lets us treat POSIX filesystem (exposed over NFS maybe) as a ‘hot cache’ almost, and operate with a smaller POSIX filesystem than otherwise. I think this is reasonably easy to build, and can be built to be fairly generic and safe.

We would need code to do two things.

The Archiving process

This should be a background job that runs in a loop, and finds user home directories that have been unmodified for a while and not currently in use. It archives these, puts them in object storage, and makes a note somewhere (a database?) that this user’s home directory is in object storage now. This is a background job because there is no reliable way to run something whenever a user’s pod stops.

The unarchiving process

When a user logs in, we check the database (or wherever the archiver is keeping state) to see if they have a home directory in need of extraction. This is done as an init container in z2jh, so the user pod is not started until this process is completed. A small script in the init container should somehow fetch the tarball from object storage, extract it to the POSIX filesystem, and then start the user pod. The user should know no difference, other than a delay for extraction to happen. Some authentication needs to happen to make sure that users can’t access others’ home directories.

Would love for someone to build this! <3

2 Likes

An intermediate compromise (between native object storage and your proposal) is to mount an object store with s3fuse or goofys. It’s only partially POSIX compatible, but might be good enough? I just tried using git locally with a Minio server mounted with s3fs and it seemed to work.

I have never trusted FUSE file systems. POSIX file semantics are synchronous (open, write, etc) while all network calls are asynchronous in nature. This means network failures manifest in extremely weird ways with the file system. Performance is also hit or miss sometimes. But most importantly, I think there isn’t currently an easy way to do FUSE in kubernetes! I would love to explore that option for limited use cases.

1 Like

AWS has an S3 storage gateway that offers NFS:

I think it spins up an EC2 instance behind the scenes to act as an NFS server in front of S3, with some local storage as a cache.

Azure seems to have something similar:

Obviously it means you’re dependent on the services offered by a particular cloud provider.