Possible to move volumes between clusters?

vputz · July 11, 2019, 4:41pm

My jupyterhub cluster started floundering, and in an attempt to upgrade kubernetes and increase the number and size of master nodes, the cluster got… Severely Borked (etcd can’t mount it’s volume on 2/3 masters, they have different PKI keys than the working master, which is only working because I ssh’d into it and changed its API server manifest–it’s a mess)

I feel like the only option now is to take off and nuke the whole site from orbit, but the last time I had to do that it clobbered the (Aws ebs) user storage volumes. Is there a way to save those, perhaps by starting a new cluster and moving them over, or some such? I’d hate for people to lose their work… And as much as I love kubernetes, I suspect the ability to recreate the cluster without losing user storage would be a really valuable skill…

minrk · July 19, 2019, 6:23am

I’m sure this is possible, but I’m not sure how manual the steps have to be. The main thing should be that this is a generic kubernetes on AWS question, not specific to JupyterHub, so if you’re Googling for ideas probably omit anything Jupyter related. I have more experience with GKE, where I’ve done things like create snapshots of volumes before taking actions that might destroy them, so that I have the ability to restore data later, even if that means mounting the snapshot and new empty volume on a new node and copying files with rsync.

mulliganp · July 31, 2019, 1:36pm

Ok from a generic AWS standpoint you can snapshot the current list of attached EBS volumes. You can restore these snapshots to new volumes.

As volumes are AZ specific I don’t think there is a straight forward way to remount these new volumes onto a kubernetes JupyterHub cluster as when new containers are provisioned they will be spread across AZ’s, therefore, the new single-user container might end up in a different AZ then the previous container. I would, therefore, suggest you mount each volume and rsync the data into an S3 bucket creating a separate folder for each user. You will then need a sync script to run as new containers are generated syncing the data back down to the new volume.

Lots of this work can be done using python scripts and boto but you may need to do some tagging around the resources to ensure you only work on the Jupyterhub volumes.

A quick question about your current cluster how do you currently get around volumes been tied to an AZ do you simple over provision nodes in your cluster?

It may be worth thinking about EFS instead of EBS volumes but that is entirely up to your use case.

Topic		Replies	Views
JupyterHub hub-db-dir PV Question JupyterHub	5	3285	December 13, 2019
Updated AWS EBS CSI driver in order to update EKS to 1.23, now single user server pods can't connect to volumes JupyterHub	1	706	September 6, 2022
Migrating user data between Hubs Zero to JupyterHub on Kubernetes	9	1133	November 15, 2023
Is there a way to mount more than 1 volume? Zero to JupyterHub on Kubernetes community , jupyterhub , help-wanted	5	1243	August 2, 2023
Recommendations on Dynamic PersistentVolumes on Newer AWS EKS Versions Zero to JupyterHub on Kubernetes how-to , help-wanted	3	1094	April 7, 2023

Possible to move volumes between clusters?

Related topics