Has someone investigated/coded something/thought about using the activity information of JupyterHub users to remove the PVC and PV of users that haven’t been active for more than 30 days (or some other configurable period)?
Background is the JupyterHub of the OpenHumans foundation. They offer storage to everyone who logs in but as usual many people only login once or twice. Then never again. Over time (in this case >2years) you accumulate a lot of PVs that will never be used again. Automating the clean-up would help save money and remove a manual task.
Another use case could be hubs used for teaching where students come for a few week course or a semester or several semesters. Eventually they will stop using the hub and their PVs will continue to cost money.
An alternative approach is to add a NFS server (or use cloud vendor version of this) to provide shared storage to users. This removes the need to clean up PVs but comes with more complexity or increased cost if you don’t need the minimal storage size your cloud vendor sets.
The current idea is to create a JupyterHub service similar to the “cull idle users” service. The service would check when a user was last active and remove their PV after a long period of inactivity. Deleting data is never nice so maybe the server could even notify users by email before it will delete data.
If you know of existing code or discussions I’d be happy to hear about it.