What’s the best way to have a scheduled job run on a user pod for Z2JH?
The example use-case would be syncing user directories based on user groups.
This is currently accompanied by a kubernetes cronjob, which uses the hub REST API and an external API to sync internal user groups with Jupyter groups. This runs successfully, and runs on schedule as expected.
Additionally, a shared directory is mounted to all user pods under /mnt/Shared/* with a unique directory name that matches Jupyter group names. A sync.sh script fires on podstart to iterate through the user groups, and ln /mnt/Shared/$i /home/[username]/$i. This runs successfully, and syncs user directories on first login as expected.
Ideally, I’d like to run this same script on a schedule.
Just adding this didn’t seem to work as expected. After I kubectl exec -it into a user pod, I found that the cron package wasn’t included in the image. I added, but still not getting any results/job runs.
I believe singleuser.extraContainers may be a possible solution, but I’m not so sure. Because this uses it’s own image, it would have a unique filesystem which I don’t believe can manage the userpod filesystem.
Is there a better way to run these jobs inside of a user pod on a schedule? Or is there a secret to getting cron to work normally in an image?
Yes. In the first case you’re running multiple services inside a single container, mimicking a VM. In the latter you’re running seperate processes/services in separate containers (but in the same K8s pod), with a shared volume.
Not in JupyterHub. However JupyterHub and Jupyter server/lab/notebook are highly extensible, so for example you could write a Jupyter server extension that runs scheduled jobs. This is for schewduling notebooks rather than jobs but it illustrates what’s possible:
For any future visitors, I’ve found what I think may be an easier way (if you already create your own singleuser image) - closer to the ‘“VM-like” container’ solution. This uses supercronic.
Add the execute to the singleuser.lifecycleHooks.postStart
** I add the & to send the pid to the background, and not hang the loading of the jupyterserver container