Best way for scheduled job to run in a user pod?

What’s the best way to have a scheduled job run on a user pod for Z2JH?

The example use-case would be syncing user directories based on user groups.
This is currently accompanied by a kubernetes cronjob, which uses the hub REST API and an external API to sync internal user groups with Jupyter groups. This runs successfully, and runs on schedule as expected.

Additionally, a shared directory is mounted to all user pods under /mnt/Shared/* with a unique directory name that matches Jupyter group names. A sync.sh script fires on podstart to iterate through the user groups, and ln /mnt/Shared/$i /home/[username]/$i. This runs successfully, and syncs user directories on first login as expected.

Ideally, I’d like to run this same script on a schedule.

I’ve tried to add the below to the z2jh config:

singleuser:
  extraFiles:
    syncdirs_cronjob:
      mountPath: /etc/cron.d/syncdirs
      stringData: |
        - */10 * * * * root /usr/local/etc/jupyter/sync.sh >> /var/log/sync.log 2>&1
      mode: 0755

Just adding this didn’t seem to work as expected. After I kubectl exec -it into a user pod, I found that the cron package wasn’t included in the image. I added, but still not getting any results/job runs.

I believe singleuser.extraContainers may be a possible solution, but I’m not so sure. Because this uses it’s own image, it would have a unique filesystem which I don’t believe can manage the userpod filesystem.

Is there a better way to run these jobs inside of a user pod on a schedule? Or is there a secret to getting cron to work normally in an image?

Containers don’t usually contain a full operating system, so system services like Cron aren’t setup.

It’s possible to achieve a “VM-like” container, but you’ll need to build it yourself to include JupyterLab/notebook:

I think extraContainers should work if you’re using volumes, since volumes can be shared across all containers in the pod.

This should work - as I’m already using a custom image (not z2jh images).

This suggestion is either “vm-like” OR extra container, not AND (a “vm-like” extra container) correct?

Is there any other services or routes that are Jupyter/hub native for running schedules on user servers?

Yes. In the first case you’re running multiple services inside a single container, mimicking a VM. In the latter you’re running seperate processes/services in separate containers (but in the same K8s pod), with a shared volume.

Not in JupyterHub. However JupyterHub and Jupyter server/lab/notebook are highly extensible, so for example you could write a Jupyter server extension that runs scheduled jobs. This is for schewduling notebooks rather than jobs but it illustrates what’s possible:

For any future visitors, I’ve found what I think may be an easier way (if you already create your own singleuser image) - closer to the ‘“VM-like” container’ solution. This uses supercronic.

  • Add the “stanzas” from the release build, as described in their install instructions.
  • Add the execute to the singleuser.lifecycleHooks.postStart
    ** I add the & to send the pid to the background, and not hang the loading of the jupyterserver container
singleuser:
  ...
  lifecycleHooks:
    postStart:
      exec:
        command: ["/bin/bash", "-c", "/usr/local/bin/supercronic", "/etc/crontab", "&"]