Using a CIFS share

Hi all,

I’m completely new to JupyterHub and am a Linux / Kubernetes admin. A developer of ours setup a hub on one of our Kubernetes clusters with the instructions provided here:

This works great, with Ingress, TLS, ADFS and all, except for one thing: connecting a CIFS mount. For the storage of the home folders my colleague has used our rook/ceph in-cluster storage. For sharing specific files he connected the hub to our NFS storage. Both are completeley fine. But trying to attach the external, shared CIFS storage we get a timeout after 300 seconds and a rather general error:
Unable to attach or mount volumes: unmounted volumes=[use], unattached volumes=[use]: timed out waiting for the condition

He’s running the 0.9.0 version and we are running a very new Kubernetes version 1.18.6. He setup the config with a helm chart and the following storage options for in-cluster and NFS:

  storage:
    type: dynamic
    capacity: 10Gi
    dynamic:
      pvcNameTemplate: claim-{username}
      volumeNameTemplate: volume-{username}
      storageAccessModes: [ ReadWriteMany ]
    extraVolumes:
      - name: jupyterhub-workspace
        persistentVolumeClaim:
          claimName: jupyterhub-workspace
    extraVolumeMounts:
      - name: jupyterhub-workspace
        mountPath: /mnt/WORKSPACE

And now we’re trying more or less the same for CIFS. Looking at the values.yml file for the helm chart, these seem to be our options:

  storage:
    type: dynamic
    extraLabels: {}
    extraVolumes: []
    extraVolumeMounts: []
    static:
      pvcName:
      subPath: '{username}'
    capacity: 10Gi
    homeMountPath: /home/jovyan
    dynamic:
      storageClass:
      pvcNameTemplate: claim-{username}{servername}
      volumeNameTemplate: volume-{username}{servername}
      storageAccessModes: [ReadWriteOnce]

We tried both static and dynamic configurations (the documentation at zero-to-jupyterhub.readthedocs.io is lacking) but we’re always getting the time out and error.

So to me, this seems the most logical possible configuration:

  storage:
    type: dynamic
    capacity: 10Gi
    dynamic:
      pvcNameTemplate: claim-{username}
      volumeNameTemplate: volume-{username}
      storageAccessModes: [ ReadWriteMany ]
    extraVolumes:
      - name: jupyterhub-workspace
        persistentVolumeClaim:
          claimName: jupyterhub-workspace
      - name: jupyterhub-use
        persistentVolumeClaim:
          claimName: jupyterhub-use
    extraVolumeMounts:
      - name: jupyterhub-workspace
        mountPath: /mnt/WORKSPACE
      - name: jupyterhub-use
        mountPath: /mnt/USE

For reference, this is the way we create the object for our CIFS PV (which works fine with other deployments):

apiVersion: v1
kind: PersistentVolume
metadata:
  name: jupyterhub-use-share
  namespace: jupyterhub-tst
spec:
  capacity:
    storage: 1000Gi
  flexVolume:
    driver: mydomain.io/cifs
    options:
      opts: rw,noperm,nounix,vers=3.0
      server: data.storage.mydomain.io
      share: /data/use
    secretRef:
      name: jupyterhub-use-secret
  accessModes:
    - ReadWriteMany
---
apiVersion: v1
kind: Secret
metadata:
  name: jupyterhub-use-secret
  namespace: jupyterhub-tst
type: mydomain.io/cifs
data:
  username: 'redacted base64 encoded'
  password: 'redacted base64 encoded'
  domain: 'redacted base64 encoded'

With other deployments we claim the above created CIFS space with something like this:

      volumes:
      - name: use
        persistentVolumeClaim:
          claimName: myapp-use

We re-use this same code in other deployments, mounting the same CIFS share with the same config: no problems. With Jupyterhub this does not work.

To cut a long story short: who has this working with CIFS and how? :slight_smile:

Just discovered and tested, that apparently we are hit by this bug/behavior:

How can this be avoided?

I have not seen a fix for it, but didn’t try the latest version. Even if you fix the immediate problem, trying to open a folder with too many files will crash the file browser in jupyterlab. Be aware that lots of tools don’t work right if there are too many files in the same folder. Any process which must iterate over all the files will not work.

In my case, my files are mounted inside the container as the user, but it was much more tricky to implement. It works, but I’m not happy with it as I maintain a credentials file separate from the user’s login to the notebook environment.