Zero to k8s 3.2.1 EFS static provisioniing volume mounting error

Hello,

I am currently attempting to upgrade my zero to jupyterhub k8s to 3.2.1 so that it can be compatible with my 1.27 EKS cluster. However, I have been running into a problem when the user pods spin up and attempt to mount the ‘home’ volume, and then time out . I am using a storageclass, pv, and pvc to do so( see below for config). Every time I run the singleuser, the pod gets scheduled to my nodes but then times out with:

2024-01-25T19:33:49.383049Z [Normal] Successfully assigned jhub-blue/jupyter-sidhant-2eswami to <REDACTED>

2024-01-25T19:35:52Z [Warning] Unable to attach or mount volumes: unmounted volumes=[home], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition

Spawn failed: pod jhub-blue/jupyter-sidhant-2eswami did not start in 300 seconds!

Unusually, when I run a custom pod to test the mounting, that works fine. Only my jhub pods encounter this error. I have been trying to sift through the config but nothing seems out of the ordinary for this to be timing out when trying to mount. Not sure how to fix this since the efs.aws.csi controller and node pods have no logs regarding the timeout, either. And besides this singleuser pod, other pods that try to access the EFS to do so easily.

I am attempting to use static provisioning for a PersistentVolume in Kubernetes by specifying an access point on an exsiting AWS EFS, which is referenced in the Persistent Volume configuration. Here is the config for the pvc, pv, and storageclass:

Storage Class:

Name:                  efs-sc
IsDefaultClass:        No
Annotations:           meta.helm.sh/release-name=aws-efs-csi-driver,meta.helm.sh/release-namespace=kube-system,storageclass.kubernetes.io/is-default-class=false
Provisioner:           efs.csi.aws.com
Parameters:            directoryPerms=700
AllowVolumeExpansion:  <unset>
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     Immediate
Events:                <none>

Persistient Volume:

Name:            efs-pv-blue
Labels:          <none>
Annotations:     pv.kubernetes.io/bound-by-controller: yes
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    efs-sc
Status:          Bound
Claim:           jhub-blue/efs-claim
Reclaim Policy:  Retain
Access Modes:    RWX
VolumeMode:      Filesystem
Capacity:        5Gi
Node Affinity:   <none>
Message:         
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            efs.csi.aws.com
    FSType:            
    VolumeHandle:      fs-123456::fsap-1234abcdedfghijk
    ReadOnly:          false
    VolumeAttributes:  <none>
Events:                <none>

Persistient Volume Claim:

Name:          efs-claim
Namespace:     jhub-blue
StorageClass:  efs-sc
Status:        Bound
Volume:        efs-pv-blue
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      5Gi
Access Modes:  RWX
VolumeMode:    Filesystem
Used By:       efs-app
Events:        <none>

Here is the user pod that is failing:

Name:                 jupyter-sidhant-2eswami
Namespace:            jhub-blue
Priority:             0
Priority Class Name:  jhub-blue-default-priority
Service Account:      default
Node:                 <REDACTED>
Start Time:           Thu, 25 Jan 2024 14:33:49 -0500
Labels:               app=jupyterhub
                      chart=jupyterhub-3.2.1
                      component=singleuser-server
                      heritage=jupyterhub
                      hub.jupyter.org/network-access-hub=true
                      hub.jupyter.org/servername=
                      hub.jupyter.org/username=jupyter-sidhant-2eswami
                      release=jhub-blue
Annotations:          hub.jupyter.org/username: sidhant.swami
Status:               Pending
IP:                   
IPs:                  <none>
Init Containers:
  block-cloud-metadata:
    Container ID:  
    Image:         quay.io/jupyterhub/k8s-network-tools:3.2.1
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      iptables
      --append
      OUTPUT
      --protocol
      tcp
      --destination
      <REDACTED>
      --destination-port
      80
      --jump
      DROP
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:         <none>
Containers:
  notebook:
    Container ID:  
    Image:         <REDACTED>
    Image ID:      
    Port:          8888/TCP
    Host Port:     0/TCP
    Args:
      jupyterhub-singleuser
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  2147483648
    Requests:
      cpu:     50m
      memory:  536870912
    Environment:
      CPU_GUARANTEE:                           0.05
      CPU_LIMIT:                               2.0
    Mounts:
      /home/jovyan/work/ from home (rw,path="home/jupyter-sidhant-2eswami")
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  home:
    Type:        PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:   efs-claim
    ReadOnly:    false
QoS Class:       Burstable
Node-Selectors:  type=main
Tolerations:     hub.jupyter.org/dedicated=user:NoSchedule
                 hub.jupyter.org_dedicated=user:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
                 type=main:NoSchedule
Events:
  Type     Reason       Age    From                      Message
  ----     ------       ----   ----                      -------
  Normal   Scheduled    2m14s  jhub-blue-user-scheduler  Successfully assigned jhub-blue/jupyter-sidhant-2eswami to <REDACTED>
  Warning  FailedMount  11s    kubelet                   Unable to attach or mount volumes: unmounted volumes=[home], unattached volumes=], failed to process volumes=]: timed out waiting for the condition

For context, efs-app was that test pod I was using that can access the EFS.

Happy to provide any more information needed. Has anyone encountered this issue before? Any help/pointers is much appreciated :slight_smile:

I think you started needing a EKS plugin after ugrading eks to some minor version. In GitHub - 2i2c-org/infrastructure: Infrastructure for configuring and deploying our community JupyterHubs. you can find some templates of eksctl config files that installs the relevant plugin etc in the eksctl/ folder

Thanks for the quick response!

Just to understand, what is the context of that project and its documentation? I can’t seem to find a ‘eks plugin’, anywhere. How would I use this to integrate into my existing zero-to-jupyterhub-k8s project? I have already upgraded a 1.27 EKS cluster, is this for any additional changes that need to be made for a cluster hosting jhub?

Once again, thanks for your help!