Hello,
I am currently attempting to upgrade my zero to jupyterhub k8s to 3.2.1 so that it can be compatible with my 1.27 EKS cluster. However, I have been running into a problem when the user pods spin up and attempt to mount the ‘home’ volume, and then time out . I am using a storageclass, pv, and pvc to do so( see below for config). Every time I run the singleuser, the pod gets scheduled to my nodes but then times out with:
2024-01-25T19:33:49.383049Z [Normal] Successfully assigned jhub-blue/jupyter-sidhant-2eswami to <REDACTED>
2024-01-25T19:35:52Z [Warning] Unable to attach or mount volumes: unmounted volumes=[home], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition
Spawn failed: pod jhub-blue/jupyter-sidhant-2eswami did not start in 300 seconds!
Unusually, when I run a custom pod to test the mounting, that works fine. Only my jhub pods encounter this error. I have been trying to sift through the config but nothing seems out of the ordinary for this to be timing out when trying to mount. Not sure how to fix this since the efs.aws.csi controller and node pods have no logs regarding the timeout, either. And besides this singleuser pod, other pods that try to access the EFS to do so easily.
I am attempting to use static provisioning for a PersistentVolume in Kubernetes by specifying an access point on an exsiting AWS EFS, which is referenced in the Persistent Volume configuration. Here is the config for the pvc, pv, and storageclass:
Storage Class:
Name: efs-sc
IsDefaultClass: No
Annotations: meta.helm.sh/release-name=aws-efs-csi-driver,meta.helm.sh/release-namespace=kube-system,storageclass.kubernetes.io/is-default-class=false
Provisioner: efs.csi.aws.com
Parameters: directoryPerms=700
AllowVolumeExpansion: <unset>
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events: <none>
Persistient Volume:
Name: efs-pv-blue
Labels: <none>
Annotations: pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pv-protection]
StorageClass: efs-sc
Status: Bound
Claim: jhub-blue/efs-claim
Reclaim Policy: Retain
Access Modes: RWX
VolumeMode: Filesystem
Capacity: 5Gi
Node Affinity: <none>
Message:
Source:
Type: CSI (a Container Storage Interface (CSI) volume source)
Driver: efs.csi.aws.com
FSType:
VolumeHandle: fs-123456::fsap-1234abcdedfghijk
ReadOnly: false
VolumeAttributes: <none>
Events: <none>
Persistient Volume Claim:
Name: efs-claim
Namespace: jhub-blue
StorageClass: efs-sc
Status: Bound
Volume: efs-pv-blue
Labels: <none>
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 5Gi
Access Modes: RWX
VolumeMode: Filesystem
Used By: efs-app
Events: <none>
Here is the user pod that is failing:
Name: jupyter-sidhant-2eswami
Namespace: jhub-blue
Priority: 0
Priority Class Name: jhub-blue-default-priority
Service Account: default
Node: <REDACTED>
Start Time: Thu, 25 Jan 2024 14:33:49 -0500
Labels: app=jupyterhub
chart=jupyterhub-3.2.1
component=singleuser-server
heritage=jupyterhub
hub.jupyter.org/network-access-hub=true
hub.jupyter.org/servername=
hub.jupyter.org/username=jupyter-sidhant-2eswami
release=jhub-blue
Annotations: hub.jupyter.org/username: sidhant.swami
Status: Pending
IP:
IPs: <none>
Init Containers:
block-cloud-metadata:
Container ID:
Image: quay.io/jupyterhub/k8s-network-tools:3.2.1
Image ID:
Port: <none>
Host Port: <none>
Command:
iptables
--append
OUTPUT
--protocol
tcp
--destination
<REDACTED>
--destination-port
80
--jump
DROP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment: <none>
Mounts: <none>
Containers:
notebook:
Container ID:
Image: <REDACTED>
Image ID:
Port: 8888/TCP
Host Port: 0/TCP
Args:
jupyterhub-singleuser
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 2
memory: 2147483648
Requests:
cpu: 50m
memory: 536870912
Environment:
CPU_GUARANTEE: 0.05
CPU_LIMIT: 2.0
Mounts:
/home/jovyan/work/ from home (rw,path="home/jupyter-sidhant-2eswami")
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
home:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: efs-claim
ReadOnly: false
QoS Class: Burstable
Node-Selectors: type=main
Tolerations: hub.jupyter.org/dedicated=user:NoSchedule
hub.jupyter.org_dedicated=user:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
type=main:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m14s jhub-blue-user-scheduler Successfully assigned jhub-blue/jupyter-sidhant-2eswami to <REDACTED>
Warning FailedMount 11s kubelet Unable to attach or mount volumes: unmounted volumes=[home], unattached volumes=], failed to process volumes=]: timed out waiting for the condition
For context, efs-app was that test pod I was using that can access the EFS.
Happy to provide any more information needed. Has anyone encountered this issue before? Any help/pointers is much appreciated