Trouble understanding behavior of autoscaling support

I enabled autoscaling. I also see 3 user-placeholder pods (as configured). However, when I log in as a new user and my cluster is at limit (cannot allocated more pods), I don’t see any of the placeholder pods being taken for a real user. Is this expected?

I’m using AWS EKS. Here’s my configuration for JupyterHub:

scheduling:
  userScheduler:
    enabled: true
  podPriority:
    enabled: true
    userPlaceholderPriority: 5
  userPlaceholder:
    enabled: true
    replicas: 3

Here are some relevant bits in the log:

Jun 03 15:41:48 jhub-user-scheduler-74954df8b-gn769 user-scheduler: I0603 22:41:47.769570       1 factory.go:1181] About to try and schedule pod jupyter-user15
Jun 03 15:41:48 jhub-user-scheduler-74954df8b-gn769 user-scheduler: I0603 22:41:47.769590       1 scheduler.go:447] Attempting to schedule pod: jhub/jupyter-user15
Jun 03 15:41:48 jhub-user-scheduler-74954df8b-gn769 user-scheduler: I0603 22:41:47.769702       1 scheduler.go:194] Failed to schedule pod: jhub/jupyter-user15
Jun 03 15:41:48 jhub-user-scheduler-74954df8b-gn769 user-scheduler: I0603 22:41:47.769760       1 factory.go:1303] Unable to schedule jhub jupyter-user15: no fit: 0/8 nodes are available: 8 Insufficient cpu.; waiting

By writing that the userpod has a priority of five, they are actually able to evict user pods with priority of 0, it should be the other way around.

Let them have the default priority of -10.

Making changes to this priority may require you to have --force in your helm upgrade command if i recall correcly.

Releated documentation:
http://z2jh.jupyter.org/en/latest/optimization.html

1 Like

Yeah I suspected that had something to do with it, so I updated the deployment with priority taken out (so deafult to -10). However, I’m observing a similar behavior. For example: see the following user pod in pending state, while there are placeholders older than that user pod. If I understand the expectation correctly, shouldn’t user jupyter-mXXXn-2eXXXX pod take over one of the running user-placeholder-X pods?

rai-notebooks-20190524-2333   jupyter-mXXXn-2eXXXX                         0/1     Pending             0          4m23s
rai-notebooks-20190524-2333   user-placeholder-0                                  1/1     Running             0          10m
rai-notebooks-20190524-2333   user-placeholder-1                                  0/1     Pending             0          6m38s
rai-notebooks-20190524-2333   user-placeholder-2                                  1/1     Running             0          21h

It should!

Debugging steps:

    1. Ensure you have k8s 1.11 or higher, and helm 1.12 or higher (requirement for using the chart, so i figure you do)
    1. kubectl get priorityclass and also a describe to ensure you have a priorityclass installed that pods can reference they want to get their priority from. The placeholder pods should do this and get the priority of -10.
    1. kubectl describe pod user-placeholder-0 and verify that these pods have gotten the new priority. You should find priorityClassName referencing the custom priorityClass and a priority field being set to -10 rather than the defualt of 0.
    1. Hmmm perhaps you need to manually restart these pods i realize now, they may have an old priority if you change the priorityclass after the pods creation… Yeah i think that is only assigned on pod creation.

NOTE: I left out -n your-namespace in the kubectl commands, add it.