Hub/proxy node affinity

i’m running two jupyterhub instances on aws with identical settings. they both use cluster autoscaler and have three autoscaling groups: one ondemand and two with spot pricing. all groups have a minSize of 0.

in one of the instances the hub and proxy have ended up a spot pricing worker. i would like for them to stay on the ondemand group. is there a known way of forcing the hub/proxy to stay in a certain nodetype?

Here is the full deployment setup if anyone is interested. The configuration information for one instance is here.

Have you looked at using a nodeSelector?
https://zero-to-jupyterhub.readthedocs.io/en/latest/resources/reference.html?highlight=nodeselector#hub

I recommend configuring scheduling.userPods.nodeAffinity.matchNodePurpose=require, like that, you can force user pods and user placeholder pods to schedule on nodes labelled like: hub.jupyter.org/node-purpose=user.

With that said, there is a bug reported about this, so I’m not 100% confident this works as intended for the user pods: Are the docs regarding dedicated node pools correct? · Issue #2561 · jupyterhub/zero-to-jupyterhub-k8s · GitHub

Related

thank @manics . just to clarify. if for one of my node spec’s i add those labels and taints from the core, is that sufficient. i’m listing all the different labels that exists in the spec for different parts. is metadata --> labels and spec --> taints the correct places?

metadata:
  labels:
    kops.k8s.io/cluster: {{ namespace }}.k8s.local
    hub.jupyter.org/node-purpose: core
  name: nodes-us-east-2a
spec:
  cloudLabels:
    k8s.io/cluster-autoscaler/enabled: ""
    k8s.io/cluster-autoscaler/{{ namespace }}.k8s.local: ""
    k8s.io/cluster-autoscaler/node-template/label: ""
...
  nodeLabels:
    on-demand: "true"
...
  taints:
  - on-demand=true:PreferNoSchedule
  - hub.jupyter.org/dedicated=core:NoSchedule

It appears that the “bug” was just user-error :sweat_smile:

1 Like

just a note that after i put the label and taints in, the kubernetes cluster would not finish applying a rolling update. i suspect i’m missing something something else. will continue investigating.

just an update here that i had to change some of the node taints independent of jupyterhub and increase the number of nodes. this keeps the hub/proxy pods on these lighterweight nodes, allowing the spot nodes to spin down. specifically i was setting on-demand=False:preferNoSchedule on these nodes.