Just wondering: has someone (tried to) write a custom kubernetes scheduler? I’m interested in finding examples and guides on doing this for BinderHub so any points and experience would be welcome.
I’ve found https://banzaicloud.com/blog/k8s-custom-scheduler/ which looks interesting.
Some background on why I am interested in this:
We currently use the default
kube-scheduler with a custom config for user pods. The main effect this has is to schedule user pods on the “fullest” node (that has room to spare). This helps a lot with keeping our nodes busy and allows us to scale down nodes that are not needed. This in turn lets us spend less on compute.
One drawback of this strategy is that if a group of (say) 30 people launch the same repo at the same time there is a high chance they all end up on the same node. As mybinder.org is used a lot for courses and workshops chances are that all these 30 people will do things in unison, including running “heavy compute”. This is when we see a high load on the node that they are on because, in the interest of keeping our node utilisation up, we over-commit our nodes.
Most of the time no one notices that ~90 users are sharing the 8 cores of a node because humans spend a lot of time reading and thinking about what it is the computer just computed for them. Except in classroom settings with some “heavy” compute.