This post is about documenting the resources used by various BinderHub deployments in order to make it easier for others who want to deploy one to estimate how many resources they need. @nuest recently asked on gitter about what kind of resources he should request for a hub at his institute so I thought it would be a good idea to write up my response here.
If you operate a BinderHub (private or public) or know of one and have an estimate of the resources it uses please post here.
We operate a setup that autoscales, so the number of nodes (computers) in the cluster changes over the course of a day and week. You can check out the size of the cluster and number of concurrent pods via our public grafana dashboard.
On average we have around five user nodes each with 8CPUs and 52GB of RAM. The size of this pool of nodes can scale all the way down to zero.
We also operate a “core node” that has 4CPUs and 26GB of RAM. This node is used for support services, the binderhub pods and the jupyterhub pods. This node will always be up. User pods are not allowed to execute on this node.
Our kubernetes cluster is hosted on GKE. We have been very happy with the service we get, basically there are hardly ever any issues with the infrastructure. One thing we are becoming more and more convinced of is that operating your own kubernetes cluster is a full time job in itself, so outsourcing that is worth the cost.
We typically have hundreds of concurrent user pods running at any given moment and total nearly 100000 launches per week.