This is a post describing what people who operate mybinder.org actually do. My hope with writing this down is to make it less opaque and more accessible.
The single most frequent thing I do as an operator is visit the binder launch success rate chart (slightly different view of same metric). We want the green line (hidden behind the yellow one) to be at 100%. If it is significantly below or zigzaggy something is wrong. Then I check how many user pods are running. All lines except for the blue one should be “basically zero”, otherwise something is wrong.
Another common task is noticing that a new Pull Request has been merged in the BinderHub or repo2docker repositories. We try to deploy changes from these two repositories “as soon as reasonably possible”, within a few days. To update mybinder.org we follow the SRE guide on deploying changes. Usually “nothing happens” when we do this. Life just goes on. However sometimes things go wrong or there is a bug in the new version we just deployed. In that case you need to decide if you have a bit of time to debug it or not. If you don’t have time to debug it revert the Pull Request with your changes. That things break is rare and when things do go wrong reverting the PR recovers things. This means that the risk of breaking mybinder.org is small. (Which is a great thing to know considering about 100k binders are launched each week…)
There are more tasks like updating the version of kubernetes we use but they are less frequent so I won’t describe them here (yet).
Becoming an Operator
If you enjoy the feeling of running a service “at scale” or want to learn more about the inner workings of mybinder.org you can become one of The Operators (yes I think we should capitalise it like that ).
To get started all you need is a GitHub account and familiarity with making Pull Requests. Everything is controlled from https://github.com/jupyterhub/mybinder.org-deploy/ and you won’t have to type any commands into a terminal as all that is automated.
If you have questions post please post in this thread and I’ll try and answer them.