Hey all - I’ve updated the binder data repository with a few enhancements:
- I’ve removed old/outdated data sources (like google analytics)
- I’ve added a new folder that scrapes and aggregates Binder data from archive.analytics.mybinder.org
- I’ve added a github workflow that runs each week and publishes the aggregated data as a github release
- I’ve added another folder that has a notebook to provide some simple visualizations of this data to show it off
- I’ve also set this up as a MyST site so that people can view the Binder launch data quickly.
I updated the READMEs etc of the repository so that it’s easy enough to follow along, but please let me know if you have questions. Hopefully this makes the repository a little bit more useful for Binder!
Repository: GitHub - jupyterhub/binder-data: A place to store data for Binder
JB2/MyST site: MyBinder Analytics Data Analysis - MyBinder Analytics Report
In case anybody is curious why there’s a drop in traffic, here’s a very rough timeline of the launches:
Essentially, we’ve seen big drops in traffic because Google Cloud cut our credit allocation, and as a result we had to significantly scale back the capacity of mybinder.org. We moved many repositories over to JupyterLite, and we also capped the number of sessions in general, so this significantly shrunk the overall capacity of Binder, and also reduced its perceived reliability, thus lowering demand. In 2025 we spun up two Hetzner hubs and this has stabilized binder quite a lot
Also, SHOUT OUT TO GESIS who have been persistently supporting Binder with capacity for many years now. They are unsung heroes of this project.
I did this work because I wanted to write a little 2i2c blog post to highlight the launches on the 2i2c binder federation hub. It was hard to scrape this data with our current tooling, so I decided to improve this in the binder-data
repository so others can benefit from it too!