If you are planning on using mybinder.org as part of your tutorial, workshop, live demo, etc please do post here. Awesome! Why post here? PyCon is huge and we’d like to make sure we are prepared for any thundering herds so that you, the participants and the operators of mybinder.org have a great time (and keep the stress low in everyone’s life).
Some things to keep in mind:
there is a 100 live instances per repo limit in place, once you reach it new users will get a “sorry we are full” message
every 70 (or so) additional live instances we need to scale up our cluster which takes about 10 minutes. If the cluster is at 400 live instances in total right now, then at 471 we will need to scale the cluster up. During scale up everyone has to join the waiting queue.
PyCon has wifi, but we all know how well conference wifi works when you really need it. Have a backup plan.
mybinder.org is generally a stable service. However we have little experience with keynote speakers asking everyone to launch a binder type events. They are the black swan events. The operators are volunteers and might be sleeping, on a plane or generally not at their computers. Have a plan for when it takes a while for service to be restored in case of an outage.
If you let us know your plans we can work something out together or provision a cluster with more headroom during PyCon or …
Be awesome and thanks for helping us help you help others learn more about Python!
I’ll be at PyCon Wd-Sun – and I have Jupyter stickers – I’ll be teaching Thursday morning 6-9:20 Pacific, 9-12:20 Localtime; not planning on using Binder.
I’m presenting a tutorial 1-4 on Wednesday
“analyzing census data with pandas”
I plan on using binder for it fingers crossed for the wifi
it’s 50 people max
I have a back up plan but I really really want to use mybinder
it went great for me! The wifi was great on my room so no hiccups there. Everyone seemed to like it - not installing anything and stuff. Some people didn’t know about mybinder and I took a min to explain how great it is
Mine went fantastic, no hiccups and I think 90% of people had everything loaded about 5 minutes into the tutorial. Binder is such an awesome service and I’m happy to tell any funding agencies the same thing if you ask .
Most of the problems I had were people using Jupyter locally, which was fixed by “try opening the binder”. I forgot to suggest that they try downloading their work materials, but either people figured it out or they didn’t care.
The only time I had a problem I couldn’t solve was someone whose corporate laptop was configured to block both github and mybinder.org, and there wasn’t much I could do about it.
Thursday afternoon (1:20pm-4:40pm EDT) at PyCon, I’ll be teaching Data Science Best Practices with pandas. My repo has a Binder link, though I assume usage will be relatively low since I’ve been reminding people again and again to make sure everything is installed ahead of time! (I didn’t want to assume good wifi or require people to use Jupyter notebook for the tutorial, thus Binder is what I’m recommending as a backup option.)
A small extension to install/try out for that might be GitHub - data-8/nbzip: Zips and downloads all the contents of a jupyter notebook. Gives you a button to get a big old zip file. There is also a discussion/idea brewing to make it so that you can save/download your notebook as notebook for as long as your tab is open (even if the binder instance has been culled). Needs some help/someone with spare time to implement it. Would be huge to have this.
I think I’ll keep following the Pycon tutorials “live” by watching our monitoring then
love this thread! can’t wait to go through the links and see what I missed.
General question about Binder in regards to this (planning for the future), what is the difficulty/cost of setting up an instance for a weekend to handle heavy demos?
Like, if I had a budget of $500-1000 and a day or two to set it up so that 500 simultaneous users could get a demo with almost no lag (assuming good wifi), do you think that would be possible?
@betatim I’m just wondering about what kinds of options open up when Jupyter doesn’t have to foot the Google Cloud bill for things like this. If I’m planning a conference, can I budget for something like this to handle the capacity? Would it have to be more? Just trying to get a sense for what’s possible and what it would require.
I think order of magnitude wise it costs a couple of hundred dollars to run mybinder.org per day in cloud costs. This serves between 300-450 concurrent sessions “at all times”, so at any moment in time you’d find about that many people using mybinder.org.
How much more or less compute resources you’d need depends heavily on what it is people are actually doing. So it is a bit tricky to say.
Time to setup a BinderHub from scratch? If you’ve been doing it regularly and it is for a “throw away” hub (no customisations, minimal monitoring, skipping all the things you need for long term ops): around a day. If you haven’t done it before maybe a few days? I think we could improve the documentation and tooling a bit if we had more people setting up and tearing down hubs.
The limit of 100 users per repo is mostly arbitrary. We have to draw a line somewhere to stop run away trains and most of the time no one hits this limit. Seems like in this case also no one hit the limit. A while back we did a load test where 200 simulated users click the same launch link at the same time. Mostly went fine but because the cluster had to add a new machine it took ~10min for everyone to have an instance. With a bit of prior warning/coordination ahead of time you could provision the machines ahead of time so that it is faster.