PyCon 2019 and mybinder.org

Hello Cleveland! Hello PyCon 2019!

https://us.pycon.org/2019/

If you are planning on using mybinder.org as part of your tutorial, workshop, live demo, etc please do post here. Awesome! Why post here? PyCon is huge and we’d like to make sure we are prepared for any thundering herds so that you, the participants and the operators of mybinder.org have a great time (and keep the stress low in everyone’s life).

Some things to keep in mind:

  • there is a 100 live instances per repo limit in place, once you reach it new users will get a “sorry we are full” message
  • every 70 (or so) additional live instances we need to scale up our cluster which takes about 10 minutes. If the cluster is at 400 live instances in total right now, then at 471 we will need to scale the cluster up. During scale up everyone has to join the waiting queue.
  • PyCon has wifi, but we all know how well conference wifi works when you really need it. Have a backup plan.
  • mybinder.org is generally a stable service. However we have little experience with keynote speakers asking everyone to launch a binder type events. They are the black swan events. The operators are volunteers and might be sleeping, on a plane or generally not at their computers. Have a plan for when it takes a while for service to be restored in case of an outage.

If you let us know your plans we can work something out together or provision a cluster with more headroom during PyCon or …

Be awesome and thanks for helping us help you help others learn more about Python! :smiley:

3 Likes

I’ll be at PyCon Wd-Sun – and I have Jupyter stickers – I’ll be teaching Thursday morning 6-9:20 Pacific, 9-12:20 Localtime; not planning on using Binder.

See you there !

3 Likes

I’m presenting a tutorial 1-4 on Wednesday
“analyzing census data with pandas”
I plan on using binder for it fingers crossed for the wifi
it’s 50 people max

I have a back up plan but I really really want to use mybinder

1 Like

Do you have a link to the repo already?

I’m planning to use binder with my tutorial Dealing with Datetimes, Wednesday, May 1st from 13:00-16:00 EDT-04:00.

There are 51 people registered, I am not sure if it’s possible more people will register.

Thanks to the binder team for providing this excellent service, and for being so helpful!

2 Likes

It’s at GitHub.com/chekos/analyzing-census-data

2 Likes

How did it go @pganssle and @chekos? Any thoughts and feedback welcome.

1 Like

it went great for me! The wifi was great on my room so no hiccups there. Everyone seemed to like it - not installing anything and stuff. Some people didn’t know about mybinder and I took a min to explain how great it is :fire:

2 Likes

Mine went fantastic, no hiccups and I think 90% of people had everything loaded about 5 minutes into the tutorial. Binder is such an awesome service and I’m happy to tell any funding agencies the same thing if you ask :wink:.

Most of the problems I had were people using Jupyter locally, which was fixed by “try opening the binder”. I forgot to suggest that they try downloading their work materials, but either people figured it out or they didn’t care.

The only time I had a problem I couldn’t solve was someone whose corporate laptop was configured to block both github and mybinder.org, and there wasn’t much I could do about it.

3 Likes

hi there! Eric Ma & I taught a workshop today and a lot of the learners used Binder and it blew their minds! I also used Binder as an instructor.

repo here: https://github.com/ericmjl/bayesian-stats-modelling-tutorial

pycon tutorial page here: https://us.pycon.org/2019/schedule/presentation/77/

3 Likes

Thursday afternoon (1:20pm-4:40pm EDT) at PyCon, I’ll be teaching Data Science Best Practices with pandas. My repo has a Binder link, though I assume usage will be relatively low since I’ve been reminding people again and again to make sure everything is installed ahead of time! (I didn’t want to assume good wifi or require people to use Jupyter notebook for the tutorial, thus Binder is what I’m recommending as a backup option.)

2 Likes

Fantastic to hear that all went well! Thank you :smiley:

A small extension to install/try out for that might be GitHub - data-8/nbzip: Zips and downloads all the contents of a jupyter notebook. Gives you a button to get a big old zip file. There is also a discussion/idea brewing to make it so that you can save/download your notebook as notebook for as long as your tab is open (even if the binder instance has been culled). Needs some help/someone with spare time to implement it. Would be huge to have this.

I think I’ll keep following the Pycon tutorials “live” by watching our monitoring then :smiley:

Thanks for the praise!

3 Likes

Might pop around later and say hi :wave:t3:

This thread makes me happy :slight_smile:

1 Like

love this thread! can’t wait to go through the links and see what I missed.

General question about Binder in regards to this (planning for the future), what is the difficulty/cost of setting up an instance for a weekend to handle heavy demos?

Like, if I had a budget of $500-1000 and a day or two to set it up so that 500 simultaneous users could get a demo with almost no lag (assuming good wifi), do you think that would be possible?

@betatim I’m just wondering about what kinds of options open up when Jupyter doesn’t have to foot the Google Cloud bill for things like this. If I’m planning a conference, can I budget for something like this to handle the capacity? Would it have to be more? Just trying to get a sense for what’s possible and what it would require.

I think order of magnitude wise it costs a couple of hundred dollars to run mybinder.org per day in cloud costs. This serves between 300-450 concurrent sessions “at all times”, so at any moment in time you’d find about that many people using mybinder.org.

How much more or less compute resources you’d need depends heavily on what it is people are actually doing. So it is a bit tricky to say.

Time to setup a BinderHub from scratch? If you’ve been doing it regularly and it is for a “throw away” hub (no customisations, minimal monitoring, skipping all the things you need for long term ops): around a day. If you haven’t done it before maybe a few days? I think we could improve the documentation and tooling a bit if we had more people setting up and tearing down hubs.

The limit of 100 users per repo is mostly arbitrary. We have to draw a line somewhere to stop run away trains and most of the time no one hits this limit. Seems like in this case also no one hit the limit. A while back we did a load test where 200 simulated users click the same launch link at the same time. Mostly went fine but because the cluster had to add a new machine it took ~10min for everyone to have an instance. With a bit of prior warning/coordination ahead of time you could provision the machines ahead of time so that it is faster.

TL;DR: yes that would work.

1 Like