IPython Parallel

IPython Parallel

Thanks to Bodo (a new distributed compute framework for data processing and ML based on MPI), I have some funding for the rest of this year to improve IPython Parallel and bring the repo back to a healthier state. My focus is going to be on features related to steerable MPI-style code, fault tolerance, and scaling. Things like interrupting/restarting engines, managing clusters from a Python API instead of the ipcluster script, scaling, etc. Plus, a big revamp of the docs. Some of this has already landed!

A first pre-release is available, which has a new prototype Broadcast Scheduler, developed as Tom-Olav Bøyum’s Master’s thesis at the University of Oslo last year, which has significantly improved scaling for “do on all” (SPMD-style) tasks.

If you’ve been using IPython Parallel / wanting to help out or review and give feedback, now is a great time to get involved. We are currently meeting weekly on Mondays at 15:00 CEST / 09:00 EDT, and you are welcome to join.

About Bodo, in their own words: Bodo is the next generation distributed data and AI compute platform for extreme scale and speed. Bodo’s powerful compute engine combines the simplicity and flexibility of native Python with true parallelism and high efficiency for large-scale data analytics and machine learning applications. Bodo’s vision for simple and efficient interactive computing at large scale aligns with IPython Parallel perfectly.