Load testing and telemetry recommendation

kevin-bates · January 2, 2023, 11:44pm

Hi @luong-komorebi - Thank you for posting these questions. I had been hoping someone (besides me) would respond by now, but, at this point, feel you deserve some kind of response, although probably not what you were hoping for. Short answer: we (EG) don’t have any tools to measure scalability but could certainly use them!

Do we (the community, the maintainers) have any performance benchmark or statistics for jupyter enterprise gateway?

No. As you reference in your link, that post morphed into an issue that included a possible scale test, but it never made its way into a contribution.

What is the biggest scale that you (a member of the community, a maintainer) have run jupyter enterprise gateway on? What kind of bottlenecks have you seen and how do you resolve them?

I’ve been made aware of at least two deployments hosting thousand-plus active instances. One, running 1500 instances, had encountered thread and ZMQ socket limitations that were addressed via environment variable configurations. I don’t know if these changes improved the situation or not, but that was prior to introducing other server instances.

What kinds of tools and patterns can we reliably use to load test and find bottlenecks with jupyter enterprise gateway?

In non-open source situations, this kind of thing is typically handled by the QA/Performance team. I personally don’t have experience with these kinds of tools so am unable to answer this question at this time.

I completely agree that we could use improvements in this area. As noted in your linked post, we should also try to tackle burst-request situations as well. As is typically the case for these kinds of topics, they become a time-resource issue whose priority tends to get minimized in the grand scheme of things and they fall off the back of the wagon. One of the hurdles to overcome is that these kinds of “tuning” exercises vary by deployment, so it may be the case that organizations have applied scalability tests and improvements to suit their specific environments and we never hear back.

It would be great if we could spend some time and focus on this. I think opening an issue or a discussion item while tagging some of the previous authors could prove useful - assuming they’d be willing to further share their experiences.

Thanks again for raising this discussion!

Kevin.

Topic		Replies	Views
Jupyter Enterprise Gateway 1.2.0 release is out! Enterprise Gateway release	2	753	March 21, 2019
Jupyter Enterprise Gateway 2.0 is now available! Enterprise Gateway announcement , community , release	0	796	September 4, 2019
Gateways and Kernel Provisioners Kernels kernel-gateway	2	313	October 7, 2024
[ANN] Gateway Provisioners 0.1.0 Enterprise Gateway	0	563	January 28, 2023
Is the Jupyter Enterprise Gateway still the go to solution for distributing ML work loads to dedicated clusters? Enterprise Gateway	6	817	February 20, 2021

Load testing and telemetry recommendation

Related topics