Mars Insight Mission needs help with a problem

We are looking for information concerning performance problems that have emerged recently.

We have a complicated application used to analyze data from the Insight mission, it is written as a Jupyter Notebook and served through JupyterHub. Both Notebook and JupyterHub are old versions dating from the 2016 era. Recently it has slowed down to the point it is unusable. We have tested with several client browsers (Chrome, Safari, Firefox, etc.) and find that the Javascript served from Jupyter is running slowly.

We are considering upgrading the Jupyter software but don’t want to do so unless necessary. If anyone has experienced a similar problem (and particularly if you know the cure) please let us know.

Thanks,

Steve

2 Likes

Love to help!

served from Jupyter

As in, they are slow to be served or slow once in the browser?

upgrading

Welp, we’re just a open source project trying to tap dance along… but there have been some significant performance improvements to both the Jupyter Server (which is now more capable at handling simultaneous requests) and JupyterLab (which can be coaxed into only rendering visible on-page content) released within the last year.

  • What kernel(s) are in play?
  • Are they making a lot of outputs? Videos?
  • What’s the scale of the notebook? hundreds of cells? thousands of cells?
  • Are any nbextensions in play? A lot of them have somewhat… undefined behavior, especially when next to other extensions.

Feel free to reach out if you’d want to arrange a rubber duck session. Heck, as SciPy sprints are this weekend, I’d wager “support active extraplanetary mission” might get some interest if we can work up some representative examples.

1 Like

Nicholas,

Thanks. We have found a problem with our database that might be the cause of the slow execution so we are running that down. If that doesn’t fix the problem we’ll dig into the Jupyter code and that’s where we will need some help.

Initially I’m interested in knowing whether others are experiencing performance problems that are related to browser security or other technology changes to the Internet in recent years. This is because I am weighing whether we should attempt an upgrade. Please let me know if others have reported problems of this nature.

Steve

———————————

A little bit about us

Our segment of the Insight mission is called the Mars Weather Service (MWS) and we collect an analyze data such as wind speed and direction, atmospheric pressure and the intensity of light received at the surface. There are many scientists who use our software but there are only two developers. I can say without a doubt that if we had not based the application on Jupyter Notebook and JupyterHub the job would have been impossible for such a small group.

My initial interest in Jupyter was to fulfill a requirement to allow users to specify methods to filter data, including transforming and combining streams of data. There was also an implicit requirement that the user be able to accomplish this without doing any programming. I thought I would be able to use the kernel communication protocol to accomplish this and expected to write a front end from scratch, but I discovered there were many features of Notebook that I could leverage to make my task easier. As it turns out nobody used the filtering functionality but none-the-less the project benefited greatly by using Jupyter.


Screenshot of GUI

The top pane with the green “Load” button shows data for 10 Sols (Martian days) beginning with Sol 700 (the seven hundredth day after landing). Note the periodic nature of the data which shows diurnal variations. Toward the center of the pane is a brush widget (just above the 7 on 700) this is used to select a subset of the data. Below that is another pane that shows the data that was selected from the top pane. This pane contains another brush used to select the data shown in the lower panes. Data selection was done this way to allow the user to maintain a sense of the data in its entirety while zooming in on the finest detail. There are many other functions which are hidden under the venetian blind widget such as “Filter Controls”, “Log Viewer”,…, etc.

2 Likes

Thanks for the screenshot!

Are those SVG or PNG-based output? I could imagine 933 martian sols in, there might be more data…

Nicholas,

We generate the graphics on the fly from data stored in the DB.

BTW: I had some trouble posting by email and so have edited the posting so it may have more content then when you first viewed it. Most of the text and one of the graphics were deleted initially. Something about “new users are only allowed to post one emblem”. Below is another graphic that was in the original message.

logo

Very exciting to see that use of Jupyter! And I love that logo :smiley:

If the DB slowness doesn’t turn out to fix the issue, you may want to open up the websocket in your network tab and see if content is being sent to the browser in a timely manner and that it’s not pressure on the server / kernel side of things slowing it down. With that many visuals it might be a bit noisy so you may want to identify a particular graph that’s acting slow and see when the data for just that element appears on the wire. My guess is that worsening of performance is either data reaching the browser later or in data growth reaching memory boundaries in the kernel or the browser.

In my experience, I’d not expect web tech changes in the past few years to significantly slow down notebook code that was running smoothly up til now if it’s been stable thus far though it’s certainly possible a certain graphic type could be to blame as it gets used more.

Right, but are the graphics sent to the browser in the notebook:

  • bitmaps, e.g. png (which should always be the same size)
  • vectors, e.g. svg (which would get bigger every day)

Nicholas,

It uses SVG as far as I can tell, see screenshot below.

Steve

Nice. So yeah: svg, even if drawn in the browser (vs being rendered and sent over the pipe) can definitely get “too big” for the browser. bqplot has recently (within the last two years) added some canvas-based rendering, but again that puts you in an upgrade situation… and not every plot supports the canvas renderer, anyway.

If your environment already includes bokeh, it’s plausible that would be able to provide basically the same features (brushing, zooming), but the interoperability with the ipywidgets ecosystem was basically nonexistent in the 2016 timeframe, and still not perfect today… but might be able to be shoe-horned in, at worst through raw script banging.

Matthew,

I created the logo because we wanted to acknowledge the Jupyter team and not take all the credit for ourselves. I got a great deal of help from the community so there really was a third person on our team.

Thanks for the debugging tips, we have never profiled the code and so don’t know definitively where the bottle necks are. We move massive amounts of data to the browser and were surprised the performance was so good in light of that. During development we had trouble due to a data transfer limit built into Notebook that was intended to thwart denial of service attacks. I hacked my way around the problem and since then we have had no problems with data transfer, except for the possibility of the current problem. I was told that in future versions the data limit would be made a configuration parameter but we are still running the old version. In our case there wasn’t enough time to improve the system especially while it was working adequately (of course I state the obvious).

Steve

Is there a way to switch bqplot so it uses PNG, that sounds like the easiest way to gain some performance. We make such heavy use of ipywidgets that I shudder to think of all the bugs we might inject by tinkering with things. Our architecture is based on ipywidgets but they were often difficult to use and despite that well worth the trouble.

Is there a way to switch bqplot so it uses PNG

nope.

Hi, I am Brian, Steve’s colleague on this project.

When the slowness occurs, the [Web Content] process* on the browser host goes to 100%+ (and I’ve seen it over 2400% yee haa!) in the CPU% column of [top], while the CPU usage on the server drops once the database server and ZMQ processes finish their work. So I am pretty sure the cause of the slowdown is in the browser, e.g. a memory constraint/performance. For example, we have a checkboxes++ that turn markers (dots) on and off at each datum in a plot. With only a few hundred or thousand points in a plot, the marker change is nearly instantaneous, although for data streams with a higher cadence noticeably less so. With a few days of data (tens or hundreds thousands of data) in a plot, pressing the [Mark] checkbox [Web Content] goes to 100% and stays there for a while. This makes sense, considering that the browser is interpreting SVG.

++ in my case this is a sub-process of firefox
** called [Mark]; you can see them in the image Steve posted.

1 Like

Hi @drbitboya,

Would you be able to record a profile when the usage goes high using the developer tools “Performance” tab? The instructions for recording and saving the profile are here for Firefox and here for Chrome.

Yes, that would be the next step.

However, the one of the database tables has had a crash**, and we are currently into the second day of the repair, so nothing is going to happen at the moment. I am beginning to think it might be quicker to re-ingest the data from scratch.

We’ll keep you posted.

** 115GB of indices .MYI; 38GB data .MYD; yeah, maybe we could have designed that better, IIRC Steve wanted to split the data into multiple files, but I was worried about SELECT…WHERE and JOIN performance if we did ;-/.

1 Like

In the past (a few years ago), we saw performance problems with rendering SVG in a browser when we had tens of thousands of elements in a bqplot plot, which is part of the motivation for doing the webgl work in bqplot for high numbers of data points. I would not be surprised if profiling pointed to browser rendering performance with SVG elements.