Slow notebook pdf download - any tips?

We’ve noticed slow notebook downloads of even basic notebooks when converting to PDF in our jupyterhub installation (using z2jh if that matters).

For example, I’ve got a notebook with a single cell that just does !pip freeze and it can take upwards of 40 seconds to get the download prompt for the PDF.

Downloading the same notebook as markdown or restructured text is a lot faster, so I’m guessing the slowness comes from the latex conversion.

I noticed this thread [1] which is intriguing if we can get the same basic functionality without latex but I’d need to test it out in our hub deployment.

I also see that nbconvert has a lot of configuration options but I’m not sure if any of those could be used to help tweak the performance or help profile, does anyone have thoughts or experience there? I’m thinking maybe setting logging to debug to see if that can help show where time is spent, or maybe using py-spy?

For reference these are the relevant packages we’re installing for nbconvert in the notebook server image:

RUN apt-get -y install --reinstall texlive-xetex texlive-fonts-recommended texlive-generic-recommended texlive-latex-extra texlive-publishers texlive-science texlive-pstricks texlive-pictures pandoc
...
RUN conda update nbconvert

jupyter/scipy-notebook is the base container image.

[1] Notebook-as-PDF, save notebooks as PDFs

2 Likes

I have the same problem with a similar setup (Jupyterhub + Docker, Ubuntu 20.04, TeX Live, ARMv8.2), PDF exports are very slow. My solution is to disable PDF exporting and use native (OS dependent) “Print to PDF” functionality through the print menu. This works quite nicely and it is very quick.

2 Likes

Is this limited to PDF download, or are all downloads slow? I have just confirmed some user reports that loading non-trivial sized (35MB) notebooks is taking 10s of minutes. I found that simply downloading a random-text file of 35MB is slow – 10kB/sec. We are using z2jh on GKE (launched via daskhub).