Is headless browsing possible from a Binder Jupyter notebook?

I’m trying to use PyQt5 inside of a Jupyter notebook loaded from a GitHub repository with Binder so I can browse headlessly and scrape a website. However, when I try to create a QApplication object with app = QApplication(sys.argv), I get this message from the webpage: “The kernel appears to have died. It will restart automatically.”

This does not happen I use PyQt5 in a local Jupyter notebook, so I’m not sure if I’m doing something wrong or if this just isn’t possible in notebooks loaded with Binder. I’ve also run into errors with other packages meant for headless browsing in Binder-loaded notebooks, but I’ll stick to asking about PyQt5 for now.

I’ve included PyQt5 in my repository’s requirements.txt file and the Qt dependencies listed in sections 2.1 and 2.2 on this page in my apt.txt file. I don’t have any problems importing QApplication from PyQt5.QtWidgets.

Does anyone have ideas on how to troubleshoot?

The first step: debugging complex issues with Binder is often easier with repo2docker to do the builds and tests locally. You will get a much faster turnaround time and more debug information. This is the same code Binder uses, so if it works locally it will probably work on mybinder.org.

My guess is that it’s crashing due to a lack of X server. I think you need to set up some environment variables and possibly install xvfb. See this SO question for a possibly similar situation.

2 Likes

I haven’t used PyQt5; however, headless browsing is possible from a Binder Jupyter notebook. I have used it myself here, adapting these resources:

Grabbing Screengrab Images Using Selenium (Also Works in MyBinder)

Example of Using Headless Firefox to Grab a Screenshot from a Downloaded Page

“This is a really neat example - helped me understand how the binder/apt.txt and binder/postBuild mechanisms work https://simonwillison.net/2019/Nov/4/selenium-demoscraper/

1 Like

Thank you for the tip. I wrestled with repo2docker over the last few days and found a working solution. I did need to install X11 (xorg and openbox instead of xvfb). Then, once repo2docker finishes the build, I can call docker run with -v /tmp/.X11-unix:/tmp/.X11-unix to mount a volume and -e DISPLAY=unix$DISPLAY to set up an environment variable and everything works locally.

Transitioning back to Binder, it seems that I can set the DISPLAY environment variable in a postBuild file, but I haven’t found which type of configuration file I can use to mount a volume. Do you know? I’m trying to avoid environment.yml due to conda’s longer installation times compared to pip but am otherwise type agnostic.

This binder that does qt and web stuff worked… at some point:

It waits to use xvfb-run interactively in notebooks/index.ipynb, though I’m sure it could be done at postBuild somehow.

I don’t see anything in the environment.yml directly related to X, so it would probably work with requirements.txt.

2 Likes