Automated test of custom images and notebooks

For full end-to-end deployment testing, I can recommend the general-purpose tool, robotframework, and putting the entire system (or as much as is rational) under test. It has custom libraries for all kinds of things, but at the end of the day, is “Just Python,” so is pretty easy to extend.

Specifically for browser-based, Jupyter clients, it has a purpose-built library (disclaimer: author). The distinction here vs galata:

  • works off a user-level, “black box” model of the application
    • instead of a “white box”, tightly-bound to a specific version of e.g. JupyterLab with access to the underlying JS APIs
  • looks more like (weird) plain language than typescript
  • requires a lot more jumping jacks to get videos than a custom browser

In my specific case, which is alas, not open source, the most horror-show test suite:

  • start up a couple VMs with virtualbox
  • deploy a previous, known version of jupyterhub (specifically, TLJH, but whatever)
    • this could be replaced with e.g. minikube and helm, but the principal is the same
  • spawn a user’s environment
  • start some representative interactive computing, meanwhile…
    • upgrade the jupyterhub process to a new known version
      • handle a rollback case
  • continue to assess the user environment
    • see a message about an updated environment available
  • start a new user environment
  • verify the interactive computing stuff still works (e.g. against the same in-flight data files)

By using VM snapshots, etc. this was actually pretty reasonable to run as part of normal CI.

As for assessing some of this stuff (especially over time), there are some nice tools in the opentelemetry stack… having some frame-level awareness of “user clicks button” to “new environment created” to “results of compute appear” is quite nice, with all of the database logging, etc. put in its appropriate place.

To look at scale, one can also drop locust on it, which is again, Python, but can test anything with an HTTP endpoint. Some previous discussions:

2 Likes