Run Linux Desktop Apps in mybinder.org / your JupyterHub

And, if we set it up so that any time a user requests a ‘workspace’ aka ‘server’, the following happens:

2 containers get spun up - ideally as 2 separate pods (rather than 2 containers in a pod)

  • Pod A: Lightweight, Linux desktop environment with limited tools and a modern browser, wrapped in Guacamole.
  • Pod B: The usual single user server with a choice of interfaces running on customisable compute, etc.
  • Pod A and Pod B can potentially be scheduled to run on different node pools with different resources, etc.

Pod B is only accessible from Pod A - enforced through network policy ?Calico.

Persistent storage is disabled on Pod A so that even if the user downloads data from JupyterLab served via Pod B - this ends up in Pod A which is entirely ephemeral and is lost when the user has to shutdown the server to access a different workspace. Pod A (Guacamole) prevents removal of data from the desktop environment.

Pod B gets its persistence storage through PV/PVC, in our case Azure File storage.

This prevents moving data from one workspace to another. Let’s discuss when we talk on Wednesday.

Let’s drag one more person to the party :slight_smile:

@yuvipanda - I have been looking at this → [Request for Implementations] Disabling downloads from a JupyterHub in the context of this → Secure data environment for NHS health and social care data - policy guidelines - GOV.UK.

Will be great to get your thoughts. NHS England has budgeted £100m for Secure Data Environments and seeing that a major chunk of data science happens within the Jupyter universe, will be good to discover what is possible and what has already been done. I love z2jh and will be great to see what others are doing in this space.