Can single-user instances authenticate access to shared notebooks?

johninbaltimore · October 10, 2022, 1:26pm

I’m new to this and just trying to figure out how to set up JupyterHub with per-user subdomains and suchlike.

Two important questions: are we able to restrict notebooks in single-user instances to authenticated users? If so, can we limit the type of access?

In certain environments, the use case for JupyterHub would require either no sharing of notebooks or the ability for a user to set any notebook to be shared only to specific users. Beyond simple yes-no access, I’d expect four levels of access to be possible:

Read: A user can read the notebook (with cached output from prior run by a privileged user?)
Read and Run: A user can read and execute the notebook
Read-Write: A user can read and edit the notebook, but not execute it
Read-Write-Run: A user can edit and execute the notebook, a dangerous and highly-trusted permission

This would be useful for notes regarding highly-controlled information such as trade secrets. In the most stringent environment, two separate JupyterHub installations would be created, with a sticky banner at the top indicating in large, friendly letters which instance the user is in (“NO PROPRIETARY INFORMATION” vs “PROPRIETARY INFORMATION, DO NOT SHARE”) and a link to the opposite instance for fast switching; those considerations are on the administrator’s side, and JupyterHub is already fully capable of that kind of labeling.

JupyterHub is attractive in these situations because users have tendencies to take notes, whether that be in Notepad, in a word processor, on a network share, on cloud services such as Microsoft 365, or even in some cases making notes about confidential information on their own private Google docs account—to which they keep access even after leaving the company! In 15 years of information security experience, I’ve seen it all. I’ve seen notes taken in gmail and saved as drafts. This is behavioral, and a centralized notebook application would be an excellent solution, hence my interests in what kind of access control is currently available.

manics · October 11, 2022, 2:18pm

Could you clarify- do you want users to have limited access to other users’ notebook servers, or do you have a set of notebooks that multiple authorised users can simultaneously read/write?

For the first case you can partially do this using JupyterHub RBAC permissions to control access to other user’s servers:
https://jupyterhub.readthedocs.io/en/stable/rbac/use-cases.html#group-admin-roles
in conjunction with JupyterLab RTC (real time collaboration) which allows multiple users to edit/run the same notebook in the same JupyterLab instance:
https://jupyterlab.readthedocs.io/en/stable/user/rtc.html

Note that this is all-or-nothing for a given singleuser server, you can’t share a server as read-only.

For finer-grained control you’ll probably need new permissions at the singleuser server level. There’s some discussion in

github.com/jupyterlab/jupyterlab

Next steps on RTC

opened 08:46PM - 10 Nov 21 UTC

hbcarlos

enhancement tag:Real Time Collaboration

I'm opening this issue to keep track of the next steps we plan for real-time col…laboration in JupyterLab 4.0. ### Deduplicate content Currently, we have the document's content triplicated in the view, ModelDB and YJS. I.e., the same content is mirrored in different data structures. This means that every time one of the models changes, we must update all other models to keep them in sync. This approach is very complex and prone to bugs. As commented several times, we need to remove ModelDB and use only the shared models to manipulate content. This will allow us to simplify the existing code. For extension authors, this means that they will no longer have access to ModelDB. Instead, they should use the new shared-models API, which provides better guarantees. The switch should be fairly straight-forward. * #11602 ### Add tests * To the shared models * To the username: https://github.com/jupyterlab/jupyterlab/pull/11852#issuecomment-1011994165 ### Changes on the Shared-models * Undo manager by cell #10791 * #11640 ### ICurrentUser * #10965 * #11443 * #11678 ### UI features * List of collaborators: #11528 * Current focused cell: #11555 * Rename anonymous users ### Add RBAC * #11355 ### ¿Clean up RTC code from previous attempts? ### Use the python port of YJS on the backend The locking approach used now does not work well when there are many collaborators, we could improve it to make it work decently, but it will always be prone to fail. The best approach is to adopt the new python bindings for YJS on the backend. This implies removing the content REST API and accessing the document's content through YJS. At the same time, we would need to remove the save button/command since the document's content would always be synced to disk; a la google docs. Using YJS on the backend has some other benefits that could be tackled later by extension, like using YJS versions to see the list of changes between versions; a la git. * #11599 In addition, it allows to tackle the following issues: * #2833 * #9621 * #10544 cc @dmonad

Short term, using operating system permissions (with extended attributes if necessary) is probably your best option to control read/write. Preventing execution isn’t possible here, since if someone has read access they can always copy the notebook to obtain a writeable version, that they can execute and save.

If your use-case is to present information have a look at some of the dashboarding or presentation frameworks for notebooks, e.g. Voila and others that I can’t think of right now…

johninbaltimore · October 11, 2022, 4:17pm

Interesting. Yeah I was thinking more along the lines of users wanting to share notebooks, but not letting them share with everyone. Think about if you have legal or policy obligations to restrict information to need-to-know, and two people in the company are sharing notes on a shared project. You don’t want person A on projects X and Y to share notebook X with person B on project X, and share notebook Y with person C on project Y, but have all three able to access both notebooks X and Y.

As to running a notebook, someone can copy it to their own single-user server, which may be in a Docker container via DockerSpawner and may have access to different data (e.g. you can import a separate Python script that contains credentials instead of entering them into the notebook). That the user doesn’t have execution access to any notebook would protect against arbitrary execution, but as soon as you give them any execution access to any notebook in your environment they obviously have access to your entire environment anyway.

I’ll have a look in the other thread. This isn’t necessarily about RTC, but when two users have write access it most certainly is about RTC.

manics · October 11, 2022, 8:34pm

Standard file system permissions with users and groups already handles some of this, so I think it’d be worthwhile to write down the additional Jupyter requirements.

There’s an abstraction around the “filesystem”:
https://jupyter-server.readthedocs.io/en/latest/developers/contents.html
The default is to use the local filesystem but you can implement a remote ContentsManager with custom permissions. For example if you used an S3 object store you could give each singleuser server a set of API credentials tied to the user, and set fine grained access permissions on each object.

This still doesn’t prevent someone downloading a notebook (or viewing and copying the raw JSON), and sending it to someone else. For that you’ll probably need another layer e.g. a virtual desktop interface that prevents downloading to the local machine… but a user can still take a screenshot.

Topic		Replies	Views
Jupyterhub user viewing another user's notebook General	0	1845	April 23, 2020
Using nbviewer as a means for sharing notebooks among authenticated users JupyterHub	2	747	September 21, 2021
How do you share notebooks? Best practices around sharing to external stakeholders? JupyterHub discourse , how-to	16	9151	June 1, 2022
Sharing the Notebook with other users extensions	1	514	September 25, 2024
Allowing a user only to run the existing notebooks and that's it JupyterHub	0	456	May 6, 2021

Can single-user instances authenticate access to shared notebooks?

Related topics