In particular we are looking to include RTC on our jupyterhub / jupyterlab and reading the “enabling user subdomains” section it appears that subdomains are strongly encouraged, but the documentation appears to be written to mostly discuss subdomains for sharing access to servers. However, as the photo above shows the final important note appears to be a more blanketed statement on encouraging subdomains in general. Is this security issue w.r.t having subdomains required or necessary if we use our own oauth procedure and have semi-trusted users (and no RTC)? I’ve seen other instances of jhub w.out such measures and im not sure how to best proceed. Our solution has to be maximally secure.
Weve done some naive testing and it appears that separate users who are logged in cannot access another person’s server, but it’s certainly very limited in our testing scope to just brute force test .
If someone could provide some guidance on the best architecture/docs/ and thoughts (especially as we want to incorporate RTC) that would be immensely helpful thank you so much
JupyterHub will never promise to fully protect users from each other on a single-domain deployment. We’ll always do our best, but browsers’ Same Origin Policy makes it very hard to be rigorous. As of 4.1.5, I don’t know how users can attack each other if users do not have any permission to install server extensions, customize server config, etc., but I would be surprised if it truly cannot be done by sufficiently determined malicious users.
Note that it’s not about whether users can access each others’ servers directly, all the same-origin problems are XSS-style, of the form:
user A creates a malicious page on their own server
user B clicks a link crafted by A to a page on A’s server
a page on user A’s server makes cross-site requests to user B’s server via user B’s browser, authenticated with user B’s cookies
i.e. it requires user B to be logged into their own server and click a link provided by user A. The XSRF cookie is what we use to attempt to prevent these requests, but it is not perfect. The same-origin deployment does not have any additional vulnerability where user A can do anything to user B’s server without the participation of user B.
Definitely, the best architecture if you enable RTC is to use per-user domains, because then all standard cross-origin protections from the browser come into effect between user servers, and I can make much more confident statements about what can be done from one server to another.
“semi-trusted” is also a key term - it takes effort and malice and tricking users to perform these XSS attacks, so if after-the-fact policy enforcement is adequate for your deployment, then single-origin may be okay.
@minrk Well, once we have jupyter-server-proxy installed, users can proxy any web app, right?! I mean they dont have to have that web app installed in single user environment and they can have it installed in their own environments. In that case limiting users from manipulating PATH, PYTHONPATH or jupyter server config is not going to be enough, right?!
Yes, if you install extensions that grant users the ability to serve arbitrary things that they control, they can bypass some of these protections. This is a choice you make each time you install an extension in the user environment (and why users must not be allowed to do this if you want to prevent this kind of thing). Jupyter Server Proxy’s arbitrary port forwarding is governed by the host_allowlist, and if you set it to an empty list, it won’t allow arbitrary port forwarding.
If you want arbitrary port forwarding, you are also explicitly granting users permission to serve arbitrary resources which may be vulnerable or malicious.
This is the reason per-user domains are strongly encouraged, and only becomes more important the more freedom you give your users, and the more you permit users to interact with each other.