We are using AnyWidget from jupyter notebooks, where we need to access local user files from javascript. The files are accessed using the ulr /user/name/files/..., which works if used directly, e.g., from a Markdown cell. However, this does not seem to work from javascript, for example, using the javascript magic:
From the JupyterHub log it appears that this request results in oauth redirect loop:
[I 2024-11-28 14:16:25.321 ServerApp] 302 GET /user/bob/files/text.txt -> /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] (@::1) 2.75ms
[I 2024-11-28 14:16:25.335 JupyterHub log:192] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] -> /user/bob/oauth_callback?code=[secret]&state=[secret] (bob@::1) 5.87ms
[I 2024-11-28 14:16:25.347 JupyterHub log:192] 200 POST /hub/api/oauth2/token (bob@127.0.0.1) 6.58ms
[I 2024-11-28 14:16:25.351 JupyterHub log:192] 200 GET /hub/api/user (bob@127.0.0.1) 1.99ms
[I 2024-11-28 14:16:25.351 ServerApp] Logged-in user bob
[I 2024-11-28 14:16:25.352 ServerApp] Setting new xsrf cookie for b'494dc1f0c5c748459120a76135b48cb8:d3e9995339aa0d9231c7229057b56387f82aa7e6ff7de72b9a29c58c1cdac787' {'path': '/user/bob/'}
[I 2024-11-28 14:16:25.352 ServerApp] 302 GET /user/bob/oauth_callback?code=[secret]&state=[secret] -> /user/bob/files/text.txt (@::1) 13.36ms
[I 2024-11-28 14:16:25.356 ServerApp] 302 GET /user/bob/files/text.txt -> /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] (@::1) 0.72ms
[I 2024-11-28 14:16:25.364 JupyterHub log:192] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] -> /user/bob/oauth_callback?code=[secret]&state=[secret] (bob@::1) 4.49ms
[I 2024-11-28 14:16:25.376 JupyterHub log:192] 200 POST /hub/api/oauth2/token (bob@127.0.0.1) 6.42ms
[I 2024-11-28 14:16:25.379 JupyterHub log:192] 200 GET /hub/api/user (bob@127.0.0.1) 1.80ms
[I 2024-11-28 14:16:25.380 ServerApp] Logged-in user bob
[I 2024-11-28 14:16:25.380 ServerApp] Setting new xsrf cookie for b'494dc1f0c5c748459120a76135b48cb8:8840358aea5f3629f5d0de6157ba485d3c88339ead26617cd1df8e5dd94e39ad' {'path': '/user/bob/'}
[I 2024-11-28 14:16:25.380 ServerApp] 302 GET /user/bob/oauth_callback?code=[secret]&state=[secret] -> /user/bob/files/text.txt (@::1) 12.57ms
[I 2024-11-28 14:16:25.385 ServerApp] 302 GET /user/bob/files/text.txt -> /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] (@::1) 1.08ms
[I 2024-11-28 14:16:25.393 JupyterHub log:192] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] -> /user/bob/oauth_callback?code=[secret]&state=[secret] (bob@::1) 5.16ms
[I 2024-11-28 14:16:25.405 JupyterHub log:192] 200 POST /hub/api/oauth2/token (bob@127.0.0.1) 6.51ms
[I 2024-11-28 14:16:25.410 JupyterHub log:192] 200 GET /hub/api/user (bob@127.0.0.1) 2.37ms
[I 2024-11-28 14:16:25.410 ServerApp] Logged-in user bob
[I 2024-11-28 14:16:25.411 ServerApp] Setting new xsrf cookie for b'494dc1f0c5c748459120a76135b48cb8:fe442871b68024a53aa499f0d7fe15d1470e28fd19f0ada978eb06224cccd1e0' {'path': '/user/bob/'}
[I 2024-11-28 14:16:25.411 ServerApp] 302 GET /user/bob/oauth_callback?code=[secret]&state=[secret] -> /user/bob/files/text.txt (@::1) 14.52ms
Is that an expected behavior?
The same code worked previously with JupyterHub 4.0.2. It also works if used from JupyterLab directly. It also works with the URL obtained from the context menu “Copy Download Link” /user/bob/files/text.txt?xsrf=.... Is there a way of obtaining this URL programatically from the notebook?
I did some further tests, and the first version of Jupyter Hub that is effected is 4.1.0. I looked at the release log but I do not see anything that could be relevant.
As far as I understand, CVE-2024-28233 affects running the code from other domains, which is not the case here. Or could it be #4630?
Also, shouldn’t the xsrf cookie from the browser be used for javascript requests? Even if the cookie is required, it is unclear to me why the hub does not do redirect to the login page as for the python code.
Well, when you make a request from browser, the request forwards the session cookie, which is a valid one, and thus, authentication is already done. So, it should not redirect you to login page.
In your example using Python’s urllib, there is no valid session when you make the request and hence, it redirects you to login page for auth. Does it make sense?
Also, shouldn’t the xsrf cookie from the browser be used for javascript requests?
Yes, you are right that _xsrf token is stored in cookies and they are being forwarded with the request. But from my understanding JupyterHub expects them as a query parameter or a header.
The difference I see between regular API requests made by JupyterLab and JS fetch request like the one you are attempting is lack of X-XSRFToken header in request.
Thanks for your explanation @mahendrapaipuri! That clarifies a lot!
Yes, you are right that _xsrf token is stored in cookies and they are being forwarded with the request. But from my understanding JupyterHub expects them as a query parameter or a header.
Are they expected by JupyterHub or Jupyter Server? Because from the log it looks like the _xsrf cookies are issued by the ServerApp and also the ServerApp redirects to the hub oauth, not JupyterHub. Could it be that the JupyterHub proxy does not properly forward the cookies (in headers) from the browser requests to the ServerApp?
Based on our discussion, I now tried to extract _xsrf cookies from javascript and append it to the url:
This indeed worked, however there appears to be several _xsrf cookies set in my browser and it is not very clear which one should be appended. Anyway, I mark this workaround as a solution for now.
Unfortunately, I dont remember the details. I remember we had to change few internal services due to this security enhancement. Maybe @manics@minrk can chip in here what exactly is happening!!
This indeed worked, however there appears to be several _xsrf cookies set in my browser and it is not very clear which one should be appended.
There should be one _xsrf token for each path. So, I assume you must have one for /hub/, one for /user/bob, etc. So, for your use case you should use one for /user/bob.