Accessing user server local files from javascript

We are using AnyWidget from jupyter notebooks, where we need to access local user files from javascript. The files are accessed using the ulr /user/name/files/..., which works if used directly, e.g., from a Markdown cell. However, this does not seem to work from javascript, for example, using the javascript magic:

%%js
fetch('/user/bob/files/text.txt')
  .then( response => response.text() )
  .then( text => console.log(text) );

From the JupyterHub log it appears that this request results in oauth redirect loop:

[I 2024-11-28 14:16:25.321 ServerApp] 302 GET /user/bob/files/text.txt -> /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] (@::1) 2.75ms
[I 2024-11-28 14:16:25.335 JupyterHub log:192] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] -> /user/bob/oauth_callback?code=[secret]&state=[secret] (bob@::1) 5.87ms
[I 2024-11-28 14:16:25.347 JupyterHub log:192] 200 POST /hub/api/oauth2/token (bob@127.0.0.1) 6.58ms
[I 2024-11-28 14:16:25.351 JupyterHub log:192] 200 GET /hub/api/user (bob@127.0.0.1) 1.99ms
[I 2024-11-28 14:16:25.351 ServerApp] Logged-in user bob
[I 2024-11-28 14:16:25.352 ServerApp] Setting new xsrf cookie for b'494dc1f0c5c748459120a76135b48cb8:d3e9995339aa0d9231c7229057b56387f82aa7e6ff7de72b9a29c58c1cdac787' {'path': '/user/bob/'}
[I 2024-11-28 14:16:25.352 ServerApp] 302 GET /user/bob/oauth_callback?code=[secret]&state=[secret] -> /user/bob/files/text.txt (@::1) 13.36ms
[I 2024-11-28 14:16:25.356 ServerApp] 302 GET /user/bob/files/text.txt -> /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] (@::1) 0.72ms
[I 2024-11-28 14:16:25.364 JupyterHub log:192] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] -> /user/bob/oauth_callback?code=[secret]&state=[secret] (bob@::1) 4.49ms
[I 2024-11-28 14:16:25.376 JupyterHub log:192] 200 POST /hub/api/oauth2/token (bob@127.0.0.1) 6.42ms
[I 2024-11-28 14:16:25.379 JupyterHub log:192] 200 GET /hub/api/user (bob@127.0.0.1) 1.80ms
[I 2024-11-28 14:16:25.380 ServerApp] Logged-in user bob
[I 2024-11-28 14:16:25.380 ServerApp] Setting new xsrf cookie for b'494dc1f0c5c748459120a76135b48cb8:8840358aea5f3629f5d0de6157ba485d3c88339ead26617cd1df8e5dd94e39ad' {'path': '/user/bob/'}
[I 2024-11-28 14:16:25.380 ServerApp] 302 GET /user/bob/oauth_callback?code=[secret]&state=[secret] -> /user/bob/files/text.txt (@::1) 12.57ms
[I 2024-11-28 14:16:25.385 ServerApp] 302 GET /user/bob/files/text.txt -> /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] (@::1) 1.08ms
[I 2024-11-28 14:16:25.393 JupyterHub log:192] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] -> /user/bob/oauth_callback?code=[secret]&state=[secret] (bob@::1) 5.16ms
[I 2024-11-28 14:16:25.405 JupyterHub log:192] 200 POST /hub/api/oauth2/token (bob@127.0.0.1) 6.51ms
[I 2024-11-28 14:16:25.410 JupyterHub log:192] 200 GET /hub/api/user (bob@127.0.0.1) 2.37ms
[I 2024-11-28 14:16:25.410 ServerApp] Logged-in user bob
[I 2024-11-28 14:16:25.411 ServerApp] Setting new xsrf cookie for b'494dc1f0c5c748459120a76135b48cb8:fe442871b68024a53aa499f0d7fe15d1470e28fd19f0ada978eb06224cccd1e0' {'path': '/user/bob/'}
[I 2024-11-28 14:16:25.411 ServerApp] 302 GET /user/bob/oauth_callback?code=[secret]&state=[secret] -> /user/bob/files/text.txt (@::1) 14.52ms

Is that an expected behavior?

The same code worked previously with JupyterHub 4.0.2. It also works if used from JupyterLab directly. It also works with the URL obtained from the context menu “Copy Download Link” /user/bob/files/text.txt?xsrf=.... Is there a way of obtaining this URL programatically from the notebook?

See also the github issue for further details.

@manics

Regarding your comment on github:

if you’re making calls from within a notebook cell you’ll need to add authentication headers.

I am not entirely sure that the problem is with authentication. If I write an equivalent code in python:

import urllib.request
with urllib.request.urlopen('http://localhost:8000/user/bob/files/text.txt') as f:
    print(f.read().decode('utf-8'))

then hub redirects to the login page, and there is no redirect loop:

[I 2024-11-28 16:40:57.707 ServerApp] 302 GET /user/bob/files/text.txt -> /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] (@::1) 1.24ms
[I 2024-11-28 16:40:57.710 JupyterHub log:192] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-bob&redirect_uri=%2Fuser%2Fbob%2Foauth_callback&response_type=code&state=[secret] -> /hub/login?next=%2Fhub%2Fapi%2Foauth2%2Fauthorize%3Fclient_id%3Djupyterhub-user-bob%26redirect_uri%3D%252Fuser%252Fbob%252Foauth_callback%26response_type%3Dcode%26state%3DowDHMQbdIZ1sCuN0h3THNA (@::1) 0.64ms
[I 2024-11-28 16:40:57.713 JupyterHub _xsrf_utils:125] Setting new xsrf cookie for b'None:B2vvw5njN-vr9DUC1ldCBYxx-vOkrOJ9SsGShqD8qpI=' {'path': '/hub/', 'max_age': 3600}
[I 2024-11-28 16:40:57.715 JupyterHub log:192] 200 GET /hub/login?next=%2Fhub%2Fapi%2Foauth2%2Fauthorize%3Fclient_id%3Djupyterhub-user-bob%26redirect_uri%3D%252Fuser%252Fbob%252Foauth_callback%26response_type%3Dcode%26state%3DowDHMQbdIZ1sCuN0h3THNA (@::1) 1.88ms

This does not seem to be the case for the call from javascript.

I think you need to pass _xsrf token along with the request to fetch the files. This was added somewhere in JupyterHub 4.x for security reasons.

I did some further tests, and the first version of Jupyter Hub that is effected is 4.1.0. I looked at the release log but I do not see anything that could be relevant.
As far as I understand, CVE-2024-28233 affects running the code from other domains, which is not the case here. Or could it be #4630?

Also, shouldn’t the xsrf cookie from the browser be used for javascript requests? Even if the cookie is required, it is unclear to me why the hub does not do redirect to the login page as for the python code.

1 Like

Well, when you make a request from browser, the request forwards the session cookie, which is a valid one, and thus, authentication is already done. So, it should not redirect you to login page.

In your example using Python’s urllib, there is no valid session when you make the request and hence, it redirects you to login page for auth. Does it make sense?

Also, shouldn’t the xsrf cookie from the browser be used for javascript requests?

Yes, you are right that _xsrf token is stored in cookies and they are being forwarded with the request. But from my understanding JupyterHub expects them as a query parameter or a header.

The difference I see between regular API requests made by JupyterLab and JS fetch request like the one you are attempting is lack of X-XSRFToken header in request.

1 Like

Thanks for your explanation @mahendrapaipuri! That clarifies a lot!

Yes, you are right that _xsrf token is stored in cookies and they are being forwarded with the request. But from my understanding JupyterHub expects them as a query parameter or a header.

Are they expected by JupyterHub or Jupyter Server? Because from the log it looks like the _xsrf cookies are issued by the ServerApp and also the ServerApp redirects to the hub oauth, not JupyterHub. Could it be that the JupyterHub proxy does not properly forward the cookies (in headers) from the browser requests to the ServerApp?

Based on our discussion, I now tried to extract _xsrf cookies from javascript and append it to the url:

%%js
const xsrfCookies = document.cookie.split(';')
    .map(c => c.trim())
    .filter(c => c.startsWith('_xsrf='))
;
fetch('/user/bob/files/text.txt?' + xsrfCookies[0])
  .then( response => response.text() )
  .then( text => console.log(text) );

This indeed worked, however there appears to be several _xsrf cookies set in my browser and it is not very clear which one should be appended. Anyway, I mark this workaround as a solution for now.

1 Like

Unfortunately, I dont remember the details. I remember we had to change few internal services due to this security enhancement. Maybe @manics @minrk can chip in here what exactly is happening!!

This indeed worked, however there appears to be several _xsrf cookies set in my browser and it is not very clear which one should be appended.

There should be one _xsrf token for each path. So, I assume you must have one for /hub/, one for /user/bob, etc. So, for your use case you should use one for /user/bob.

Unfortunately, it appears to be not possible to get paths (or other cookies attributes) from document.cookie.

1 Like