Strange issue with RTC in JupyterHub behind Apache HTTP reverse proxy

I’m trying to upgrade a small JupyterHub installation (JupyterHub 4.1.5, JupyterLab 3.6.7, jupyterlab-link-share 0.3.0) to recent versions, and use RTC (JupyterHub 5.0.0, JupyterLab 4.2.4, jupyter_collaboration 2.1.2).

It’s a very small user community, so the basic trust model of incorporating users is acceptable. The cunfigurable-http-proxy is running behind a (company mandated) Apache HTTPD reverse proxy which terminates TLS. It hub is running under a URI prefix /APP/jupyter

Most things work as expected, hub and lab function properly and RTC works partially. But if I open a notebook which is not in the root folder of the JupyterLab, then the notebook doesn’t even properly open, the spinner in the UI stays forever. On the backend, I get errors as follows

[I 2024-07-29 15:58:39.233 YDocExtension] Creating FileLoader for: it-notebooks%2FSomeNotebook.ipynb
[I 2024-07-29 15:58:39.237 YDocExtension] Watching file: it-notebooks%2FSomeNotebook.ipynb
[I 2024-07-29 15:58:39.238 ServerApp] 101 GET /APP/jupyter/user/UUU/api/collaboration/room/json:notebook:xxx?sessionId=xxx (t603016@127.0.0.1) 23.38ms
[I 2024-07-29 15:58:39.238 ServerApp] Initializing room json:notebook:xxx
[E 2024-07-29 15:58:39.249 ServerApp] File it-notebooks%2FSomeNotebook.ipynb not found.
    HTTPError()
    Traceback (most recent call last):
      File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/jupyter_server_ydoc/handlers.py", line 234, in open
        await self.room.initialize()
      File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/jupyter_server_ydoc/rooms.py", line 110, in initialize
        model = await self._file.load_content(self._file_format, self._file_type)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/jupyter_server_ydoc/loaders.py", line 114, in load_content
        model = await ensure_async(
                ^^^^^^^^^^^^^^^^^^^
      File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/jupyter_core/utils/__init__.py", line 198, in ensure_async
        result = await obj
                 ^^^^^^^^^
      File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/jupyter_server/services/contents/filemanager.py", line 910, in get
        raise web.HTTPError(404, "No such file or directory: %s" % path)
    tornado.web.HTTPError: HTTP 404: Not Found (No such file or directory: it-notebooks%2FSomeNotebook.ipynb)
[E 2024-07-29 15:58:39.252 ServerApp] Failed to write message
    Traceback (most recent call last):
      File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/jupyter_server_ydoc/handlers.py", line 268, in send
        self.write_message(message, binary=True)
      File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/tornado/websocket.py", line 332, in write_message
        raise WebSocketClosedError()
    tornado.websocket.WebSocketClosedError
[I 2024-07-29 15:58:39.253 ServerApp] Deleting Y document from memory: json:notebook:xxx
[I 2024-07-29 15:58:39.253 ServerApp] Room json:notebook:xxx deleted
[I 2024-07-29 15:58:39.259 ServerApp] Deleting file it-notebooks%2FSomeNotebook.ipynb
[E 2024-07-29 15:58:39.264 ServerApp] Document Room Exception, (room_id=json:notebook:xxx):
      + Exception Group Traceback (most recent call last):
      |   File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/pycrdt_websocket/yroom.py", line 216, in start
      |     async with create_task_group() as self._task_group:
      |   File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 680, in __aexit__
      |     raise BaseExceptionGroup(
      | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
      +-+---------------- 1 ----------------
        | Traceback (most recent call last):
        |   File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/jupyter_server_ydoc/rooms.py", line 205, in _broadcast_updates
        |     await super()._broadcast_updates()
        |   File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/pycrdt_websocket/yroom.py", line 142, in _broadcast_updates
        |     await self._task_group.start(self.ystore.start)
        |           ^^^^^^^^^^^^^^^^^^^^^^
        | AttributeError: 'NoneType' object has no attribute 'start'
        +------------------------------------
[E 2024-07-29 15:58:39.265 ServerApp] Exception in callback functools.partial(<function WebSocketProtocol._run_callback.<locals>.<lambda> at 0x7f7530d01440>, <Task finished name='Task-2991' coro=<YDocWebSocketHandler.on_message() done, defined at /opt/fqbjupyter/jupyter/lib/python3.11/site-packages/jupyter_server_ydoc/handlers.py:279> exception=AttributeError("'YDocWebSocketHandler' object has no attribute 'room'")>)
    Traceback (most recent call last):
      File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/tornado/ioloop.py", line 750, in _run_callback
        ret = callback()
              ^^^^^^^^^^
      File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/tornado/websocket.py", line 640, in <lambda>
        self.stream.io_loop.add_future(result, lambda f: f.result())
                                                         ^^^^^^^^^^
      File "/opt/fqbjupyter/jupyter/lib/python3.11/site-packages/jupyter_server_ydoc/handlers.py", line 288, in on_message
        changes = self.room.awareness.get_changes(message[1:])
                  ^^^^^^^^^
    AttributeError: 'YDocWebSocketHandler' object has no attribute 'room'

Notice that the slash / is somehow encoded to %2F. In my opinion this is the cause of the error. When I take out the Apache HTTPD reverse proxy, everything works as expected, and the above messages in the log read as expected ‘it-notebooks/SomeNotebook.ipynb’

So I’m pretty sure the problem stems from the combination of reverse proxy and JupyterHub. Nothing has changed in the RP config compared to the previous setup with JupyterHub 4.1.5. And the problem only happens when I have RTC turned on. When I disable it, everything works.

Does anybody have a tip what could be the root cause for this behaviour

I can’t think of anything in JupyterHub that would modify the URL only when proxied via Apache. Can you share your full Apache configuration, or perhaps a minimal configuration example that you can run in an Apache Docker container that shows the problem?

I came up with this minimal httpd.conf sample for JupyterHub · GitHub minimal Apache configuration that shows the issue. To check, I toggle the JupyterHub config between the following 2 lines

c.JupyterHub.bind_url = 'http://127.0.0.1:8000/APP/jupyter'
#c.JupyterHub.bind_url = 'https://jupyterhubhost:8001/APP/jupyter'

When using the direct 8001 https connection, I can open all notebooks and share them. When using the Apache HTTPD as proxy, it works with notebooks on the top directory, but everything below gets mangled with the %%2F

I dug somewhat deeper, I’m pretty sure it doesn’t have anything to do with JupyterHub. Sorry for raising it here.

The URL that causes the problem is issued from the jupyter-collaboration JS itsself. It’s a PUT request to the collaboration API, and the JS does an explicit encodeURIComponent() of ‘it-notebooks/SomeNotebook.ipynb’, which generates the %2F. The request sent by the browser looks like this

Request URL: https://jupyterhubhost:8001/APP/jupyter/user/xxx/api/collaboration/session/it-notebooks%2FSomeNotebook.ipynb?1722410953279
Request Method: PUT

When talking directly to JupyterHub without Apache HTTPD proxy, the Tornado server somehow manages to handle this correctly and return 200. When talking via HTTPD, a 404 Not Found response is returned and the UI spinner spins endlessly.

Still have no idea which part of HTTPD mangles this request so that Jupyter cannot understand it anymore. I tried to add the ‘nocanon’ option to the ProxyPass command, but this didn’t help

ProxyPass http://127.0.0.1:8000/APP/jupyter nocanon

Apache debugging is telling me indeed that it found this strange element in the URL

[Wed Jul 31 09:54:47.680008 2024] [core:info] [pid 4171902:tid 140491123287808] [client 10.94.253.61:60612] AH00026: found %2f (encoded '/') in URI (decoded='/APP/jupyter/user/xxx/api/collaboration/session/it-notebooks/SomeNotebook.ipynb'), returning 404, referer: https://jupyterhubhost/APP/jupyter/user/xxx/lab

I’ll try the current beta 3.0.0 of jupyter-collaboration GitHub - jupyterlab/jupyter-collaboration: A Jupyter Server Extension Providing Support for Y Documents. They already have that issue reported Cant´t open Notebook in a directory after activating jupyterlab-collaboration · Issue #271 · jupyterlab/jupyter-collaboration · GitHub but no solution yet.

I’ll transfer discussion to there, because my issue clearly has nothing to do with JupyterHub

1 Like

Hello,

I have a similar issue. The Apache Server can not identify %2F in URL, which is used in Collaboration extension. Maybe this theme can help you.

Just add this in you Apache configuration file may solve your problem.

AllowEncodedSlashes On

2 Likes

And also try to use Jupyterlab 4.1.6.
And jupyter-collaboration 2.0.11

This Versions work well.

With the AllowEncodedSlashes On addition, things work perfectly for me :slight_smile:

Even with JupyterHub 5.0.0, Lab 4.2.4 and jupyter-collaboration 2.1.2

I’l give 3.0.0b1 another shot, just to be sure that it also works with that

1 Like

Can you show me your jupyterlab image? Because I didn’t make it work with jupyterlab 4.2.4