Download of S3 files from JupyterLab fails

Currently, when using JupyterLab, we are unable to download .ipynb notebook files stored on S3. Opening, reading, editing, and saving files all work. The issue occurred when downloading files via /files/ route in JupyterLab using S3ContentsManager. JupyterLab could not resolve the file paths correctly, resulting in 404 errors.

For example, clicking the download button requests a link under the /files path, but /files is not mapped to the storage path of the S3ContentsManager.

https://url.domain.com/files/pinotebook.ipynb?_xsrf=*****

However, if the link is in the format of /, it can be accessed. So, it’s clear that the route mapping wasn’t successfully established.

https://url.domain.com/notebooks/pinotebook.ipynb?_xsrf=***

So, my current approach is to implement an extension that dynamically redirects /files requests to the appropriate S3 bucket and prefix path. A custom handler was implemented to retrieve the bucket and prefix from S3ContentsManager and construct the correct file path.

However, there are currently two issues:

  1. Since I’m redirecting /files requests, downloading .ipynb notebooks by calling S3ContentsManager works. However, downloading other types of files results in constant redirection between /files and /notebooks. I briefly reviewed JupyterLab’s source code and found that notebooks and other file types are handled with different logic branches.
  2. Could you help analyze the root cause of this issue with downloading files from S3? Specifically:
    • How can we understand JupyterLab’s redirection mechanism?
    • Is there a way to resolve this issue while keeping changes minimal?
    • What would be the best approach to fix this problem without introducing unnecessary complexity?

Current Configuration of S3ContentsManager

import os
from s3contents import S3ContentsManager

c = get_config()
c.ServerApp.contents_manager_class = S3ContentsManager
c.S3ContentsManager.bucket = os.environ.get('S3_BUCKET')
c.S3ContentsManager.prefix = os.environ.get('S3_PREFIX', '')
c.S3ContentsManager.region_name = os.environ.get('REGION')