File access on HybridContents through CLI

Hi,

I am using HybridContentsManager to access notebooks and other files from S3 as well as local files within a pod in Z2JH setup.

Currently, it works to create notebooks in S3. However, the kernel (and terminal, and git) is not aware of the notebooks in the S3 bucket. Is it possible for the kernel be aware of the notebooks in S3?

  • /home/jovyan/local_directory shows files local to the pod.
  • /home/jovyan shows files in the bucket

ls or cd in the terminal does not show any of the objects.

import os

from hybridcontents import HybridContentsManager
from s3contents import S3ContentsManager
from jupyter_server.services.contents.largefilemanager import LargeFileManager 

HOME = os.environ['HOME'] # /home/jovyan directory

c = get_config()

c.ServerApp.contents_manager_class = HybridContentsManager

c.HybridContentsManager.manager_classes = {
    '': S3ContentsManager,
    'local_directory': LargeFileManager,
}

c.HybridContentsManager.manager_kwargs = {
    '': {
        'access_key_id':'',
        'secret_access_key':'',
        'endpoint_url':'',
        'bucket':'',
        'prefix':'',
        'skip_tls_verify':True,
        'root_dir': HOME
    },
    'local_directory': {
        'root_dir': HOME,
    }
}

def no_spaces(path):
    return ' ' not in path

c.HybridContentsManager.path_validators = {
    '': no_spaces,
}

Jupyter-server has a ContentsManager which defaults to using the local filesystem. It can be overridden:
https://jupyter-server.readthedocs.io/en/latest/developers/contents.html
but it only applies to anything using the Contents API to manage files.

If you want to access files on S3 via the kernel you could use the S3 API, e.g. using the boto3 library if you’re using Python. Otherwise you can try and mount S3 using fuse, e.g. GitHub - s3fs-fuse/s3fs-fuse: FUSE-based file system backed by Amazon S3, but be aware that although the files will appear to be local it’s not a POSIX filesystem so you may run into problems with some operations.