How to convert JupyterLab S3 Contents Manager to use a custom API instead?

Snawar_Latif · March 11, 2025, 7:14pm

Question:

I’m working with a JupyterLab extension that currently uses AWS S3 for file storage via the AWS SDK. For security reasons, I want to replace the direct S3 access with API calls to my backend server, which will handle the S3 operations server-side.

Here’s my current approach:

In index.ts, I’ve modified the authFileBrowser plugin to use placeholder credentials:

const authFileBrowser: JupyterFrontEndPlugin<IS3Auth> = {
  id: 'jupydrive-s3:auth-file-browser',
  description: 'The default file browser auth/credentials provider',
  provides: IS3Auth,
  activate: (): IS3Auth => {
    return {
      factory: async () => {
        console.log('Setting up S3/R2 proxy via server API...');
        
        // Since we're using API endpoints for all S3 operations,
        // we just need a minimal configuration with the bucket name
        const config = {
          bucket: 'my-bucket', // This is just for display purposes
          root: '',
          config: {
            forcePathStyle: true,
            // These are placeholder values since actual S3 operations
            // will be handled by the server
            endpoint: 'https://api-proxy',
            region: 'auto',
            credentials: {
              accessKeyId: 'proxy-auth',
              secretAccessKey: 'proxy-auth'
            }
          }
        };
        
        console.log('S3/R2 proxy setup complete');
        return config;
      }
    };
  }
};

And in s3contents.ts, I’m replacing the S3 operations with API calls:

async get(
  path: string,
  options?: Contents.IFetchOptions
): Promise<Contents.IModel> {
  path = path.replace(this._name + '/', '');

  // format root the first time contents are retrieved
  if (!this._isRootFormatted) {
    this._root = await this.formatRoot(this._root ?? '');
    this._isRootFormatted = true;
  }

  try {
    // Use API endpoint instead of direct S3 access
    const response = await fetch(`/api/s3/contents?path=${encodeURIComponent(path)}&root=${encodeURIComponent(this._root)}`, {
      method: 'GET'
    });

    if (!response.ok) {
      throw new Error(`Failed to fetch contents: ${response.statusText}`);
    }

    const data = await response.json();
    Contents.validateContentsModel(data);
    return data;
  } catch (error) {
    console.error('Error fetching contents:', error);
    throw error;
  }
}

// Similar changes for other methods like save, delete, rename, etc.

My questions are:

Is this the correct approach to replace S3 SDK operations with API calls?
What specific API endpoints do I need to implement on my backend to fully support JupyterLab’s contents manager functionality?
Are there any special considerations for handling binary files, large files, or streaming content?
How should I handle authentication and authorization for these API calls?
Are there any examples or reference implementations of a custom API-based contents manager for JupyterLab?

I’m trying to maintain all the functionality of the S3 contents manager but with improved security by keeping S3 credentials on the server side.

Krish_C · March 12, 2025, 10:42pm

Will be great to answer to this. @fomightez @jtp thoughts on this? This is based on jupydrive-s3/src/index.ts at 7cb48f727d7a47ac5cca01df021a1c6b84465cf7 · QuantStack/jupydrive-s3 · GitHub. help is most appreciated

jtp · March 13, 2025, 7:31am

In this case, you may want to have a look at GitHub - QuantStack/jupyter-drives: Jupyter Server supporting JupyterLab IDrive, which can be configured on the backend.

Snawar_Latif · March 18, 2025, 4:33pm

@jtp Thank you for suggesting jupyter-drives. I’m currently using JupyterLite as a static application within my Rails app (in the public folder). While jupyter-drives looks promising, I have a few questions:

Since I’m using JupyterLite in a static way, can I still use jupyter-drives? I understand it requires a server component, but I don’t have a Python server running.
If I want to integrate it with my Rails backend instead of using the Python server component, what would be the best approach? Would I need to:

Modify the jupyter-drives frontend to communicate with my Rails endpoints?
Create my own custom extension that mimics jupyter-drives but works with Rails?
Or is there another recommended approach?

Are there any examples or documentation specifically about integrating jupyter-drives with non-Python backends or static JupyterLite deployments?

I’d appreciate any guidance on how to best implement drive functionality in my specific setup.

Krish_C · March 26, 2025, 9:00pm

any thoughts on this? Curious to know options to make this work ? @jtp

Topic		Replies	Views
Customizing jupydrive-s3 for JupyterLite - Seeking Advice extensions jupyterlab , help-wanted	0	33	March 8, 2025
Using Jupyterlab with extension jupyterlab-s3-browser JupyterLab	2	1323	June 6, 2024
Download of S3 files from JupyterLab fails JupyterLab jupyterlab , help-wanted	0	95	December 27, 2024
Performing file accessing and modifying actions in jupyterlab extension extensions help-wanted	2	1289	April 6, 2023
How to do I/O file operation in my extension? extensions how-to	4	608	July 1, 2022

How to convert JupyterLab S3 Contents Manager to use a custom API instead?

Question:

Related topics