How I could get the notebook name having the kernel .json runtime key or OS process ID?

Running JupyterHub with multiple users, we want to quickly identify the notebook a user left in the background consuming resources to ask if still needed or can be shutdown.
It is possible to run a command or use an API to get the notebook name related to OS process ID or the runtime/kernel-*.json file?

Thanks!

You should be able to find the user fairly easily. If you’re running JupyterHub on a a single shared server with multiple operating system users you can see easily who owns a process using tools such as ps. For something like Docker or Kubernetes you can obtain the contain CPU/memory usage through the platform.

I don’t know whether you can easily from from the PID to the notebook/kernel though, hopefully someone else can help!

What information are you starting with? Do you know which user’s server you are interested in, at least?

It’s not easy in general because some of these bits of information aren’t part of the Jupyter spec, but you can probably get close enough in practice based on some conventions and assumptions.

I made this notebook as an example of looking up documents from PIDs and connection files.

The key bits of information:

  1. The server’s Sessions API is where documents are associated with kernels
  2. you can’t get the PID from any Jupyter APIs, but the IPython kernel does expose an expermental message for retrieving it.
  3. If you have a connection file, these filenames contain kernel ids and can be used to map back to kernel id and thereby document without needing to talk to the kernels themselves.
  4. you can talk to servers deployed by JupyterHub if you have a JupyterHub token with the access:servers scope.

Summarizing the simpler approach to map a connection file name to a document:

import requests

session_list = requests.get(f"{server_url}/api/sessions", headers={"Authorization": f"token {token}"}).json()

for session in session_list:
    kernel_id = session["kernel"]["id"]
    if kernel_id in connection_file_name:
        print(f"{connection_file_name}: {session['type']} {session['name']!r}")
1 Like

Awesome, thanks for sharing the gist very useful. In my case we are using JupyterHub then take me a bit to realize need to update server_url to server_url = ‘https://hostname:port/user/username’
The other interesting point is am using vscode to connect to remote kernel and execute my workload, then seems the api does not have information or the session name does not know the final notebook name if created in vscode, but in vscode I can see the list of filenames when I click on kernel dropdown menu.

Any suggestions?

I don’t know how vs code’s notebook + remote kernel works, but it sounds like the notebook doesn’t have to be on the server at all. The mapping may live in vs code only, rather than anywhere on the Jupyter server, i.e. vs code maps notebook document → vs code uuid → jupyter session uuid → jupyter kernel uuid.

You can try checking the full contents of the session output (mainly the path variable) to see if there’s anything more recognizable.