Troubleshooting JLab server shutdowns

I am having multiple users reporting their JLab servers shutting down unexpectedly. Cull is disabled and all I can see in the logs is the server stopping with exit code 1. I was hoping to get some suggestions on how I can troubleshoot this. I am attaching logs from the JLab server and the JHub server. Also, adding my Helm Cull settings. TIA!

Can you turn on debug logging, and show us the full hub logs, from several minutes before the server is stopped to after it’s stopped?

Missed your reply but here is an example. From what I see there is an issue writing to ~/.local but not sure why or how.

It certainly looks like it! Have you checked the permissions on that directory? Are you using a custom image, or a standard one from GitHub - jupyter/docker-stacks: Ready-to-run Docker images containing Jupyter applications? Can you show us your Z2JH config?

I am using a custom image that I built starting from jupyter/scipy-notebook. It essentially just adds some python packages.

I did some digging and exec’d into one of the k8s nodes and found where the EFS volume I am using was mounted. I was able to see all the users home directories with their usernames. The interesting thing was the the users who had a problem had permissions of root:root. I did a test and changed ownership of the folder and was then able to start the users JupyterLab(JL).

I am confused now on what is setting the ownership of the users directories in our EFS. Is it a script that runs at startup of a JL instance?

JupyterLab/JupyterHub don’t modify permissions be default.

It’s possible to enable it though by starting the container as root and running chown:

If you’re not doing this then something else is changing the EFS permissions. e.g. Maybe the user folders are created by some other process? Or maybe there’s another container running as root that creates a root owned directory? Or could it be from testing something?

1 Like