Hello. I am running the The Littlest JupyterHub for my class. I am hosting this on Google Cloud. Usually, JupyterHub works fine. However, sometimes if a student has an error in their code, their kernel gets stuck, which eventually takes down the entire Google Cloud instance where I am hosting JupyterHub. Is there a way to prevent this from happening?
Sorry if the question is naive. I am a relative novice with JupyterHub.
Have you looked at your system’s resource usage?
If it’s due to someone using too much memory or CPU you can add a limit:
Thanks. I didn’t know you could set these limits. I’ll give that a try. What are some recommended limits for CPU and memory per user?
It’s not in any way trivial, but here is one example based on a 4 CPU / 32 GB machine.
- As a lower limit on what users need on average, I think you can go with 0.05 CPU / 250MB, which could enable 80 users to work on the same 4 CPU / 32 GB machine.
- As an upper limit on what any individual user should be allowed to consume, I think you can go with 1 CPU and 1 GB of memory. No single user can then drain all CPU or memory, and users get informed if they peak in memory by a “OOMKilled” notification or similar and don’t end up waiting for the entire machine to start running out of memory.
- a preliminary 1CPU/1GB limit for users
- to setup TLJH on machines with relative high amounts of memory compared to CPU, such as a 1:8 ratio that a 4 CPU / 32 GB memory machine has, as typically users run out of memory rather than CPU