Best practices for monitoring a JupyterHub?

1kastner · March 29, 2020, 1:06pm

Hello there,

I was thinking of setting up a JupyterHub with semitrusted users. Due to previous projects we have already made up our minds about how to appropriately isolate users. Now I was wondering whether you know some solid monitoring approach? I mainly thought about two cases of abuse by the user: Using the Hub as a source for network attacks such as DDoS (making me as the admin partly responsible for sth.) or for mining cryptocurrencies (slowing down the hub for others). My first idea was to check log files with kibana, to control which processes each user spawns with general system administrative tools, and have some kind of alarm on “abnormal” activity. But which tools have been previously tested? I would love to get some insights from somebody who has designed or even currently operates a somehow similar monitoring system. Thank you so much for sharing!

Best regards

PS: I know no system could be perfect and there will be always some uncovered (security) issues. I only want to stop the most obvious cases.

choldgraf · March 30, 2020, 3:02am

One quick thought on mining, that we have learned from running mybinder.org:

Jupyter environments are generally for interactive computing. A very common pattern you see with interactive computing is that usually, the cpu is doing nothing. AKA, the person is typing, browsing and looking at stuff, etc (all of that is bring done on the client-side). So if we see a session on mybinder.org that is running 100% CPU for an extended period of time, it’s very often a mining bot.

betatim · March 30, 2020, 7:32pm

Grafana is what mybinder.org uses to keep an eye on things.

I second what @choldgraf said about finding miners. They are usually not at all subtle and stick out like a sore thumb. A simple top on the note tells you which processes are consuming a lot of CPU and the offenders are usually easy to spot. We use Grafana to have a web-based dashboard to keep an eye on the whole cluster.

The other thing to do which will also limit opportunities for denial of service attacks starting from your cluster is to fairly strictly limit what remote ports and hosts users can connect to. mybinder.org essentially only allows HTTP(S), rsync and git. A lot of hostnames related to crypto mining pools are black listed. We limit the bandwidth per instance as well.

Most of this is done via network policies in kubernetes. I think cgroups (aka docker) can also do some of this. And then there are also trusty iptables.

Overall it depends a bit on how you’ve setup your hub.

arnim · March 30, 2020, 9:07pm

One more thing you may want to consider is how much resources you want to provide to your users. Allowing for only up to 2 CPUs per user makes mining in many cases less attractive. We run a openly accessible setup of an JHub (that can be configured by the user via the binder mechanism) at http://notebooks.gesis.org/ for over 2 years now and this resource limiting strategy worked quite well (so fare). A further strategy (not tried so fare) that might also allow for more resources is to let people write a short mini proposal before they can use your Hub (after all you are also giving something to them ;)).

1kastner · March 31, 2020, 8:14am

@arnim the ressource restriction sounds reasonable! Our Hub would also be used for machine learning. Therefore, sometimes a lot of computing ressources could be needed!

1kastner · March 31, 2020, 9:18am

@choldgraf we would also have some machine learning tasks running at the Hub so that could be another source for heavy CPU load. Hence, it might be a bit more difficult to differentiate between the two.

Do you think it is possible to check which processes a user runs, to tell apart Python processes from let’s say compiled programs or other scripts in other languages such as Shell?

betatim · March 31, 2020, 9:44am

These tools must exist but I haven’t been able to find them. If you do please share.

The problem is that abusers fairly quickly realise that they are easy pray if their process is named mining.sh or xminer.exe. So they rename their script/executable to python. This means it is harder than just looking at the name of the process :-/

1kastner · March 31, 2020, 9:59am

That is a fair point! I could imagine it to be cumbersome since we don’t want to hinder proper Python libraries (which might be implemented in another language). I will let you know as soon as I have results on this!

Topic		Replies	Views
Log Monitoring for JupyterLab on DockerSpawner JupyterHub jupyterhub , how-to , help-wanted	2	433	March 12, 2024
TLJH exploited by bitcoin mining The Littlest JupyterHub security	8	983	February 21, 2023
Issue wit cpuminer on JupyterHub JupyterHub help-wanted	2	662	December 9, 2021
Has anyone implemented JupyterHub dashboards to show usage / adoption / etc? JupyterHub	5	456	March 9, 2021
[Feature request] More useful metrics collected by JupyterHub JupyterHub feature-idea	3	641	June 17, 2022

Best practices for monitoring a JupyterHub?

Related topics