[Feature request] More useful metrics collected by JupyterHub

Hi everyone, I would like to create a feature request and help in developing it.

As a developer and maintainer of our organization JupyterHub, I am interested in collecting more useful data about our Hub, which can justify the development and cloud costs dedicated to the project. Currently, I found 2 very useful metrics collected by JupyterHub: total number of registered users and number of currently running servers. I wish to know, for example, the number of currently active (logged in) users, distribution of server lifetimes and many more. I can also see that there are plans on the roadmap to have more resource tracking inside servers as well.

In that vein, I would like to ask the community, if someone already working on that, and if not, to get some pointers to get me started developing that. In particular, I would be glad for any recommendations on how to start implementing the number of currently active users metric.

JupyterHub is very flexible and highly customisable, so some of those stats depend on what’s offered by the underlying platform. JupyterHub could gather some standard metrics, but for now I think the easiest solution is to leverage your base infrstructure.

For example, if you’re using Kubernetes you can use Prometheus/Grafana to build dashboards visualising the current pods/user, resource consumption, storage, etc, in much more detail than would be possible with a generic set of JupyterHub metrics. If you’re using JupyterHub on HPC you should be able to get stats from your cluster management system.

@manics thanks for the reply! If using kubespawner on k8s + Prometheus/Grafana stack, how would I get the gauge for the number of unique active users, given that they can run multilple servers at a time. I can only think of running some regexp on pod names and trying to count unique usernames in the list of Jupyter pods.
If someone have solved this problem, I would be interested to know the approach

Yes, going through the pod metrics and matching/grouping/aggregating by labels will get you the most useful information, especially as you’ll see the resource usage of each pod

There are some example Grafana dashboards you could extend in