Configuring PAMAuthenticator executor

Is it possible to configure what executor the PAMAuthenticator uses? I’ve tried the obvious adding c.PAMAuthenticator.executor=ThreadPoolExecutor(10) to my jupyterhub_config.py with no luck.

Context:
I’ve been debugging an issue we’ve been running into with our JupyterHub install (~100 active users at any time) where if too many users (>5, so not that many) attempt to login at the same time authentication hangs forever (or at least 5 min, I tend to get bored waiting for it after that long) the only resolution we’ve found thus far is restarting the hub.

From my testing, I’ve discovered that the PAMAuthenticator essentially processes authentication requests sequentially, with no parallelism, due to the executor being set to only use 1 thread here jupyterhub/jupyterhub/auth.py at f4fa229645e007f3c3ea9d2ed440855f7b19595f · jupyterhub/jupyterhub · GitHub

Our pam authentication flow uses radius, and can take 30+ seconds per user, so my best guess for whats happening is these long requests are stacking up, causing a timeout somewhere to be hit, breaking authentication on our hub.

So it seems there is a bug in the code somewhere, that causes this infinite hang to never resolve, but being able to configure some level of parallelism will likely solve our problem, or at least mean I’ll have to stop restarting the hub so often.

I’d have expected c.PAMAuthenticator.executor=ThreadPoolExecutor(10) to do what you want. Can you run JupyterHub with debug logging, and show us the logs when multiple people attempt to login?

Seems like PAMAuthenticator.executor is not a configurable traitlet.

Looking through my logs from when I tried setting the executor I found this line

[W 2024-08-06 16:34:15.607 JupyterHub configurable:214] Config option `executor` not recognized by `PAMAuthenticator`.

So I guess that answers the question of if the executor is configurable or not.

I’d prefer not to muck about with our production environment if I can help it, but I do have an easy repro case here: GitHub - Will-Shanks/jupyterhub_auth_repro: repro PAMAuthenticator auth issue

I can generate logs from this setup if you like, or if you really need it I can have a go collecting some from our production environment.

Good point! I can’t think why it couldn’t be made configurable, or at least the number of threads could be configurable.

In the meantime you could try creating an in-line subclass in your jupyterhub_config.py file, something like:

from jupyterhub.auth import PAMAuthenticator
class CustomPAMAuthenticator(PAMAuthenticator):
    executor = Any()  # I'm not sure if this is necessary, of if overriding `_default_executor` is enough without defining `executor` again

    @default('executor')
    def _default_executor(self):
        return ThreadPoolExecutor(10)

c.JupyterHub.authenticator_class = CustomPAMAuthenticator
1 Like

Thanks! With a little tweak that does the job!

from concurrent.futures import ThreadPoolExecutor
from jupyterhub.auth import PAMAuthenticator
from traitlets import default

class CustomPAMAuthenticator(PAMAuthenticator):
    @default('executor')
    def _default_executor(self):
        return ThreadPoolExecutor(10)

c.JupyterHub.authenticator_class = CustomPAMAuthenticator

It might just be my lack of understanding how the configuration magic works, but it looks like making executor a config option might be tricky, but adding a new executor_thread_pool_size config option to initialize it with should be trivial. Is that something that would be accepted as a PR? I’d be happy to submit it if so. I have to imagine this would be useful to more than just our site, and would have been a very helpful clue for our debugging if it had existed already.

2 Likes

A PR to make the pool size configurable seems reasonable to me. I’ve found one example in OAuthenticator:

2 Likes

Thank you for all the help! We’ve been working on this for a while so its frustrating to see how simple the fix was, but great to have finally found it.

Just opened a PR: PAMAuthenticator: make executor threads configurable by Will-Shanks · Pull Request #4863 · jupyterhub/jupyterhub · GitHub

1 Like