Configuring PAMAuthenticator executor

Will-Shanks · August 6, 2024, 1:41am

Is it possible to configure what executor the PAMAuthenticator uses? I’ve tried the obvious adding c.PAMAuthenticator.executor=ThreadPoolExecutor(10) to my jupyterhub_config.py with no luck.

Context:
I’ve been debugging an issue we’ve been running into with our JupyterHub install (~100 active users at any time) where if too many users (>5, so not that many) attempt to login at the same time authentication hangs forever (or at least 5 min, I tend to get bored waiting for it after that long) the only resolution we’ve found thus far is restarting the hub.

From my testing, I’ve discovered that the PAMAuthenticator essentially processes authentication requests sequentially, with no parallelism, due to the executor being set to only use 1 thread here jupyterhub/jupyterhub/auth.py at f4fa229645e007f3c3ea9d2ed440855f7b19595f · jupyterhub/jupyterhub · GitHub

Our pam authentication flow uses radius, and can take 30+ seconds per user, so my best guess for whats happening is these long requests are stacking up, causing a timeout somewhere to be hit, breaking authentication on our hub.

So it seems there is a bug in the code somewhere, that causes this infinite hang to never resolve, but being able to configure some level of parallelism will likely solve our problem, or at least mean I’ll have to stop restarting the hub so often.

manics · August 6, 2024, 3:04pm

I’d have expected c.PAMAuthenticator.executor=ThreadPoolExecutor(10) to do what you want. Can you run JupyterHub with debug logging, and show us the logs when multiple people attempt to login?

mahendrapaipuri · August 6, 2024, 4:49pm

Seems like PAMAuthenticator.executor is not a configurable traitlet.

Will-Shanks · August 6, 2024, 4:50pm

Looking through my logs from when I tried setting the executor I found this line

[W 2024-08-06 16:34:15.607 JupyterHub configurable:214] Config option `executor` not recognized by `PAMAuthenticator`.

So I guess that answers the question of if the executor is configurable or not.

I’d prefer not to muck about with our production environment if I can help it, but I do have an easy repro case here: GitHub - Will-Shanks/jupyterhub_auth_repro: repro PAMAuthenticator auth issue

I can generate logs from this setup if you like, or if you really need it I can have a go collecting some from our production environment.

manics · August 6, 2024, 5:06pm

Good point! I can’t think why it couldn’t be made configurable, or at least the number of threads could be configurable.

In the meantime you could try creating an in-line subclass in your jupyterhub_config.py file, something like:

from jupyterhub.auth import PAMAuthenticator
class CustomPAMAuthenticator(PAMAuthenticator):
    executor = Any()  # I'm not sure if this is necessary, of if overriding `_default_executor` is enough without defining `executor` again

    @default('executor')
    def _default_executor(self):
        return ThreadPoolExecutor(10)

c.JupyterHub.authenticator_class = CustomPAMAuthenticator

Will-Shanks · August 6, 2024, 5:26pm

Thanks! With a little tweak that does the job!

from concurrent.futures import ThreadPoolExecutor
from jupyterhub.auth import PAMAuthenticator
from traitlets import default

class CustomPAMAuthenticator(PAMAuthenticator):
    @default('executor')
    def _default_executor(self):
        return ThreadPoolExecutor(10)

c.JupyterHub.authenticator_class = CustomPAMAuthenticator

It might just be my lack of understanding how the configuration magic works, but it looks like making executor a config option might be tricky, but adding a new executor_thread_pool_size config option to initialize it with should be trivial. Is that something that would be accepted as a PR? I’d be happy to submit it if so. I have to imagine this would be useful to more than just our site, and would have been a very helpful clue for our debugging if it had existed already.

manics · August 6, 2024, 9:06pm

A PR to make the pool size configurable seems reasonable to me. I’ve found one example in OAuthenticator:

github.com

jupyterhub/oauthenticator/blob/cd00a8d6e5a8a6a97088b4a1e73ccb419f197257/oauthenticator/mediawiki.py#L96


      
              return os.environ.get("LOGIN_SERVICE", "MediaWiki")
          
          mw_index_url = Unicode(
              os.environ.get('MW_INDEX_URL', 'https://meta.wikimedia.org/w/index.php'),
              config=True,
              help="""
              Full path to index.php of the MW instance to use to log in
              """,
          )
          
          executor_threads = Integer(
              12,
              config=True,
              help="""
              Number of executor threads.
          
              MediaWiki OAuth requests happen in this thread,
              so it is mostly waiting for network replies.
              """,
          )

Will-Shanks · August 6, 2024, 10:16pm

Thank you for all the help! We’ve been working on this for a while so its frustrating to see how simple the fix was, but great to have finally found it.

Just opened a PR: PAMAuthenticator: make executor threads configurable by Will-Shanks · Pull Request #4863 · jupyterhub/jupyterhub · GitHub

Topic		Replies	Views
PAMAuthenticator - unexpected behavior JupyterHub	2	316	February 11, 2021
Slow PAM authentication with PAM Authenticator and many users JupyterHub jupyterhub	2	216	January 11, 2024
PAMauthenticator with dummy Users and Passwords JupyterHub	4	637	August 23, 2023
NotebookApp not being culled discuss	7	1648	April 25, 2021
Pam Authentication Failure JupyterHub jupyterhub , how-to , help-wanted	6	4622	June 1, 2021

Configuring PAMAuthenticator executor

Related topics