Pre_spawn_hook locks JupyterHub

#1

Hi everyone, I’m encountering a problem specific to my program flow using the azuread oauthenticator.
I need to create an aad system user for everyone that logs in and authenticates through my organisation’s Azure AD.
For that there is a debian package called aadlogin. Basically using the function aaduseradd I can create a system user that is registered remote with the AD.
I think that that function is not ideal at the moment, as it can take up to half an hour to add a user this way.

Currently, I have a pre_spawn_hook that calls a shell script that makes the aad user, mounts a file share, and creates symlinks.
Because the aaduseradd function takes so long, JupyterHub locks up inside the pre_spawn_hook.
Basically, the hub doesn’t work while inside that shell script.
I’ve tried making the pre_spawn_hook a coroutine, to no avail.

How can I make sure the hub isn’t interrupted while spawning a user?

Update:
I’ve found out the reason for the command aaduseradd being so slow; instead of using symlinked directories like in /etc/skel it downloads the full data (260MB -> 11GB). This can take a long time.
If I can fix that I suspect the problem will be of a smaller magnitude, but I’d still like to know how I can make the pre_spawn_hook non-blocking.

#2

How are you calling aaduseradd? With the subprocess module? I’d recommend making pre_spawn_hook an async function and using https://docs.python.org/3/library/asyncio-subprocess.html. That should hopefully stop blocking JupyterHub.

#3

I called it using subprocess.check_call, and did try making it an async function.
I will look into using asyncio-subprocess and report back!
Thanks!

#4

An important thing to know about async: async def doesn’t make a function asynchronous. It allows it to be asynchronous. Only where you have await statements will any other operations be able to execute. So if an “async” function has no await statements, it’s fully blocking and nothing else can run while it’s working. This is what the asyncio-subprocess module does: make waiting for a subprocess awaitable so it doesn’t block other code, unlike the stdlib subprocess module.

#5

Thanks for the info!
I solved the issue of aaduseradd taking so long, so currently there is no pressing need to implement that.
As the server is up and running, I will test this for the next release.