Tailoring spawn options and server configuration to certain users

Background

If you want to customize something based on information about the user logged in and about to start a server, perhaps based on custom python logic, you can!

For example, perhaps you have information about the user being part of a group with access to GPUs and want that to influence some spawn options and server configurations?

This post is about how you could go about that practically.

Problem 1 - Adjust presented server options based on user

What if you want to present different server options to different users, based on some property of the user? Just setting c.KubeSpawner.profile_list / singleuser.profileList won’t do because then all users would get the same options.

The solution to this is to update c.KubeSpawner.options_form with a custom function that first updates the the_users_spawner_instance.profile_list based on logic that involve the user information.

Common section for examples below

The examples below assume we can request a “scope” called “gpu_access”, which is supposed to return a “claim” with the same name as a true or false value.

# This snipped assumes a OAuthenticator based Authenticator, and the scopes will
# vary and this is just an example.
#
# - openid is common for OpenID Connect based authentication
# - profile, email are both commonly available scopes to request
# - gpu_access is entirely custom
#
c.OAuthenticator.scopes = ["openid", "profile", "email", "gpu_access"]

Solution to problem 1

To present JupyterHub server spawn options based on the user

# To adjust the spawn options presented to the user, we must create a custom
# options_form function, and this example demonstrates how!
#
#
#
# profile_list (KubeSpawner class) can be configured as a convenience to
# generate set HTML for the options_form configuration (Spawner class).
#
# If options_form is set (or indirectly set through profile_list), it is the
# HTML that users are presented with when users have signed in and want to start
# a server.
#
# While options_form is allowed to be a HTML string, it can also be a callable
# function, that when called generates HTML. If a callable function return a
# falsy value, no form will be rendered.
#
# In this custom options_form function, we will make a decision based on user
# information, update profile_list, and rely on the profile_list logic to render
# the HTML for us.
#
async def custom_options_form(spawner):
    # See the pre_spawn_hook example for more ways to get information about the
    # user
    auth_state = await spawner.user.get_auth_state()
    user_details = auth_state["oauth_user"]
    gpu_access = user_details.get("gpu_access", False)

    # Declare the common profile list for all users
    spawner.profile_list = [
        {
            'display_name': 'CPU server',
            'slug': 'cpu',
            'default': True,
        },
    ]

    # Dynamically update profile_list based on user
    if gpu_access:
        spawner.log.info(f"GPU access options added for {username}.")
        spawner.profile_list.extend(
            [
                {
                    'display_name': 'GPU server',
                    'slug': 'gpu',
                    'kubespawner_override': {
                        'image': 'training/datascience:my_tag',
                    },
                },
            ]
        )

    # Let KubeSpawner inspect profile_list and decide what to return, it
    # will return a falsy blank string if there is no profile_list,
    # which makes no options form be presented.
    #
    # ref: https://github.com/jupyterhub/kubespawner/blob/37a80abb0a6c826e5c118a068fa1cf2725738038/kubespawner/spawner.py#L1885-L1935
    return spawner._options_form_default()

# Don't forget to ask KubeSpawner to make use of this custom hook
c.KubeSpawner.options_form = custom_options_form

Problem 2 - Adjust spawned server config based on user and/or user’s chosen option

What if you wanted to add a label, environment variable, container etc. or similar to a user pod based on the user, the users “auth state”, and/or the chosen profile among the profile_list profiles? Setting c.KubeSpawner.extra_labels or similar would not cut it, because it would influence all users, not just a specific user.

The solution to this is to update c.KubeSpawner.pre_spawn_hook with a custom function that updates the the_users_spawner_instance.<some option> based on logic that involve the user information.

Solution to problem 2

To adjust the server based on the user, the users “auth state”, and/or the chosen profile.

async def custom_pre_spawn_hook(spawner):
    # Here are examples of information available to us:
    #
    # 1. Info about the user from the JupyterHub User object
    #
    username = spawner.user.name
    # ... more User object properties are available, for more information, see:
    # https://jupyterhub.readthedocs.io/en/latest/_static/rest-api/index.html#/definitions/User
    
    # 2. Info about the user's profile if profile_list was configured
    #
    #    This value will be either the "slug" or "display_name". In our example,
    #    "cpu" or "gpu".
    #
    chosen_profile = spawner.user_options.get("profile", "")

    # 3. Info about the user from the user's authentication state
    #
    #    The authentication state must be enabled for this to be accessible, so
    #    set hub.config.Authenticator.enable_auth_state to true.
    #
    #    Depending on what authenticator you use and how it is configured, you
    #    can get access to different things. By setting an OAuthenticator based
    #    class "scope" to include "email", you should have email "claim"
    #    available. You request scopes, and can be returned claims.
    #
    auth_state = await spawner.user.get_auth_state()
    user_details = auth_state["oauth_user"]
    gpu_access = user_details.get("gpu_access", False)

    # Here are examples on how to update settings based on user information:
    #
    # 1. Setting a label
    #
    #    With .update on this dict, we don't remove all other extra_labels
    #
    spawner.extra_labels.update(
        {
            "sundellopensource.se/gpu-access": str.lower(str(gpu_access)),
        }
    )

    # 2. Add an init_container
    #
    #    With .insert on this list, we don't remove all other init_containers
    #
    init_container_for_gpu_users = {} # FIXME: not a complete example
    spawner.init_containers.insert(0, init_container_for_gpu_users)

# Don't forget to ask KubeSpawner to make use of this custom hook
c.KubeSpawner.pre_spawn_hook = custom_pre_spawn_hook
6 Likes

Great post, thanks a lot!

Any reason why you didn’t use the custom pre-spawn hook for both solutions? Can’t see why wouldn’t use it for problem 1.

It just makes more sense to me to use the pre-spawn hook since options_form is meant to return HTML.

1 Like

Problem 1 is about customizing the options presented to a user, and you can’t do that in the pre_spawn_hook. The pre_spawn_hook that is invoked after the user has been presented with options and made choices about presented server options.

The order of events of relevance are:

  1. A user is presented with server options

    Providing a custom options_form function makes these options render based on custom logic that can be user specific. It would be too late to use a pre_spawn_hook in this case, because those are triggered after the user has been presented and provided input via the options_form.

  2. A user has opted to start a server with some chosen entry from the options_form.

    Providing a pre_spawn_hook function enables to take further actions based on the choice(s) made in the options_form.

To conclude:

  • options_form is suited to customize what options to present based on the user
  • pre_spawn_hook is suited to customize what final settings to configure based on the choices the user made in the presented options_form.
1 Like

Hi @consideRatio,
Thanks for this great snippet!

Had 2 questions:

  1. What is the benefits here of using "options_form " instead of the “auth_state_hook” (described below)
       # auth_state may be an input to options form,
        # so resolve the auth state hook here
        auth_state = await user.get_auth_state()
        await spawner.run_auth_state_hook(auth_state)
  1. I’m using LDAP authentication and getting None in auth_state,
    any tips? :slight_smile:

Edit:
Regarding #2
I read the source code, looks like I was missing the “auth.state.enabled” config which causes auth_state to be returned as None if not configured :slight_smile:

The key difference is that auth_state_hook doesn’t have anything to do with manual user input, but options_form does.

The spawner’s options_form renders a HTML input form for the user, and that means the user is able to provide input.

The Spawner’s auth_state_hook helps you invoke custom logic associated when auth state is to be used in some way.

As already explained in github I’m trying to map volumes based in the profile selection.
Therefore I now created the following snippet. Sadly that does not work. The kubespawner does not create the required pvc?

  extraConfig:
    volume.py: |    
      def profile_pvc(spawner):          
          chosen_profile = spawner.user_options.get("profile", "")
          pvc_name_template = "claim-{username}{servername}-"+chosen_profile
          spawner.pvc_name_template = pvc_name_template
          volume_name_template = "volume-{username}{servername}-"+chosen_profile
          spawner.storage_pvc_ensure = True
          spawner.storage_access_modes = ["ReadWriteOnce"]
          spawner.storage_capacity = "4G"
          # Add volumes to singleuser pods
          spawner.volumes = [
              {
                  "name": volume_name_template,
                  "persistentVolumeClaim": {"claimName": pvc_name_template},
              }
          ]
          spawner.volume_mounts = [
              {
                  "mountPath": get_config('singleuser.storage.homeMountPath'),
                  "name": volume_name_template,
              }
          ]
      c.KubeSpawner.pre_spawn_hook = profile_pvc

Jhub Log: (I would expect that the spawner tries to create a pvc for claim-ritterho-40hm-2eedu-dev)

[I 2021-12-18 19:19:44.620 JupyterHub proxy:347] Checking routes
[I 2021-12-18 19:19:44.621 JupyterHub app:2869] JupyterHub is now running at http://:8000
[I 2021-12-18 19:19:44.793 JupyterHub log:189] 200 GET /hub/api/users (cull-idle@::1) 44.00ms
[I 2021-12-18 19:19:49.989 JupyterHub log:189] 302 GET / -> /hub/ (@-redacted-) 1.45ms
[I 2021-12-18 19:19:50.043 JupyterHub log:189] 302 GET /hub/ -> /hub/spawn (ritterho@hm.edu@-redacted-) 24.22ms
[I 2021-12-18 19:19:50.133 JupyterHub log:189] 200 GET /hub/spawn (ritterho@hm.edu@-redacted-) 25.70ms
[I 2021-12-18 19:19:52.061 JupyterHub provider:574] Creating oauth client jupyterhub-user-ritterho%40hm.edu
[I 2021-12-18 19:19:52.085 JupyterHub spawner:2344] Attempting to create pvc claim-ritterho-40hm-2eedu, with timeout 3
[I 2021-12-18 19:19:52.130 JupyterHub spawner:2361] PVC claim-ritterho-40hm-2eedu already exists, so did not create new pvc.
[I 2021-12-18 19:19:52.134 JupyterHub spawner:2302] Attempting to create pod jupyter-ritterho-40hm-2eedu, with timeout 3
[I 2021-12-18 19:19:53.033 JupyterHub log:189] 302 POST /hub/spawn -> /hub/spawn-pending/ritterho@hm.edu (ritterho@hm.edu@-redacted-) 1003.82ms
[I 2021-12-18 19:19:53.071 JupyterHub pages:402] ritterho@hm.edu is pending spawn
[I 2021-12-18 19:19:53.074 JupyterHub log:189] 200 GET /hub/spawn-pending/ritterho@hm.edu (ritterho@hm.edu@-redacted-) 6.48ms
[W 2021-12-18 19:20:02.033 JupyterHub base:1008] User ritterho@hm.edu is slow to start (timeout=10)

Ok. I’ve figured out that the template can not be overwriten by the pre_spawn_hook as it is only used while initalizing the spawner.

Ok. I’ve figured out that the template can not be overwriten by the pre_spawn_hook as it is only used while initalizing the spawner.

This is probably incorrect, I’m not sure what you mean. If you configure c.KubeSpawner.volumes, it means you provide a default so that when a KubeSpawner object is initialized for a specific server to be spawned, it will use those defaults at that point in time. If you set some configuration on a spawner instance, you will adjust that spawner, and as long as that is done “pre spawn” that will influence how the server is started - the k8s pod hasn’t been created yet, so you can tweak how it will be created!

Watch out, you are overriding volumes here. Instead extend the volumes that has been initialized. The whole point of using a pre_spawn_hook was that kubespawner_override would override this list rather than extend it, so it is a crucial point to not override it here. This was discussed in Binding other storage to profile · Issue #2520 · jupyterhub/zero-to-jupyterhub-k8s · GitHub.

          spawner.volumes.extend([
              {
                  "name": volume_name_template,
                  "persistentVolumeClaim": {"claimName": pvc_name_template},
              }
          ])

Hi @consideRatio,

To clarify: The problem I stumpled over was that the pvc_name_template is only used here (kubespawner/spawner.py at d913841afb59f1b9a4e1f8c69ada2189ee34729f · jupyterhub/kubespawner · GitHub) and thus inside initalizing the spawner and before the pre_spawn_hook. Eventhough I change it before spawning it has no effect because it is not used inside the spawn method :wink:

Hello, @consideRatio

I am not sure if I got it! I’m facing an issue really close to this! I use Ldap authentication and I want to create another profileList with a custom spark configuration different from the default.

What pieces I shall create to change the parameters inside spark-defaults.conf ? I want to provide a profile with pre-defined executors and drivers set yet and It is not clear for me enough which way I should go. Could you give me a hint?

In my case I have configured something similiar with @flori-uni. Some PVs exist on my cluster and those are being mounted to a user’s notebook when this user is a member of a group.

This works fine but when a pod stops due to cull, when the user spawns it anew those volumes are not mounted.
When the pod is manually evicted and spawns again the volumes are mounted.

So I’m guessing that cull overrides KubeSpawner configuration in my case at least.

async def custom_options_form(spawner):
    # Retrieve the user's authentication state and details
    auth_state = await spawner.user.get_auth_state()
    user_details = auth_state["auth0_user"]

    if user_details["app"] is not None and "groups" in user_details["app"]:
        # Retrieve the "app" scope and groups from the user's details
        groups = user_details["app"]["groups"]

        # Initialize a list to store the lifecycle hook commands
        lifecycle_hook_command = ""
        lifecycle_hook_commands_list = []

        # Define the base commands for chown and chmod for the lifecycle_hook_commands_list
        chown = "sudo chown -R jovyan:users /home/jovyan/"
        chmod_readWrite = "sudo chmod -R 777 /home/jovyan/"
        chmod_readOnly = "sudo chmod -R 555 /home/jovyan/"

        for group in groups:
            # Retrieve the group name and initialize a list to store subdirectories (if any)
            group_name = group["name"]
            group_subdirectories = []

            # Extend the spawner's volumes and volume_mounts with the group-specific details
            spawner.volumes.extend([
                {
                    "name": "pv-group-jupyterhub-" + group_name,
                    "persistentVolumeClaim": {"claimName": "pvc-group-jupyterhub-" + group_name}
                }
            ])

            spawner.volume_mounts.extend([
                {
                    "mountPath": "/home/jovyan/" + group_name,
                    "name": "pv-group-jupyterhub-" + group_name
                }
            ])

            lifecycle_hook_commands_list.append(chown + group_name + " && " + chmod_readWrite + group_name)

            if "subdirectories" in group:
                # If subdirectories exist, add them to the group_subdirectories list
                group_subdirectories.extend(group["subdirectories"])

                # Add chown and chmod commands for the group's base directory
                lifecycle_hook_commands_list.append(chmod_readWrite + group_name + '/')
                lifecycle_hook_commands_list.append(chown + group_name + '/')

                for i in range(0, len(group_subdirectories)):
                    # Iterate over the subdirectories and add chown and chmod commands for each
                    subdirectory_name = group_subdirectories[i]["name"]
                    subdirectory_permission = group_subdirectories[i]["permission"]

                    lifecycle_hook_commands_list.append(chown + group_name + '/' + subdirectory_name)

                    if subdirectory_permission == "readOnly":
                        lifecycle_hook_commands_list.append(chmod_readOnly + group_name + '/' + subdirectory_name)
                    elif subdirectory_permission == "readWrite":
                        lifecycle_hook_commands_list.append(chmod_readWrite + group_name + '/' + subdirectory_name)

        # Form the bash command that will be executed on the lifecycle hook 
        lifecycle_hook_command = " && ".join(lifecycle_hook_commands_list)

        if lifecycle_hook_command:
            # Update the spawner's lifecycle_hooks with the postStart hook and the generated lifecycle_hook_command. This will be executed after user's pod spawns.
            print(lifecycle_hook_command)
            spawner.lifecycle_hooks.update({
                "postStart": {
                    "exec": {
                        "command": ["/bin/sh", "-c", lifecycle_hook_command]
                    }
                }
            })        
    else:
        # If no groups are found for the user, print a message and spawn the pod
        print("No groups found")
        c.KubeSpawner.pre_spawn_hook = custom_options_form
        
# Set c.KubeSpawner.pre_spawn_hook to the custom_options_form function
c.KubeSpawner.pre_spawn_hook = custom_options_form

Hi Erik Sundell,
We are just working on jupyterhub. And we had a question when we started to work. We just want to restrict Jupyterhub kernal for some users same like the server. If there any steps please let me know.
Thanks

Regards
Saran

This is a great post, and thanks! I have a couple of silly questions about problem #1, and I’m sorry for the naive questions:

  1. Where is the custom_pre_spawn hook defined? Is it in the config.yaml file? If so, what’s the yaml tag?
  2. What does profileList get set to?

I am using LDAPAuthenticator, how do I restrict specific profile to specific LDAP group? Is there an example or guide?