Arguments passed to spawner do not seem to propagate to user pods

Summary of our components:

Environment: Kubernetes, five nodes, pods obey affinity and schedule on desired nodes.

Spawner: Kubespawner

Chart version: standard jupyterhub/jupyterhub chart version 0.8.2

Hub image: standard, unspecified in values.

Singleuser image: custom, built from jupyter/datascience-notebook. Nothing terribly interesting happens in my custom image, I just added a Python 2.7 environment, along with the cloning of a few private repositories and a bit of housekeeping.

Authenticator: oauthenticator.generic.GenericOAuthenticator (bundled with Jupyter) https://github.com/jupyterhub/oauthenticator/blob/5d8c7b810bfc977164f4cb94b8ec744f5d532a52/oauthenticator/generic.py

Database: Postgres, with persisted auth state.

Goal:

I would like to pass an auth token from a hub pod into a user pod that just authenticated. There are several loosely defined examples of this floating around github and the official docs, but none of these implementations have worked for me.

I need to add that I do not want to fork this project or have to maintain a codebase locally for chart deployment, i’d like to stay within the scope of what we can do with chart config values, and what we can pass in via extraConfig. The docs seem to suggest that token injection into a user pod should be possible from within the extraConfig block of chart values.

Current Approach:

The most promising solution seems to be to override the pre_spawn_start method for whichever authenticator class is in use. According to the docs, all authenticators should call this method upon completing. Authenticators — JupyterHub documentation.

The example above also overrides authenticate(), which we do not need, as GenericOAuthenticator implements it for us, and returns an auth_state JSON dictionary natively; of the form:

    return {
        'name': resp_json.get(self.username_key),
        'auth_state': {
            'access_token': access_token,
            'refresh_token': refresh_token,
            'oauth_user': resp_json,
            'scope': scope,
        }
    }

I do have auth working correctly and user pods spawning with names corresponding to their user data returned from our endpoint. Because of this, I know, or at least I hope, that this dictionary still lives somewhere in scope at the time that our spawner is invoked.

My extraConfig (in my chart values) looks something like this:

extraConfig:
00-pass_auth_token_to_user_pod.py: |

  class StatefulAuthenticator(GenericOAuthenticator):
    @gen.coroutine
    def pre_spawn_start(self, user, spawner):
      auth_state = yield user.get_auth_state()
      if not auth_state:
          print("Auth State disabled.")
          return
      spawner.environment['UPSTREAM_TOKEN'] = auth_state[access_token]
  
  c.JupyterHub.authenticator_class = StatefulAuthenticator
  c.Authenticator.enable_auth_state = True

And my auth config ( also in chart values) looks like this:

auth:
  state:
    cryptoKey: 'state_key12345'
    enabled: true
  type: custom
  custom:
    className: oauthenticator.generic.GenericOAuthenticator
    config:
      login_service: 'MyOauth'
      extra_params:
        client_id : 'id12345'
        client_secret: 'secret12345'
      client_id: 'id12345'
      client_secret: 'secret12345'

The Problem:

In this current configuration

className: oauthenticator.generic.GenericOAuthenticator

from the auth block overrides:

c.JupyterHub.authenticator_class = StatefulAuthenticator

from the extraConfig block, and the behavior is identical to simply omitting the portion of extraConfig where the new class is defined.

I can confirm that extraConfig is executing the python code it contains, so at the very least, I know this config block is not being skipped.

If I were to change

className: oauthenticator.generic.GenericOAuthenticator

to something like:

className: StatefulAuthenticator

the chart will deploy successfully, but the hub pod will fail to initialize, because this class name cannot be resolved. From that I am guessing that the auth config block is evaluated before the extraConfig one. Either that, or I simply don’t have a way of bringing a class definition from extraConfig into scope in a way that would allow me to use it for auth.className

I have also tried simply extending the Authenticator class, in case there is some kind of magic at work. That setup would change my extraConfig to look something like this:

extraConfig:
00-pass_auth_token_to_client_pod.py: |

  class StatefulAuthenticator(Authenticator):
    @gen.coroutine
    def pre_spawn_start(self, user, spawner):
      auth_state = yield user.get_auth_state()
      if not auth_state:
          print("Auth State disabled. ")
          return
      spawner.environment['UPSTREAM_TOKEN'] = 'TEST12345'
  
  c.JupyterHub.authenticator_class = StatefulAuthenticator
  c.Authenticator.enable_auth_state = True

This also doesn’t work, because we can’t actually use this class here either.

Having been unsuccessful so far, I have attempted a few purely diagnostic approaches to see if I can get SOMETHING of my choosing into a newly spawned user pod.

singleuser:
  environment: {'HARDCODED_VALUE': 'ABCD12345' }

I get nothing in the user pod after doing this when checking out environment variables.

c.Spawner.environment.update({'TEST' : 'VALUE'})

c.Spawner.environment=({'TEST' : 'VALUE'})

Still nothing in user pod environments when attempting to directly alter the spawner’s environment parameters.

Another approach I’ve seen is to override the spawner’s user_env method:

def user_env(self):
    env = super().user_env()
    env['access_token'] = access_token_for_user(self.user.name)
    return env

Or implemented in hub.extraConfig :

   class SpawnerWithEnv(Spawner):
     def user_env(self):
        env = super().user_env()
        env['access_token'] = access_token_for_user(self.user.name)
        return env

Once again, all I have is my hub.extraConfig, so this suffers from the same issue as the override for pre_spawn_start(). The class can’t be invoked, so we effectively can’t it.

I suspect my use case is fairly common, as the preservation of an auth token would allow notebooks to invoke REST/RPC methods exposed by the issuer of said token.

If I have no choice other than to fork/branch the repo and maintain my own Jupyterhub codebase I suppose I will. But having this feature on top of staying in-line with Jupyterhub master would be greatly desirable.

If anyone has worked through this problem before, or has any insight, I would greatly appreciate some second hand wisdom.

Summary:

I would like to pass an auth token, generated by Jupyter’s own GenericOauthenticator class, to the user pod created by the spawner following a successful auth.

I would like to do this purely by using the tools provided by chart values.yaml files, and in particular, the evaluated Python code in hub.extraConfig

Is this possible?

If so, how can one use an authenticator class defined in hub.extraConfig at the time the hub is initialized?

1 Like

I tried with the latest z2h helm chart version 0.9.0-beta.2 and was able to get it working.

  auth:
    type: custom
    custom:
      #className: "oauthenticator.azuread.AzureAdOAuthenticator"
      className: "CustomAzureAdOAuthenticator"

And extraconfig:

   authpasstoken: |
      from oauthenticator.azuread import AzureAdOAuthenticator
      from tornado import gen
      class CustomAzureAdOAuthenticator(AzureAdOAuthenticator):
        @gen.coroutine
        def pre_spawn_start(self, user, spawner):
          auth_state = yield user.get_auth_state()
          if not auth_state:
            # user has no auth state
            return
          # define some environment variables from auth_state
          print(auth_state)
          spawner.environment['AD_TOKEN'] = auth_state['access_token']
      c.JupyterHub.authenticator_class = CustomAzureAdOAuthenticator
      # Need to persist auth state in database.           
      c.Authenticator.enable_auth_state = True

I am not sure why, but you also need to enable auth persistance to make this work which means if you want to send the token to kubespawner as an env variable then you have to persist the token in the database.

   auth:
     state:
        enabled: true
        cryptoKey: "xxxxxxxxxx"

cryptokey is generated with the command:

openssl rand -hex 32

Two quick lines to warmly thank you for having posted this solution.
I’ve tried it on our kb8 based jhub installation and it works great!
-marco