How to extract azure groups_id after login with AzureAdOAuthenticator?

Good morning, I am doing a project with jupyterhub on k8s.
I have already configured an application on Microsoft Entra to handle authentication via azuread in the hub. In the configuration on Azure I added a custom claim to extract the group_id from the user. I would need to extract that information after login to customize the user experience depending on the azure group membership of the logged in user.
This is the configuration I currently use:

hub:    
  extraConfig:
    auth-config: |
      from oauthenticator.azuread import AzureAdOAuthenticator
      import logging
      ## Timeout (in seconds) before giving up on a spawned HTTP server
      #
      #  Once a server has successfully been spawned, this is the amount of time we
      #  wait before assuming that the server is unable to accept connections.
      c.Spawner.http_timeout = 600

      
      #------------------------------------------------------------------------------
      # Application configuration
      #------------------------------------------------------------------------------

      # This is an application.

      # The date format used by logging formatters for %(asctime)s
      c.Application.log_datefmt = '%Y-%m-%d %H:%M:%S'

      # The Logging format template
      c.Application.log_format = ' [%(name)s] %(message)s'

      # Set the log level by value or name.
      c.Application.log_level = 'TRACE'

      
      #------------------------------------------------------------------------------
      # Auth configuration
      #------------------------------------------------------------------------------

      c.JupyterHub.log_level = 'TRACE'
      c.LocalProcessSpawner.debug= True
      c.OAuthenticator.allow_all = True
      c.JupyterHub.shutdown_on_logout = True
      c.JupyterHub.authenticator_class = "azuread"
      c.AzureAdOAuthenticator.scope = "api://x/extract_groupsid"
      c.AzureAdOAuthenticator.enable_auth_state = True
      c.AzureAdOAuthenticator.manage_groups = True

      #------------------------------------------------------------------------------
      # Print auth
      #------------------------------------------------------------------------------

      async def auth_state_hook(spawner, auth_state):
          spawner.log.info("Auth state received: %s", auth_state)
      c.KubeSpawner.auth_state_hook = auth_state_hook


  config:
    ################## Auth for AAD ##############
    JupyterHub:
      authenticator_class: azuread
    Authenticator:
      enable_auth_state: true
    AzureAdOAuthenticator:
      client_id: "x"
      client_secret: "y"
      oauth_callback_url: "http://localhost/hub/oauth_callback"
      tenant_id: "z"
      username_claim: unique_name
      allow_all: true
      scope: "api://x/extract_groupsid"
    ################## Auth for AAD ##############
debug:
  enabled: true    

singleuser:
  image:
    name: jupyter/all-spark-notebook
    tag: latest
  storage:
    type: none
  cpu:
    limit: 2
    guarantee: 0.05
  memory:
    limit: 2G
    guarantee: 512M
  networkPolicy:
    egressAllowRules:
      privateIPs: true

scheduling:
  userScheduler:
    enabled: true
  podPriority:
    enabled: true
  userPlaceholder:
    enabled: true
    replicas: 4
  userPods:
    nodeAffinity:
      matchNodePurpose: require

cull:
  enabled: true
  timeout: 3600
  every: 300

Below are the hub logs: I noticed that the initial GET uses my custom scope (for group_id claims) but when I then print the auth_state, the returned tokens reference the other scopes EXCLUDING the configured custom one.


[I 2025-01-16 10:33:38.932 JupyterHub log:192] 302 GET /hub/oauth_login?next=%2Fhub%2F -> https://login.microsoftonline.com/TENANT_ID/oauth2/authorize?response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2Fhub%2Foauth_callback&client_id=X&state=[secret]&scope=api%3A%2F%2FTENANT_ID%2Fextract_groupsid (@::ffff:10.244.171.192) 2.33ms
[D 2025-01-16 10:33:35.463 JupyterHub log:192] 200 GET /hub/health (@192.168.58.3) 1.42ms

[I 2025-01-16 10:33:35.574 JupyterHub log:192] 302 GET / -> /hub/ (@::ffff:10.244.171.192) 0.97ms

[I 2025-01-16 10:33:35.587 JupyterHub log:192] 302 GET /hub/ -> /hub/login?next=%2Fhub%2F (@::ffff:10.244.171.192) 1.04ms

[I 2025-01-16 10:33:35.649 JupyterHub log:192] 200 GET /hub/login?next=%2Fhub%2F (@::ffff:10.244.171.192) 50.41ms

[D 2025-01-16 10:33:37.463 JupyterHub log:192] 200 GET /hub/health (@192.168.58.3) 0.79ms

[I 2025-01-16 10:33:38.930 JupyterHub oauth2:99] OAuth redirect: http://localhost/hub/oauth_callback

[D 2025-01-16 10:33:38.931 JupyterHub base:668] Setting cookie oauthenticator-state: {'httponly': True, 'expires_days': 1}

[I 2025-01-16 10:33:38.932 JupyterHub log:192] 302 GET /hub/oauth_login?next=%2Fhub%2F -> https://login.microsoftonline.com/TENANT_ID/oauth2/authorize?response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2Fhub%2Foauth_callback&client_id=X&state=[secret]&scope=api%3A%2F%2FTENANT_ID%2Fextract_groupsid (@::ffff:10.244.171.192) 2.33ms

[D 2025-01-16 10:33:39.461 JupyterHub log:192] 200 GET /hub/health (@192.168.58.3) 0.99ms

[D 2025-01-16 10:33:41.461 JupyterHub log:192] 200 GET /hub/health (@192.168.58.3) 0.81ms

[D 2025-01-16 10:33:41.824 JupyterHub roles:282] Assigning default role to User USER

[D 2025-01-16 10:33:41.849 JupyterHub base:668] Setting cookie jupyterhub-session-id: {'httponly': True, 'path': '/'}

[D 2025-01-16 10:33:41.850 JupyterHub base:672] Setting cookie for USER: jupyterhub-hub-login

[D 2025-01-16 10:33:41.850 JupyterHub base:668] Setting cookie jupyterhub-hub-login: {'httponly': True, 'path': '/hub/'}

[I 2025-01-16 10:33:41.851 JupyterHub base:937] User logged in: USER

[I 2025-01-16 10:33:41.851 JupyterHub log:192] 302 GET /hub/oauth_callback?code=[secret]&state=[secret]&session_state=[secret] -> /hub/ (USER@::ffff:10.244.171.192) 358.96ms

[D 2025-01-16 10:33:41.896 JupyterHub user:431] Creating <class 'kubespawner.spawner.KubeSpawner'> for USER:

[I 2025-01-16 10:33:41.901 JupyterHub log:192] 302 GET /hub/ -> /hub/spawn (USER@::ffff:10.244.171.192) 41.80ms

[D 2025-01-16 10:33:41.915 JupyterHub scopes:884] Checking access to /hub/spawn via scope servers

[D 2025-01-16 10:33:41.915 JupyterHub scopes:697] Argument-based access to /hub/spawn via servers

[I 2025-01-16 10:33:41.917 JupyterHub <string>:50] Auth state received: {'access_token': 'Z', 'refresh_token': 'Y', 'id_token': 'X', **'scope': ['Directory.Read.All', 'Group.Read.All', 'GroupMember.Read.All', 'openid', 'profile', 'User.Read', 'User.Read.All', 'User.ReadBasic.All'], 'token_response': {'token_type': 'Bearer', 'scope': 'Directory.Read.All Group.Read.All GroupMember.Read.All openid profile User.Read User.Read.All User.ReadBasic.All'**, 'expires_in': '4487', 'ext_expires_in': '4487', 'expires_on': '1737028109', 'not_before': '1737023321', 'resource': '00000002-0000-0000-c000-000000000000', 'access_token': 'Z', 'refresh_token': 'Y', 'id_token': 'X'}, 'user': {'aud': 'FERE', 'iss': 'https://sts.windows.net/CC', 'iat': 1, 'nbf': 2, 'exp': 1737027221, 'amr': ['pwd', 'mfa'], 'family_name': 'PLUTO', 'given_name': 'PIPPO', 'ipaddr': '213.174.178.226', 'name': 'XY', 'oid': 'R', 'rh': 'V', 'sub': 'AQ7esI5E6DnjmGOFADl73LySnAdQRkN3Tetz6gb87hM', 'tid': 'P', 'unique_name': 'USER, 'upn': 'USER, 'ver': '1.0', 'wids': ['X']}}

[I 2025-01-16 10:33:41.917 JupyterHub <string>:52] User configuration: <User(USER 0/1 running)>

In azure this is the configuration:


how can I solve?

PS: If I use browser login to the address of the GET made by jupyterhub (with specific scope), copy the resulting code to me, and then run a Sign in to your account call entering the extracted code value, and grant_type authorization_code, I get back the value of groups_id.

Does the Azure console give you access to any logs? For example, can you check whether Azure actually received and parsed the custom scope? If you deliberately add a non-existent scope like api://TENANT_ID/non_existent does Azure give you an error, or does it ignore it?

Which response/token does Azure put the groups_id in? There are several in OAuth.

Thank you for your help. I find the solution: it was missing the openid scope. After adding that, it works!! :slight_smile: