Hi,
I’ve been setting up a Jupyterhub instance as a POC for my company. So far things are looking really amazing; however, I’m struggling a bit with the mapping of group membership to resource allocation. The core of my challenge is maintaining the group membership information. At present, I’ve figured out a path to get the full list of groups from Microsoft and then update the profiles and mounts based on group membership; however, Jupyterhub forgets the groups after some time making it all for naught. Can anybody provide any insight into how to prevent the group membership information from being forgotten?
Details:
I happen to be a member of several hundred groups and Microsoft will therefore not return the full list of groups to which I belong in the response token. One work around is for my IT department to define a list of groups in the SSO configuration so that the of groups returned to Jupyterhub is the intersection of the pre-configured list and the groups the user is a member of. This is painful as we have high turnover in IT and it takes days of re-explaining what they need to do every time. The other option suggested by Microsoft is that after I get the access token, I can make a second call to the graph api to get the full list of groups. I’ve figured out that recipe that works for this second path and with that information, I can update the profiles and mounts.
values.yaml (fragment)
...
hub:
config:
JupyterHub:
authenticator_class: azuread
AzureAdOAuthenticator:
admin_groups:
- TEAM-JUPYTERHUB-ADMIN
allowed_groups: ORG-ALL-PEOPLE
auth_state_groups_key: user.roles
authorize_url: https://login.microsoftonline.com/${redacted}/oauth2/v2.0/authorize
auto_login: yes
client_id: ${redacted}
client_secret: ${redacted}
enable_auth_state: yes
login_service: https://login.microsoftonline.com
manage_groups: yes
oauth_callback_url: https://${redacted}/hub/oauth_callback
tenant_id: ${redacted}
username_claim: samAccountname
scope:
- openid
- email
- profile
- GroupMember.Read.All
- Group.Read.All
- User.Read
...
Note 1: I did need to work my IT organization to enable the GroupMember.Read.All and Group.Read.All scopes, get access pre-approved by the admin and then they had to do something on the back end to inject the samAccountName in the auth response. They didn’t share any details about what they did, so I’m unable to provide that information.
Note 2: The auth_state_groups_key is set to ‘user.roles’ which is a list of groups IT pre-configured for my app. I was originally using the Authenticator ‘post_auth_hook’ method and found that it ran to late to discover I was an admin. I figured it would not be too painful to have IT setup that one group as a specially blessed group that would always be present in the token.
jupyterhub_config.py (fragments)
...
import logging
...
from kubespawner import KubeSpawner
from kubernetes_asyncio.client import V1Pod
from oauthenticator.azuread import AzureAdOAuthenticator
logger: logging.Logger = logging.getLogger(__name__)
...
async def insert_microsoft_entra_id_groups(
authenticator: AzureAdOAuthenticator,
auth_state: dict) -> dict:
'''Add the AD username to the auth state.
:param auth_state: The auth state instance.
:type auth_state: dict
:returns: The auth state with the AD groups added.
:rtype: dict
'''
if not isinstance(authenticator, AzureAdOAuthenticator):
return auth_state
get_members_url: str = 'https://graph.microsoft.com/v1.0/me/memberOf?$select=displayName'
access_token: str = ''
groups_user_is_a_member_of: list[str] = []
access_token = auth_state['token_response']['access_token']
get_more_data: bool = True
while get_more_data:
logger.debug(f"Get data from {get_members_url}...", end='')
get_members_response: Response = get(
get_members_url,
headers={'Authorization': f"Bearer {access_token}"}
).json()
received_groups: list[str] = [d['displayName'] for d in get_members_response.get('value', [])]
logger.debug(f"got {len(received_groups)} groups.")
groups_user_is_a_member_of.extend(received_groups)
if '@odata.nextLink' in get_members_response:
get_members_url = get_members_response['@odata.nextLink']
else:
get_more_data = False
groups_user_is_a_member_of.sort()
auth_state['entra_id_groups'] = groups_user_is_a_member_of
if auth_state.get('entra_id_groups'):
logger.debug('\n'.join(auth_state['entra_id_groups']))
else:
logger.warning("Microsoft Entra ID Groups Not Found")
return auth_state
async def custom_options_form(spawner: KubeSpawner) -> str:
'''Custom options form for the spawner.
:param spawner: The KubeSpawner instance.
:type spawner: kubespawner.KubeSpawner
:returns: The html form
:rtype: str
'''
groups = await get_groups(spawner)
if not hasattr(spawner, 'default_profile_list'):
spawner.default_profile_list = spawner.profile_list
spawner.profile_list = spawner.default_profile_list
if 'TEAM-JUPYTERHUB-GPU-ACCESS' in groups:
spawner.profile_list.extend([
{
'display_name': 'AI Team GPU server',
'description': 'Dynamically added for members of \"TEAM-JUPYTERHUB-GPU-ACCESS\"',
'default': False,
}
])
return spawner._options_form_default()
async def pod_customization(
spawner: KubeSpawner,
pod: V1Pod) -> V1Pod:
'''
This is a hook that can be used to modify the pod before it is created.
:param spawner: The KubeSpawner instance.
:type spawner: kubespawner.KubeSpawner
:param pod: The pod to modify.
:type pod: kubernetes_asyncio.client.V1Pod
:returns: The modified pod.
:rtype: kubernetes_asyncio.client.V1Pod
'''
groups = await get_groups(spawner)
if 'TEAM-JUPYTERHUB-AI-TEAM-SHARE-ACCESS' in groups:
inject_mount(pod, 'ai-team-share')
return pod
async def get_groups(spawner: KubeSpawner) -> list[str]:
'''
Get the groups the user is a member of.
:param spawner: The KubeSpawner instance.
:type spawner: kubespawner.KubeSpawner
:returns: The list of groups the user is a member of.
:rtype: list[str]
'''
groups: list[str] = []
auth_state = await spawner.user.get_auth_state()
if auth_state.get('entra_id_groups'):
groups = auth_state['entra_id_groups']
elif auth_state['user'].get('roles'):
groups = auth_state['user']['roles']
if groups:
logger.debug(f"User \"{spawner.user.name}\" is a member of {len(groups)} groups.")
else:
logger.debug(f"User \"{spawner.user.name}\" is not member of any groups")
return groups
def inject_mount(
pod: V1Pod,
name: str = 'ai-team-share'
) -> None:
'''Inject a mount to the pod.
:param pod: The pod to modify.
:type pod: kubernetes_asyncio.client.V1Pod
:param name: The name of the mount.
:type name: str
:returns: Nothing
:rtype: None
'''
pod.spec.volumes.append({
'name': name,
'persistentVolumeClaim': {
'claimName': name
}
})
pod.spec.containers[0].volume_mounts.append({
'name': name,
'mountPath': f"/mnt/{name}"
})
...
c.Authenticator.modify_auth_state_hook = insert_microsoft_entra_id_groups
c.KubeSpawner.options_form = custom_options_form
c.KubeSpawner.modify_pod_hook = pod_customization
...