How to set up LTI 1.3 authentication for Jupyterhub

martinclaus · February 6, 2023, 4:17pm

We want to upgrade our integration of Jupyterhub into our LMS at our university from LTI 1.1 to 1.3. Currently, I am trying to set up ltiauthenticator.lti13.auth.LTI13Authenticator (1.3.0) to work with the saltire LTI platform simulator to exclude any issues coming from the LMS we are using and to have easier debugging.

The relevant helm config (z2jh 1.2.0) looks currently like this:

hub:
  config:
    JupyterHub:
      authenticator_class: ltiauthenticator.lti13.auth.LTI13Authenticator
    LTI13Authenticator:
      username_key:  "lms_user_id"
      authorize_url: "https://saltire.lti.app/platform/auth"
      client_id: "saltire.lti.app"
      endpoint: "https://<MYHUB.COM>/hub/oauth_callback"
      token_url: "https://saltire.lti.app/platform/token/<SOME_HASH>"
  extraEnv:
    # full path to the RSA private key in PEM format, required by LTI13JWKSHandler.
    # https://github.com/jupyterhub/ltiauthenticator/blob/aa769c2cc9fc40703fd3e71d2afda2be4a741f95/ltiauthenticator/lti13/handlers.py#L261
    LTI13_PRIVATE_KEY: <FULL_PATH_TO_PRIVATE_KEY_PEM>

while on the platform side, I am required to provide:

Initiate login URL: https://<MYHUB.COM>/hub/oauth_login
Redirection URI(s): [https://<MYHUB.COM>/hub/oauth_callback,]
Public keyset URL: https://<MYHUB.COM>/hub/lti13/jwks
Public key: <PUBLIC_KEY_OF_LTI13_PRIVATE_KEY_SET_ABOVE>

Those information are extracted from the /hub/lti13/config endpoint response which may be used by LMS to extract the configuration of a tool.

My problem is that the LTI13Authenticator seems to register only the /hub/lti13/config endpoint. All the other, i.e. /hub/oauth_login, /hub/oauth_callback, /hub/lti13/jwks respond in a 404 : Not found. This also seems to make sense, since the Authenticator overrides the get_handlers method of it’s superclass and thereby preventing the mentioned routes to be set.

So either my understanding of how the authentication is suppose to work is totally wrong, or LTI13Authenticator is fundamentally broken. Any help is very much appreciated.

(@consideRatio @yuvipanda ping)

bpfrd · December 21, 2023, 7:49am

Hi,

I would like to use lti as well, do we need to install lti only on the hub?

best

mahendrapaipuri · December 21, 2023, 3:46pm

Yes, you install the authenticator only on hub side.

danielmcquillen · April 3, 2024, 5:47am

Hi @martinclaus … working on adding an LTI 1.3 integration to JupyterHub from a small LMS I’ve created. Did you get any farther on your own effort to establish an LTI 1.3 connection with a JupyterHub server?

consideRatio · April 3, 2024, 6:45am

@martinclaus has done an amazing work in the ltiauthenticator project! I think it is functional now and documented for use with lti13

danielmcquillen · April 24, 2024, 3:28am

Working with ltiauthenticator … ( thanks @martinclaus ! ) … in order to integrate JupyterHub as a tool for a custom LMS as platform.

For initial development I’m running my LMS on localhost:8001, and configuring ltiauthenticator to run on localhost:8000

I’ve gotten through the preflight request and am able to launch JupyterHub as a tool via the redirect URL, but it’s responding with "Fail to fetch data from the url, err: “HTTP Error 404: Not Found” when I send the required information as POST to the /hub/lti13/oauth_callback url.

…but I’m thinking the app can’t access :8001 from inside docker, as I see this in the log:

jupyterhub-1  | 05:24:25.434 [ConfigProxy] warn: 404 GET /lti/security/jwks/ 
jupyterhub-1  | [W 2024-04-24 05:24:25.435 JupyterHub web:1873] 400 POST /hub/lti13/oauth_callback (::ffff:192.168.65.1): Fail to fetch data from the url, err: "HTTP Error 404: Not Found"

So that 404 really means the /lti/security/jwks/ URL can’t be found.

Any suggestions on how to configure JupyterHub running in Docker on port 8000 to be able to reach out to localhost:8001?

IvanYingX · June 7, 2024, 11:55am

Hi @danielmcquillen, sorry I can’t be very useful, but I am also working on creating a small LMS and I need to configure JupyterHub to identify my LMS as a client.

Can I ask, what configuration have you used to get started? I am a bit lost in terms of getting started from a custom LMS.

Thanks in advance!

danielmcquillen · August 20, 2024, 6:22am

@IvanYingX Sorry for the delay! Only now getting back to working on this issue. Nice to hear someone else is trying the same thing

If it’s interesting to you, you can see my work on the LMS LTIv1.3 side of things in the open-source LMS I’ve written (repo is here : GitHub - ScienceCommunicationLab/KinesinLMS: A simple, open-source Learning Management System.). It’s a Django project and most of the relevant code is in the lti app folder.

Still trying to figure out the particulars of the OpenID connect passive login flow but I think I’ve got most of it.

As far as getting JupyterHub set up…gah. I’ve switched to trying to install “The littlest JupyterHub” on a Digital Ocean droplet and seeing if I can get it to work with my LMS

I’m still stuck with getting a 404 when the LMS tries to connect with it (not sure yet whether it’s in the initial login call or the second oauth redirect call).

I’ll post more here if I can make progress. Still not clear to me how one designates a particular JupyterHub notebook as the final resource that’s meant to be shown once the whole LTIv1.3 authorization is complete.

Note that it appears the conventional way to configure tljh is to use a command line call for each property you want to set, rather than setting a config file. More here: Configuring TLJH with tljh-config — The Littlest JupyterHub documentation

So I think the trick is to just use that command to set the properties expected by ltiauthenticator

More soon…

danielmcquillen · August 20, 2024, 6:29am

With the properties expected by ltiauthenticator listed here: Configuration reference — LTI Autenticator for JupyterHub documentation

IvanYingX · August 20, 2024, 9:56am

Hi @danielmcquillen and thanks for your reply!

In my case we are not using TLJH, so not sure if this will be useful. I set everything up with two different methods:

For production it was using k8s in multiple EC2 instances
and for local dev I am using docker compose, where one docker image replicates the k8s cluster.

The main difference is that in local dev, to configure the hub, it uses jupyterhub_config.py, while for k8s, everything is inside the helm chart. Another difference is that the spawner in local uses DockerSpawner, whereas for k8s it uses KubeSpawner, but they are very similar configuration-wise

I finally managed to make the authenticator work using FusionAuth (https://fusionauth.io/), but there are a few tweaks I had to make. Not sure if it will be relevant to your authentication tool, but here’s what I could find:

The LTIAuthenticator class is expecting a few keys in the id_token, such as “version”, “deployment_id” and other parameters. I couldn’t add these properties from the LMS itself, so I used a trick on FusionAuth that allows you to populate your token by making it first go through a Lambda function:

function populate(jwt, user, registration) {
	jwt["https://purl.imsglobal.org/spec/lti/claim/version"] = "1.3.0";
  jwt["https://purl.imsglobal.org/spec/lti/claim/deployment_id"] = <your deployment id (I randomly created one)
  jwt["https://purl.imsglobal.org/spec/lti/claim/message_type"] = "LtiResourceLinkRequest";
  jwt["https://purl.imsglobal.org/spec/lti/claim/roles"] = [
    "http://purl.imsglobal.org/vocab/lis/v2/institution/person#Student",
    "http://purl.imsglobal.org/vocab/lis/v2/membership#Learner",
    "http://purl.imsglobal.org/vocab/lis/v2/membership#Mentor"
  ];
  jwt["https://purl.imsglobal.org/spec/lti/claim/target_link_uri"] = <your jupyterhub url>
  jwt["https://purl.imsglobal.org/spec/lti/claim/resource_link"] = {
    id: user.data.resource_link_id
  };
  jwt.email = user.email;
}

Then, in the authenticator tool you also need to add the allowed redirect URLs and Authorized Request URLs.
Another thing that caused me to get stuck for a long time was that your authentication tool and your LMS have to be in the same domain - from what I can see in your case, it should be fine

In case it helps, I spent a lot of time debugging, so I added this custom authenticator simply to see the logs and to see if some of the methods were called. In my helm chart, under hub → extraConfig, I added this:

    customLTIAuthenticator: |
      from ltiauthenticator.lti13.auth import LTI13Authenticator
      from ltiauthenticator.lti13.handlers import LTI13LoginInitHandler, LTI13CallbackHandler, LTI13ConfigHandler
      from ltiauthenticator.lti13.validator import LTI13LaunchValidator
      from ltiauthenticator.lti13.error import ValidationError
      from ltiauthenticator.utils import convert_request_to_dict
      from traitlets import List, Set, Unicode
      from typing import Any, Dict, Optional, cast
      from tornado.httputil import url_concat
      from tornado.web import HTTPError


      class MyLTI13LoginHandler(LTI13LoginInitHandler):
          def __init__(self, *args, **kwargs):
              super().__init__(*args, **kwargs)
              print("Custom LTI Login Handler init")
              print(self.__dict__)
          
          def check_xsrf_cookie(self):
              print("=====Attempting to check cookies from Login====")
              """
              Do not attempt to check for xsrf parameter in POST requests. LTI requests are
              meant to be cross-site, so it must not be verified.
              """
              return

          def authorize_redirect(
              self,
              redirect_uri: str,
              login_hint: str,
              nonce: str,
              client_id: str,
              state: str,
              purl_args: Dict[str, str] = None,
              lti_message_hint: Optional[str] = None,
          ) -> None:
              """
              Overrides the OAuth2Mixin.authorize_redirect method to to initiate the LTI 1.3 / OIDC
              login flow with the required and optional arguments.

              User Agent (browser) is redirected to the platform's authorization url for further
              processing.

              References:
              https://openid.net/specs/openid-connect-core-1_0.html#AuthRequest
              http://www.imsglobal.org/spec/lti/v1p3/#additional-login-parameters-0

              Args:
                redirect_uri: redirect url specified during tool installation (callback url) to
                  which the user will be redirected from the platform after attempting authorization.
                login_hint: opaque value used by the platform for user identity
                nonce: unique value sent to allow recipients to protect themselves against replay attacks
                client_id: used to identify the tool's installation with a platform
                state: opaque value for the platform to maintain state between the request and
                  callback and provide Cross-Site Request Forgery (CSRF) mitigation.
                lti_message_hint: similarly to the login_hint parameter, lti_message_hint value is opaque to the tool.
                  If present in the login initiation request, the tool MUST include it back in
                  the authentication request unaltered.
              """
              handler = cast(RequestHandler, self)
              # Required parameter with values specified by LTI 1.3
              # https://www.imsglobal.org/spec/security/v1p0/#step-2-authentication-request
              args = {
                  "response_type": "id_token",
                  "scope": "openid",
                  "response_mode": "form_post",
                  "prompt": "none",
              }
              # Dynamically computed required parameter values
              args["client_id"] = client_id
              args["redirect_uri"] = redirect_uri
              args["login_hint"] = login_hint
              args["nonce"] = nonce
              args["state"] = state

              if lti_message_hint is not None:
                  args["lti_message_hint"] = lti_message_hint
              if purl_args:
                  args.update(purl_args)
              print("Args for authenticate")
              url = self.authenticator.authorize_url
              handler.redirect(url_concat(url, args))

          def get_purl_args(self, args: Dict[str, str]) -> Dict[str, str]:
              """
              Return value of optional argument or None if not present.
              """
              PURL_ARGS = [
                  "https://purl.imsglobal.org/spec/lti/claim/launch_presentation",
                  "https://purl.imsglobal.org/spec/lti/claim/tool_platform",
                  "https://purl.imsglobal.org/spec/lti/claim/deployment_id",
                  "https://purl.imsglobal.org/spec/lti/claim/message_type",
                  "https://purl.imsglobal.org/spec/lti/claim/version",
                  "https://purl.imsglobal.org/spec/lti/claim/resource_link",
                  "https://purl.imsglobal.org/spec/lti/claim/context",
              ]
              values = {}
              for purl_arg in PURL_ARGS:
                  value = self._get_optional_arg(args, purl_arg)
                  if value:
                      values[purl_arg] = value
              
              return values
          
          def post(self):
              """
              Validates required login arguments sent from platform and then uses the authorize_redirect() method
              to redirect users to the authorization url.
              """
              print("------POST method called-------")
              validator = LTI13LaunchValidator()
              args = convert_request_to_dict(self.request.arguments)
              print("------POST method called-------", args)
              self.log.debug(f"Initial login request args are {args}")

              # Raises HTTP 400 if login request arguments are not valid
              try:
                  validator.validate_login_request(args)
              except ValidationError as e:
                  raise HTTPError(400, str(e))

              login_hint = args["login_hint"]
              self.log.debug(f"login_hint is {login_hint}")

              lti_message_hint = self._get_optional_arg(args, "lti_message_hint")
              client_id = self._get_optional_arg(args, "client_id")

              # lti_deployment_id is not used anywhere. It may be used in the future to influence the
              # login flow depending on the deployment settings. A configurable hook, similar to `Authenticator`'s `post_auth_hook`
              # would be a good way to implement this.
              # lti_deployment_id = self._get_optional_arg(args, "lti_deployment_id")
              purl_args = self.get_purl_args(args)

              redirect_uri = self.get_redirect_uri()
              self.log.debug(f"redirect_uri is: {redirect_uri}")

              # to prevent CSRF
              state = self.generate_state()

              # to prevent replay attacks
              nonce = self.generate_nonce()
              self.log.debug(f"nonce value: {nonce}")

              # Set cookies with appropriate attributes
              handler = cast(RequestHandler, self)
              handler.set_secure_cookie('oauth_state', state, secure=True, httponly=True, samesite='None')
              handler.set_secure_cookie('oauth_nonce', nonce, secure=True, httponly=True, samesite='None')

              self.authorize_redirect(
                  client_id=client_id,
                  login_hint=login_hint,
                  lti_message_hint=lti_message_hint,
                  nonce=nonce,
                  redirect_uri=redirect_uri,
                  state=state,
                  purl_args=purl_args,
              )
              


          # GET requests are also allowed by the OpenID Connect launch flow:
          # https://www.imsglobal.org/spec/security/v1p0/#fig_oidcflow
          #
          get = post

      class MyLTI13CallbackHandler(LTI13CallbackHandler):
          def __init__(self, *args, **kwargs):
              super().__init__(*args, **kwargs)
              print("Custom LTI CallBack Handler init")
              print(self.__dict__)
          
          async def get(self):
              print("----------Get method used------------")
              await self.post()

          def decode_and_validate_launch_request(self) -> Dict[str, Any]:
              """Decrypt, verify and validate launch request parameters.

              Raises subclasses of `ValidationError` of `HTTPError` if anything fails.

              References:
              https://openid.net/specs/openid-connect-core-1_0.html#IDToken
              https://openid.net/specs/openid-connect-core-1_0.html#ImplicitIDTValidation
              """
              print("Callback self.request", self.request)
              print("Callback self.request.arguments", self.request.arguments)
              validator = LTI13LaunchValidator()

              args = convert_request_to_dict(self.request.arguments)
              self.log.debug(f"Initial launch request args are {args}")

              validator.validate_auth_response(args)

              # Check is state is the same as in the authorization request issued
              # constructed in `LTI13LoginInitHandler.post`, prevents CSRF
              self.check_state()
              print("______Verifying and decoding jwt______")
              id_token = validator.verify_and_decode_jwt(
                  encoded_jwt=args.get("id_token"),
                  issuer=self.authenticator.issuer,
                  audience=self.authenticator.client_id,
                  jwks_endpoint=self.authenticator.jwks_endpoint,
                  jwks_algorithms=self.authenticator.jwks_algorithms,
              )
              print("Id Token: ", id_token)
              print("______Validating id token______")
              validator.validate_id_token(id_token)
              validator.validate_azp_claim(id_token, self.authenticator.client_id)

              # Check nonce matches the one that has been used in the authorization request.
              # A nonce is a hash of random state which is stored in a session cookie before
              # redirecting to make authorization request. This mitigates replay attacks.
              #
              # References:
              # https://openid.net/specs/openid-connect-core-1_0.html#NonceNotes
              # https://auth0.com/docs/get-started/authentication-and-authorization-flow/mitigate-replay-attacks-when-using-the-implicit-flow
              self.check_nonce(id_token)
              print("Checks bypassed")
              print("=============Returning Id Token=========")
              return id_token

          def check_xsrf_cookie(self):
              print("=====Attempting to check cookies from callback====")
              """
              Do not attempt to check for xsrf parameter in POST requests. LTI requests are
              meant to be cross-site, so it must not be verified.
              """
              return
      
      class MyLTI13ConfigHandler(LTI13ConfigHandler):
          def check_xsrf_cookie(self):
              print("=====Attempting to check cookies from config====")
              """
              Do not attempt to check for xsrf parameter in POST requests. LTI requests are
              meant to be cross-site, so it must not be verified.
              """
              return

      class MyAuthenticator(LTI13Authenticator):
          login_handler = MyLTI13LoginHandler
          callback_handler = MyLTI13CallbackHandler
          config_handler = MyLTI13ConfigHandler
          auto_login = True
          login_service = "LTI 1.3"
          username_key = Unicode("email")
          client_id = Set({"b5b6ba99-a446-4fdb-806e-0d642f3eb6c5"})
          authorize_url = Unicode("<authorizartion_tool_URL>/oauth2/authorize")
          jwks_endpoint = Unicode("<authorizartion_tool_URL>/.well-known/jwks.json")
          issuer = Unicode("acme.com")

          async def authenticate(self, handler, data=None):
              print("Custom LTI Authenticator authenticate")
              print(handler)
              print(data)
              return await super().authenticate(handler, data)
          
          async def pre_spawn_start(self, user, spawner):
              print("Custom Pre Spawn Start called")
              print(user)
              print(spawner)

      c.JupyterHub.authenticator_class = MyAuthenticator

Most methods are just copy pasted from the original codebase, I added them just in case I wanted to add prints statements eventually

Then, from the LMS, I tell it to go open localhost:8080 and add some parameters:

const launchParams = {
iss: ‘acme.com’, (or the issues you used in your application in the authenticator tool)
sub: <user_id in your LMS>
name:
email:
exp: Math.floor(Date.now() / 1000) + 60,
iat: Math.floor(Date.now() / 1000),
target_link_uri:
client_id:
login_hint: <user email - or what you want the user to use as the authentication login>
};

And these will be used as the searchParams (for example localhost:8080?iss=acme.com&sub=user_id…)

I am afraid I can’t help more, I haven’t used TLJH, so I don’t know what config it requires, but I hope this really helps

Good luck with your project!

danielmcquillen · August 28, 2024, 1:16am

@IvanYingX Thanks so much for the thorough breakdown of your attempts. I’ll probably be mining that for a while, but it looks like most of your comments are concerned with the second part of the OpenID connect sequence (the “redirect url”), while I’m still stuck on the first call (the “login url”).

I’m giving the K8S approach a go … I’ve managed to get a cluster on DigitalOcean running with JupyterHub, but I’m confused: hows does one log in as an “Admin” to JupyterHub to e.g. create notebooks for students to use, when the LTI13Authenticator says that when you use it, you can no longer log into JupyterHub the conventional way, you have to use the LTIv1.3 connection from your LMS?

Also, I guess I don’t have a good mental model for :

how an instructor/admin manages which students (e.g. usernames?) JupyterHub should expect to be logging in to access a notebook.
how an instructor/admin would set up that series of notebooks for those students to use for a given course. Let’s say there’s 10 notebook “exercises” how would the instructor map those to the students who should have access to them for a given course.

IvanYingX · August 30, 2024, 9:53am

@danielmcquillen no worries! I’m glad I can help at least a bit. I am not sure about the login url. If I understand correctly, you can connect to the hub using your browser, but when you try to do so through the LMS, it throws a 404? If that’s the case, would you mind sharing some screenshots of the Network tab in your browser inspector?

Regarding the notebooks, maybe someone with more expertise can correct me, but I think JupyterHub doesn’t have a native way to share notebooks that you create from the admin account. AFAIK, the admin account allows you to spawn or delete user containers.But I know there are libraries that allow users to share the notebooks with the rest of the cluster like nbgrader (nbgrader — nbgrader 0.9.3 documentation).

In my case, what I am doing to share the notebooks is using nbgitpuller (GitHub - jupyterhub/nbgitpuller: Jupyter server extension to sync a git repository one-way to a local path), which essentially makes a git pull from a repo of your choice every time a container is spawned. So you can upload your notebooks to a repo, and they will be shared. Just as a note, users will be able to modify the notebook, but that won’t affect the original notebook in the repo.

For the two points you mentioned:

Usually auth tools should grant access to a user if they are already registered in that application. You can grant access through OpenID to the LMS, but then block them from JupyterHub by simply not adding them to the list of JUpyterhub registrations.

If you want to be more specific about the notebooks that are accessible by the user, you can tweak that in the jupyterhub_config.py file (if you are using DockerSpawner) or in your helm chart (if you are using K8s).
When you spawn a container, you can pass parameters to it, which can be read by the spawner. That can modify the notebook_dir env variable (so the user is forced to go to a specific path without being able to move up - however, they can still check other paths inside a notebook, so might not be ideal) or you can run a command to grant read access to that user using chmod (haven’t tried this one, so I’m not sure it would work tbh)

I think the point above should answer this point, but to give a bit more detail, you can store info about the path to be opened in the database, so whenever you click on a link in the LMS, the path is sent to the spawner.

Maybe I am going too deep in the weeds, so I’ll stop giving ideas now

Hope that’s helpful, and yeah, if you can send the screenshots I might be able to help a bit more

danielmcquillen · September 1, 2024, 11:41pm

Thanks @IvanYingX.

To recap then: you’re using nbgitpuller to point at a notebook you want the student to access, and then in the “launch” LTIv1.3 redirect request from the LMS to JupyterHub, you’re including information about that notebook in the Open ID json web token.

If you have time (which you probs don’t!!) could you post the details of the web token you’re sending, which would show how you’re telling JupyterHub which notebook you want the student to be using? Or perhaps it’s an extra GET variable outside the token…

My mental model is that if an instructor has an extensive, say, three month course, there will be maybe say 5 or 10 different notebooks for different units in the course as the student progresses.

So I imagine in one unit your web token will contain info like (informally) “show the first-concept-notebook” … and then another request would be “show the second-concept-notebook” and so on.

Although I haven’t gotten past the Helm-driven setup stage yet (sigh), when I do I’m not sure how I should think about the organization of multiple notebooks for multiple courses with multiple instructors and how that’s kept synchronized with the LMS. It could be there’s just a lot of busy work keeping the two in sync via manual input in configuration screens.

I don’t think there’s any guidance on this from the LTI Authenticator directly…so looks like a lot of tacit knowledge I need to build up by asking experts (like you!).

I noticed the docs mention a “Data 8” course at UC Berkeley that used JupyterHub extensively…I wonder how they managed the admin via an LMS, or if they didn’t even use an LMS and therefore didn’t have these kinds of things to worry about.

IvanYingX · September 3, 2024, 1:00am

Hey @danielmcquillen

I will try to be as detailed as possible. There is a lot of context and I don’t want to saturate the message with irrelevant information, but if anything is not crystal clear, just let me know.

Ok, so I am currently creating an app using nextjs and React. When I click a button to open the hub, it calls this function:

  const launchLtiInIframe = async () => {
    if (!launchParams) {
      return;
    }
    if (clientId === LTI_JUPYTERHUB_APPLICATION_ID) {
      await launchJupyterHub();
    }
    setIframeUrl(
      `${launchUrl}?${Object.entries(launchParams)
        .map(([key, value]) => `${key}=${value}`)
        .join('&')}`
    );
  };

where launchUrl is <your_jupyterhub_url>/hub/lti13/oauth_login and launchParams is the parameters you add at the end of the URL. In my case, they are:

const session = auth() // This comes from nextauth.js, and essentially has info about the name, id and email of the logged in user. You can use any name and email, but make sure they are registered in your auth tool

const launchParams = {
    iss: 'acme.com'
    sub: session?.user.id,
    name: session?.user.name,
    email: session?.user.email,
    exp: Math.floor(Date.now() / 1000) + 60,
    iat: Math.floor(Date.now() / 1000),
    target_link_uri: activity?.url,
    client_id: //id of your auth tool application
    login_hint: session?.user.email
  };

All this info is just to open jupyterhub in an iframe. However, you can see I haven’t mentioned anything about the notebook that I want to open. This is done in the launchJupyterHub() function:

const launchJupyterHub = async () => {
    await fetch(`/api/jupyterhub/server`, {
      method: 'POST',
      body: JSON.stringify({
        lessonName: activity.title,
        lessonId: activity.sectionId,
      }),
    });
  }

This endpoint looks like this:

const launchJupyterHubServer = async (
  lessonName: string,
  lessonId: string,
  email: string
) => {
  const url = `${JUPYTERHUB_URL}/users/${email}/server`;
  const body = {
    "lesson_name": lessonName,
    "lesson_id": lessonId,
    "lesson_dir": lessonName,
  };
  const response = await fetch(url, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      "Authorization": `token ${JUPYTERHUB_API_TOKEN}`
    },
    body: JSON.stringify(body),
  });
  return response;

}

export async function POST(
  request: NextRequest,
) {
  const session = await auth();
  const email = session?.user.email;
  if (!email || !session) {
    return Response.error();
  }
  const { lessonName, lessonId, courseId } = await request.json();
  const launchResponse = await launchJupyterHubServer(lessonName, lessonId, courseId, email);

  return Response.json({
    data: { launchResponse },
  });
}

This is coded in a nextjs route, but you can create your own endpoints on AWS for example. Make sure you created a JupyterHub Token with admin privileges (I will explain this shortly). You can see that I am sending information about the lesson_name, lesson_id, and lesson_dir to the jupyterhub endpoint. This endpoint is from jupyterhub, and you can check more endpoints here: JupyterHub REST API — JupyterHub documentation. The one that I am using here is this: JupyterHub REST API — JupyterHub documentation.

So, once you make a POST request to your hub, you can read the payload you sent. Let’s take a look at the configuration file to see how this is read. This is how I did it for KubeSpawner, but it’s very easy to extend it to DockerSpawner (in case you want to use a Docker container to replicate a k8s cluster - ideal for local dev env).

Just a disclaimer, I wrote this in a very debugging-friendly way, but there are more elegant ways to write it.

Inside the config.yaml file for my helm chart, I have this:

hub:
  extraConfig:
    customSpawner: |
          import os
      import logging
      from kubespawner.spawner import KubeSpawner

      class CustomSpawner(KubeSpawner):
          def start(self):
              lesson_id = self.user_options.get("lesson_id", "")
              lesson_name = self.user_options.get("lesson_name", "")
              lesson_dir = self.user_options.get("lesson_dir", "")
              self.environment['LESSON_ID'] = lesson_id
              self.environment['LESSON_NAME'] = lesson_name
              self.notebook_dir = 'lessons/' + lesson_dir
              return super().start()

      c.JupyterHub.spawner_class = CustomSpawner

NOTE: I am using LESSON_ID and LESSON_NAME as env variables so they can be used by some of my extensions, but you can omit them.

YOu can see that whatever I passed in the payload can be read under self.user_options. One of the variables is lesson_dir, which points to the directory I want to open. In this particular case, nbgitpuller is pulling from a repo that has different notebooks

into a folder called lessons. In case you need to know how to do it, this is the command I am using inside config.yaml

singleuser:
  lifecycleHooks:
    postStart:
      exec:
        command:
          [
            "gitpuller",
            "https://{your Github Token}@github.com/{your repo}",
            "main", // or the branch you are using
            "lessons",
          ]

I think that’s everything I can help with for organizing notebooks. One important thing I should share are:

If the notebook_dir you pass doesn’t exist, the spawn will fail.
notebook_dir should end with a /, so it knows it’s a directory. Otherwise it can also fail

Sorry if the post is getting too long, I know it’s a lot of information.
So, answering the first question:

It’s not in the token itself, but in the body of the POST request to spawn a new container.

I haven’t reached any part where I need a particular instructor to deal with certain notebooks, but I guess one thing you can do is creating a list of relationships between instructor and notebooks in your LMS, so these relationships are handled upstream rather than in the notebooks themselves.

Also, related to that topic, one idea I had for syncing the notebook to the LMS in terms of completion status (for example) is by passing the lesson_id (that’s why you could see it in the earlier examples). Thus, a user can mark something as completed in the notebook:

And that would be checked in the LMS:

The lesson_id is passed to the spawner and this will set it as a env variable. Then, an extension can send a request to my endpoint to modify that value in the LMS database. It is a bit convoluted, so you might prefer an easier alternative (I just couldn’t think of any )

I think that’s pretty much everything I could think of. Not sure if this was more confusing (I hope not!)

Let me how that goes and good luck!

danielmcquillen · September 3, 2024, 1:11am

@IvanYingX Thanks. Great stuff. I’m looking forward to going through it once I get get my cluster set up and spawning notebooks correctly (Hoping for expect help in Spawn failed: Timeout even when start_timeout is set to 3600 seconds - #18 by danielmcquillen)

danielmcquillen · September 5, 2024, 2:17am

@IvanYingX My (poor) understanding is that the default workflow for second part of the LTIv1.3 interaction with JupyterHub would be to use the LTIAuthenticator’s oauth_callback endpoint, as explained here: LMS Integration — LTI Autenticator for JupyterHub documentation

My go to video for trying to understand the LTIv1.3 OIDC Connect steps is the tutorial video by Claude Vervoot

Just curious: why did you instead choose to make an API call to a different endpoint in JupyterHub rather than make the final redirect call back to https://<my-hub.domain.com>/hub/lti13/oauth_callback.

Since the end result of an LTI connection to JupyterHub as a “tool” would be to show a particular notebook to a particular student, I would have thought the LTIAuthenticator plugin would handle taking the student to particular notebook as part of that oauth_callback method.

However, I can’t seem to find anywhere in the LTIAuthenticator docs what information needs to be sent as part of the call to JupyterHub’s oauth_callback to show the intended notebook.

Seems strange that you have to diverge from the usual LTIv1.3 workflow and implement it yourself in a custom API call. But I could be misunderstanding all kinds of things!

@martinclaus can you confirm that LTIAuthenticator does indeed have logic to open and show a particular notebook as part of its handling requests to the oauth_callback endpoint?

I see this block of code in the LTI13CallbackHandler:

    def get_next_url(self, user=None):
        """Get the redirect target from the state field"""
        state = self._get_state_from_url()
        next_url = _deserialize_state(state).get("next_url")
        if next_url:
            return next_url
        # JupyterHub 0.8 adds default .get_next_url for a fallback
        return super().get_next_url(user)

…but I’m not sure if “next_url” is meant to point to a notebook (or notebook spawning url) and, if so, how “next_url” is introduced into the “state” part of the OIDC Connect workflow between platform (my LMS) and tool (JupyterHub). Thanks for any thoughts!!

IvanYingX · September 5, 2024, 9:38am

Heya @danielmcquillen! Not sure if I understand the question correctly, but if I do, then you are right, the oauth_callback can handle taking a user to a specific notebook. In fact, you can use LTI Deep Linking, you can add a claim to the token like this:
jwt["https://purl.imsglobal.org/spec/lti/claim/message_type"] = "LtiDeepLinkingRequest";
Instead of:
jwt["https://purl.imsglobal.org/spec/lti/claim/message_type"] = "LtiResourceLinkRequest";

And the you can add the deeplinking url:
https://purl.imsglobal.org/spec/lti-dl/claim/deep_linking_settings.

However, there is one thing I noticed when doing it that way, the user can still navigate through parent directories. So let’s say for example that in your LMS, you want that the user can only access a lesson at a time, but you have this tree in your repo:

lessons
|
|----Python Basics
            |
            |--- Sets
            |       |--- Notebook.ipynb
            |       |--- Test.txt
            |
            |---- Dictionaries
                    |--- Notebook.ipynb
                    |--- Test.txt

If you want the user to only check the Notebooks inside lessons/Python Basics/Dictionaries, you should block any access to the parent directories.

I noticed that the way to set a notebook directory is by sending the details to the spawner and modify the notebook_dir env variable, whereas Deep Linking will redirect you to the dictionary, but will allow the user to “move up” from that directory.

Just to make sure we are in the same page. The call I am making here:

IvanYingX:

const launchJupyterHubServer = async (
  lessonName: string,
  lessonId: string,
  email: string
) => {
  const url = `${JUPYTERHUB_URL}/users/${email}/server`;
  const body = {
    "lesson_name": lessonName,
    "lesson_id": lessonId,
    "lesson_dir": lessonName,
  };
  const response = await fetch(url, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      "Authorization": `token ${JUPYTERHUB_API_TOKEN}`
    },
    body: JSON.stringify(body),
  });
  return response;

does not open JupyterHub, it “simply” spawns a container for a user, but then, the actual workflow for opening JupyterHub in the LMS is done here:

IvanYingX:

const launchLtiInIframe = async () => {
    if (!launchParams) {
      return;
    }
    if (clientId === LTI_JUPYTERHUB_APPLICATION_ID) {
      await launchJupyterHub();
    }
    setIframeUrl(
      `${launchUrl}?${Object.entries(launchParams)
        .map(([key, value]) => `${key}=${value}`)
        .join('&')}`
    );
  };

Once again, not sure if that was the actual question, if not let me know

danielmcquillen · September 9, 2024, 4:46am

Hi @IvanYingX . Back again seeking more guidance!

I think I grok your basic approach: 1) spawning the server first via a direct JupyterHub API call, then 2) launching that notebook through a traditional LTI1.3 ‘callback’ URL.

However, what’s still unclear to me is where in the LTI1.3 JWT you refer to the exact notebook you spawned behind the scenes. You mentioned LtiDeepLinkingRequest but I don’t think you’re actually using that are you? ( I though that was more for workflows where the user needs to pick a resource from many options or something). I guess another way to ask this would be, given the example above, where does the string “Dictionaries” appear in your JWT?

I would have thought the standard LTI1.3 target_link_uri field would be what you used.

But I’m still trying to figure out if LTIAuthenticator expects one to use this field or not, and automatically gets a particular notebook based on the value of this field and some logic or rules, or whether you needed a different approach to launch the particular notebook you spawned for the user in the previous step.

BTW, I would think using the JuptyerHub API directly with a stored token is a tighter coupling than what LTI1.3 would promise by itself … e.g. the setup for an individual LMS would have to be more customized and involve more data entry (the JupyterHub api key) than a standard LTI connection UI. I wonder how Moodle or Open edX would be able to connect to JupyterHub using just LTI1.3 (because it’s missing that extra API bit). That’s why I’m surprised the LTIAuthenticator doesn’t do all of this for you. But again still learning…

Many thanks for your generous sharing! I hope to contribute back some useful thoughts at some point, rather than keep bumbling around blindly.

danielmcquillen · September 11, 2024, 9:13pm

Answering my own question: LTIAuthenticator indeed respects the “target_link_uri” key in the ID Token, as shown in this block of code from its LTI13CallbackHandler

async def post(self):
        """
        Overrides the upstream post handler.
        """
        try:
            id_token = self.decode_and_validate_launch_request()
        except InvalidAudienceError as e:
            raise HTTPError(401, str(e))
        except ValidationError as e:
            raise HTTPError(400, str(e))

        try:
            user = await self.login_user(id_token)
        except LoginError as e:
            raise HTTPError(400, str(e))
        self.log.debug(f"user logged in: {user}")
        if user is None:
            raise HTTPError(403, "User missing or null")
        await self.redirect_to_next_url(user)
 class:

…where “next url” is set from “target_link_uri” in another part of the code (if the Platform hasn’t included ‘next_url’ in the initial request).

But still unclear why I would need to “pre-spin up” notebooks via a more tightly coupled direct API call after doing the tool login request but before doing the tool launch callback. If we’re sticking with the LTIv1.3 standard the tool should do that during the final launch (otherwise how would LTIAuthenticator work with more generic installations like on Moodle or Open edX?)

@IvanYingX you probably thought of this and the API call is still required…I’ll just have to discover the reasons and slowly prove it to myself I think I understand your main motivation is to prevent students from moving through a directory tree, but not sure if that’s the only reason.

Topic		Replies	Views
JupyterHub and Moodle integration JupyterHub	21	8883	September 3, 2024
LTI Authenticator w JH error 405 : Method Not Allowed JupyterHub how-to	2	1593	July 12, 2021
Best practice: LTIAuthenticator with nbgitpuller to launch lesson? Zero to JupyterHub on Kubernetes	0	32	October 22, 2024
JupyterHub with LTI launch JupyterHub jupyterhub , help-wanted	0	468	September 15, 2021
Jupyterhub + jupyterlite JupyterHub	11	395	January 21, 2025

How to set up LTI 1.3 authentication for Jupyterhub

Related topics