Notebook spawned in Yarn but killed after timeout

Hi all,
I’ve followed as much as I can on all the instructions above, namely, on the Hub side, modified

  • yarnspawner/singleuser.py as per post #3
  • jupyterhub/spawner.py as specified in post #26.
    I’ve followed the instructions in both PR #23 from Dec 22 as per post #32. I still can’t get the notebook (ie the jupyterlab in the container) to start up properly. The entire container starts up but has the following errors in the yarn log:
    a) error in jupyterhub/services/auth.py at line 460/472 error with "unexpected keyword argument ‘json’ "
    b) the log says "Jupyter Server 1.15.6 is running at: http://xxx:yyy/user/jupyterhub-sandbox/lab (I’m using HDFS for user files ) but when I copied and tried to access the above URL it gave me a 404 error
    c) timed out after the expiry time set a 300 seconds / 5minutes and the container is terminated due to time out from starting.

This sounds silly but I see no option to attach text files… or I would have posted the config, edited code and logs…

I’m sorry @bigtapp_quek but the posts here don’t seem to have numbers. Can you use the permalinks instead? You can just insert your config, etc. into the post itself. That’s what I did. It makes for a long post but it works.

Not to be a downer, but I will tell you that I was unable to replicate elsewhere the setup that worked for me in a test site and I have given up on running JupyterHub over Hadoop. It seems clear that the code is not maintained and I’m frankly surprised that the functionality is still shown as supported in the documentation.

Let me try. I’ll respond on my own message.

Hi let me try. The config is at: jupyterhub_config_20240216 - Pastebin.com I’ve cut out defaults/commented out values. If there is any value that you need I can post it here.

The Yarn container log is at: container_e80_1706853697397_0119_01_000001 - Google Drive (3MB in size)

As per my point a) above, I’ve modified it as per PR 23

        self.io_loop.add_callback(self.hub_auth._api_request, method='POST',
                                  url=url_path_join(self.hub_api_url, 'yarnspawner'),
                                  body=json.dumps({'port': self.port}))

Can someone assist? Appreciate it @ajs6f

For starter, you are using Jupyter Server 1.x which might not work well with JupyterHub 4.x. If possible try to upgrade your JupyterLab to 4.x which will upgrade your Jupyter Server to 2.x

Where did you make this change?

The error is coming from json.dumps({'port': self.port}) which is being used without importing.

Hi @mahendrapaipuri, I will take note on the Jupyter Server version. Let me try to upgrade it. The changes that you mentioned is in the PR #23 mentioned in the above post 31

Also here is the list of jupyter* stuff installed via conda.

jupyter-hdfscm            0.2.0            py39hf3d152e_3    conda-forge
jupyter-server-mathjax    0.2.6              pyh5bfe37b_1    conda-forge
jupyter-server-proxy      4.1.0              pyhd8ed1ab_0    conda-forge
jupyter_client            8.6.0              pyhd8ed1ab_0    conda-forge
jupyter_core              5.7.1            py39hf3d152e_0    conda-forge
jupyter_server            1.15.6             pyhd8ed1ab_1    conda-forge
jupyter_telemetry         0.1.0              pyhd8ed1ab_1    conda-forge
jupyterhub                4.0.2              pyh31011fe_0    conda-forge
jupyterhub-base           4.0.2              pyh31011fe_0    conda-forge
jupyterhub-ldapauthenticator 1.3.2                      py_0    conda-forge
jupyterhub-yarnspawner    0.4.0            py39hf3d152e_2    conda-forge
jupyterlab                3.3.4              pyhd8ed1ab_0    conda-forge
jupyterlab-git            0.44.0             pyhd8ed1ab_0    conda-forge
jupyterlab_server         2.16.6             pyhd8ed1ab_0    conda-forge
jupyterlab_widgets        3.0.9              pyhd8ed1ab_0    conda-forge

Instead can you try this patch

def start(self):
        self.oauth_callback_handler_class = HubOAuthCallbackHandler
        
        hub_auth = HubOAuth()
        url = url_path_join(hub_auth.api_url, "yarnspawner")
        headers = {"Authorization": f"token {hub_auth.api_token}"}
        r = requests.post(
               url,
               headers=headers,
               json={"port": self.port},
         )
        super().start()

And also update your JupyterLab to 4.x

I’ve patched what you suggested on the jupyterhub side, not the node/notebook. The one I patched was on the node/notebook. Do I need to patch both or only yours on the jupyterhub hub side? On the node/notebook side, I tried the patch you suggested but it cannot find HubOAuth IIRC as the Hub class is not imported into the node/notebook side.

You will need to use the patch I suggested on yarnspawner/singleuser.py as in this post. This singleuser.py will be used only in single user server (notebook) when starting the notebook. There is nothing to be done in the JupyterHub side. Sorry for being less clear!

You can import HubOAuth in singleuser.py file from jupyterhub.services.auth import HubOAuth before using it.

Thanks. I was also puzzled but let me try yours. BTW, I’ve installed Jupyter Server version 2.16 as seen in the output above why does the error (in the container log) tells me I’m using 1.15?!?

This is the backend server of JupyterLab/notebook. This needs to be bumped to 2.x. I think you are referring to jupyterlab_server which is not the same

Oooh thanks for the info.
Edit: I’m not the one doing the conda packaging but I remember it may be due to jupyter-hdfscm which is only compatible until version until version 5 I think of the jupyter notebook. I had error of not finding the class for HDFSCM when using the latest versions of the notebook, server, etc

If you cannot upgrade to JupyterLab 4.x and jupyter_server 2.x, ensure that you set JUPYTERHUB_SINGLEUSER_EXTENSION=0 env variable in your single user server’s config. This will ensure singleuser implemntation uses legacy mode instead of jupyter_server extension mode which is only compatible with jupyter_server 2.x

1 Like

Thanks. Let me try with this as well. Thanks everyone!

Hey @mahendrapaipuri, I managed to get all working by using following patch (jupyter_labhub.py), JupyterHub 4.1.5 and JupyterLab 4.2.4:

import os
import json
import requests

from jupyterhub.services.auth import HubOAuthCallbackHandler
from jupyterhub.services.auth import HubOAuth
from jupyterhub.utils import random_port, url_path_join
from traitlets import default

try:
    from jupyterlab.labhubapp import SingleUserLabApp
except ImportError:
    raise ImportError("You must have jupyterlab installed for this to work")

# Borrowed and modified from jupyterhub/batchspawner:
# https://github.com/jupyterhub/batchspawner/blob/d1052385f245a3c799c5b81d30c8e67f193963c6/batchspawner/singleuser.py
class YarnSingleUserNotebookApp(SingleUserLabApp):
    @default('port')
    def _port(self):
        return random_port()

    def start(self):
        self.oauth_callback_handler_class = HubOAuthCallbackHandler

        hub_auth = HubOAuth()
        url = url_path_join(hub_auth.api_url, "yarnspawner")
        headers = {"Authorization": f"token {hub_auth.api_token}"}
        r = requests.post(
               url,
               headers=headers,
               json={"port": self.port},
         )
        super().start()


def main(argv=None):
    # Set configuration directory to something local if not already set
    for var in ['JUPYTER_RUNTIME_DIR', 'JUPYTER_DATA_DIR']:
        if os.environ.get(var) is None:
            if not os.path.exists('.jupyter'):
                os.mkdir('.jupyter')
            os.environ[var] = './.jupyter'
    for var in ['JUPYTERHUB_OAUTH_ACCESS_SCOPES', 'JUPYTERHUB_OAUTH_SCOPES']:
        if os.environ.get(var):
            os.environ[var] = json.dumps(var)
    return YarnSingleUserNotebookApp.launch_instance(argv)


if __name__ == "__main__":
    main()

I converted the environment variables JUPYTERHUB_OAUTH_ACCESS_SCOPES and JUPYTERHUB_OAUTH_SCOPES into JSON formatted string using json.dumps. This way the relevant function in jupyterhub.services auth.py was able to load the environment variables using json.loads. Without changing the environment variables in jupyter_labhub.py following error occurs:

        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

Additionally, I had to change this line directly in auth.py:

from:

            scopes = self.hub_auth.check_scopes(self.hub_scopes, model)

to:

            scopes = self.hub_scopes

I did not find out why, but the function check_scopes in auth.py did not work despite having the correct scopes object in self.hub_scopes. Do you know why?

JupyterLab environment: CommunityLab/collections/ansible_collections/jupyter/lab/roles/setup/files/jupyterlabenvironment.txt at 0a21eb7f052bb4dde56e59fa7137c20ab1ee30cc · GeorgSchulz/CommunityLab · GitHub

all changes I need to made using Ansible:

scopes = self.hub_scopes

I dont think this is correct. You are effectively skipping scopes check by doing so. The problem with env vars JUPYTERHUB_OAUTH_ACCESS_SCOPES and JUPYTERHUB_OAUTH_SCOPES is lack of “extra” quotes. See this comment, you need to enclose the string into another set of quotes. Base spawner already dumps these env vars before setting them in the environment of single user server.

I tried to enclose the string of the env vars JUPYTERHUB_OAUTH_ACCESS_SCOPES and JUPYTERHUB_OAUTH_SCOPES into another set of quotes using the approach in this comment. But that actually that did not work, still this error occurs:

        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

Maybe the function check_scopes is failing because the value for my env var JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES is empty.

Could you give the stack trace of this error? Where is it coming from? I would suggest you to patch the YarnSpawner as in this comment instead of patching SingleUserLabApp because the problem is with YarnSpawner.

The stack strace is coming from line 460 in auth.py of jupyterhub/services/auth.py when the object env_scopes is defined using json.loads on the env var JUPYTERHUB_OAUTH_ACCESS_SCOPES.

    @default('access_scopes')
    def _default_scopes(self):
        env_scopes = os.getenv('JUPYTERHUB_OAUTH_ACCESS_SCOPES')
        if not env_scopes:
            # deprecated name (since 3.0)
            env_scopes = os.getenv('JUPYTERHUB_OAUTH_SCOPES')
        if env_scopes:
            return set(json.loads(env_scopes))

This is the stack trace:

[E 2024-08-03 08:45:35.447 YarnSingleUserNotebookApp web:1875] Uncaught exception GET /user/gschulz/lab? (::ffff:95.217.23.222)
    HTTPServerRequest(protocol='https', host='hub1.click-your-it.de', method='GET', uri='/user/gschulz/lab?', version='HTTP/1.1', remote_ip='::ffff:95.217.23.222')
    Traceback (most recent call last):
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/tornado/web.py", line 1769, in _execute
        result = await result  # type: ignore
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/jupyter_server/base/handlers.py", line 620, in prepare
        _user = self.identity_provider.get_user(self)
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/jupyter_server/auth/identity.py", line 703, in get_user
        user = self.login_handler_class.get_user(handler)  # type:ignore[attr-defined]
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/jupyterhub/singleuser/mixins.py", line 125, in get_user
        return handler.get_current_user()
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/jupyterhub/services/auth.py", line 1408, in get_current_user
        self._hub_auth_user_cache = self.check_hub_user(user_model)
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/jupyterhub/services/auth.py", line 1335, in check_hub_user
        if self.allow_all:
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/jupyterhub/services/auth.py", line 1270, in allow_all
        self.hub_scopes is None
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/jupyterhub/services/auth.py", line 1262, in hub_scopes
        return self.hub_auth.access_scopes or None
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/traitlets/traitlets.py", line 687, in __get__
        return t.cast(G, self.get(obj, cls))  # the G should encode the Optional
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/traitlets/traitlets.py", line 635, in get
        default = obj.trait_defaults(self.name)
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/traitlets/traitlets.py", line 1897, in trait_defaults
        return t.cast(Sentinel, self._get_trait_default_generator(names[0])(self))
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/traitlets/traitlets.py", line 1241, in __call__
        return self.func(*args, **kwargs)
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/site-packages/jupyterhub/services/auth.py", line 460, in _default_scopes
        return set(json.loads(env_scopes))
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/json/__init__.py", line 346, in loads
        return _default_decoder.decode(s)
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/json/decoder.py", line 337, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "/opt/miniforge/miniforge/envs/jupyterlab/lib/python3.9/json/decoder.py", line 355, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

This occurs when patching spawner.py of YarnSpawner like here and also patching singleuser.py of YarnSpawner like here.

In my case it does not occur when using json.dumps on the env var JUPYTERHUB_OAUTH_ACCESS_SCOPES in singleuser.py of yarnspawner.