Notebook spawned in Yarn but killed after timeout

I’m afraid not. It is a fairly complex environment using Ambari-managed Hadoop. I don’t really know how I would containerize it and I can no longer afford to spend the time experimenting. I’ll have to move on and perhaps Jupyterhub is something I can pursue again later for our site. My whole interest in YarnSpawner was to avoid having to build up a completely different and independent infrastructure using container tech alongside the extensive Hadoop investment we already have made, but that doesn’t seem possible.

I would like to suggest that the YarnSpawner repo be marked as out-of-maintenance and the documentation be updated as such. I’ve spent quite a lot of time on this with no results and it seems that YarnSpawner doesn’t work now, or at least not without much Jupyterhub knowledge beyond the newbie stage. I’ll make that suggestion in a separate thread.

Fair enough!! As a final try, could you make sure that you have notebook and jupyterlab installed in your single user environments. Also try to spawn a JupyterLab interface rather than classic notebook using c.YarnSpawner.default_url = '/lab' to see if it changes anything.

Holy moly! It did indeed change things. Now the missing cookie-token is being found! But there is another problem… Here’s what I’m seeing now (below). It looks like auth.py is looking for a value in the JSON of the default scope env var JUPYTERHUB_OAUTH_ACCESS_SCOPES that isn’t there? (Please be aware that some of the line numbers in this stacktrace may be a few off from the shared code because of logging statements I have inserted.)

[D 2023-07-19 11:22:38.469 YarnSingleUserNotebookApp auth:746] token found
[D 2023-07-19 11:22:38.469 YarnSingleUserNotebookApp auth:676] got user_model from cookie: {'name': 'sorokaa', 'admin': False, 'kind': 'user', 'groups': [], 'session_id': '2aa3b4492f3d40fb9127eb8b964f6d21', 'scopes': ['access:servers!server=sorokaa/', 'read:users:groups!user=sorokaa', 'read:users:name!user=sorokaa']}
[D 2023-07-19 11:22:38.470 YarnSingleUserNotebookApp auth:368] env_scopes = os.getenv('JUPYTERHUB_OAUTH_ACCESS_SCOPES' resulted in [access:servers!server=sorokaa/, access:servers!user=sorokaa]
[E 2023-07-19 11:22:38.470 YarnSingleUserNotebookApp http1connection:67] Uncaught exception
    Traceback (most recent call last):
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/tornado/http1connection.py", line 273, in _read_message
        delegate.finish()
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/tornado/httpserver.py", line 387, in finish
        self.delegate.finish()
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/tornado/routing.py", line 268, in finish
        self.delegate.finish()
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/tornado/web.py", line 2290, in finish
        self.execute()
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/tornado/web.py", line 2309, in execute
        self.handler = self.handler_class(
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/tornado/web.py", line 227, in __init__
        self.clear()
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/tornado/web.py", line 328, in clear
        self.set_default_headers()
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyter_server/base/handlers.py", line 314, in set_default_headers
        elif self.token_authenticated and "Access-Control-Allow-Origin" not in self.settings.get(
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyter_server/base/handlers.py", line 159, in token_authenticated
        return self.login_handler.is_token_authenticated(self)
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyterhub/singleuser/mixins.py", line 99, in is_token_authenticated
        handler.get_current_user()
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyterhub/services/auth.py", line 1165, in get_current_user
        self._hub_auth_user_cache = self.check_hub_user(user_model)
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyterhub/services/auth.py", line 1090, in check_hub_user
        if self.allow_all:
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyterhub/services/auth.py", line 1036, in allow_all
        self.hub_scopes is None
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyterhub/services/auth.py", line 1028, in hub_scopes
        return self.hub_auth.access_scopes or None
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/traitlets/traitlets.py", line 703, in __get__
        return self.get(obj, cls)
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/traitlets/traitlets.py", line 659, in get
        default = obj.trait_defaults(self.name)
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/traitlets/traitlets.py", line 1872, in trait_defaults
        return self._get_trait_default_generator(names[0])(self)
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/traitlets/traitlets.py", line 1233, in __call__
        return self.func(*args, **kwargs)
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyterhub/services/auth.py", line 374, in _default_scopes
        return set(json.loads(env_scopes))
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/json/__init__.py", line 346, in loads
        return _default_decoder.decode(s)
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/json/decoder.py", line 337, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "/cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/json/decoder.py", line 355, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

Actually, now that I look more closely, it looks like the value in that env var: JUPYTERHUB_OAUTH_ACCESS_SCOPES = [access:servers!server=sorokaa/, access:servers!user=sorokaa] really isn’t valid JSON: it’s missing some quotes around the terms…

Thats great news. So, notebook was missing from the single user environment? If so, we need to raise an issue with core developers as the error is quite misleading.

Well, env var JUPYTERHUB_OAUTH_ACCESS_SCOPES is a list of strings. I think this issue must be Hadoop specific as env vars are set here and you need to make sure that this list of strings is properly formatted.

Excellent, anything YarnSpawner-specific is a much smaller zone for fixes! I will get to that right away and see what I can figure out. Do you think it’s worth waiting to fix everything I can fix before filing that issue about the absence of notebook? I want to make sure I can give the best possible picture to the devs.

Thank you again @mahendrapaipuri for all the help you have been giving me for this!

Hello @mahendrapaipuri , I can report success!

With one more edit I was able to successfully launch and connect to a JupyterLab notebook (if that is the correct terminology). In YarnSpawner’s spawner.py I overrode the default value for JUPYTERHUB_OAUTH_ACCESS_SCOPES as follows:

    @default("oauth_access_scopes")
    def _default_access_scopes(self):
        self.log.debug("Now using YarnSpawner oauth_access_scopes override")
        return [
            f'"access:servers!server={self.user.name}/{self.name}"',
            f'"access:servers!user={self.user.name}"',
        ]

This overrides the out-of-the-box default value from Spawner in JupyterHub itself, which is:

    @default("oauth_access_scopes")
    def _default_access_scopes(self):
        return [
            f"access:servers!server={self.user.name}/{self.name}",
            f"access:servers!user={self.user.name}",
        ]

Notice the quotation marks I have added. Now, I can’t begin to tell you why my setup is “picky” about this. I don’t think it has anything to do with Hadoop, or at least I cannot see why it would. That having been said, that’s what worked.

I would very much like to contribute these fixes back to JupyterHub. Do you think the best move is to open issues and PRs on YarnSpawner? Or is there a better way to get the attention of someone who can take responsibility for merging fixes?

Thank you again for all your help!

Well, JupyterHub spawner exports this env var as string after converting it into json string. So, generally you will not need to escape quotes again. I thought it might be some sort of side effect of skein library that YarnSpawner is using to spawn single user servers.

Yes, you can submit PR for YarnSpawner and hope someone will respond.

[I 2023-07-11 14:21:58.981 YarnSingleUserNotebookApp mixins:609] Starting jupyterhub single-user server version 3.1.1
[I 2023-07-11 14:21:58.981 YarnSingleUserNotebookApp mixins:623] Extending __main__.YarnSingleUserNotebookApp from __main__ 
[I 2023-07-11 14:21:58.981 YarnSingleUserNotebookApp mixins:623] Extending jupyter_server.serverapp.ServerApp from jupyter_server 1.23.6
[D 2023-07-11 14:21:58.999 YarnSingleUserNotebookApp application:190] Searching ['/home/.jupyter', '/home/.local/etc/jupyter', '/cm/shared/datalake/jupyterhub/miniconda/etc/jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files
[D 2023-07-11 14:21:58.999 YarnSingleUserNotebookApp application:902] Looking for jupyter_config in /etc/jupyter
[D 2023-07-11 14:21:58.999 YarnSingleUserNotebookApp application:902] Looking for jupyter_config in /usr/local/etc/jupyter
[D 2023-07-11 14:21:59.000 YarnSingleUserNotebookApp application:902] Looking for jupyter_config in /cm/shared/datalake/jupyterhub/miniconda/etc/jupyter
[D 2023-07-11 14:21:59.000 YarnSingleUserNotebookApp application:902] Looking for jupyter_config in /home/.local/etc/jupyter
[D 2023-07-11 14:21:59.000 YarnSingleUserNotebookApp application:902] Looking for jupyter_config in /home/.jupyter
[D 2023-07-11 14:21:59.000 YarnSingleUserNotebookApp application:902] Looking for jupyter_server_config in /etc/jupyter
[D 2023-07-11 14:21:59.001 YarnSingleUserNotebookApp application:902] Looking for jupyter_server_config in /usr/local/etc/jupyter
[D 2023-07-11 14:21:59.001 YarnSingleUserNotebookApp application:902] Looking for jupyter_server_config in /cm/shared/datalake/jupyterhub/miniconda/etc/jupyter
[D 2023-07-11 14:21:59.001 YarnSingleUserNotebookApp application:902] Looking for jupyter_server_config in /home/.local/etc/jupyter
[D 2023-07-11 14:21:59.001 YarnSingleUserNotebookApp application:902] Looking for jupyter_server_config in /home/.jupyter
[D 2023-07-11 14:21:59.003 YarnSingleUserNotebookApp config_manager:93] Paths used for configuration of jupyter_server_config: 
    	/etc/jupyter/jupyter_server_config.json
[D 2023-07-11 14:21:59.003 YarnSingleUserNotebookApp config_manager:93] Paths used for configuration of jupyter_server_config: 
    	/usr/local/etc/jupyter/jupyter_server_config.json
[D 2023-07-11 14:21:59.003 YarnSingleUserNotebookApp config_manager:93] Paths used for configuration of jupyter_server_config: 
    	/cm/shared/datalake/jupyterhub/miniconda/etc/jupyter/jupyter_server_config.json
[D 2023-07-11 14:21:59.003 YarnSingleUserNotebookApp config_manager:93] Paths used for configuration of jupyter_server_config: 
    	/home/.local/etc/jupyter/jupyter_server_config.json
[D 2023-07-11 14:21:59.004 YarnSingleUserNotebookApp config_manager:93] Paths used for configuration of jupyter_server_config: 
    	/home/.jupyter/jupyter_server_config.json
[I 2023-07-11 14:21:59.080 YarnSingleUserNotebookApp mixins:670] Starting jupyterhub-singleuser server version 3.1.1
[D 2023-07-11 14:21:59.085 YarnSingleUserNotebookApp _version:74] jupyterhub and jupyterhub-singleuser both on version 3.1.1
[I 2023-07-11 14:21:59.086 YarnSingleUserNotebookApp serverapp:2686] Serving notebooks from local directory: /hadoop/yarn/local/usercache/sorokaa/appcache/application_1687889389675_0015/container_1687889389675_0015_01_000001
[I 2023-07-11 14:21:59.086 YarnSingleUserNotebookApp serverapp:2686] Jupyter Server 1.23.6 is running at:
[I 2023-07-11 14:21:59.086 YarnSingleUserNotebookApp serverapp:2686] http://dl-test-01:39199/user/sorokaa/tree/
[I 2023-07-11 14:21:59.086 YarnSingleUserNotebookApp serverapp:2686]  or http://127.0.0.1:39199/user/sorokaa/tree/
[I 2023-07-11 14:21:59.086 YarnSingleUserNotebookApp serverapp:2687] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2023-07-11 14:21:59.092 YarnSingleUserNotebookApp mixins:591] Updating Hub with activity every 300 seconds
[D 2023-07-11 14:21:59.093 YarnSingleUserNotebookApp mixins:553] Notifying Hub of activity 2023-07-11T18:21:59.014371Z
[I 2023-07-11 14:21:59.240 YarnSingleUserNotebookApp log:186] 302 GET /user/sorokaa/ -> /user/sorokaa/tree/? (@192.168.60.11) 0.64ms
[I 2023-07-11 14:21:59.351 YarnSingleUserNotebookApp log:186] 302 GET /user/sorokaa/ -> /user/sorokaa/tree/? (@::ffff:10.254.127.246) 0.59ms
[I 2023-07-11 14:21:59.399 YarnSingleUserNotebookApp log:186] 302 GET /user/sorokaa/tree/? -> /user/sorokaa/tree? (@::ffff:10.254.127.246) 0.55ms
[D 2023-07-11 14:21:59.440 YarnSingleUserNotebookApp auth:673] No user identified
[D 2023-07-11 14:21:59.441 YarnSingleUserNotebookApp handlers:273] Using contents: services/contents
[D 2023-07-11 14:21:59.476 YarnSingleUserNotebookApp handlers:881] Path favicon.ico served from /cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyter_server/static/favicon.ico
[D 2023-07-11 14:21:59.478 YarnSingleUserNotebookApp handlers:881] Path style/bootstrap.min.css served from /cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyter_server/static/style/bootstrap.min.css
[D 2023-07-11 14:21:59.479 YarnSingleUserNotebookApp handlers:881] Path style/bootstrap-theme.min.css served from /cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyter_server/static/style/bootstrap-theme.min.css
[D 2023-07-11 14:21:59.480 YarnSingleUserNotebookApp handlers:881] Path style/index.css served from /cm/shared/datalake/jupyterhub/miniconda/lib/python3.10/site-packages/jupyter_server/static/style/index.css
[W 2023-07-11 14:21:59.481 YarnSingleUserNotebookApp log:186] 404 GET /user/sorokaa/tree? (@::ffff:10.254.127.246) 42.27ms
[D 2023-07-11 14:21:59.538 YarnSingleUserNotebookApp auth:673] No user identified
[D 2023-07-11 14:21:59.540 YarnSingleUserNotebookApp log:186] 200 GET /user/sorokaa/static/style/bootstrap.min.css?v=0e8a7fbd6de23ad6b27ab95802a0a0915af6693af612bc304d83af445529ce5d95842309ca3405d10f538d45c8a3a261b8cff78b4bd512dd9effb4109a71d0ab (@::ffff:10.254.127.246) 2.96ms
[D 2023-07-11 14:21:59.546 YarnSingleUserNotebookApp auth:673] No user identified
[D 2023-07-11 14:21:59.547 YarnSingleUserNotebookApp auth:673] No user identified
[D 2023-07-11 14:21:59.549 YarnSingleUserNotebookApp log:186] 200 GET /user/sorokaa/static/style/bootstrap-theme.min.css?v=8b2f045cb5b4d5ad346f6e816aa2566829a4f5f2783ec31d80d46a57de8ac0c3d21fe6e53bcd8e1f38ac17fcd06d12088bc9b43e23b5d1da52d10c6b717b22b3 (@::ffff:10.254.127.246) 2.86ms
[D 2023-07-11 14:21:59.550 YarnSingleUserNotebookApp log:186] 200 GET /user/sorokaa/static/style/index.css?v=30372e3246a801d662cf9e3f9dd656fa192eebde9054a2282449fe43919de9f0ee9b745d7eb49d3b0a5e56357912cc7d776390eddcab9dac85b77bdb17b4bdae (@::ffff:10.254.127.246) 2.82ms
[D 2023-07-11 14:21:59.769 YarnSingleUserNotebookApp auth:673] No user identified
[D 2023-07-11 14:21:59.771 YarnSingleUserNotebookApp log:186] 200 GET /user/sorokaa/static/favicon.ico?v=50afa725b5de8b00030139d09b38620224d4e7dba47c07ef0e86d4643f30c9bfe6bb7e1a4a1c561aa32834480909a4b6fe7cd1e17f7159330b6b5914bf45a880 (@::ffff:10.254.127.246) 1.85ms
End of LogType:application.driver.log.This log file belongs to a running container (container_1687889389675_0015_01_000001) and so may not be complete.

@manics Do you happen to know why when notebook is missing from single user environment we get this No user identified error? I mean it is sort of misleading, right? Can we check for the presence of notebook/jupyterlab before starting single user server? Cheers!!

Do you happen to know why when notebook is missing from single user environment we get this No user identified error? I mean it is sort of misleading, right?

notebook shouldn’t be required, though you will need something that provides the singleuser server. Notebook or JupyterLab are the common ones, but there are other alternative implementations.

I’ve just tested the latest JupyterHub with only jupyter_server (no notebook, nor jupyterlab), and it’s working. Is it possible this is another bug in yarnspawner?

I agree that there is no “strict” requirement of having notebook or jupyterlab in the single user environment. But when we configure default_url as /tree and there is no notebook in the single user environment, it seems like we might end up in this situation for JupyterHub 3.

I will try to replicate this behaviour and if I can, I will post my environment here. \

Cheers!!

Just a note that a fortnight ago I filed an issue for the original problem with an async method:

with no response as of yet. There is a PR from last December that appears to address the same issue that also appears to have been ignored.

Thank you so much for this advice.

@brezzsent, did you also have difficulty getting YarnSpawner to work? If so, please comment on the issue I raised and perhaps we can get it addressed so that others won’t have the same problem.

Hey @ajs6f,

I have created an account just to say thank you for your findings. I have managed to run a POC’est of all POCs JH in Yarn after bashing a head in the wall for a week until I found this thread and your contribution in it.

However, being not a progammer by nature I wish I understood what you meant by saying:

In YarnSpawner’s spawner.py I overrode the default value for JUPYTERHUB_OAUTH_ACCESS_SCOPES as follows

I have only managed to patch JupyterHub's spawner.py directly. Is there an easy way to bring this patch out of the main JupyterHub's spawner.py into a YarnSpawner’s spawner.py file? I would like to leave JupyterHub's spawner.py file vanilla.

Thanks!

Hi @dariuss, I’m so glad you were able to make progress! This is clearly an unmaintained section of the codebase and it seems we are largely on our own wrt to the core team. I couldn’t have gotten anywhere myself but for help from @mahendrapaipuri, so much thanks to them.

The alteration I made to YarnSpawner’s spawner.py looks like:

    @default("oauth_access_scopes")
    def _default_access_scopes(self):
        self.log.debug("Now using YarnSpawner oauth_access_scopes override")
        return [
            f'"access:servers!server={self.user.name}/{self.name}"',
            f'"access:servers!user={self.user.name}"',
        ]

and I put it right at the top of the file for convenience. There is probably a more readable place.

Let me know if that works for you: it was the final step for me in getting a POC to work. If you have time, please do comment on the relevant issue (see above) and perhaps we can get some attention and get merged one of the various fixes that have already been sent as PRs.

Hi @ajs6f, thanks for your reply. I am still struggling to understand how have you managed to get it working inside Yarnspawner's spawner.py file? I have bluntly tried to paste your code at the top of the file, and it failed, with identation errors:

      File "/opt/miniconda/envs/py310_github/lib/python3.10/site-packages/yarnspawner/spawner.py", line 1
        @default("oauth_access_scopes")
    IndentationError: unexpected indent

I am not sure if this is the correct way. Would you mind sharing the whole modified files please? Thanks!

I can and will later today, but if you are not able to independently correct that miscopy error, you may face a great deal of trouble supporting Jupyter. It is written in Python and Python is whitespace-sensitive.

Hello @dariuss, I’m sorry for the delay, but below you will find the whole of my Yarnspawner spawner.py file. I would emphasize again that if you cannot correct indentation errors on your own, you are not likely to have a good time supporting JupyterHub, but good luck in any case!

import skein
from jupyterhub.spawner import Spawner
from jupyterhub.traitlets import Command, ByteSpecification
from traitlets import Unicode, Dict, Integer, default
from tornado import gen


_STOPPED_STATES = {'FAILED', 'KILLED', 'FINISHED'}


class YarnSpawner(Spawner):
    """A spawner for starting singleuser instances in a YARN container."""

    @default("oauth_access_scopes")
    def _default_access_scopes(self):
        self.log.debug("Now using YarnSpawner oauth_access_scopes override")
        return [
            f'"access:servers!server={self.user.name}/{self.name}"',
            f'"access:servers!user={self.user.name}"',
        ]

    start_timeout = Integer(
        300,
        help="Timeout (in seconds) before giving up on starting of singleuser server.",
        config=True
    )

    ip = Unicode(
        "0.0.0.0",
        help="The IP address (or hostname) the singleuser server should listen on.",
        config=True
    )

    principal = Unicode(
        None,
        help='Kerberos principal for JupyterHub user',
        allow_none=True,
        config=True,
    )

    keytab = Unicode(
        None,
        help='Path to kerberos keytab for JupyterHub user',
        allow_none=True,
        config=True,
    )

    queue = Unicode(
        'default',
        help='The YARN queue to submit applications under',
        config=True,
    )

    localize_files = Dict(
        help="""
        Extra files to distribute to the singleuser server container.

        This is a mapping from ``local-name`` to ``resource``. Resource paths
        can be local, or in HDFS (prefix with ``hdfs://...`` if so). If an
        archive (``.tar.gz`` or ``.zip``), the resource will be unarchived as
        directory ``local-name``. For finer control, resources can also be
        specified as ``skein.File`` objects, or their ``dict`` equivalents.

        This can be used to distribute conda/virtual environments by
        configuring the following:

        .. code::

            c.YarnSpawner.localize_files = {
                'environment': {
                    'source': 'hdfs:///path/to/archived/environment.tar.gz',
                    'visibility': 'public'
                }
            }
            c.YarnSpawner.prologue = 'source environment/bin/activate'

        These archives are usually created using either ``conda-pack`` or
        ``venv-pack``. For more information on distributing files, see
        https://jcrist.github.io/skein/distributing-files.html.
        """,
        config=True,
    )

    prologue = Unicode(
        '',
        help='Script to run before singleuser server starts.',
        config=True,
    )

    cmd = Command(
        ['python -m yarnspawner.singleuser'],
        allow_none=True,
        help='The command used for starting the singleuser server.',
        config=True
    )

    mem_limit = ByteSpecification(
        '2 G',
        help="""
        Maximum number of bytes a singleuser notebook server is allowed to
        use. Allows the following suffixes:

        - K -> Kibibytes
        - M -> Mebibytes
        - G -> Gibibytes
        - T -> Tebibytes
        """,
        config=True)

    cpu_limit = Integer(
        1,
        min=1,
        help="""
        Maximum number of cpu-cores a singleuser notebook server is allowed to
        use. Unlike other spawners, this must be an integer amount >= 1.
        """,
        config=True)

    epilogue = Unicode(
        '',
        help='Script to run after singleuser server ends.',
        config=True,
    )

    script_template = Unicode(
        ("{prologue}\n"
         "{singleuser_command}\n"
         "{epilogue}"),
        help="""
        Template for application script.

        Filled in by calling ``script_template.format(**variables)``. Variables
        include the following attributes of this class:

        - prologue
        - singleuser_command
        - epilogue
        """,
        config=True,
    )

    # A cache of clients by (principal, keytab). In most cases this will only
    # be a single client. These should persist for the lifetime of jupyterhub.
    clients = {}

    async def _get_client(self):
        key = (self.principal, self.keytab)
        client = type(self).clients.get(key)
        if client is None:
            kwargs = dict(principal=self.principal,
                          keytab=self.keytab,
                          security=skein.Security.new_credentials())
            client = await gen.IOLoop.current().run_in_executor(
                None, lambda: skein.Client(**kwargs)
            )
            type(self).clients[key] = client
        return client

    @property
    def singleuser_command(self):
        """The full command (with args) to launch a singleuser server"""
        return ' '.join(self.cmd + self.get_args())

    def _build_specification(self):
        script = self.script_template.format(
            prologue=self.prologue,
            singleuser_command=self.singleuser_command,
            epilogue=self.epilogue
        )

        resources = skein.Resources(
            memory='%d b' % self.mem_limit,
            vcores=self.cpu_limit
        )

        security = skein.Security.new_credentials()

        # Support dicts as well as File objects
        files = {k: skein.File.from_dict(v) if isinstance(v, dict) else v
                 for k, v in self.localize_files.items()}

        master = skein.Master(
            resources=resources,
            files=files,
            env=self.get_env(),
            script=script,
            security=security
        )

        return skein.ApplicationSpec(
            name='jupyterhub',
            queue=self.queue,
            user=self.user.name,
            master=master
        )

    def load_state(self, state):
        super().load_state(state)
        self.app_id = state.get('app_id', '')

    def get_state(self):
        state = super().get_state()
        if self.app_id:
            state['app_id'] = self.app_id
        return state

    def clear_state(self):
        super().clear_state()
        self.app_id = ''

    async def start(self):
        loop = gen.IOLoop.current()

        spec = self._build_specification()

        client = await self._get_client()
        # Set app_id == 'PENDING' to signal that we're starting
        self.app_id = 'PENDING'
        try:
            self.app_id = app_id = await loop.run_in_executor(None, client.submit, spec)
        except Exception as exc:
            # We errored, no longer pending
            self.app_id = ''
            self.log.error(
                "Failed to submit application for user %s. Original exception:",
                self.user.name,
                exc_info=exc
            )
            raise

        # Wait for application to start
        while True:
            report = await loop.run_in_executor(
                None, client.application_report, app_id
            )
            state = str(report.state)
            if state in _STOPPED_STATES:
                raise Exception("Application %s failed to start, check "
                                "application logs for more information"
                                % app_id)
            elif state == 'RUNNING':
                self.current_ip = report.host
                break
            else:
                await gen.sleep(0.5)

        # Wait for port to be set
        while getattr(self, 'current_port', 0) == 0:
            await gen.sleep(0.5)

            report = await loop.run_in_executor(
                None, client.application_report, app_id
            )
            if str(report.state) in _STOPPED_STATES:
                raise Exception("Application %s failed to start, check "
                                "application logs for more information"
                                % app_id)

        return self.current_ip, self.current_port

    async def poll(self):
        if self.app_id == '':
            return 0
        elif self.app_id == 'PENDING':
            return None

        client = await self._get_client()
        report = await gen.IOLoop.current().run_in_executor(
            None, client.application_report, self.app_id
        )
        status = str(report.final_status)
        if status in {'SUCCEEDED', 'KILLED'}:
            return 0
        elif status == 'FAILED':
            return 1
        else:
            return None

    async def stop(self, now=False):
        if self.app_id == 'PENDING':
            # The application is in the process of being submitted. Wait for a
            # reasonable amount of time until we have an application id
            for i in range(20):
                if self.app_id != 'PENDING':
                    break
                await gen.sleep(0.1)
            else:
                self.log.warn("Application has been PENDING for an "
                              "unreasonable amount of time, there's likely "
                              "something wrong")

        # Application not submitted, or submission errored out, nothing to do.
        if self.app_id == '':
            return

        client = await self._get_client()
        await gen.IOLoop.current().run_in_executor(
            None, client.kill_application, self.app_id
        )

1 Like

I will try to replicate this behaviour and if I can, I will post my environment here..
.