Need Exact documentation to build own Docker image spawnable with Jupyterhub

Until now I was not able to find a clear documentation, how to set up a Dockerfile that creates a docker image, that will be spawnable with jupyterhub.

We use:

  • Ubuntu 18.04 TLS headless server
  • Jupyterhub 1.0.0
  • PAM Authantication behind a LDAP.
  • c.JupyterHub.spawner_class = ‘dockerspawner.SystemUserSpawner’

Can anyone please help our here?
thanks and regards!

What things have you tried so far?

I would start readinng https://github.com/jupyterhub/jupyterhub/tree/master/singleuser for an example of a docker image that can run a user’s server.

There is also https://repo2docker.readthedocs.io/en/latest/howto/jupyterhub_images.html which is what I would try first for creating a user image. However I don’t know the last time someone has used this feature of repo2docker so if you run into problems report back here sooner than later as bitrot might have set in.

1 Like

Hello betatim,

thanks a lot for your response! Before checking the links form you, I want to answer your questions, I have tried to take jupyter/base-notebook and modify it for my purposes, which didn’t work out.

I got my first idea, how to set up docker images that conform to a standard that enables them to be spawned by a jupyterhub from this website:

https://github.com/jupyterhub/dockerspawner#picking-or-building-a-docker-image

down below this on this page was a section that said, that your image

  • must have python > 3.4 installed,
  • jupyterhub installed,
  • notebook installed,
  • and you must run CMD ["jupyterhub-singleuser"]
    as a last command in the Dockerfile, to be able to soawn it with jupyterhub.

But this didn’t work out for me.

My question is, what are the real criteria and needs to make a docker image conform to be spawnable with jupyterhub? Is there a list?

Here is my Dockerfile that does run with

docker run -it hhn/dll-notebook:latest /bin/bash

But it is not spawnable, andif I try to spawn it, the logs don’t say anything helpful:

[W 2019-08-06 16:28:51.053 JupyterHub dockerspawner:976] Removing container that should have been cleaned up: jupyter-rschaufler (id: 86256de)
[I 2019-08-06 16:28:51.053 JupyterHub dockerspawner:815] Removing container 86256deb2ef72e74a196a36b6d707d0a30ca13013ec435824eaf04b03df0be97
[I 2019-08-06 16:28:51.177 JupyterHub dockerspawner:990] Created container jupyter-rschaufler (id: 94622ff) from image hhn/dll-notebook:latest
[I 2019-08-06 16:28:51.177 JupyterHub dockerspawner:1013] Starting container jupyter-rschaufler (id: 94622ff)
[E 2019-08-06 16:29:01.029 JupyterHub pages:209] Failed to spawn single-user server with form
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/pages.py", line 206, in post
        await self.spawn_single_user(user, server_name=server_name, options=options)
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py", line 922, in spawn_single_user
        % (status, spawner._log_name),
    tornado.web.HTTPError: HTTP 500: Internal Server Error (Spawner failed to start [status=ExitCode=1, Error='', FinishedAt=2019-08-06T14:28:53.407630275Z]. The logs for rschaufler may contain details.)

[I 2019-08-06 16:29:01.031 JupyterHub log:174] 200 POST /hub/spawn (rschaufler@127.0.0.1) 10019.67ms
[W 2019-08-06 16:29:24.923 JupyterHub user:678] rschaufler's server never showed up at http://127.0.0.1:32871/user/rschaufler/ after 30 seconds. Giving up
[E 2019-08-06 16:29:24.948 JupyterHub gen:593] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py:800> exception=TimeoutError("Server at http://127.0.0.1:32871/user/rschaufler/ didn't respond in 30 seconds",)> after timeout
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/tornado/gen.py", line 589, in error_callback
        future.result()
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py", line 807, in finish_user_spawn
        await spawn_future
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 654, in spawn
        await self._wait_up(spawner)
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 701, in _wait_up
        raise e
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 669, in _wait_up
        http=True, timeout=spawner.http_timeout, ssl_context=ssl_context
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/utils.py", line 234, in wait_for_http_server
        timeout=timeout,
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/utils.py", line 177, in exponential_backoff
        raise TimeoutError(fail_message)
    TimeoutError: Server at http://127.0.0.1:32871/user/rschaufler/ didn't respond in 30 seconds

[I 2019-08-06 16:32:36.800 JupyterHub proxy:319] Checking routes

Dockerfile:

FROM jupyter/base-notebook
USER root
RUN apt-get update
RUN apt-get install -y software-properties-common
RUN add-apt-repository -y ppa:ubuntu-toolchain-r/test
RUN apt-get update
RUN apt-get install -y make
RUN apt-get install -y cmake
RUN apt-get install -y gcc g++
USER $NB_UID
WORKDIR $HOME
RUN python -m pip install jupyterhub==1.0.0
RUN npm install -g configurable-http-proxy
RUN python -m pip install notebook
ADD DeepLearningLecture.yml .
#RUN conda activate
RUN conda env create -f DeepLearningLecture.yml
#RUN conda activate DeepLearningLecture
# Pull the environment name out of the environment.yml
RUN echo "source activate $(head -1 DeepLearningLecture.yml | cut -d' ' -f2)" > ~/.bashrc
ENV PATH /opt/conda/envs/$(head -1 DeepLearningLecture.yml | cut -d' ' -f2)/bin:$PATH
RUN conda install --yes ipykernel
RUN python -m ipykernel install --user --name DeepLearningLecture --display-name "DeepLearningLecture"

And last but not least the yaml file to set up the deep learning env in the container:

name: DeepLearningLecture
channels:
  - conda-forge
  - defaults
dependencies:
  - blas=1.0=mkl
  - bzip2=1.0.6=h14c3975_5
  - ca-certificates=2019.5.15=0
  - cairo=1.14.12=h8948797_3
  - certifi=2019.3.9=py36_0
  - ffmpeg=4.0=hcdf2ecd_0
  - fontconfig=2.13.0=h9420a91_0
  - freeglut=3.0.0=hf484d3e_5
  - freetype=2.9.1=h8a8886c_1
  - glib=2.56.2=hd408876_0
  - graphite2=1.3.13=h23475e2_0
  - harfbuzz=1.8.8=hffaf4a1_0
  - hdf5=1.10.2=hba1933b_1
  - icu=58.2=h9c2bf20_1
  - intel-openmp=2019.4=243
  - jasper=2.0.14=h07fcdf6_1
  - jpeg=9b=h024ee3a_2
  - libedit=3.1.20181209=hc058e9b_0
  - libffi=3.2.1=hd88cf55_4
  - libgcc-ng=8.2.0=hdf63c60_1
  - libgfortran-ng=7.3.0=hdf63c60_0
  - libglu=9.0.0=hf484d3e_1
  - libopencv=3.4.2=hb342d67_1
  - libopus=1.3=h7b6447c_0
  - libpng=1.6.37=hbc83047_0
  - libstdcxx-ng=8.2.0=hdf63c60_1
  - libtiff=4.0.10=h2733197_2
  - libuuid=1.0.3=h1bed415_2
  - libvpx=1.7.0=h439df22_0
  - libxcb=1.13=h1bed415_1
  - libxml2=2.9.9=he19cac6_0
  - mkl=2019.4=243
  - mkl_fft=1.0.12=py36ha843d7b_0
  - mkl_random=1.0.2=py36hd81dba3_0
  - ncurses=6.1=he6710b0_1
  - numpy-base=1.16.4=py36hde5b4d6_0
  - opencv=3.4.2=py36h6fd60c2_1
  - openssl=1.1.1c=h7b6447c_1
  - pcre=8.43=he6710b0_0
  - pip=19.1.1=py36_0
  - pixman=0.38.0=h7b6447c_0
  - py-opencv=3.4.2=py36hb342d67_1
  - python=3.6.8=h0371630_0
  - readline=7.0=h7b6447c_5
  - setuptools=41.0.0=py36_0
  - sqlite=3.27.2=h7b6447c_0
  - tk=8.6.8=hbc83047_0
  - wheel=0.33.1=py36_0
  - xz=5.2.4=h14c3975_4
  - zlib=1.2.11=h7b6447c_3
  - zstd=1.3.7=h0b5b093_0
  - pip:
    - absl-py==0.7.1
    - astor==0.7.1
    - attrs==19.1.0
    - backcall==0.1.0
    - bleach==3.1.0
    - chardet==3.0.4
    - conda==4.3.16
    - cycler==0.10.0
    - decorator==4.4.0
    - defusedxml==0.5.0
    - entrypoints==0.3
    - gast==0.2.2
    - grpcio==1.20.0
    - h5py==2.9.0
    - idna==2.8
    - imageio==2.5.0
    - ipykernel==5.1.0
    - ipympl==0.2.1
    - ipython==7.4.0
    - ipython-genutils==0.2.0
    - ipywidgets==7.4.2
    - jedi==0.13.3
    - jinja2==2.10.1
    - jsonschema==3.0.1
    - jupyterhub==1.0.0
    - jupyter-client==5.2.4
    - jupyter-core==4.4.0
    - keras==2.1.4
    - keras-applications==1.0.7
    - keras-preprocessing==1.0.9
    - kiwisolver==1.0.1
    - markdown==3.1
    - markupsafe==1.1.1
    - matplotlib==3.0.3
    - mistune==0.8.4
    - nbconvert==5.4.1
    - nbformat==4.4.0
    - networkx==2.3
    - notebook==5.7.8
    - numpy==1.16.2
    - pandas==0.24.2
    - pandocfilters==1.4.2
    - parso==0.4.0
    - pexpect==4.7.0
    - pickleshare==0.7.5
    - pillow==6.0.0
    - prometheus-client==0.6.0
    - prompt-toolkit==2.0.9
    - protobuf==3.7.1
    - ptyprocess==0.6.0
    - pygments==2.3.1
    - pyparsing==2.4.0
    - pyrsistent==0.14.11
    - python-dateutil==2.8.0
    - pytz==2019.1
    - pywavelets==1.0.3
    - pyyaml==5.1
    - pyzmq==18.0.1
    - requests==2.21.0
    - ruamel-yaml==0.15.92
    - scikit-image==0.15.0
    - scikit-learn==0.20.3
    - scipy==1.2.1
    - seaborn==0.9.0
    - send2trash==1.5.0
    - six==1.12.0
    - sklearn==0.0
    - style==1.1.0
    - tensorboard==1.12.2
    - tensorflow==1.12.0
    - termcolor==1.1.0
    - terminado==0.8.2
    - testpath==0.4.2
    - tflearn==0.3.2
    - tornado==6.0.2
    - tqdm==4.31.1
    - traitlets==4.3.2
    - update==0.0.1
    - urllib3==1.24.2
    - wcwidth==0.1.7
    - webencodings==0.5.1
    - werkzeug==0.15.2
    - widgetsnbextension==3.4.2
prefix: /home/jovyan/miniconda3/envs/DeepLearningLecture

I am not sure if this has something to do with the diovverent envs and if in some env something is missing? But I have installed noitebook and jupyterhub in the base installation, see Dockerfile as well as in the DeepLearningLecture env…

Ah, the docker containers on docker-stacks are actually spawnable without any problem, just my personal derivate is not.
Do your have any suggestions?
Thanks and regards!

hi betatim,

after checking the two links of yours…

The second link gives a complete new and different, but as I see, great opportunity to generate images! This looks awesome! But at this time, please understand: it must be possible to create my own docker image that is spawnable with our system-user-spawner. After this one works, I will try Binder and repo2docker also of course…

The first link you share with me, is the link to the jupyterhub/singleuser image. This one runs well on our jupyterhub-server already. It is spawnable, when unmodified. And it was a modification of one of the docker-stacks Dockerfiles, which went wrong after modifications that I performed on its Dockerfile. One problem seems to be, that the images to be spawnable are very sensitive to conda and its configuration in this very image. As soon as you install different environments withinh the image, it doesn’t work well anymore. Even though I carefully installed all the modules that have to be present for the spawner outside, like notebook, jupyterhub asf. Maybe there are yet more problems I do not know at all…?

As you can see in my preceding post, my Dockerfile is based on the jupyter/base-notebook. I additionally install some conda environment with a yml file. Unless I have forgotten and overlooked to install/configure something inside the env, it should be spawnable… It doesn’t spawn even, when I just install the env and its kernel without activating it… (BTW, where can I find really expressive logs, it always says the logs for … may contain more info. But the jupyterhub-log contains exact the same as the shell output. Am I missing a special log?)
The other question here is, do different types of spawners need different modules/configurations installed inside the image?

Any suggestions, what is missing / wrong in the Dockerfile?
warmest regards!

This is trying to say “look at the logs of the spawn for user rschaufler”. Where exactly these logs are depends on the spawner you use. With docker spawner you need to look at the log output of the container started for that user, for systemd spawner you need to track down which unit (I think that is the word for it?) was started and use journalctl to look at its logs. In these logs you will (or should) find the output of the command that the spawner runs. My guess is that that command crashes or isn’t the right command.

“TimeoutError: Server at http://127.0.0.1:32871/user/rschaufler/ didn’t respond in 30 seconds” says that the hub waited for the user’s spawn to start a command but that it never heard back from that command. It also tells you on which IP and port it thinks this server should be, which might be part of the problem (network unreachable from the hub?).

No idea from reading your Dockerfile what could be the problem. I’d start with one that only contains one extra line and keep building it up till it stops working, then think about why that lines breaks it.

2 Likes

@RSchauf Looking at your Dockerfile, I can’t tell what’s wrong with it. But here’s a custom single user image that works with JupyterHub - https://gitlab.com/gitlab-org/jupyterhub-user-image/blob/master/Dockerfile

This one is also derived from base/notebook (same as yours). You can plug this Dockerfile in and see if it’s working and then make step by step changes to it and see what’s breaking it.

One thing I’d recommend is to specify a version of base-notebook image instead of relying on latest tag.

1 Like

@betatim: thanks a lot!

Your first hint about the logs actually helped a lot. As I use

c.JupyterHub.spawner_class = ‘dockerspawner.SystemUserSpawner’

the variant with docker spawner you talked of applies for my case. I .e., I was able to find the logs for a started container from the conmmand line with the command

docker logs <container-id>

Even in the case when the container crashed, I figured out that a

docker ps -a

could help with logs, which actually was the case. As I found out, as soon as the container is started, it may create a log that can be later, even when it crashed, be read by docker logs.
I will now try to experiment with this recently gained access to enhanced log info! Thanks a lot!

Concerning the second hint, I will try @amirathi 's hint, to start modifying a working container and then add step by step new stuff to see where it gets stuck. Maybe I can then also get step by step to the problem with the server timeout!

Ok, let’ s see how far I get now… thanks a lot!

1 Like

@amirathi , thanks for this reply, I already worked in your first hint in my recent reply to @betatim

And your second hint is also wise, as later on when I stay with :latest tag, the whole image may not work any more as it slightly changed! Thanks for this hint, I will try to find a tag that is stable.

I will also check out your link to your derived notebook.

Thanks for Your great help and also to @betatim for his help, I appreciate it very much,
I’ll let you know the results,
warmest regards …

@betatim @amirathi

I was able to set up a container by stepwise derivation from a base container, thanks for your support, this works now.
But my first target, to understand how to build my own container from scratch and what it has to contain to be spawnable with the Jupyterhub docker-spawner (systemUserSpawner) is open yet.
Do you have a list of preconditions here that I can rely on? The people who have set up the docker-stocks must have such a list otherwise they couldn’t have set up those containers to be spawnable…?!
thanks and regards!