Searching a Dockerfile to start with

Hi,

I’m confused with all the Dockerfiles. I’m searching a Dockerfile for customization. Most of the Dockerfiles are producung images you can’t log into. No default password. When doing docker exec -it image bash, I don’t get a root shell. So using normal /etc/password is good enough. No special things. The problem is that the images are building (except the alpine which is completly buggy) and then no explanation how to follow up. For example the docker demo is ok

Any advice to start with. The jupyterhub dockumentation did not help me at all. The purpos is to create a plain docker image I can customize for using in kubernetes. - No- the helm chart is not an option, it’s buggy as well.

Thanks to everybody giving me some hints

Yes, it’s not completely trivial to get going with a custom Docker image.

Here’s one of the repos that I use to build on top of the default jupyter/base-notebook image:

In particular, have a look at nipraxis-images/Dockerfile at 502ac0add95b09216856b74888c20daf22ea88f6 · nipraxis/nipraxis-images · GitHub .

It’s true, that, by default, you can’t start a shell as root, but see the instructions in:

to start a sudo-capable shell in the built Docker image.

2 Likes

First, thank you very much for sharing this information. I will try it out.
I’m running a private docker registry in a custom kubernetes cluster. So no cloud provider required. This is tested and works.

The jupyterhub helm chart is not able to support that, - and slim documented. so that’s why I work with handmade yaml files.

I also have handmade manifest. Before you content, the problem was the spawner, that did not worked out. Anyway, I’m happy to try your solution.

Okay, so here the Dockerfile of nipraxis-hub explains the game:

You take a:

  1. ubuntu base image
    2 install a base jupyter note book to it
    This is the base the project starts, then
  2. a Enhance as user root the ubuntu system and install:
  • some system tools: dns curl etc
  • npm, yarn required for jupyterhub
  • then install what is called theia by copy package.json and install it with yarn
  1. b change to user $NB_USER which is ofthen jovyan
  • install python local to this user

  • install juypeter extension (so here you may add additional ones)

  • then install the jupyter-server-proxy as user $NB_USER (notebook user). So the proxy runs as user $NB_USER

    This is a different server-proxy from outside the project. It’s not the configurable-http-proxy. Ok

Ok, I can build this, and the outcome runs:

cd nipraxis-hub/
../make_image.sh .
docker images | grep matthewbrett

So the image is matthewbrett/nipraxis-hub 502ac0a 44ebe9a2ef4d 10 minutes ago 3.19GB

Ramping up works

run  -it  --rm  -p 8888:8888 -e JUPYTER_ENABLE_LAB=true  matthewbrett/nipraxis-hub:502ac0abash

The notebook image requries the URI with token http://127.0.0.1:8888/lab?token=e347b42370420160778ea6f5b2eb6f1c408bde4e14bce791

Congrat, nice job!

I’ve tested somthing:

  • the Theia IDE causes a 500 : Internal Server Error
  [E 2022-05-08 19:07:01.917 ServerApp] Uncaught exception GET /theia/ (172.17.0.1)
   HTTPServerRequest(protocol='http', host='127.0.0.1:8888', method='GET', uri='/theia/', version='HTTP/1.1', remote_ip='172.17.0.1')
   Traceback (most recent call last):
     File "/opt/conda/lib/python3.9/site-packages/tornado/web.py", line 1704, in _execute
       result = await result
     File "/opt/conda/lib/python3.9/site-packages/jupyter_server_proxy/websocket.py", line 91, in get
       return await self.http_get(*args, **kwargs)
     File "/opt/conda/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 668, in http_get
       return await ensure_async(self.proxy(self.port, path))
     File "/opt/conda/lib/python3.9/site-packages/jupyter_server/utils.py", line 182, in ensure_async
       result = await obj
     File "/opt/conda/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 662, in proxy
       await self.ensure_process()
     File "/opt/conda/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 627, in ensure_process
       cmd = self.get_cmd()
     File "/opt/conda/lib/python3.9/site-packages/jupyter_server_proxy/config.py", line 65, in get_cmd
       return self._realize_rendered_template(command)
     File "/opt/conda/lib/python3.9/site-packages/jupyter_server_proxy/config.py", line 60, in _realize_rendered_template
       call_with_asked_args(attribute, self.process_args)
     File "/opt/conda/lib/python3.9/site-packages/jupyter_server_proxy/utils.py", line 33, in call_with_asked_args
       return callback(*asked_arg_values)
     File "/opt/conda/lib/python3.9/site-packages/jupyter_theia_proxy/__init__.py", line 15, in _theia_command
       raise FileNotFoundError('Can not find theia executable in $PATH')
   FileNotFoundError: Can not find theia executable in $PATH
[E 2022-05-08 19:07:01.930 ServerApp] {
     "Host": "127.0.0.1:8888",
     "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
     "Referer": "http://127.0.0.1:8888/lab/workspaces/auto-B",
     "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:100.0) Gecko/20100101 Firefox/100.0"
   }
[E 2022-05-08 19:07:01.930 ServerApp] 500 GET /theia/ (172.17.0.1) 15.15ms referer=http://127.0.0.1:8888/lab/workspaces/auto-B

Anyway, that is not so important for me right now

  • A lot of jupyterlab extensions are installed, what is nice :wink:
  • Terminal is working nice
  • jupyterlab is starting
  • kernel is connecting

I’m happy

The next step for me is still unclear. Using tokens as authentication in kubernetes might be a bit complicated, because it is created during startup. Parsing the logs is possible, but not nice. Having a nice login screen with user/password would be nice.

So I want to:

  • login with user and password (good enough for now) and not with a token.
  • I saw customisations in adding like adding to the Dockerfile
ADD jupyterhub_config.py /srv/jupyterhub/jupyterhub_config.py

So how is this done? I’m struggeling with the documentation, and could not make it work. This should be the PAM module? But it is uncleare how to really do it.

Any advice or examples. Is there something I’m missing out?

Thank you very much vor any advice. I try to find this out since days now.

For the Theia error - yes - Theia requires HTTPS - so it’s difficult to test localling with HTTP.

I’m not entirely sure what you are asking here. Are you looking for a different way to authenticate, other than the options in JupyterHub authenticators?

What else do you need, from your authentication?

Ok, that explains everything → So resolved when changing to https

Thank you very much

Well, regarding to the authenticators, there is one simple problem: I like to login with user/password. I just want to get rid of the tokens, because later I like to deploy this image in the k3s/kubernetes cluster. Having a autogenereated tokens just doen’t work out.

So no oauth or something. Just the any authenticator that does not require an external infrastructure would be fine.
That’s why I tried the dummy as well as using the /etc/password. So I went through https://jupyterhub.readthedocs.io/en/stable/getting-started/authenticators-users-basics.html which should explaint that. It does not work.

The reason seems to be that jupytherhub should run as root and not as jovyan.

  • The LocalAuthenticator uses adduser which writes /etc/password and /etc/shadow and more. So this requires root. Ok this won’t work

  • So this points me to the point PAM should be configured on the docker image, - or possilbe the PAM authenticater will handle that. So that’s what I’m currently trying.

  • DummyAuthenticator with setting c.DummyAuthenticator.password = “some_password”.

May be the last on is first starting point. But where is jupyterhub_config.py ? Ok, one can argue that

  • the image that was build is another step in a multistage docker build process and authentication is the next build step.
  • Or for testing just enhance the existing Dockerfile

In any cases jupyterhub_config.py this file is not part of the current image (result of a root shell executing find / -name jupyterhub_config.py -print). So this needs to get created first.

I’m really not sure what authenticator will do the trick, so that’s why I am in the process of testing authenticators. But this is hard.

Any advice? And thank you again for your really great help and time for me

Ok, so after some work, a more precise question.

First again, thank you for the starting point. I found oud the image still starts the jupyter notebook with its own authentication.

The next step is just to start the jupyterhub. This is done by deriving a new docker image from the previous.

FROM matthewbrett/nipraxis-hub:87a8955
 
USER $NB_USER
CMD ["jupyterhub", "--debug", "--ip=0.0.0.0"]

So this does pretty nothing, except changing the CMD. So don’t expect to have a valid login.

But now jupyterhub comes up asking for credentials (That are not configured).

Now getting the most easiest authenication working, the dummy authenticator. This one just accepts any user!

To configure, a vanilla jupyter_config.py was created and copied the project. This was then modiefied by online c.JupyterHub.authenticator_class = 'dummyauthenticator.DummyAuthenticator

FROM matthewbrett/nipraxis-hub:87a8955

USER $NB_USER
RUN pip install jupyterhub-dummyauthenticator

ADD jupyterhub_config.py  /etc/jupyterhub/jupyterhub_config.py

CMD ["jupyterhub", "--debug", "-f", "/etc/jupyterhub/jupyterhub_config.py",  "--ip=0.0.0.0"]

Then build an start

The result is:

  • login as jovjan works, the spawner is started, but fails. So next I take a look at the spawner
  • login with any other user will result in an error * 500 : Internal Server Error*. This is absolutly right. Because the dummyAuthenticator allows authentication with a used that does not exist. So this must end up in a crash. So this error just shows the system behaves as expected.

So I got a jupyter hub running, but from the jupyter/jupyter image with the native authenticator. So here is the Dockerfile. So this solution derives from a jupyterhub and adds the jupyterlabs. So it’s the other way around compared to mathews’s approach.

# FROM matthewbrett/nipraxis-hub:87a8955
FROM jupyterhub/jupyterhub:latest


USER root

RUN apt-get update && \
  sleep 4 && \
  apt-get install -y npm nodejs python3 python3-pip git nano vim && \
  python3 -m pip install jupyterhub notebook jupyterlab && \
  npm install -g configurable-http-proxy


RUN pip install jupyterhub-nativeauthenticator

RUN mkdir -p /etc/jupyterhub && \
  cd /etc/jupyterhub && \
  jupyterhub --generate-config -f jupyterhub_config.py

ADD jupyterhub_config.py  /etc/jupyterhub/jupyterhub_config.py


# jupyterhub -f /etc/jupyterhub/jupyterhub_config.py
# CMD ["jupyterhub", "--ip=0.0.0.0" ]
CMD ["jupyterhub", "--debug", "-f", "/etc/jupyterhub/jupyterhub_config.py",  "--ip=0.0.0.0"]

The nativeAuthenticator allwos to signup first with the user admin, then click on login and you can login as admin. The restriction in the nativeAuthenticate is, that you can only login with one user on machine. Changing to a new user is not possible. Don’t know why.

So the jupyterlab notebook comes up and is functional. That was the goal

What is with nipraxis-image ? Well here is a docker image, that also runs the nativeAuthenticate, but this is crashing on the spawner. So need to check

FROM matthewbrett/nipraxis-hub:87a8955


USER root

RUN apt-get update && \
  sleep 4 && \
  apt-get install -y python3 python3-pip git nano vim && \
  python3 -m pip install jupyterhub notebook jupyterlab && \
  npm install -g configurable-http-proxy


RUN pip install jupyterhub-nativeauthenticator

RUN mkdir -p /etc/jupyterhub && \
  cd /etc/jupyterhub && \
  jupyterhub --generate-config -f jupyterhub_config.py

ADD jupyterhub_config.py  /etc/jupyterhub/jupyterhub_config.py


# jupyterhub -f /etc/jupyterhub/jupyterhub_config.py
# CMD ["jupyterhub", "--ip=0.0.0.0" ]
CMD ["jupyterhub", "--debug", "-f", "/etc/jupyterhub/jupyterhub_config.py",  "--ip=0.0.0.0"]

So this needs deeper investigation…