Hi, I am trying to work out how to get the right format for the dependencies when converting my repo (https://github.com/johnjarmitage/percolation-inversion) with mybinder. I include the following in the environments.yml:
name: percolation
channels:
- conda-forge
- pytorch
dependencies:
- python
- numpy
- fenics
- mshr
- pytorch-cpu
- pip:
- matplotlib
And mybinder.org does it stuff up until this point:
Successfully pushed binder-registry.mybinder.ovh/binder-ovh/r2d-f18835fd-johnjarmitage-2dpercolation-2dinversion-b3384b:01114b9e8cabad46d3d1af352b238c678f9660c7Built image, launching...
Launching server...
Launch attempt 1 failed, retrying...
Launch attempt 2 failed, retrying...
Launch attempt 3 failed, retrying...
Internal Server Error
Could anyone point to what I am doing wrong?
Thanks,
John
1 Like
I think this is nothing to do with you but more to do with the issue discussed here. We recently added a second cluster to mybinder.org and there are still some things that need straightening out :-/.
I suspect if you temporarily tie yourself to the old cluster and try https://gke.mybinder.org/v2/gh/johnjarmitage/percolation-inversion/master (note the gke in the domain name, you shouldn’t generally rely on that or hard code it) it’ll work.
I was wrong. This problem has nothing to do with the registry issues.
Using my admin powers to look at the log from the container I see:
$ kubectl logs -f jupyter-johnjarmitage-2dp-2dation-2dinversion-2dmhd6xpcd
/srv/conda/envs/notebook/compiler_compat/ld: cannot find -lpthread
/srv/conda/envs/notebook/compiler_compat/ld: cannot find -lc
collect2: error: ld returned 1 exit status
Traceback (most recent call last):
File "/srv/conda/envs/notebook/bin/jupyter-notebook", line 7, in <module>
from notebook.notebookapp import main
File "/srv/conda/envs/notebook/lib/python3.6/site-packages/notebook/notebookapp.py", line 47, in <module>
from zmq.eventloop import ioloop
File "/srv/conda/envs/notebook/lib/python3.6/site-packages/zmq/__init__.py", line 47, in <module>
from zmq import backend
File "/srv/conda/envs/notebook/lib/python3.6/site-packages/zmq/backend/__init__.py", line 40, in <module>
reraise(*exc_info)
File "/srv/conda/envs/notebook/lib/python3.6/site-packages/zmq/utils/sixcerpt.py", line 34, in reraise
raise value
File "/srv/conda/envs/notebook/lib/python3.6/site-packages/zmq/backend/__init__.py", line 27, in <module>
_ns = select_backend(first)
File "/srv/conda/envs/notebook/lib/python3.6/site-packages/zmq/backend/select.py", line 28, in select_backend
mod = __import__(name, fromlist=public_api)
File "/srv/conda/envs/notebook/lib/python3.6/site-packages/zmq/backend/cython/__init__.py", line 6, in <module>
from . import (constants, error, message, context,
ImportError: /srv/conda/envs/notebook/lib/python3.6/site-packages/zmq/backend/cython/../../../../.././libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /srv/conda/envs/notebook/lib/python3.6/site-packages/zmq/backend/cython/../../../../../libzmq.so.5)
which means there is something serious “broken” by the packages you installed. To be investigated. The easiest question is: do you really need pytorch and do you really need it from their own conda channel instead of from conda-forge?
Thanks Tim! I am testing random notebooks on mybinder to see how it works and how the dependencies get dealt with as part of the reproducible journal idea I discussed earlier this week. Therefore, while I don’t need pytorch I thought I would put it in there as it is a typical machine learning dependency. I took the conda channel from the pytorch install instructions. I will see if there is another way.
I have found that fenics has caused time out problems on readthedocs and travis, so I also thought it might be an issue here…
1 Like
I switched out the pytorch dependency from conda to pip, but it looks like it is still failing. The error message is as before. Is it possible that the notebook dependencies are too demanding? It takes a long time to build, and I had a similar problem with readthedocs (although their machines are small on memory).
I’ve not looked into it yet but the error message I posted above points more at a discrepancy at the C library level. So not something related to binder/repo2docker but the result of mixing libraries together that aren’t compatible.
In general if you want to see a bit more from the build process I’d install https://repo2docker.readthedocs.io/ locally. I find it much easier than trying to keep track of the build log on mybinder.org and it is exactly the same tool used. So no disadvantage really.
Same problem locally in repo2docker with pytorch installed using pip. Could it be a problem related to mixing conda and pip in the environment.yml?
Success! I found the guidelines to making an environment.yml on the repo2docker docs: https://repo2docker.readthedocs.io/en/latest/howto/export_environment.html. I followed them, added my dependencies and the binder image works. (I guess it is not straight forward to automate the manual process of creating the precise dependencies for a conda install?)
1 Like
There are a few projects which try and guess from analysing Python code what dependencies it has and what the packages for those dependencies could be called (it isn’t a one to one mapping). So for now I’d say the original author should know what libraries they installed when they originally wrote the code and getting that written down explicitly is better than having a script try and guess
You don’t need to list all dependencies, in the sense of your dependencies and the dependencies of your dependencies and so on. It is enough to add matplotlib
and scikit-learn
to your list of dependencies if those are the two packages you use directly. They then pull in the packages they depend on.
if you do want to create the “full” set of tightly pinned dependencies as described in How to automatically create a environment.yml that works with repo2docker — repo2docker 2023.06.0+43.gc6f97e5 documentation … yeah we’d love to have a repo2docker freeze
like command to automate some of those steps. Alas no one has yet written that tool