Binder & Tesseract?

I found your own MyBinder session launch here (that binder-ready repo) and see that no longer works as written right now.
Change the wget line to:

!wget https://github.com/tesseract-ocr/tessdata/raw/main/eng.traineddata

Or

!curl -OL https://github.com/tesseract-ocr/tessdata/raw/main/eng.traineddata

This is because main is now preferably used for the main branch and it looks like they converted. (See Adeoy’s comment from Feb 18 2022 here.)

… Still testing if anything else is needed because I think there is a permissions issue with /srv/conda/envs/notebook/share/. …

In case someone is looking to have this launch already installed, I note that at one time this was the suggestion to install this here using apt.txt so this get installed by apt-get as the environment is built. I suspect though with the proper conda commands and then adding the trained data via postBuild the same thing can be accomplished without apt.txt.


Minor thing, you’ll note that I suggest your install should be:

%conda install -c conda-forge -y tesseract
%conda install -c conda-forge pytesseract

The reason is that it uses the more modern magics, see here. Using the exclamation point there is no longer the best practice and you are actually better off without it because of automagics.