Openssl mismatch between RStudio and conda environments

This is a short communication about issues we have found trying to deploy Keras + Tensorflow in a custom Python + R environment/image. Hopefully, it helps others in the community!

tl;dr

RStudio uses the “system” version of OpenSSL. Conda also installs OpenSSL. If you use RStudio to run a conda-installed package that calls OpenSSL, there is a good chance that it won’t work due to an OpenSSL “mismatch”. This is because RStudio forces the use of a system version of OpenSSL, while conda expects its own version of OpenSSL. To fix it, either call the function that requires OpenSSL from a Jupyter interface, or separate your conda and RStudio environments entirely.

Introduction

Recently, 2i2c received a request to install Tensorflow and Keras in an image containing conda environments along wit several R packages, including RStudio: GitHub - 2i2c-org/utoronto-image: User image for the UToronto Hub.

We were able to install the python TensorFlow package and the R counterparts as instructed by the corresponding documentation.

We also needed to set up the RETICULATE_PYTHON environment variable so the R packages could properly find the python ones: utoronto-image/Rprofile.site at d422076c3695e44f09a959b14cb10f89b6e5f538 · 2i2c-org/utoronto-image · GitHub

Problem

Our users began reporting issues when trying to download example datasets from within RStudio. For example:

Error in py_call_impl(callable, dots$args, dots$keywords) : 
  Exception: URL fetch failure on https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz: None -- unknown url type: https

which suggested some underlying OpenSSL-related issues.

Investigation

Upon several rounds of debugging sessions, we have found that RStudio seems to load the “system” OpenSSL libraries when it is opened. Our hypothesis for what is happening:

  • when RStudio starts the session, it loads the OpenSSL “system” components
  • when you try to load a dataset, the TensorFlow + Keras libraries (and underlying packages) try to use this OpenSSL “system” version
  • but the conda (python) libraries are (somehow) expecting the conda-installed OpenSSL version
  • This then fails!

To test if this specific mismatch was causing the problem, we tried a symbolic link hack:

mv /opt/conda/lib/libssl.so.1.1 /opt/conda/lib/libssl.so.1.1.backup
ln -s /usr/lib/x86_64-linux-gnu/libssl.so.1.1 /opt/conda/lib/libssl.so.1.1

and then, it worked!

Screen Shot 2022-03-22 at 12 01 33

This confirmed our suspicion but is likely not a long-term solution because it is probably a very brittle fix that will break in unexpected ways.

Possible workarounds

After spending a lot of hours on this issue we finally decided to stop trying, look for reasonable workarounds, and post here to disseminate the information we collected.

We have verified the OpenSSL mismatch does NOT happen when you use the Jupyter Notebook application with the R-kernel. So the problem seems to be an RStudio-specific issue when you have multiple co-existing environments (most likely caused by RStudio somehow loading the “system” OpenSSL libraries instead of the conda one). Hence, an immediate workaround is to use a Jupyter interface to download the dataset and then return to RStudio for the rest of your task.

Another alternative would be to create a different image without a conda environment to run your RStudio workflows, so that any python package (including TF or Keras) actually uses the “system” OpenSSL library instead of a conflicting one.

There might be other options involving fixes/enhancements at the RStudio level, but this is outside of our expertise to fix. If others have experience with RStudio and an idea for how to resolve this, please share your ideas!

Hopefully, all this information is useful for future readers!

Appendix

Things we have tried (and it did not work!)

First, we thought about syncing the OpenSSL versions in both environments (“system” and conda): Try downgrading the openssl version on the conda environment by damianavila · Pull Request #28 · 2i2c-org/utoronto-image · GitHub. But that approach did NOT work!

Then we tried pointing the LD_LIBRARY_PATH environment variable to conda-specific OpenSSL-related paths (to “force” RStudio to load the expected OpenSSL libraries) but that approach also failed: A different approach trying to make RStudio lo load the proper libssl stuff by damianavila · Pull Request #29 · 2i2c-org/utoronto-image · GitHub and triggered other potential issues!

Checking OpenSSL versions

To check the OpenSSL “system” version being used, you used the ldd command:

$ ldd /usr/lib/rstudio-server/bin/rserver | grep ssl        libssl.so.1.1 => /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007f0413833000)

and the “system” version was 1.1.1f. We also confirmed the installed version with the dpkg -l command:

$ dpkg -l | grep openssl
ii  libcurl4-openssl-dev:amd64           7.68.0-1ubuntu2.7                   amd64        development files and documentation for libcurl (OpenSSL flavour)
ii  openssl                              1.1.1f-1ubuntu2.12                  amd64        Secure Sockets Layer toolkit - cryptographic utility

To check the OpenSSL conda-associated version, we listed the openssl conda package and also directly checked the version:

$ conda list openssl
# packages in environment at /opt/conda:
#
# Name                    Version                   Build  Channel
openssl                   1.1.1l               h7f98852_0    conda-forge
pyopenssl                 19.1.0                     py_1    conda-forge
$ openssl version
OpenSSL 1.1.1l  24 Aug 2021

and the conda main environment had version 1.1.1l.

3 Likes

Thanks for the investigation, it sounds like it could affect other libraries too e.g.

2 Likes