Connect to selenium server

I am preparing some lectures using binder. In these students should use library(RSelenium) in R . Selenium is a browser automation suite. Using selenium outside of binder requires launching a Selenium server in a docker container:

docker run -d -p 4445:4444 selenium/standalone-firefox

How can I achieve this functionality within binder? I cannot launch docker within binder. Can I run it on the host and have students connect to it?

I am just starting out with binder so please forgive my ignorance. I’ve used it for teaching once and was so happy with it that I’m trying to move everything there as it spares me from dealing with computer configuration issues from students.

Some time ago I popped together this recipe for myself for using Selenium in Mybinder: https://github.com/simonw/selenium-demoscraper

2 Likes

Thanks, I’ll give it a try!

Hi @hliebert ,

Did you manage to get RSelenium to work within binder? I tried the solution by @psychemedia but could not get it to work for me.

Thanks

The requirements.txt configuration file that is in the current working version of the example binder-ready for selenium works with pip. If I recall correctly, the configurations that involve R aren’t compatible with the requirements.txt approach. I imagine the easiest these days would be to convert a configuration that should include selenium set up properly to use much the same as the ‘r conda’ example, building in there what presently is in the example for selenium requirements.txt. And then you’d include the apt.txt of the selenium binder-ready example repo, or the contents if the ‘r conda’ already has an apt.txt. Similar for adapting from the postBuild where selenium works on mybinder to the ‘r conda’ example. Wait. Actually because conda handles things better, I think I was able to remove the apt.txt and postBuild in the example with an environment.yml file for configuration, like the ‘r conda example’ uses, here. That latter repo of mine may serve as a better guide for the adapting of ‘r conda’ example to install selenium, too.

To get much more specific guidance, you’d need to include a link to your repo.

Hi @fomightez ,

Thanks for jumping in to help! Essentially, I’m trying to deploy a data scraping app with RShiny and RSelenium, which when I try to open up in the Shinyapp format disconnects from the server. However, when I’ve tried to run it from rstudio (Binder), it looks like RSelenium does start (which suggests that it’s installed), but it’s missing some dependancies (JAVA) since I get a “Warning: Error in java_check: PATH to JAVA not found. Please check JAVA is installed.” error message.
I tried the solution suggested here (How to install jdk 17 in mybinder.org) and I added ca-certificates-java and openjdk-17-jre-headless to my apt.txt file, but I got an “returned a non-zero code: 100” error (which is why they are no longer in that file).
Any help will be greatly appreciated.
Thanks

That route to installing Java via apt.txt looks more complex than I learned and have been using with success for years. See the top two apt.txt files listed here. I just verified that typing java in a terminal in sessions launched from the top one of those associated repos works presently. And launches from this one, which uses a variation, also works when you type java in a session launched from there…

Thanks @fomightez ,

Those solutions worked! Unfortunately, this just unveiled the next series of issues with getting RSelenium to work properly.
Now I get an error in curl::curl_fetch_disk: Unrecognized content encoding type. I can temporarily avoid this by looping over the rsDriver command and defining the selenium (3.141.59) and geckover (‘0.31.0’) version instead of having them default to the ‘latest’ version, but ultimately the connection is refused and it cannot connect to pretty much any port.
From what I’ve seen (e.g. RSelenium::rsDriver() not working as expected. - #3 by lazycipher - General - RStudio Community or Webscraping Aliexpress with Rselenium | R-bloggers) people seem to attribute this to a corrupted jarfile (and indeed when running wdman::selenium(), I get the same error message) and the solution seems to be to download the jar file from the original website and place it the project’s directory. Would you happen to know how to do this in Binder?

If I follow correctly, you’d probably do something like that with a postBuild configuration file. I have a vaguely similar process going on in this example postBuild or this one or this one.

I haven’t read the other posts after yours. I didn’t spend much more time getting docker to work, I just used a selenium server standalone jarfile and using that worked smoothly.

thanks for the responses, I tried using a selenium server standalone jarfile through postBuild, while also installing firefox and java through apt.txt but I still can’t get selenium to work. @hliebert would you mind sharing a link to your repository for this project? If not, could you maybe list the dependencies you used and how you implemented them?
Thanks!