Accessing data using binder and a URL to get the data

Hi @choldgraf and @betatim !!

i’m struggling a bit with adding data to my binder

i have a repo here and we setup an environment. i did see the binder examples… but am confused as to the best way of downloading data.
We are using an approach like this where we open a netcdf file using a link to the data (opendap server). Our notebooks generate the paths on the fly to download.

How would i modify the approach to allow things to download in mybinder environment?

# The (online) url for the MACAv2 data
data_path = "http://thredds.northwestknowledge.net:8080/thredds/dodsC/agg_macav2metdata_tasmax_BNU-ESM_r1i1p1_historical_1950_2005_CONUS_monthly.nc"

# Open the data using a context manager
with xr.open_dataset(data_path) as file_nc:
    max_temp_xr = file_nc

im essentially getting this error:

OSError: [Errno -68] NetCDF: I/O failure: b'http://thredds.northwestknowledge.net:8080/thredds/dodsC/agg_macav2metdata_tasmax_BNU-ESM_r1i1p1_historical_1950_2005_CONUS_monthly.nc'

so it is an i/o error. Any suggestions are greatly appreciated!!

mybinder.org is a completely free and open service, so to reduce the risk of abuse the outgoing ports are limited. 8080 isn’t permitted. Is your data available on a webserver using the standard http/https port?

1 Like

i am not sure. these are the standard climate MACA v2 data. The url does specify 8080 so it sounds like that port is blocked? is the only option then to use a wget call after the image loads? i wanted to create a lesson where students could modify the url and download the data that they want however it sounds to me like this may not be possible.

wget will also fail since it’s running inside the container- the port restrictions are enforced at the network level. Could you speak to the data provider, and see if the data is mirrored anywhere else?