Creating a future infrastructure for notebooks to be submitted and peer-reviewed

@somedave Is there a recent-ish write up anywhere of how your org makes use of nbgallery?

Is the peer review infrastructure mostly about a traditional workflow, or would you also support citation/source validation (e.g. ensuring adequate citation, and that citations are appropriate to the notebook)? I ask because I’ve been working on data source citation/validation at https://github.com/whythawk/whyqd which provides an audit trail for wrangled source data.

1 Like

Is the peer review infrastructure mostly about a traditional workflow, or would you also support citation/source validation (e.g. ensuring adequate citation, and that citations are appropriate to the notebook)?

The citation/source validation at this point would be manual - the reviewer could check whatever they want.

@psychemedia - I can look to work up something more recent, but this JupyterCon talk of mine from September '18 details our use of nbgallery and highlights the recommendation and health monitoring efforts within nbgallery (the curation/review framework came later).

Also possibly of interest would be this repo with our thoughts on dashboards, and this previous discourse post detailing our experience using Jupyter in a large enterprise setting.

3 Likes

Just to clarify: the notebook is replacing the paper, not supplementing it? Would the reviewer possibly re-execute it and therefore need access to more than just the notebook (e.g., environment, dependencies, ala Binder)? Given the Github/OpenJournal approach, I assume this would be fully open peer review where all are comfortable with Git issues and PRs (ala JOSS)?

With a fully-open review process managed via Git and OpenJournal, couldn’t you just make the “repo” a Binder? Once it’s through the review process, it gets published to Zenodo, same as the JOSS software artifacts, and can easily be re-executed later or possibly integrated into the OpenJournal interface via some sort of widget.

There are certainly examples of journals with alternative approaches to this type of peer review, but these typically involve traditional papers with supplemental computational artifacts, non-open journals, some degree of blindness, and integration with commercial review tools and publishing infrastructure.

3 Likes

Just to clarify: the notebook is replacing the paper, not supplementing it?

There is no requirement for a paper at this point. This is just for review of some computational work.

Would the reviewer possibly re-execute it and therefore need access to more than just the notebook (e.g., environment, dependencies, ala Binder)?

yes

Given the Github/OpenJournal approach, I assume this would be fully open peer review where all are comfortable with Git issues and PRs (ala JOSS)?

That would be one option, but maybe there are others too?

With a fully-open review process managed via Git and OpenJournal, couldn’t you just make the “repo” a Binder?

That’s the first comment in Guidelines for submitting a notebook for peer review today

Once it’s through the review process, it gets published to Zenodo, same as the JOSS software artifacts, and can easily be re-executed later or possibly integrated into the OpenJournal interface via some sort of widget.

I think this is possible - if we chose this, a next question could be what tooling would be needed to make this work

This was a great read. I now wish I had buckets of spare time to prepare, get community buy in to run a study on mybinder.org following your ideas :slight_smile:

3 Likes

Lots of great ideas here! I have two side projects which may help in the process:

  • data-vault - by introducing a single ZIP (“vault”) for data and embedding hashsums and timestamps when reading and saving files I aim to increase the reproducibility of the analyses; if used properly (with git and nbdime) allows to trace when the data changed and when submitting for review, one could just share the ZIP (in addition to a git repository). It may not work for researchers who work with more complex data types, but I think that the idea of keeping a central data store and adding hashsums/timestamps may be a useful one.
    Edit: I am now aware that this is similar to another solution - nteract/scrapbook - which is already there.

  • nbpipeline this is a proof of concept for reproducible pipeline of notebooks:

while this repository is not of the quality I would normally share with anyone, I think that it addresses an important issue - going over the code published on GitHub is often like navigating a maze where you cannot know how different pieces connect to each other and in what order things were executed. Sometimes even finding where the actual results are is challenging!

I saw repositories using 01_Data_cleaning.ipynb, 02_Analysis_A.ipynb, etc which might be just fine for smaller repositories - but definetely, enabling users to specify how the how different notebooks relate to each other would be very helpful!

1 Like

I appreciate this coming up here, because we’re also working on this at Gigantum. At this point we have a few models for how this can work.

At the core of our approach is a desire to be more accessible than GitHub and more financially sustainable by ensuring broad portability of both data and compute (as opposed to a single cloud or national / institutional infrastructure). I will point out that Binder has a similar decentralized model, but I think the decentralization is more for administrators (or at least developers) than for end-users. That’s not bad - just different (and I’d be happy to discuss finer points - but for now I’m focused on an overview of my perspective from my experience at Gigantum).

There is the clear and vocal contingent that wants to put stuff on GitHub. There are lots of people who are intimidated or simply find GitHub and the related requirements burdensome. Support for inter-op with external git repositories is a medium term goal for us because of this demand, and I guess a GitHub option is important for any review tool.

I still believe that folks underestimate the impact of cognitive burden on “open science best practices” (and there is reasonable empirical evidence to this underestimation effect in general) - so it’s better for the actual science itself if review systems provide a scaffolded or even automated process. For example, the workflow of the PLOS or Frontiers review system is far more universal than anything I can imagine achieving directly with GitHub. (Presumably open journal also - I’ve not used it! And if I’m behind the times on GitHub based review, I would appreciate pointers!)

Relatedly, if things are easier to set up, the author and/or reviewer can use any extra bandwidth to improve the quality of the work and communication itself.

But I think it’s especially important to make things accessible to hand (not just to look at, but to use). I think Randal Burns did a pretty good job with this project:

https://gigantum.com/randal/forestpacking-sdm2019

This involves benchmarking first on a local machine, then on a standardized AWS reference. Anyone can poke around with these benchmarking results by clicking the “launch jupyterlab button” but they can also paste that URL into an application running locally and get a “launch jupyterlab” button there. This makes it far easier to reproduce a benchmark than it would be on Kubernetes (which is what we and at least Binder are using). Or, you could just look through our complete record of every command sent to the Jupyter kernel and see what the person did and trust their benchmarks. The reviewer can move directly towards subjecting the author to whatever level of scrutiny is desired.

The large data inputs are in an attached forestpacking dataset, so if you just want to pull the project onto your laptop to review results on a plane or in the woods, you can conserve space and leave the datasets behind, or just grab one file, etc.

Anyway, in terms of tools for review per se, I wonder if a review system could be de-coupled from the publishing side of the Open Journal system?

Foundationally, my hope is that we get a variety of projects that have different focus (e.g., empowering end-user, making administration by institutions easier, hard-core developer mode, etc.). This translates into a desire for a review system that’s not tightly coupled to the systems for actually running or inspecting code and data!

1 Like

I realized there’s a question implicit in the above - is Randal’s project a good example of what a review process might help steer authors towards? What are other examples of “good” and (perhaps only sketched in the abstract) “bad” code projects that could be targets or things to avoid in the review process?

I have asked this question before - “good” examples included the re-analysis in Jupyter of the LIGO data (which I won’t link directly because I’m unsure which is the “right” one - but if anyone has trouble finding it, feel free to ask me).

You should take a look at JOSS if you haven’t - All reviews are open and available to read so you can see how this works in practice. Also see Journal of Open Source Software (JOSS): design and first-year review and Publish Your Software: Introducing the Journal of Open Source Software (JOSS) for more discussion about it

I absolutely love JOSS, and I think it nails a number of things. Most importantly, it created a category for what it is - a low-cost, community moderated publication for authors of scientific / research software. Because I’ve trained in certain ways, I’ve enjoyed using the excellent GitHub interface to do reviews for JOSS.

BUT, a hard-learned lesson for me is that the majority of researchers I’ve worked with do not benefit from GitHub - but rather find it confusing.

So that’s why I was more curious about something like the OJS (even though I’ve never used it for real) it’s decluttered and accessible for folks who struggle (or perhaps simply lack the patience) to navigate GitHub. There are also some rather spiffy CMS approaches that use GitHub as a backend - perhaps a system like that would be the best of both worlds? (In case you have no idea what I’m talking about: the first such system I was aware of was prose.io, but more recently, systems like netlifyCMS have been popular.) BUT, perhaps a simpler app that’s closer to a typical review form will be far less of a headache, and be accessible to almost anyone already.

I hope I’m not belaboring the point too much! But after working on accessibility and inclusivity in a variety of situations, I’ve found that it’s a hard point to nail home. And indeed, maybe an important part of the design process will be talking to less technical users (the sort who aren’t terribly inclined to be on the Jupyter Discourse!)

2 Likes

Perhaps it is worth mentioning the assumptions that we have for our users. My assumptions are:

  • Users are familiar with Python / R / etc enough to write analysis scripts in their papers
  • Users are already familiar with the Jupyter Notebook, and have used it before
  • Users are motivated enough to want to submit a notebook along with their work

It seems that this is a reasonable kind of user to build infrastructure for at first, because they’re the most likely to actually use and benefit from this infrastructure. In all likelihood, the people who would actually participate in a pilot of this kind are going to be those who are fairly familiar with notebooks (which is fortunately also a pretty large group of people).

That could be a test case to build interest, prototypes, and eventually to make a case that it’s worthwhile to build infrastructure / UI / etc around users that aren’t as familiar or motivated with coding practices.

I think this kind of thing would most-naturally go in waves of development. Start off building infrastructure that makes it possible under-the-hood, perhaps relying on more power-user types to test and use the infrastructure. Try not to make any decisions that totally cut out off from extending functionality. Then if it’s got enough interest, start building out more user-friendly UI for those who don’t want to use GitHub.

1 Like

Is it worth us forming a working group around this? I’ve been working on similar efforts (Kubeflow for reproducible pipelines/declarative services, MLSpec for schemas, MLBox for execution layer), but I would prefer to do this as a unified effort.

Just let me know!

2 Likes

I like the assumptions in your bullet points for a target user model, but I’m a bit worried about:

In general, I think you’re proposing a pretty good plan, @choldgraf. I would argue that it should be considered in the brainstorming phase until you can get folks using it who are of the not-using-GitHub type. If you have even a “draft” design that’s not inclusive, I don’t think that’s setting us up for the kind of thing I think we all eventually want!

Hi @danielskatz (and thanks @labarba for the shoutout) - a bit late to this party but we built a simple and minimal workflow for document submission, review and publication inside Authorea. (Note: an Authorea document can include and execute Jupyter Notebooks). You can see the workflow in action in this video: https://www.youtube.com/watch?v=YQO0FDk4BDE

A couple of things to note: (1) a DOI can (optionally) be minted upon notebook publication, (2) peer review reports (signed or anonymous) are published as well (transparent peer review)

2 Likes

very cool! is there a place where we can see the backend details of how that is handled, or code that others could use to build upon?

In addition to krassowski/data-vault, krassowski/nbpipeline, and nteract/scrapbook;
pachyderm and quiltdata/quilt do (1) data versioning and (2) data analysis pipelines with sequences of container image invocations.

2 Likes

What is a Journal, what value do Journals provide, how can Journals and Notebooks merge to become supreme Notebook-hosting Journals?

What is a Journal? What value do Journals provide?

  • Document hosting: PostScript, LaTeX, PDF, HTML → HTML+RDF (RDFa), HTML+JSONLD
  • Document-level bibliographic metadata:
    Title, Authors (Organizations, Funding), Abstract
  • Comments / Threaded Comments
  • Search: Documents, Comments, Datasets, Code
  • Premises: Inputs and Outputs
    • Citations as (typed) graph edges (already parsed into JSON-LD)
    • Code repositories with version control
    • Data repositories with version control
    • Image hosting: charts and figures (CDN: Content Delivery Network)
  • Recommended/similar articles
  • #LinkedResearch (linkeddata/dokieli,)
  • Expert Community

How can Journals and Notebooks merge to become supreme Notebook-hosting Journals?

  • Accept and host Jupyter notebook ScholarlyArticle(s):

    • nbviewer – read-only (nbconvert --to html notebook.ipynb)

    • BinderHub (JupyterHub) – read-write (repo2docker github.com/repo/name)

    • List of solutions for hosting Notebooks as/with Journals

  • Integrate with the https://mybinder.org/ BinderHub instance?

1 Like

I’m quite late to this thread (thanks @danielskatz for pointing me to it), but I thought I’d share the notebook publishing solution we have developed for Pangeo Gallery. This is far from a complete / finished solution, but there may be some elements in our workflow that can be remixed / reused in other ways.

The main elements of Pangeo Gallery are:

  • The gallery is organized into repos. Each repo contains notebooks, a shared environment, and a simple configuration file. The repos can live in any organization.
  • Binderbot, a CLI which uses the binder API to execute notebooks from within a running binder. This is a key ingredient that allows us to “build” notebooks in the cloud, in the user-specified environment.
  • A GitHub workflow which gets run on each repo of the gallery, which calls binderbot, builds the notebooks, and commits them to a separate branch in the same repo.
  • A Sphinx Website which builds http://gallery.pangeo.io/ statically from the built notebooks. Each repo in the gallery is added as a submodule to the pangeo-gallery repo.

This combination of tools provides a fairly simple and lightweight way to continuously integrate notebooks and build them into a nice website, using all open-source tools and platforms. By using binder, we get interactive execution for free.

Going forward, this could conceivably form the basis of a peer-review / publication pipeline, similar to JOSS, in which the review occurs in the author’s repo itself, via comments, PRs, etc

To achieve archivability, one would want to store the built notebooks in a more permanent repository with DOIs–either something custom made for this purpose or, the easier path, Zenodo or Figshare.

2 Likes