This lesson is being piloted (Beta version)

Share and Publish Jupyter notebooks

Overview

Teaching: 30 min
Exercises: 30 min
Questions
  • How to use mybinder?

  • How to make your github repository citable with zenodo?

Objectives
  • Learn about myBinder

  • Learn about zenodo

  • Learn how to make your github repository citable with zenodo

MyBinder reproducible environment

Sharing your GitHub repository along with your jupyter notebooks and your LaTeX publications is an important step for making your research reproducible. However, anyone willing to rerun your programs/notebooks need to get the same computational environment (python, LaTeX, additional python packages, etc.).

The next section (using Binder) will show you how to make your research “fully” reproducible, offering users the same computational environment as we used during this workshop and with very little extra efforts.

Important notice

This lesson has been taken from https://reproducible-science-curriculum.github.io/sharing-RR-Jupyter/ and is distributed under the Creative Commons Attribution license. The following is a human-readable summary of (and not a substitute for) the full legal text of the CC BY 4.0 license.

Reproducible computing environments with Binder

A short intro on Binder

Authors: Chris Holdgraf, M Pacer

Slideshow

Turn your github repository into a reproducible environment with mybinder

Preparing your github repository for Binder

We would like to publish all the codes in our repository with Binder. To be Binder-compliant, we need to add configurations files (one or more text files) that specify the requirements for building your project’s code:

Sharing our Python environment (environment.yml)

This approach is recommended when all the additional packages/libraries you need are part of conda. Be aware that conda is a source package management system and is not only used for python. Many packages/libraries, independent of python/R are made available via conda, so the best is to first check online whether your package is already available via conda.

name: resbaz
channels:
  - nordicesmhub
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - python=3.6
  - numpy
  - xarray
  - cartopy
  - matplotlib
  - netcdf4
  - geopandas
  - tqdm
  - rasterio
  - pandoc
  - html5lib
  - ipython_genutils
  - ipyvolume
  - ipywidgets
  - jupyter_latex_envs
  - jupyter_contrib_nbextensions
  - nbconvert
  - nbformat
  - jupyterlab
  - nodejs
  - scikit-image
  - ipypublish
  - pip:
      - jupyterlab_latex
      - jupyterlab-git

This file must be placed in the root directory of your reprository on Gitub.

Sharing our complete workflow

Using environment.yml, we can run jupyter notebooks except if additional system libraries are required. In addition, we cannot run LaTeX as it is not available by default. To share our computational environment, additional system packages (LaTeX, etc.) need to be installed.

These packages are not available as a conda package but we can install them with apt-get install.

texlive-latex-base
texlive-latex-recommended
texlive-science
texlive-latex-extra
texlive-fonts-recommended
dvipng
ghostscript
latexmk
texlive
vim
#!/bin/bash

pip install jupyterlab_latex

jupyter serverextension enable --sys-prefix jupyterlab_latex

jupyter labextension install @jupyterlab/jupyterlab-drawio jupyterlab/latex jupyterlab/git

jupyter labextension install @jupyter-widgets/jupyterlab-manager
jupyter serverextension enable --py jupyterlab_git

Note

This file must be executable to be used with repo2docker. To do this, run the following on Linux/Mac-OSX:

chmod +x postBuild

On Windows (to be done before you commit your file):

git update-index --chmod=+x postBuild

Launch your computational environment on Binder

  • Start your complete computational environment on Binder
  • Try to execute your notebook
  • Check your notebook can run in your Binder environment

Get a shareable Binder Badge

binder badge

display badge

Make your github repository citable with Zenodo

Make your GitHub repository citable (DOI)

Your GitHub repository contains your scientific workflow, your programs/software, datasets (or links to your datasets) and jupyter dashboards so it is important to make the work you share on GitHub citable by archiving your GitHub repository to get a DOI. You may have a Data archive in your University or you may use the data archiving tool Zenodo.

Login to Zenodo

Source: https://guides.github.com/activities/citable-code/zenodo-authorize.png

Get a DOI for your Github repository

Then

Your GitHub repository is now linked to Zenodo and you will automatically get a DOI:

Add your DOI to your GitHub repository

Key Points

  • mybinder reproducible environment

  • github

  • zenodo