IT Enables Research: Create Interactive Code Repositories using BinderHub

30 September, 2020

BinderHub allows you to build and register a Docker image from a Git repository, then connect with JupyterHub, allowing you to create a public IP address that allows users to interact with the code and environment within a live JupyterHub instance. You can select a specific branch name, commit, or tag to serve.

BinderHub ties together:

  • JupyterHub to supply a scalable system for authenticating users and spawning single user Jupyter Notebook servers, and

  • Repo2Docker which generates a Docker image using a Git repository hosted online.

BinderHub runs on shared resources which IT Research Computing manages efficiently by running it on its Kubernetes production cluster.

WHAT IS BINDERHUB?

The primary goal of BinderHub is creating custom computing environments that are used by remote users. BinderHub enables an end user to easily specify a desired computing environment from a Git repo. BinderHub then serves the custom computing environment at a URL which users can access remotely.

BinderHub allows researchers to quickly create the computational environment needed to interact with research code and data shared online. To interact with someone else’s work, you simply click a URL, and you are redirected to a live environment where you can run the code in the cloud. The cloud in this case being KAUST’s on-premise Kubernetes cluster.

BINDERHUB MAIN GOALS

  1. Let anyone in the world with an internet connection easily run the code that is in public repositories.

  1. Empower authors of code or computational content to quickly create interactive versions of their work.

  1. Make it easy to create Binder services anywhere.

  1. Develop Binder as a 100% open source, community-driven project.

HOW YOU CAN USE BINDERHUB

  • Share and publish scientific results — to the extent that academic research publications have accompanying code and data, BinderHub provides an effortless way for researchers to share those resources in a fully reproducible and interactive manner. This complements recent integrations of executable protocols and figures into journals, as well as efforts to formalize and share workflows between collaborators.

  • Showcase research and analytics software features — several organizations include Jupyter notebooks and a Binder badge with their open-source software packages to let users interact with and understand the functionality of their research and analytics tools.

  • Enhance online content with interactive computation — BinderHub makes it easy for anyone with a bit of Python or R knowledge to create interactive online experiences. This is like the type of experience that Shiny has offered for the R community but works across any programming language and user interface (including RStudio).

  • Distribute interactive educational materials — in many online and in-person educational settings, students and teachers need quick access to materials that allow them to explore and practice computational examples. BinderHub opens the door for the rapid and scalable creation and deployment of interactive computational curricula, and it is already transforming education.

WHAT IS A BINDER-READY REPOSITORY?

A Binder (also called a Binder-ready repository) is a code repository that has at least two things:

  1. Code or content that you would like people to run (i.e. Jupyter Notebooks)

  1. Configuration files for your environment (i.e. Dockerfile)

WHAT ARE BINDER “CONFIGURATION FILES”?

These are files used by BinderHub to build the environment needed to run your code. For example, if you want to build a Jupyter Notebook server, you must tell BinderHub how to do it. For a list of all configuration files available to create an environment, see the Configuration Files page . You can check examples hosted on IT Research Computing’s GitLab or these hosted on GitHub.

IT RESEARCH COMPUTING BINDERHUB SPECIFICATIONS

Each Juptyer Notebook runs on two cores and 4GB of RAM. No GPU available now. You will be automatically logged out after thirty (30) minutes of inactivity; you will have to start over if this happens.

GitOps@KAUST

IT Research Computing would like to thank David Pugh who works at KVL. He has road tested BinderHub during his Data Science Workshop Series. Many improvements to this service are due to his valued (and patient!) feedback. Thank you, David!

IT Research Computing manages almost all of its products and services using GitOps principles. BinderHub is no different. We have automated GitLab pipelines to push changes to all Kubernetes workers. We also have alerting setup to check both Kubernetes and BinderHub. Our goal is to detect issues before users do. We try to avoid downtimes; that’s why BinderHub runs on Kubernetes.

Run and share your Jupyter Notebooks at https://binder.kaust.edu.sa today!

REFERENCES

CONTACT US

KAUST Information Technology Department

We make IT happen!