"When to use JupyterHub?"

yuvipanda · May 19, 2019, 7:57pm

Continuing the discussion from Is running JupyterHub as root a requirement for deployment?:

This is a question that has come up in many contexts many times. I think it would be useful to have a page laying out ‘when to use JupyterHub’.

What do people think? What kinda stuff should be in this page?

psychemedia · May 19, 2019, 8:29pm

I keep wondering if a cartoony flow chart might be useful?

Cartoony style would mean you could use informal language, express doubt on edges out of decision boxes, not be too serious / intimidating but still be useful?

minrk · May 19, 2019, 8:31pm

My perspective on this is: JupyterHub is for when you have or have access to computational resources that you want to make available to some users via Jupyter UI. It enables things like taking away the burden of package installation, etc. for courses, researchers, etc.

the short answer: “I have computers and users and I want to make it easy for those users to login and use these computers”

My perspective on the main reasons to use JupyterHub vs rolling your own, from that starting point:

JupyterHub implements login pages, authentication
JupyterHub tracks activity for shutting down unused resources, etc.
JupyterHub handles proxying lots of Jupyter servers under one URL
lots of folks use JupyterHub already, so there’s a decent chance someone has seen the same problems you are facing

The main reason to roll your own, to me, is if you already have existing solutions to some or all of these and JupyterHub’s implementation only gets in the way. I’d use KubeFlow as an example - they have their own systems for deploying resources for users and authenticating access, and are experts in deploying services on Kubernetes, so JupyterHub’s abstractions have ended up more in the way than helpful. I think KubeFlow is better off rolling their own notebook deployment implementation than using JupyterHub. Tmpnb is borderline: tmpnb itself is much simpler and much faster than an equivalent JupyterHub configuration.

gnestor · May 20, 2019, 7:33pm

Thanks @minrk!

A few follow-up questions:

It enables things like taking away the burden of package installation

Can users still override this trivially? Some users may want to install their own extensions, etc.

JupyterHub handles proxying lots of Jupyter servers under one URL

Can one user access another user’s notebooks trivially?

psychemedia · May 20, 2019, 10:29pm

Re: package installation, via @yuvipanda here, there are TLJH plugins, for which the docs include a pattern that shows how to create a plugin to install additional conda packages.

gnestor · May 20, 2019, 10:57pm

What I’m asking specifically is if the user can customize their JupyterLab instance by persisting settings, editing the config, and installing labextensions. I assume that the first two are possible (as they can be stored in jupyter’s user directory) and the third is probably not (because the lab application exists outside the user directory).

fm75 · May 22, 2019, 4:27am

Great question.
There are so many alternatives. Perhaps we should build a matrix of the possibilities and the pros and cons. Server side and client side. We have a spectrum of options. Below are some somewhat unrefined ideas based upon experience on trying to create or provide services.

Candidates

tljh
jh on various platforms
binder - pretty awesome
personal jupyter lab/notebook
- local
- vm
- virtualenv

Barrier to entry

server side
- security
- resource management
- education, marketing
client side
- education - how to set up personal server securely
- configuration
- virtualenv on shared resources
- anaconda vs crafted

Other dimensions

gpu
personal
private cloud
public cloud

minrk · May 24, 2019, 8:08am

Yes, generally. Most deployments are made in such a way that things like pip install --user can add packages specified by the user. conda install also typically works in a container-based deployment (docker, kubernetes), but users don’t usually have permissions to install packages in a shared environment in a case like TLJH. This is all about permissions and how the user environment is specified. JupyterHub doesn’t do anything to prevent or enable users installing packages, all the standard mechanisms for users to install packages work (or don’t).

This depends on storage/permissions. When using a default shared filesystem like TLJH, then this works the same as it would for any shared file on the filesystem - a shared directory and directory/file permissions govern who can read/write files.

For a container-based deployment, this can be more complicated.

Yes, these should all be possible in ~all JupyterHub deployments. labextensions are the only thing that might be an issue, if jupyter lab build must go in $PREFIX instead of a user directory, but again this can be governed by permissions. If that’s the case, there should be an issue in jupyterlab to fix it. user-installed extensions should absolutely not need write permissions to sys.prefix. In a container-based deployment, users typically (but it is not a requirement) have permission to modify the env, so this works.

hayden · September 25, 2020, 1:39am

@minrk I would like to ask is Jupyterhub necessarily to install with Kubernetes to make your computer resources shareable?

Topic		Replies	Views
Market Research: What is a JupyterHub? General jupyterhub	5	848	August 11, 2019
Is running JupyterHub as root a requirement for deployment? General	4	5058	April 16, 2019
Hosting JupyterHubs - Any tips for new admins? JupyterHub	17	2281	March 19, 2020
Installation best practices with changes in technology JupyterHub jupyterlab , jupyterhub , help-wanted	2	164	September 16, 2024
Deploying JupyterHub at your institution discuss	21	7881	December 11, 2021

"When to use JupyterHub?"

Related topics