Is there some way to specify the "schema" to install the Jupyterhub database into as part of a configuration property?

mcberma · March 1, 2021, 11:34pm

We are running Jupyterhub’s database in Postgres 11+. Our DBAs have a requirement that all database tables must run in their own schema and not in the default “public” schema. Is there some way to specify the "schema to install the Jupyterhub database into (along with the alembic tables) as part of a configuration property?

Really do not relish the idea of having to modify code to accomplish this.

minrk · March 2, 2021, 3:19pm

I’m not sure if this is quite what you are after, but from the sqlalchemy docs, you can dump the current creation sql to a string with:

from sqlalchemy import create_engine
from jupyterhub.orm import Base

def dump(sql, *multiparams, **params):
    print(sql.compile(dialect=engine.dialect))

engine = create_engine('postgresql://', strategy='mock', executor=dump)

Base.metadata.create_all(engine, checkfirst=False)

which gives this output for jupyterhub 1.3.0. That should be everything except the alembic table itself. That can be seen with:

python -m jupyterhub.dbutil alembic stamp --sql 4dc2d5a8c53c

Note: in general, python -m jupyterhub.dbutil alembic is a shortcut to run alembic:

loading the db url from your jupyterhub_config.py, and
locating the alembic revisions bundled with jupyterhub

Any alembic command subcommand or argument can be used there.

Assuming you are in a working directory with a juptyerhub_config.py that has:

c.JupyterHub.db_url = "postgres://"

which gives:

CREATE TABLE alembic_version (
    version_num VARCHAR(32) NOT NULL,
    CONSTRAINT alembic_version_pkc PRIMARY KEY (version_num)
);

That last bit is static and not sensitive to jupyterhub version, so it might be more work than it’s worth to autogenerate it, instead just adding it to the auto-generated jupyterhub schema.

For the upgrades once you are up and running, if you do upgrades with alembic via python -m jupyterhub.dbutil alembic upgrade from:to --sql, you can do the upgrades offline with:

# get the current revision of your db
$ python -m jupyterhub.dbutil alembic current
896818069c98
# get the latest version your db needs to  upgrade to
$ python -m jupyterhub.dbutil alembic heads
4dc2d5a8c53c (head)
# prepare the upgrade offline (emit sql, don't run it)
$ python -m jupyterhub.dbutil alembic upgrade 896818069c98:4dc2d5a8c53c --sql

However, I’ve never done offline upgrades, so it’s entirely possible that there are some assumptions in our upgrade/downgrade scripts that are not met when running offline (e.g. checking for the existence of tables) in which case the alembic scripts may need to be modified to do the upgrades.

mcberma · March 2, 2021, 4:00pm

This is awesome.

(THANK YOU SO MUCH)^1000.

Topic		Replies	Views
How does alembic generate the version hash for Jupyterhub? JupyterHub jupyterhub , how-to , help-wanted	1	927	September 10, 2021
PostgresSQL backend Jupyterhub DB JupyterHub	2	607	January 7, 2022
Trouble redeploying Jupyterhub after the jupyterhub schema and tables (Postgres) were accidentally deleted by our DBA Zero to JupyterHub on Kubernetes	1	633	July 27, 2021
Using MySQL or Microsoft SQL Server to store user authentication JupyterHub how-to	1	519	March 22, 2023
Accessing the jupyterhub database JupyterHub jupyterhub , how-to , help-wanted	0	424	July 2, 2021

Is there some way to specify the "schema" to install the Jupyterhub database into as part of a configuration property?

Related topics