The JupyterHub use one single session instance through the app life cycle. This makes sense for the synchronized application but JupyterHub is backed with Tornado which is asynchronized.
For what I know in Tornado, it’s better to create a session when a request come in and close it when that request is finished. Thus requests keep their own session instance and never messed up with each other.
There is a tiny scripts which reproduce these thoughts:
import json import tornado.ioloop import tornado.web from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from jupyterhub import orm engine = create_engine('postgresql+psycopg2://xxxx:xxxx@localhost:5432/hub_test') Session = sessionmaker(bind=engine) class MainHandler(tornado.web.RequestHandler): @property def db(self): return self.settings['db'] async def get(self): delay = self.get_argument('sleep', 0) name = self.get_argument('name') new_user = orm.User(name=name) self.db.add(new_user) # executing some heavy task await tornado.gen.sleep(int(delay)) # record user to be added in this request new_instances = self.db.new self.db.commit() self.write(json.dumps( list(map(lambda u: u.name, new_instances)) )) def make_app(): settings = dict( db=Session() ) return tornado.web.Application([ (r"/", MainHandler), ], **settings) if __name__ == "__main__": app = make_app() app.listen(8888) tornado.ioloop.IOLoop.current().start()
After start it up, execute the following commands in sequence:
curl "http://localhost:8888/?name=JoJo1&sleep=10" # output  after 30 after 10 seconds # after 2 seconds curl "http://localhost:8888/?name=JoJo2&sleep=1" # output ["JoJo1", "JoJo2"] after 1 seconds
There is the mess occured, the
JoJo2 is been added by the second request and the first request added nothing.
Not sure if it’s a problem for JupyterHub, can anybody explain it? Thanks.