Random failure to connect when initializing or restarting a notebook kernel

A few days ago, I started having major issues with notebooks. Half the time, I can open a notebook just fine, and the kernel will initialize, connect, and then display “idle”. The other half, it will say “initializing”, then “connecting”, and either stay at that point forever or move on to “unknown”. Either way, I cannot run cells. Meanwhile the server messages contain a lot of “Replacing stale connection” and “Nudge: attempt xx”. Note that I can wait any amount of time in these situations, and it will never connect, and yet manually shutting down the kernel and launching a new one finishes in a few seconds (if it works). It’s just chance. I also never have problems connecting to a running notebook after disconnecting. But this suggests my internet connection is not the real culprit.

[I 2021-09-03 00:30:27.475 ServerApp] Kernel started: afd46c7f-1ac1-4e23-adb6-6b6be9747f01
/fast/jamesn8/anaconda3/envs/torch3/lib/python3.9/json/encoder.py:257: UserWarning: date_default is deprecated since jupyter_client 7.0.0. Use jupyter_client.jsonutil.json_default.
  return _iterencode(o, 0)
[W 2021-09-03 00:32:47.882 ServerApp] Notebook Mechanical/mechanical/TestNetwork.ipynb is not trusted
[I 2021-09-03 00:32:55.766 ServerApp] Kernel started: a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:33:16.269 ServerApp] Replacing stale connection: a7131b3d-54cf-4da7-acf7-44aa93807db2:15eaca3b-b97c-4214-957f-79652f5b84a1
[W 2021-09-03 00:33:37.276 ServerApp] Replacing stale connection: a7131b3d-54cf-4da7-acf7-44aa93807db2:15eaca3b-b97c-4214-957f-79652f5b84a1
[W 2021-09-03 00:33:56.043 ServerApp] Timeout waiting for kernel_info reply from a7131b3d-54cf-4da7-acf7-44aa93807db2
[I 2021-09-03 00:33:56.546 ServerApp] Starting buffering for a7131b3d-54cf-4da7-acf7-44aa93807db2:15eaca3b-b97c-4214-957f-79652f5b84a1
[I 2021-09-03 00:33:56.547 ServerApp] Restoring connection for a7131b3d-54cf-4da7-acf7-44aa93807db2:15eaca3b-b97c-4214-957f-79652f5b84a1
[W 2021-09-03 00:34:01.058 ServerApp] Nudge: attempt 10 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:06.069 ServerApp] Nudge: attempt 20 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:11.082 ServerApp] Nudge: attempt 30 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:16.094 ServerApp] Nudge: attempt 40 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:21.105 ServerApp] Nudge: attempt 50 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:26.117 ServerApp] Nudge: attempt 60 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:31.126 ServerApp] Nudge: attempt 70 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:36.137 ServerApp] Nudge: attempt 80 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:41.147 ServerApp] Nudge: attempt 90 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:46.158 ServerApp] Nudge: attempt 100 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:51.169 ServerApp] Nudge: attempt 110 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[W 2021-09-03 00:34:56.183 ServerApp] Nudge: attempt 120 on kernel a7131b3d-54cf-4da7-acf7-44aa93807db2
[E 2021-09-03 00:34:56.549 ServerApp] Uncaught exception GET /api/kernels/a7131b3d-54cf-4da7-acf7-44aa93807db2/channels?session_id=15eaca3b-b97c-4214-957f-79652f5b84a1 (::1)
    HTTPServerRequest(protocol='http', host='localhost:8089', method='GET', uri='/api/kernels/a7131b3d-54cf-4da7-acf7-44aa93807db2/channels?session_id=15eaca3b-b97c-4214-957f-79652f5b84a1', version='HTTP/1.1', remote_ip='::1')
    Traceback (most recent call last):
      File "/fast/jamesn8/anaconda3/envs/torch3/lib/python3.9/site-packages/tornado/websocket.py", line 956, in _accept_connection
        await open_result
    tornado.util.TimeoutError: Timeout
[W 2021-09-03 00:34:57.736 ServerApp] Replacing stale connection: a7131b3d-54cf-4da7-acf7-44aa93807db2:15eaca3b-b97c-4214-957f-79652f5b84a1
[W 2021-09-03 00:35:18.816 ServerApp] Replacing stale connection: a7131b3d-54cf-4da7-acf7-44aa93807db2:15eaca3b-b97c-4214-957f-79652f5b84a1

Until I finally am forced to give up. Sometimes shutting down the kernel and starting it again manually works. But this is incredibly frustrating, and I want to know what I can change to get Jupyter notebooks working again.

The versions of software I am using are

jupyter core     : 4.7.1
jupyter-notebook : 6.4.3
qtconsole        : not installed
ipython          : 7.26.0
ipykernel        : 6.2.0
jupyter client   : 7.0.1
jupyter lab      : 3.1.9
nbconvert        : 6.1.0
ipywidgets       : 7.6.3
nbformat         : 5.1.3
traitlets        : 5.0.5
tornado       : 6.1
1 Like

Also having the same problem here

[I 2022-04-13 14:18:40.240 ServerApp] jupyterlab | extension was successfully linked.
[I 2022-04-13 14:18:40.253 ServerApp] nbclassic | extension was successfully linked.
[I 2022-04-13 14:18:41.066 ServerApp] notebook_shim | extension was successfully linked.
[W 2022-04-13 14:18:41.150 ServerApp] WARNING: The Jupyter server is listening on all IP addresses and not using encryption. This is not recommended.
[I 2022-04-13 14:18:41.152 ServerApp] notebook_shim | extension was successfully loaded.
[I 2022-04-13 14:18:41.154 LabApp] JupyterLab extension loaded from /home/chois7/miniconda3/envs/py39/lib/python3.9/site-packages/jupyterlab
[I 2022-04-13 14:18:41.154 LabApp] JupyterLab application directory is /juno/home/chois7/miniconda3/envs/py39/share/jupyter/lab
[I 2022-04-13 14:18:41.159 ServerApp] jupyterlab | extension was successfully loaded.
[I 2022-04-13 14:18:41.172 ServerApp] nbclassic | extension was successfully loaded.
[I 2022-04-13 14:18:41.173 ServerApp] Serving notebooks from local directory: /juno/home/chois7/documents
[I 2022-04-13 14:18:41.173 ServerApp] Jupyter Server 1.16.0 is running at:
[I 2022-04-13 14:18:41.173 ServerApp] http://localhost:8889/lab
[I 2022-04-13 14:18:41.174 ServerApp]  or http://127.0.0.1:8889/lab
[I 2022-04-13 14:18:41.174 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2022-04-13 14:18:50.659 LabApp] Build is up to date
[I 2022-04-13 14:18:57.178 ServerApp] Kernel started: 5f982651-86b8-4993-994a-e16ec6fc1dd8

[W 2022-04-13 14:19:57.223 ServerApp] Timeout waiting for kernel_info reply from 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:01.749 ServerApp] Nudge: attempt 10 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:01.750 ServerApp] Nudge: attempt 10 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:01.766 ServerApp] Nudge: attempt 10 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:01.766 ServerApp] Nudge: attempt 10 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:06.770 ServerApp] Nudge: attempt 20 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:06.782 ServerApp] Nudge: attempt 20 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:06.783 ServerApp] Nudge: attempt 20 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:06.783 ServerApp] Nudge: attempt 20 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:57.159 ServerApp] Nudge: attempt 120 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:57.159 ServerApp] Nudge: attempt 120 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[W 2022-04-13 14:20:57.159 ServerApp] Nudge: attempt 120 on kernel 5f982651-86b8-4993-994a-e16ec6fc1dd8
[E 2022-04-13 14:20:57.229 ServerApp] Uncaught exception GET /api/kernels/5f982651-86b8-4993-994a-e16ec6fc1dd8/channels?session_id=5f771960-a0ef-4caf-9766-6f40f36e238d (10.0.202.15)
    HTTPServerRequest(protocol='http', host='localhost:8889', method='GET', uri='/api/kernels/5f982651-86b8-4993-994a-e16ec6fc1dd8/channels?session_id=5f771960-a0ef-4caf-9766-6f40f36e238d', version='HTTP/1.1', remote_ip='10.0.202.15')
    Traceback (most recent call last):
      File "/home/chois7/miniconda3/envs/py39/lib/python3.9/site-packages/tornado/websocket.py", line 956, in _accept_connection
        await open_result
    tornado.util.TimeoutError: Timeout
[E 2022-04-13 14:20:57.257 ServerApp] Uncaught exception GET /api/kernels/5f982651-86b8-4993-994a-e16ec6fc1dd8/channels?session_id=fb2a6371-4113-4d43-9266-fad5c8104287 (10.0.202.15)
    HTTPServerRequest(protocol='http', host='localhost:8889', method='GET', uri='/api/kernels/5f982651-86b8-4993-994a-e16ec6fc1dd8/channels?session_id=fb2a6371-4113-4d43-9266-fad5c8104287', version='HTTP/1.1', remote_ip='10.0.202.15')
    Traceback (most recent call last):
      File "/home/chois7/miniconda3/envs/py39/lib/python3.9/site-packages/tornado/websocket.py", line 956, in _accept_connection
        await open_result
    tornado.util.TimeoutError: Timeout
[E 2022-04-13 14:20:57.257 ServerApp] Uncaught exception GET /api/kernels/5f982651-86b8-4993-994a-e16ec6fc1dd8/channels?session_id=438f6c45-8a92-4112-8493-7a48eb7ede29 (10.0.202.15)
    HTTPServerRequest(protocol='http', host='localhost:8889', method='GET', uri='/api/kernels/5f982651-86b8-4993-994a-e16ec6fc1dd8/channels?session_id=438f6c45-8a92-4112-8493-7a48eb7ede29', version='HTTP/1.1', remote_ip='10.0.202.15')
    Traceback (most recent call last):
      File "/home/chois7/miniconda3/envs/py39/lib/python3.9/site-packages/tornado/websocket.py", line 956, in _accept_connection
        await open_result
      File "/home/chois7/miniconda3/envs/py39/lib/python3.9/asyncio/tasks.py", line 328, in __wakeup
        future.result()
    tornado.util.TimeoutError: Timeout
[E 2022-04-13 14:20:57.283 ServerApp] Uncaught exception GET /api/kernels/5f982651-86b8-4993-994a-e16ec6fc1dd8/channels?session_id=f913f2dc-f618-4b5d-8943-646ad305daec (10.0.202.15)
    HTTPServerRequest(protocol='http', host='localhost:8889', method='GET', uri='/api/kernels/5f982651-86b8-4993-994a-e16ec6fc1dd8/channels?session_id=f913f2dc-f618-4b5d-8943-646ad305daec', version='HTTP/1.1', remote_ip='10.0.202.15')
    Traceback (most recent call last):
      File "/home/chois7/miniconda3/envs/py39/lib/python3.9/site-packages/tornado/websocket.py", line 956, in _accept_connection
        await open_result
      File "/home/chois7/miniconda3/envs/py39/lib/python3.9/asyncio/tasks.py", line 328, in __wakeup
        future.result()
    tornado.util.TimeoutError: Timeout

Driving me crazy;
Even after pip install pyzmq==19 the same thing happens