Comment 18 for bug 1817484

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Also to #15, #16 and #17:

It appears to be that the regiond db listener only noticed that the connection was lost at 19:02:30 (almost an hour after failover which happened at ~~18:12:50):

https://pastebin.canonical.com/p/g8FDBCtst5/

2019-02-25 18:02:24 maasserver.listener: [info] Listening for database notifications.
2019-02-25 19:02:27 maasserver.listener: [debug] Connection lost.
2019-02-25 19:02:30 maasserver.listener: [info] Listening for database notifications.
2019-02-25 20:03:54 maasserver.listener: [debug] Connection lost.
2019-02-25 20:03:54 maasserver.listener: [debug] Connection lost.
2019-02-25 20:03:54 maasserver.listener: [debug] Connection lost.
2019-02-25 20:03:57 maasserver.listener: [info] Listening for database notifications.
2019-02-25 20:03:57 maasserver.listener: [info] Listening for database notifications.
2019-02-25 20:03:57 maasserver.listener: [info] Listening for database notifications.
2019-02-25 20:10:19 maasserver.listener: [debug] Connection lost.
2019-02-25 20:10:22 maasserver.listener: [info] Listening for database notifications.

There were some 500 errors returned to client requests (because of the patched code):
https://pastebin.canonical.com/p/Q47fpgTfkf/

Then listener reported that it started listening for notifications a couple of times:
https://pastebin.canonical.com/p/Q47fpgTfkf/

And also the log gets messages like this periodically:

2019-02-25 21:02:26 maasserver.bootresources: [critical] Importing boot resources failed.

        Traceback (most recent call last):
# ...
            listener.register("sys_stop_import", stop_import)
          File "/usr/lib/python3/dist-packages/maasserver/listener.py", line 223, in register
            "System channel '%s' has already been registered." % channel)
        maasserver.listener.PostgresListenerRegistrationError: System channel 'sys_stop_import' has already been registered.

This behavior started at 2019-02-25 19:02:27 when the listener first reported that it has lost the connection to the DB and continues now:

https://pastebin.canonical.com/p/KTHVjYGXHg/

The registration error comes from here: https://github.com/maas/maas/blob/2.5.0/src/maasserver/listener.py#L209-L223

          File "/usr/lib/python3/dist-packages/maasserver/listener.py", line 223, in register
            "System channel '%s' has already been registered." % channel)
        maasserver.listener.PostgresListenerRegistrationError: System channel 'sys_stop_import' has already been registered.