MAAS

Bug #1817484
Comment #9

Comment 9 for bug 1817484

Revision history for this message

Dmitrii Shcherbakov (dmitriis) wrote on 2019-02-25:

Andres,

Doesn't MAAS need to return something like "500 Internal Server Error" if it does not have proper connectivity to the database?

We can retry the update request based on that, however, if 200 is returned for DNS update and the entry is in the DB as a client we consider the operation to be successful and then proceed to checking of the resolution against a desired result.

> in what situation does res_maas_region_hostname executes the update

We wait until the DB VIP comes up which means that the master is up and then start the dns resource which sends an update to MAAS.

In general, the resource agent that updates the record is no different from any other API client - so it cannot track MAAS internal DB connectivity state and issue requests based on that.

Could the issue arise because the DB update is executed in a different thread(s) from the one that updates bind9 and listens to postgres notification? Maybe different sockets are used for this which causes them to be closed separately which results in multiple failures ? (I can see that many connections are maintained to the db https://pastebin.canonical.com/p/z3KRMmMCRC/).

From what I can see DNS update is done asynchronously:

https://github.com/maas/maas/blob/2.5.0/src/maasserver/region_controller.py#L160-L189