1. yes, it gets created at first (successfully), then a pacemaker resource agent updates it on failover of the node the record points to;
In this case, the record points to maas-vhost1 which also happens to be a postgres DB master
When maas-vhost1 is killed, the record is updated from a node where pacemaker decides to start the "dns record resource" (called res_maas_region_hostname) - from maas-vhost3 in this case.
2. DB fails over to maas-vhost2 and there is a resource constraint on res_maas_region_hostname such that pacemaker does not attempt to start it (on maas-vhost3 or maas-vhost2) until the DB VIP resource starts. And DB VIP only starts after a postgres master is available.
So: (1) db failover completes (2) DNS update is issued from maas-vhost3
https://pastebin.canonical.com/p/ZZKMvRWDQN/
order ord_promote inf: ms_pgsql:promote res_pgsql_vip:start symmetrical=false
order ord_hostname_start inf: res_pgsql_vip:start res_maas_region_hostname:start symmetrical=false
3. the DB contains the new record so it's notifications from postgres that somehow do not get delivered from the new master or MAAS does not process them correctly.
> do the region controllers get restarted
No, we do not restart them - we just expect that DB notifications will arrive causing a bind9 reload.
Andres,
Logs from maas-vhost2 (current db master) /private- fileshare. canonical. com/~dima/ maas-dumps/ 2019-02- 25-maas- vhost2- etc-var- log.tar. gz
https:/
Logs from maas-vhost1 (failed master) - I used libguestfs to extract the logs from the killed machine (it is offline and you can see garbage at the and of the region log file there because obviously page cache was not dropped before the forced shutoff) /private- fileshare. canonical. com/~dima/ maas-dumps/ 2019-02- 25-maas- vhost1- etc-var- log.tar. gz
https:/
1. yes, it gets created at first (successfully), then a pacemaker resource agent updates it on failover of the node the record points to;
In this case, the record points to maas-vhost1 which also happens to be a postgres DB master
When maas-vhost1 is killed, the record is updated from a node where pacemaker decides to start the "dns record resource" (called res_maas_ region_ hostname) - from maas-vhost3 in this case.
2. DB fails over to maas-vhost2 and there is a resource constraint on res_maas_ region_ hostname such that pacemaker does not attempt to start it (on maas-vhost3 or maas-vhost2) until the DB VIP resource starts. And DB VIP only starts after a postgres master is available.
So: (1) db failover completes (2) DNS update is issued from maas-vhost3
https:/ /pastebin. canonical. com/p/ZZKMvRWDQ N/ region_ hostname: start symmetrical=false
order ord_promote inf: ms_pgsql:promote res_pgsql_vip:start symmetrical=false
order ord_hostname_start inf: res_pgsql_vip:start res_maas_
3. the DB contains the new record so it's notifications from postgres that somehow do not get delivered from the new master or MAAS does not process them correctly.
> do the region controllers get restarted
No, we do not restart them - we just expect that DB notifications will arrive causing a bind9 reload.