Comment 25 for bug 1874719

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

> However, hacluster charm should be handling this situation. There's nothing special here -
> corosync has specific behaviour out of the box. Charms should handle it.

I'm trying to understand the sequence of steps that led to this situation on Focal. From what I understand:

- install corosync
- you get a cluster named "debian", with a single node named "node1" (which does not match the hostname), in each unit. If you deployed 3 principals, and 3 hacluster subordinates, each one will be its own isolated island still.
- time passes
- some relation gets added, and the charm reacts. It renames each node to the hostname (juju-xxxx), and even node id is changed I believe. corosync.conf and other files are propagated to all nodes, and tries to form a cluster. I don't know what it does to the cluster name. Services get restarted
- at that point, some nodes (all?) still have "node1" stored somewhere, which was their own old name before

So is this a scenario of renaming a node while it's part of a cluster, even when it's a single-node cluster from the default package installation? I would expect that default single-node cluster to be destroyed when the charm takes over. Maybe it's hard to distinguish that scenario from a real-multi-node-cluster-but-degraded one.

In general, I prefer the idea of using the hostname as the node name by default. It's what at least has a chance of being unique right after install, whereas "node1" definitely does not. Sounds less surprising behavior. Now, doing an SRU to change this back to hostname, the pro and con balance isn't clear. Fresh installs would be better after the SRU, but existing ones would get a dpkg conf prompt, and risk getting the wrong answer and having the cluster broken. All in all, everyone deploying a more-than-one-node cluster will *definitely* change the conf file anyway for other reasons (or will they?), which also makes the "node1" argument kind of moot.