Corosync: Assertion 'sender_node != NULL' failed when bind iface is ready after corosync boots
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
corosync (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Trusty |
Fix Released
|
Medium
|
Victor Tapia | ||
Xenial |
Fix Released
|
Medium
|
Victor Tapia | ||
Zesty |
Fix Released
|
Undecided
|
Unassigned | ||
Artful |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
Corosync sigaborts if it starts before the interface it has to bind to is ready.
On boot, if no interface in the bindnetaddr range is up/configured, corosync binds to lo (127.0.0.1). Once an applicable interface is up, corosync crashes with the following error message:
corosync: votequorum.c:2019: message_
Aborted (core dumped)
The last log entries show that the interface is trying to join the cluster:
Dec 19 11:36:05 [22167] xenial-pacemaker corosync debug [TOTEM ] totemsrp.c:2089 entering OPERATIONAL state.
Dec 19 11:36:05 [22167] xenial-pacemaker corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.
During the quorum calculation, the generated nodeid (704573706) for the node is being used instead of the nodeid specified in the configuration file (1), and the assert fails because the nodeid is not present in the member list. Corosync should use the correct nodeid and continue running after the interface is up, as shown in a fixed corosync boot:
Dec 19 11:50:56 [4824] xenial-corosync corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.241.10:80) was formed. Members joined: 1
[Environment]
Xenial 16.04.3
Packages:
ii corosync 2.3.5-3ubuntu1 amd64 cluster engine daemon and utilities
ii libcorosync-
[Test Case]
Config:
totem {
version: 2
member {
}
member {
}
transport: udpu
nodeid: 1
interface {
}
}
quorum {
provider: corosync_votequorum
}
nodelist {
node {
}
node {
}
}
1. ifdown interface (169.254.241.10)
2. start corosync (/usr/sbin/corosync -f)
3. ifup interface
[Regression Potential]
This patch affects corosync boot; the regression potential is for other problems during corosync startup and/or configuration parsing.
[Other info]
# Upstream corosync commit :
https:/
# git describe aab55a004bb12eb
v2.3.5-45-gaab55a0
# rmadison corosync
corosync | 2.3.3-1ubuntu1 | trusty | source, amd64, arm64, armhf, i386, powerpc, ppc64el
corosync | 2.3.3-1ubuntu3 | trusty-updates | source, amd64, arm64, armhf, i386, powerpc, ppc64el
corosync | 2.3.5-3ubuntu1 | xenial | source, amd64, arm64, armhf, i386, powerpc, ppc64el, s390x
corosync | 2.4.2-3build1 | zesty | source, amd64, arm64, armhf, i386, ppc64el, s390x
corosync | 2.4.2-3build1 | artful | source, amd64, arm64, armhf, i386, ppc64el, s390x
corosync | 2.4.2-3build1 | bionic | source, amd64, arm64, armhf, i386, ppc64el, s390x
description: | updated |
description: | updated |
tags: | added: sts sts-sponsor-ddstreet |
Changed in corosync (Ubuntu): | |
status: | New → Fix Released |
Changed in corosync (Ubuntu Trusty): | |
assignee: | nobody → Victor Tapia (vtapia) |
Changed in corosync (Ubuntu Xenial): | |
assignee: | nobody → Victor Tapia (vtapia) |
status: | New → In Progress |
Changed in corosync (Ubuntu Trusty): | |
status: | New → In Progress |
importance: | Undecided → Medium |
Changed in corosync (Ubuntu Xenial): | |
importance: | Undecided → Medium |
description: | updated |
tags: |
added: sts-sponsor-ddstreet-done removed: sts-sponsor-ddstreet |
Changed in corosync (Ubuntu Xenial): | |
status: | Incomplete → In Progress |
Changed in corosync (Ubuntu Zesty): | |
status: | New → Fix Released |
Changed in corosync (Ubuntu Artful): | |
status: | New → Fix Released |
description: | updated |
description: | updated |
tags: |
added: verification-done verification-done-trusty verification-done-xenial removed: verification-needed verification-needed-trusty verification-needed-xenial |
I just noticed that the member{} group should be inside interface{}, triggering the bug described by this commit: https:/ /github. com/corosync/ corosync/ commit/ aab55a004bb12eb e78db341dc56759 dfe710c1b2