[bionic] fence_scsi not working properly with Pacemaker 1.1.18-2ubuntu1.1
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
pacemaker (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
OBS: This bug was originally into LP: #1865523 but it was split.
#### SRU: pacemaker
[Impact]
* fence_scsi is not currently working in a share disk environment
* all clusters relying in fence_scsi and/or fence_scsi + watchdog won't be able to start the fencing agents OR, in worst case scenarios, the fence_scsi agent might start but won't make scsi reservations in the shared scsi disk.
* this bug is taking care of pacemaker 1.1.18 issues with fence_scsi, since the later was fixed at LP: #1865523.
[Test Case]
* having a 3-node setup, nodes called "clubionic01, clubionic02, clubionic03", with a shared scsi disk (fully supporting persistent reservations) /dev/sda, with corosync and pacemaker operational and running, one might try:
rafaeldtinoco@
crm(live)configure# property stonith-enabled=on
crm(live)configure# property stonith-action=off
crm(live)configure# property no-quorum-
crm(live)configure# property have-watchdog=true
crm(live)configure# commit
crm(live)configure# end
crm(live)# end
rafaeldtinoco@
stonith:
pcmk_
devices=
meta provides=unfencing
And see the following errors:
Failed Actions:
* fence_clubionic
last-
* fence_clubionic
last-
* fence_clubionic
last-
and corosync.log will show:
warning: unpack_
[Regression Potential]
* LP: #1865523 shows fence_scsi fully operational after SRU for that bug is done.
* LP: #1865523 used pacemaker 1.1.19 (vanilla) in order to fix fence_scsi.
* There are changes to: cluster resource manager daemon, local resource manager daemon and police engine. From all the changes, the police engine fix is the biggest, but still not big for a SRU. This could cause police engine, thus cluster decisions, to mal function.
* All patches are based in upstream fixes made right after Pacemaker-1.1.18, used by Ubuntu Bionic and were tested with fence_scsi to make sure it fixed the issues.
[Other Info]
* Original Description:
Trying to setup a cluster with an iscsi shared disk, using fence_scsi as the fencing mechanism, I realized that fence_scsi is not working in Ubuntu Bionic. I first thought it was related to Azure environment (LP: #1864419), where I was trying this environment, but then, trying locally, I figured out that somehow pacemaker 1.1.18 is not fencing the shared scsi disk properly.
Note: I was able to "backport" vanilla 1.1.19 from upstream and fence_scsi worked. I have then tried 1.1.18 without all quilt patches and it didnt work as well. I think that bisecting 1.1.18 <-> 1.1.19 might tell us which commit has fixed the behaviour needed by the fence_scsi agent.
(k)rafaeldtinoc
node 1: clubionic01.private
node 2: clubionic02.private
node 3: clubionic03.private
primitive fence_clubionic stonith:fence_scsi \
params pcmk_host_
meta provides=unfencing
property cib-bootstrap-
----
(k)rafaeldtinoc
Stack: corosync
Current DC: clubionic01.private (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Mon Mar 2 15:55:30 2020
Last change: Mon Mar 2 15:45:33 2020 by root via cibadmin on clubionic01.private
3 nodes configured
1 resource configured
Online: [ clubionic01.private clubionic02.private clubionic03.private ]
Active resources:
fence_clubionic (stonith:
----
(k)rafaeldtinoc
LIO-ORG cluster.bionic. 4.0
Peripheral device type: disk
PR generation=0x0, there are NO registered reservation keys
(k)rafaeldtinoc
LIO-ORG cluster.bionic. 4.0
Peripheral device type: disk
PR generation=0x0, there is NO reservation held
Related branches
- Christian Ehrhardt (community): Approve
- Canonical Server packageset reviewers: Pending requested
- Canonical Server: Pending requested
-
Diff: 388 lines (+348/-0)6 files modifieddebian/changelog (+14/-0)
debian/patches/lp1866119-Fix-attrd-ensure-node-name-is-broadcast.patch (+116/-0)
debian/patches/lp1866119-Fix-crmd-avoid-double-free.patch (+27/-0)
debian/patches/lp1866119-Fix-pengine-unfence-before-probing.patch (+107/-0)
debian/patches/lp1866119-Refactor-pengine-functionize.patch (+80/-0)
debian/patches/series (+4/-0)
Changed in pacemaker (Ubuntu): | |
assignee: | nobody → Rafael David Tinoco (rafaeldtinoco) |
assignee: | Rafael David Tinoco (rafaeldtinoco) → nobody |
status: | New → Fix Released |
Changed in pacemaker (Ubuntu Bionic): | |
status: | New → Confirmed |
description: | updated |
Changed in pacemaker (Ubuntu Bionic): | |
status: | Confirmed → In Progress |
description: | updated |
A PPA can be currently found at : https:/ /launchpad. net/~ubuntu- server- ha/+archive/ ubuntu/ staging
I'm adjusting the SRU but, meanwhile, that PPA provides a working version for Ubuntu Bionic.