Merge ~lucaskanashiro/ubuntu/+source/pacemaker:lp1896223-focal into ubuntu/+source/pacemaker:ubuntu/focal-devel
- Git
- lp:~lucaskanashiro/ubuntu/+source/pacemaker
- lp1896223-focal
- Merge into ubuntu/focal-devel
Status: | Merged |
---|---|
Approved by: | Lucas Kanashiro |
Approved revision: | aeae35a3b6001c04dab691a33ebac179e6fb4b49 |
Merged at revision: | aeae35a3b6001c04dab691a33ebac179e6fb4b49 |
Proposed branch: | ~lucaskanashiro/ubuntu/+source/pacemaker:lp1896223-focal |
Merge into: | ubuntu/+source/pacemaker:ubuntu/focal-devel |
Diff against target: |
1293 lines (+1199/-0) 15 files modified
debian/changelog (+21/-0) debian/patches/series (+16/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-01-f1f71b3-Refactor-scheduler-functionize-comparing-on-fail.patch (+181/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-02-ef246ff-Fix-scheduler-disallow-on-fail-stop-for-stop.patch (+53/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-03-8dceba7-Refactor-scheduler-use-more-appropriate-types.patch (+43/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-04-a4d6a20-Low-libpacemaker-don-t-force-stop-when-skipping.patch (+45/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-05-98c3b64-Log-libpacemaker-check-for-re-promotes-specifically.patch (+46/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-06-2f1e2df-Feature-xml-add-on-fail-demote-option-to-resources.patch (+38/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-07-874f75e-Feature-scheduler-new-on-fail-demote-recovery-policy.patch (+355/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-08-7eec572-Build-libcrmcommon-bump-CRM-feature-set.patch (+51/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-09-204961e-Doc-Pacemaker-Explained-document-new-on-fail.patch (+67/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-10-015b5c0-Doc-Pacemaker-Explained-document-no-quorum.patch (+26/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-11-0b68344-Refactor-scheduler-functionize-checking-quorum.patch (+61/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-12-b1ae359-Feature-scheduler-support-demote-choice-for.patch (+163/-0) debian/patches/ubuntu-2.0.3-demote/lp1896223-13-d4b9117-Doc-Pacemaker-Explained-correct-on-fail-default.patch (+33/-0) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Christian Ehrhardt (community) | Approve | ||
Canonical Server | Pending | ||
Review via email: mp+395032@code.launchpad.net |
Commit message
Description of the change
Backport the no-quorum-
https:/
There was a security update in the meantime and I needed to rebase the changes.
I still want to run the regression tests before moving forward but a review for the packaging work is appreciated :)
Lucas Kanashiro (lucaskanashiro) wrote : | # |
You are right, I wrongly resolved the conflicts while I was cherry-picking the upstream commits. The Rafael's solution is the right one. I am going to redo everything to make sure the content of the patches are the same as Rafael's patches.
About the extra/missing lines in the patch headers might be the tooling I use to manage patches, I use git-buildpackage patch queue. I do not think this is a big deal.
The SRU bug description and the regression tests are the next steps, they are already in my todo list.
Christian Ehrhardt (paelzer) wrote : | # |
Now things LGTM - thanks.
The need for SRU-template addition is known.
+1 for this once the tests have completed.
Lucas Kanashiro (lucaskanashiro) wrote : | # |
I imported the version 2.0.3-3ubuntu4.2 to this branch which was released after I was working on this. The only change I made was setting the version to 2.0.3-3ubuntu4.3, all the rest is the same. So I am still considering it as approved based on the last Christian's comment.
FWIW MS folks tested the package and said it is working as they were expecting. I'll be uploading the package.
Lucas Kanashiro (lucaskanashiro) wrote : | # |
Uploaded:
$ git push pkg upload/
Enumerating objects: 29, done.
Counting objects: 100% (29/29), done.
Delta compression using up to 32 threads
Compressing objects: 100% (24/24), done.
Writing objects: 100% (24/24), 19.60 KiB | 1.78 MiB/s, done.
Total 24 (delta 8), reused 0 (delta 0)
To ssh://git.
* [new tag] upload/
$ dput ubuntu ../pacemaker_
Checking signature on .changes
gpg: ../pacemaker_
Checking signature on .dsc
gpg: ../pacemaker_
Uploading to ubuntu (via ftp to upload.ubuntu.com):
Uploading pacemaker_
Uploading pacemaker_
Uploading pacemaker_
Successfully uploaded packages.
Preview Diff
1 | diff --git a/debian/changelog b/debian/changelog |
2 | index e7ec315..c4ffd80 100644 |
3 | --- a/debian/changelog |
4 | +++ b/debian/changelog |
5 | @@ -1,3 +1,24 @@ |
6 | +pacemaker (2.0.3-3ubuntu4.3) focal; urgency=medium |
7 | + |
8 | + [ Rafael David Tinoco ] |
9 | + * Post 2.0.3 features: on-fail=demote & no-quorum-policy=demote |
10 | + (LP: #1896223). Added debian/patches/ubuntu-2.0.3-demote/*: |
11 | + - lp1896223-01-f1f71b3-Refactor-scheduler-functionize-comparing-on-fail.patch |
12 | + - lp1896223-02-ef246ff-Fix-scheduler-disallow-on-fail-stop-for-stop.patch |
13 | + - lp1896223-03-8dceba7-Refactor-scheduler-use-more-appropriate-types.patch |
14 | + - lp1896223-04-a4d6a20-Low-libpacemaker-don-t-force-stop-when-skipping.patch |
15 | + - lp1896223-05-98c3b64-Log-libpacemaker-check-for-re-promotes-specifically.patch |
16 | + - lp1896223-06-2f1e2df-Feature-xml-add-on-fail-demote-option-to-resources.patch |
17 | + - lp1896223-07-874f75e-Feature-scheduler-new-on-fail-demote-recovery-policy.patch |
18 | + - lp1896223-08-7eec572-Build-libcrmcommon-bump-CRM-feature-set.patch |
19 | + - lp1896223-09-204961e-Doc-Pacemaker-Explained-document-new-on-fail.patch |
20 | + - lp1896223-10-015b5c0-Doc-Pacemaker-Explained-document-no-quorum.patch |
21 | + - lp1896223-11-0b68344-Refactor-scheduler-functionize-checking-quorum.patch |
22 | + - lp1896223-12-b1ae359-Feature-scheduler-support-demote-choice-for.patch |
23 | + - lp1896223-13-d4b9117-Doc-Pacemaker-Explained-correct-on-fail-default.patch |
24 | + |
25 | + -- Lucas Kanashiro <kanashiro@ubuntu.com> Wed, 09 Dec 2020 10:27:00 -0300 |
26 | + |
27 | pacemaker (2.0.3-3ubuntu4.2) focal; urgency=medium |
28 | |
29 | * d/rules: Rebuild with QB_KILL_ATTRIBUTE_SECTION to overcome a problem in |
30 | diff --git a/debian/patches/series b/debian/patches/series |
31 | index c02030b..944751a 100644 |
32 | --- a/debian/patches/series |
33 | +++ b/debian/patches/series |
34 | @@ -32,3 +32,19 @@ CVE-2020-25654-4.patch |
35 | CVE-2020-25654-5.patch |
36 | CVE-2020-25654-6.patch |
37 | CVE-2020-25654-7.patch |
38 | +# |
39 | +# https://bugs.launchpad.net/bugs/1896223 |
40 | +# |
41 | +ubuntu-2.0.3-demote/lp1896223-01-f1f71b3-Refactor-scheduler-functionize-comparing-on-fail.patch |
42 | +ubuntu-2.0.3-demote/lp1896223-02-ef246ff-Fix-scheduler-disallow-on-fail-stop-for-stop.patch |
43 | +ubuntu-2.0.3-demote/lp1896223-03-8dceba7-Refactor-scheduler-use-more-appropriate-types.patch |
44 | +ubuntu-2.0.3-demote/lp1896223-04-a4d6a20-Low-libpacemaker-don-t-force-stop-when-skipping.patch |
45 | +ubuntu-2.0.3-demote/lp1896223-05-98c3b64-Log-libpacemaker-check-for-re-promotes-specifically.patch |
46 | +ubuntu-2.0.3-demote/lp1896223-06-2f1e2df-Feature-xml-add-on-fail-demote-option-to-resources.patch |
47 | +ubuntu-2.0.3-demote/lp1896223-07-874f75e-Feature-scheduler-new-on-fail-demote-recovery-policy.patch |
48 | +ubuntu-2.0.3-demote/lp1896223-08-7eec572-Build-libcrmcommon-bump-CRM-feature-set.patch |
49 | +ubuntu-2.0.3-demote/lp1896223-09-204961e-Doc-Pacemaker-Explained-document-new-on-fail.patch |
50 | +ubuntu-2.0.3-demote/lp1896223-10-015b5c0-Doc-Pacemaker-Explained-document-no-quorum.patch |
51 | +ubuntu-2.0.3-demote/lp1896223-11-0b68344-Refactor-scheduler-functionize-checking-quorum.patch |
52 | +ubuntu-2.0.3-demote/lp1896223-12-b1ae359-Feature-scheduler-support-demote-choice-for.patch |
53 | +ubuntu-2.0.3-demote/lp1896223-13-d4b9117-Doc-Pacemaker-Explained-correct-on-fail-default.patch |
54 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-01-f1f71b3-Refactor-scheduler-functionize-comparing-on-fail.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-01-f1f71b3-Refactor-scheduler-functionize-comparing-on-fail.patch |
55 | new file mode 100644 |
56 | index 0000000..796101c |
57 | --- /dev/null |
58 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-01-f1f71b3-Refactor-scheduler-functionize-comparing-on-fail.patch |
59 | @@ -0,0 +1,181 @@ |
60 | +From: Ken Gaillot <kgaillot@redhat.com> |
61 | +Date: Thu, 28 May 2020 08:22:00 -0500 |
62 | +Subject: Refactor: scheduler: functionize comparing on-fail values |
63 | + |
64 | +The action_fail_response enum values used for the "on-fail" operation |
65 | +meta-attribute were initially intended to be in order of severity. |
66 | +However as new values were added, they were added to the end (out of severity |
67 | +order) to preserve API backward compatibility. |
68 | + |
69 | +This resulted in a convoluted comparison of values that will only get worse as |
70 | +more values are added. |
71 | + |
72 | +This commit adds a comparison function to isolate that complexity. |
73 | + |
74 | +Author: Ken Gaillot <kgaillot@redhat.com> |
75 | +Origin: upstream, https://github.com/ClusterLabs/pacemaker/commit/f1f71b3 |
76 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
77 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
78 | +Last-Update: 2020-10-05 |
79 | +--- |
80 | + include/crm/pengine/common.h | 32 ++++++++++++------ |
81 | + lib/pengine/unpack.c | 80 +++++++++++++++++++++++++++++++++++++++++--- |
82 | + 2 files changed, 97 insertions(+), 15 deletions(-) |
83 | + |
84 | +diff --git a/include/crm/pengine/common.h b/include/crm/pengine/common.h |
85 | +index e497f9c..450206e 100644 |
86 | +--- a/include/crm/pengine/common.h |
87 | ++++ b/include/crm/pengine/common.h |
88 | +@@ -29,18 +29,29 @@ extern "C" { |
89 | + extern gboolean was_processing_error; |
90 | + extern gboolean was_processing_warning; |
91 | + |
92 | +-/* order is significant here |
93 | +- * items listed in order of accending severeness |
94 | +- * more severe actions take precedent over lower ones |
95 | ++/* The order is (partially) significant here; the values from action_fail_ignore |
96 | ++ * through action_fail_fence are in order of increasing severity. |
97 | ++ * |
98 | ++ * @COMPAT The values should be ordered and numbered per the "TODO" comments |
99 | ++ * below, so all values are in order of severity and there is room for |
100 | ++ * future additions, but that would break API compatibility. |
101 | ++ * @TODO For now, we just use a function to compare the values specially, but |
102 | ++ * at the next compatibility break, we should arrange things properly. |
103 | + */ |
104 | + enum action_fail_response { |
105 | +- action_fail_ignore, |
106 | +- action_fail_recover, |
107 | +- action_fail_migrate, /* recover by moving it somewhere else */ |
108 | +- action_fail_block, |
109 | +- action_fail_stop, |
110 | +- action_fail_standby, |
111 | +- action_fail_fence, |
112 | ++ action_fail_ignore, // @TODO = 10 |
113 | ++ // @TODO action_fail_demote = 20, |
114 | ++ action_fail_recover, // @TODO = 30 |
115 | ++ // @TODO action_fail_reset_remote = 40, |
116 | ++ // @TODO action_fail_restart_container = 50, |
117 | ++ action_fail_migrate, // @TODO = 60 |
118 | ++ action_fail_block, // @TODO = 70 |
119 | ++ action_fail_stop, // @TODO = 80 |
120 | ++ action_fail_standby, // @TODO = 90 |
121 | ++ action_fail_fence, // @TODO = 100 |
122 | ++ |
123 | ++ // @COMPAT Values below here are out of order for API compatibility |
124 | ++ |
125 | + action_fail_restart_container, |
126 | + |
127 | + /* This is reserved for internal use for remote node connection resources. |
128 | +@@ -51,6 +62,7 @@ enum action_fail_response { |
129 | + */ |
130 | + action_fail_reset_remote, |
131 | + |
132 | ++ action_fail_demote, |
133 | + }; |
134 | + |
135 | + /* the "done" action must be the "pre" action +1 */ |
136 | +diff --git a/lib/pengine/unpack.c b/lib/pengine/unpack.c |
137 | +index d337758..514207e 100644 |
138 | +--- a/lib/pengine/unpack.c |
139 | ++++ b/lib/pengine/unpack.c |
140 | +@@ -2724,6 +2724,78 @@ last_change_str(xmlNode *xml_op) |
141 | + return ((when_s && *when_s)? when_s : "unknown time"); |
142 | + } |
143 | + |
144 | ++/*! |
145 | ++ * \internal |
146 | ++ * \brief Compare two on-fail values |
147 | ++ * |
148 | ++ * \param[in] first One on-fail value to compare |
149 | ++ * \param[in] second The other on-fail value to compare |
150 | ++ * |
151 | ++ * \return A negative number if second is more severe than first, zero if they |
152 | ++ * are equal, or a positive number if first is more severe than second. |
153 | ++ * \note This is only needed until the action_fail_response values can be |
154 | ++ * renumbered at the next API compatibility break. |
155 | ++ */ |
156 | ++static int |
157 | ++cmp_on_fail(enum action_fail_response first, enum action_fail_response second) |
158 | ++{ |
159 | ++ switch (first) { |
160 | ++ case action_fail_reset_remote: |
161 | ++ switch (second) { |
162 | ++ case action_fail_ignore: |
163 | ++ case action_fail_recover: |
164 | ++ return 1; |
165 | ++ case action_fail_reset_remote: |
166 | ++ return 0; |
167 | ++ default: |
168 | ++ return -1; |
169 | ++ } |
170 | ++ break; |
171 | ++ |
172 | ++ case action_fail_restart_container: |
173 | ++ switch (second) { |
174 | ++ case action_fail_ignore: |
175 | ++ case action_fail_recover: |
176 | ++ case action_fail_reset_remote: |
177 | ++ return 1; |
178 | ++ case action_fail_restart_container: |
179 | ++ return 0; |
180 | ++ default: |
181 | ++ return -1; |
182 | ++ } |
183 | ++ break; |
184 | ++ |
185 | ++ default: |
186 | ++ break; |
187 | ++ } |
188 | ++ switch (second) { |
189 | ++ case action_fail_reset_remote: |
190 | ++ switch (first) { |
191 | ++ case action_fail_ignore: |
192 | ++ case action_fail_recover: |
193 | ++ return -1; |
194 | ++ default: |
195 | ++ return 1; |
196 | ++ } |
197 | ++ break; |
198 | ++ |
199 | ++ case action_fail_restart_container: |
200 | ++ switch (first) { |
201 | ++ case action_fail_ignore: |
202 | ++ case action_fail_recover: |
203 | ++ case action_fail_reset_remote: |
204 | ++ return -1; |
205 | ++ default: |
206 | ++ return 1; |
207 | ++ } |
208 | ++ break; |
209 | ++ |
210 | ++ default: |
211 | ++ break; |
212 | ++ } |
213 | ++ return first - second; |
214 | ++} |
215 | ++ |
216 | + static void |
217 | + unpack_rsc_op_failure(resource_t * rsc, node_t * node, int rc, xmlNode * xml_op, xmlNode ** last_failure, |
218 | + enum action_fail_response * on_fail, pe_working_set_t * data_set) |
219 | +@@ -2783,10 +2855,7 @@ unpack_rsc_op_failure(resource_t * rsc, node_t * node, int rc, xmlNode * xml_op, |
220 | + } |
221 | + |
222 | + action = custom_action(rsc, strdup(key), task, NULL, TRUE, FALSE, data_set); |
223 | +- if ((action->on_fail <= action_fail_fence && *on_fail < action->on_fail) || |
224 | +- (action->on_fail == action_fail_reset_remote && *on_fail <= action_fail_recover) || |
225 | +- (action->on_fail == action_fail_restart_container && *on_fail <= action_fail_recover) || |
226 | +- (*on_fail == action_fail_restart_container && action->on_fail >= action_fail_migrate)) { |
227 | ++ if (cmp_on_fail(*on_fail, action->on_fail) < 0) { |
228 | + pe_rsc_trace(rsc, "on-fail %s -> %s for %s (%s)", fail2text(*on_fail), |
229 | + fail2text(action->on_fail), action->uuid, key); |
230 | + *on_fail = action->on_fail; |
231 | +@@ -3630,7 +3699,8 @@ unpack_rsc_op(pe_resource_t *rsc, pe_node_t *node, xmlNode *xml_op, |
232 | + |
233 | + record_failed_op(xml_op, node, rsc, data_set); |
234 | + |
235 | +- if (failure_strategy == action_fail_restart_container && *on_fail <= action_fail_recover) { |
236 | ++ if ((failure_strategy == action_fail_restart_container) |
237 | ++ && cmp_on_fail(*on_fail, action_fail_recover) <= 0) { |
238 | + *on_fail = failure_strategy; |
239 | + } |
240 | + |
241 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-02-ef246ff-Fix-scheduler-disallow-on-fail-stop-for-stop.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-02-ef246ff-Fix-scheduler-disallow-on-fail-stop-for-stop.patch |
242 | new file mode 100644 |
243 | index 0000000..72c7abc |
244 | --- /dev/null |
245 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-02-ef246ff-Fix-scheduler-disallow-on-fail-stop-for-stop.patch |
246 | @@ -0,0 +1,53 @@ |
247 | +From: Ken Gaillot <kgaillot@redhat.com> |
248 | +Date: Thu, 28 May 2020 08:27:47 -0500 |
249 | +Subject: Fix: scheduler: disallow on-fail=stop for stop operations |
250 | + |
251 | +because it would loop infinitely as long as the stop continued to fail |
252 | + |
253 | + [Backport] |
254 | + |
255 | + This pacemaker version did not use pcmk__config_err() function for |
256 | + configuration warnings. It used crm_config_err() still. |
257 | + |
258 | +Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
259 | + |
260 | +Author: Ken Gaillot <kgaillot@redhat.com> |
261 | +Origin: backport, https://github.com/ClusterLabs/pacemaker/commit/ef246ff |
262 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
263 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
264 | +Last-Update: 2020-10-05 |
265 | +--- |
266 | + lib/pengine/utils.c | 14 ++++++++++++-- |
267 | + 1 file changed, 12 insertions(+), 2 deletions(-) |
268 | + |
269 | +diff --git a/lib/pengine/utils.c b/lib/pengine/utils.c |
270 | +index 3fc072f..ad5a09b 100644 |
271 | +--- a/lib/pengine/utils.c |
272 | ++++ b/lib/pengine/utils.c |
273 | +@@ -666,14 +666,24 @@ custom_action(resource_t * rsc, char *key, const char *task, |
274 | + return action; |
275 | + } |
276 | + |
277 | ++static bool |
278 | ++valid_stop_on_fail(const char *value) |
279 | ++{ |
280 | ++ return safe_str_neq(value, "standby") |
281 | ++ && safe_str_neq(value, "stop"); |
282 | ++} |
283 | ++ |
284 | + static const char * |
285 | + unpack_operation_on_fail(action_t * action) |
286 | + { |
287 | + |
288 | + const char *value = g_hash_table_lookup(action->meta, XML_OP_ATTR_ON_FAIL); |
289 | + |
290 | +- if (safe_str_eq(action->task, CRMD_ACTION_STOP) && safe_str_eq(value, "standby")) { |
291 | +- crm_config_err("on-fail=standby is not allowed for stop actions: %s", action->rsc->id); |
292 | ++ if (safe_str_eq(action->task, CRMD_ACTION_STOP) |
293 | ++ && !valid_stop_on_fail(value)) { |
294 | ++ crm_config_err("Resetting '" XML_OP_ATTR_ON_FAIL "' for %s stop " |
295 | ++ "action to default value because '%s' is not " |
296 | ++ "allowed for stop", action->rsc->id, value); |
297 | + return NULL; |
298 | + } else if (safe_str_eq(action->task, CRMD_ACTION_DEMOTE) && !value) { |
299 | + /* demote on_fail defaults to master monitor value if present */ |
300 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-03-8dceba7-Refactor-scheduler-use-more-appropriate-types.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-03-8dceba7-Refactor-scheduler-use-more-appropriate-types.patch |
301 | new file mode 100644 |
302 | index 0000000..303d874 |
303 | --- /dev/null |
304 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-03-8dceba7-Refactor-scheduler-use-more-appropriate-types.patch |
305 | @@ -0,0 +1,43 @@ |
306 | +From: Ken Gaillot <kgaillot@redhat.com> |
307 | +Date: Thu, 28 May 2020 08:50:33 -0500 |
308 | +Subject: Refactor: scheduler: use more appropriate types in a couple places |
309 | + |
310 | +Author: Ken Gaillot <kgaillot@redhat.com> |
311 | +Origin: upstream, https://github.com/ClusterLabs/pacemaker/commit/8dceba7 |
312 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
313 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
314 | +Last-Update: 2020-10-05 |
315 | +--- |
316 | + lib/pengine/unpack.c | 5 ++--- |
317 | + 1 file changed, 2 insertions(+), 3 deletions(-) |
318 | + |
319 | +diff --git a/lib/pengine/unpack.c b/lib/pengine/unpack.c |
320 | +index 514207e..8f1ac1b 100644 |
321 | +--- a/lib/pengine/unpack.c |
322 | ++++ b/lib/pengine/unpack.c |
323 | +@@ -2206,7 +2206,7 @@ unpack_lrm_rsc_state(node_t * node, xmlNode * rsc_entry, pe_working_set_t * data |
324 | + xmlNode *rsc_op = NULL; |
325 | + xmlNode *last_failure = NULL; |
326 | + |
327 | +- enum action_fail_response on_fail = FALSE; |
328 | ++ enum action_fail_response on_fail = action_fail_ignore; |
329 | + enum rsc_role_e saved_role = RSC_ROLE_UNKNOWN; |
330 | + |
331 | + crm_trace("[%s] Processing %s on %s", |
332 | +@@ -2237,7 +2237,6 @@ unpack_lrm_rsc_state(node_t * node, xmlNode * rsc_entry, pe_working_set_t * data |
333 | + |
334 | + /* process operations */ |
335 | + saved_role = rsc->role; |
336 | +- on_fail = action_fail_ignore; |
337 | + rsc->role = RSC_ROLE_UNKNOWN; |
338 | + sorted_op_list = g_list_sort(op_list, sort_op_by_callid); |
339 | + |
340 | +@@ -3331,7 +3330,7 @@ int pe__target_rc_from_xml(xmlNode *xml_op) |
341 | + static enum action_fail_response |
342 | + get_action_on_fail(resource_t *rsc, const char *key, const char *task, pe_working_set_t * data_set) |
343 | + { |
344 | +- int result = action_fail_recover; |
345 | ++ enum action_fail_response result = action_fail_recover; |
346 | + action_t *action = custom_action(rsc, strdup(key), task, NULL, TRUE, FALSE, data_set); |
347 | + |
348 | + result = action->on_fail; |
349 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-04-a4d6a20-Low-libpacemaker-don-t-force-stop-when-skipping.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-04-a4d6a20-Low-libpacemaker-don-t-force-stop-when-skipping.patch |
350 | new file mode 100644 |
351 | index 0000000..baad36a |
352 | --- /dev/null |
353 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-04-a4d6a20-Low-libpacemaker-don-t-force-stop-when-skipping.patch |
354 | @@ -0,0 +1,45 @@ |
355 | +From: Ken Gaillot <kgaillot@redhat.com> |
356 | +Date: Tue, 2 Jun 2020 12:05:57 -0500 |
357 | +Subject: Low: libpacemaker: don't force stop when skipping reload of failed |
358 | + resource |
359 | + |
360 | +Normal failure recovery will apply, which will stop if needed. |
361 | + |
362 | +(The stop was forced as of 2558d76f.) |
363 | + |
364 | +Author: Ken Gaillot <kgaillot@redhat.com> |
365 | +Origin: upstream, https://github.com/ClusterLabs/pacemaker/commit/a4d6a20 |
366 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
367 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
368 | +Last-Update: 2020-10-05 |
369 | +--- |
370 | + lib/pacemaker/pcmk_sched_native.c | 16 +++++++++++++--- |
371 | + 1 file changed, 13 insertions(+), 3 deletions(-) |
372 | + |
373 | +diff --git a/lib/pacemaker/pcmk_sched_native.c b/lib/pacemaker/pcmk_sched_native.c |
374 | +index bbf3eb7..04552c4 100644 |
375 | +--- a/lib/pacemaker/pcmk_sched_native.c |
376 | ++++ b/lib/pacemaker/pcmk_sched_native.c |
377 | +@@ -3270,9 +3270,19 @@ ReloadRsc(resource_t * rsc, node_t *node, pe_working_set_t * data_set) |
378 | + pe_rsc_trace(rsc, "%s: unmanaged", rsc->id); |
379 | + return; |
380 | + |
381 | +- } else if (is_set(rsc->flags, pe_rsc_failed) || is_set(rsc->flags, pe_rsc_start_pending)) { |
382 | +- pe_rsc_trace(rsc, "%s: general resource state: flags=0x%.16llx", rsc->id, rsc->flags); |
383 | +- stop_action(rsc, node, FALSE); /* Force a full restart, overkill? */ |
384 | ++ } else if (is_set(rsc->flags, pe_rsc_failed)) { |
385 | ++ /* We don't need to specify any particular actions here, normal failure |
386 | ++ * recovery will apply. |
387 | ++ */ |
388 | ++ pe_rsc_trace(rsc, "%s: preventing reload because failed", rsc->id); |
389 | ++ return; |
390 | ++ |
391 | ++ } else if (is_set(rsc->flags, pe_rsc_start_pending)) { |
392 | ++ /* If a resource's configuration changed while a start was pending, |
393 | ++ * force a full restart. |
394 | ++ */ |
395 | ++ pe_rsc_trace(rsc, "%s: preventing reload because start pending", rsc->id); |
396 | ++ stop_action(rsc, node, FALSE); |
397 | + return; |
398 | + |
399 | + } else if (node == NULL) { |
400 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-05-98c3b64-Log-libpacemaker-check-for-re-promotes-specifically.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-05-98c3b64-Log-libpacemaker-check-for-re-promotes-specifically.patch |
401 | new file mode 100644 |
402 | index 0000000..7620a7c |
403 | --- /dev/null |
404 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-05-98c3b64-Log-libpacemaker-check-for-re-promotes-specifically.patch |
405 | @@ -0,0 +1,46 @@ |
406 | +From: Ken Gaillot <kgaillot@redhat.com> |
407 | +Date: Mon, 13 Apr 2020 12:23:22 -0500 |
408 | +Subject: Log: libpacemaker: check for re-promotes specifically |
409 | + |
410 | +If a promotable clone instance is being demoted and promoted on its current |
411 | +node, without also stopping and starting, it previously would be logged as |
412 | +"Leave" indicating unchanged, because the current and next role are the same. |
413 | + |
414 | +Now, check for this situation specifically, and log it as "Re-promote". |
415 | + |
416 | +Currently, the scheduler is not capable of generating this situation, but |
417 | +upcoming changes will. |
418 | + |
419 | +Author: Ken Gaillot <kgaillot@redhat.com> |
420 | +Origin: upstream, https://github.com/ClusterLabs/pacemaker/commit/98c3b64 |
421 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
422 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
423 | +Last-Update: 2020-10-05 |
424 | +--- |
425 | + lib/pacemaker/pcmk_sched_native.c | 12 ++++++++++-- |
426 | + 1 file changed, 10 insertions(+), 2 deletions(-) |
427 | + |
428 | +diff --git a/lib/pacemaker/pcmk_sched_native.c b/lib/pacemaker/pcmk_sched_native.c |
429 | +index 04552c4..7c193fa 100644 |
430 | +--- a/lib/pacemaker/pcmk_sched_native.c |
431 | ++++ b/lib/pacemaker/pcmk_sched_native.c |
432 | +@@ -2466,9 +2466,17 @@ LogActions(resource_t * rsc, pe_working_set_t * data_set, gboolean terminal) |
433 | + } else if (is_set(rsc->flags, pe_rsc_reload)) { |
434 | + LogAction("Reload", rsc, current, next, start, NULL, terminal); |
435 | + |
436 | ++ |
437 | + } else if (start == NULL || is_set(start->flags, pe_action_optional)) { |
438 | +- pe_rsc_info(rsc, "Leave %s\t(%s %s)", rsc->id, role2text(rsc->role), |
439 | +- next->details->uname); |
440 | ++ if ((demote != NULL) && (promote != NULL) |
441 | ++ && is_not_set(demote->flags, pe_action_optional) |
442 | ++ && is_not_set(promote->flags, pe_action_optional)) { |
443 | ++ LogAction("Re-promote", rsc, current, next, promote, demote, |
444 | ++ terminal); |
445 | ++ } else { |
446 | ++ pe_rsc_info(rsc, "Leave %s\t(%s %s)", rsc->id, |
447 | ++ role2text(rsc->role), next->details->uname); |
448 | ++ } |
449 | + |
450 | + } else if (start && is_set(start->flags, pe_action_runnable) == FALSE) { |
451 | + LogAction("Stop", rsc, current, NULL, stop, |
452 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-06-2f1e2df-Feature-xml-add-on-fail-demote-option-to-resources.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-06-2f1e2df-Feature-xml-add-on-fail-demote-option-to-resources.patch |
453 | new file mode 100644 |
454 | index 0000000..d6aeb97 |
455 | --- /dev/null |
456 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-06-2f1e2df-Feature-xml-add-on-fail-demote-option-to-resources.patch |
457 | @@ -0,0 +1,38 @@ |
458 | +From: Ken Gaillot <kgaillot@redhat.com> |
459 | +Date: Tue, 26 May 2020 17:50:48 -0500 |
460 | +Subject: Feature: xml: add on-fail="demote" option to resources schema |
461 | + |
462 | +We don't need an XML schema version bump because it was already bumped since |
463 | +the last release, for the rsc_expression/op_expression feature. |
464 | + |
465 | + [Backport] |
466 | + |
467 | + Original patch changes xml/resources-3.4.rng. As we're backporting |
468 | + features to Ubuntu 2.0.4 release, which only defines schema up to |
469 | + xml/resources-3.2.rng, and using a new minor version only, for this |
470 | + new Ubuntu only feature set (3.3.1), this patch adds the feature to |
471 | + the 3.2 resources schema instead of a 3.4. |
472 | + |
473 | +Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
474 | + |
475 | +Author: Ken Gaillot <kgaillot@redhat.com> |
476 | +Origin: backport, https://github.com/ClusterLabs/pacemaker/commit/2f1e2df |
477 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
478 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
479 | +Last-Update: 2020-10-05 |
480 | +--- |
481 | + xml/resources-3.2.rng | 1 + |
482 | + 1 file changed, 1 insertion(+) |
483 | + |
484 | +diff --git a/xml/resources-3.2.rng b/xml/resources-3.2.rng |
485 | +index 44656d6..1930508 100644 |
486 | +--- a/xml/resources-3.2.rng |
487 | ++++ b/xml/resources-3.2.rng |
488 | +@@ -388,6 +388,7 @@ |
489 | + <choice> |
490 | + <value>ignore</value> |
491 | + <value>block</value> |
492 | ++ <value>demote</value> |
493 | + <value>stop</value> |
494 | + <value>restart</value> |
495 | + <value>standby</value> |
496 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-07-874f75e-Feature-scheduler-new-on-fail-demote-recovery-policy.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-07-874f75e-Feature-scheduler-new-on-fail-demote-recovery-policy.patch |
497 | new file mode 100644 |
498 | index 0000000..ae96daf |
499 | --- /dev/null |
500 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-07-874f75e-Feature-scheduler-new-on-fail-demote-recovery-policy.patch |
501 | @@ -0,0 +1,355 @@ |
502 | +From: Ken Gaillot <kgaillot@redhat.com> |
503 | +Date: Thu, 28 May 2020 08:29:37 -0500 |
504 | +Subject: Feature: scheduler: new on-fail="demote" recovery policy for |
505 | + promoted resources |
506 | + |
507 | +Author: Ken Gaillot <kgaillot@redhat.com> |
508 | +Origin: upstream, https://github.com/ClusterLabs/pacemaker/commit/874f75e |
509 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
510 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
511 | +Last-Update: 2020-10-05 |
512 | +--- |
513 | + include/crm/pengine/pe_types.h | 1 + |
514 | + lib/pacemaker/pcmk_sched_native.c | 25 +++++++++++++++---- |
515 | + lib/pengine/common.c | 3 +++ |
516 | + lib/pengine/unpack.c | 51 ++++++++++++++++++++++++++++++++++++--- |
517 | + lib/pengine/utils.c | 34 ++++++++++++++++++++++---- |
518 | + 5 files changed, 101 insertions(+), 13 deletions(-) |
519 | + |
520 | +diff --git a/include/crm/pengine/pe_types.h b/include/crm/pengine/pe_types.h |
521 | +index 23e1c46..6e5cbcc 100644 |
522 | +--- a/include/crm/pengine/pe_types.h |
523 | ++++ b/include/crm/pengine/pe_types.h |
524 | +@@ -235,6 +235,7 @@ struct pe_node_s { |
525 | + # define pe_rsc_allocating 0x00000200ULL |
526 | + # define pe_rsc_merging 0x00000400ULL |
527 | + |
528 | ++# define pe_rsc_stop 0x00001000ULL |
529 | + # define pe_rsc_reload 0x00002000ULL |
530 | + # define pe_rsc_allow_remote_remotes 0x00004000ULL |
531 | + |
532 | +diff --git a/lib/pacemaker/pcmk_sched_native.c b/lib/pacemaker/pcmk_sched_native.c |
533 | +index 7c193fa..3ce75b8 100644 |
534 | +--- a/lib/pacemaker/pcmk_sched_native.c |
535 | ++++ b/lib/pacemaker/pcmk_sched_native.c |
536 | +@@ -1122,6 +1122,7 @@ native_create_actions(resource_t * rsc, pe_working_set_t * data_set) |
537 | + node_t *chosen = NULL; |
538 | + node_t *current = NULL; |
539 | + gboolean need_stop = FALSE; |
540 | ++ bool need_promote = FALSE; |
541 | + gboolean is_moving = FALSE; |
542 | + gboolean allow_migrate = is_set(rsc->flags, pe_rsc_allow_migrate) ? TRUE : FALSE; |
543 | + |
544 | +@@ -1226,8 +1227,15 @@ native_create_actions(resource_t * rsc, pe_working_set_t * data_set) |
545 | + need_stop = TRUE; |
546 | + |
547 | + } else if (is_set(rsc->flags, pe_rsc_failed)) { |
548 | +- pe_rsc_trace(rsc, "Recovering %s", rsc->id); |
549 | +- need_stop = TRUE; |
550 | ++ if (is_set(rsc->flags, pe_rsc_stop)) { |
551 | ++ need_stop = TRUE; |
552 | ++ pe_rsc_trace(rsc, "Recovering %s", rsc->id); |
553 | ++ } else { |
554 | ++ pe_rsc_trace(rsc, "Recovering %s by demotion", rsc->id); |
555 | ++ if (rsc->next_role == RSC_ROLE_MASTER) { |
556 | ++ need_promote = TRUE; |
557 | ++ } |
558 | ++ } |
559 | + |
560 | + } else if (is_set(rsc->flags, pe_rsc_block)) { |
561 | + pe_rsc_trace(rsc, "Block %s", rsc->id); |
562 | +@@ -1261,10 +1269,16 @@ native_create_actions(resource_t * rsc, pe_working_set_t * data_set) |
563 | + |
564 | + |
565 | + while (rsc->role <= rsc->next_role && role != rsc->role && is_not_set(rsc->flags, pe_rsc_block)) { |
566 | ++ bool required = need_stop; |
567 | ++ |
568 | + next_role = rsc_state_matrix[role][rsc->role]; |
569 | ++ if ((next_role == RSC_ROLE_MASTER) && need_promote) { |
570 | ++ required = true; |
571 | ++ } |
572 | + pe_rsc_trace(rsc, "Up: Executing: %s->%s (%s)%s", role2text(role), role2text(next_role), |
573 | +- rsc->id, need_stop ? " required" : ""); |
574 | +- if (rsc_action_matrix[role][next_role] (rsc, chosen, !need_stop, data_set) == FALSE) { |
575 | ++ rsc->id, (required? " required" : "")); |
576 | ++ if (rsc_action_matrix[role][next_role](rsc, chosen, !required, |
577 | ++ data_set) == FALSE) { |
578 | + break; |
579 | + } |
580 | + role = next_role; |
581 | +@@ -2527,7 +2541,8 @@ LogActions(resource_t * rsc, pe_working_set_t * data_set, gboolean terminal) |
582 | + |
583 | + free(key); |
584 | + |
585 | +- } else if (stop && is_set(rsc->flags, pe_rsc_failed)) { |
586 | ++ } else if (stop && is_set(rsc->flags, pe_rsc_failed) |
587 | ++ && is_set(rsc->flags, pe_rsc_stop)) { |
588 | + /* 'stop' may be NULL if the failure was ignored */ |
589 | + LogAction("Recover", rsc, current, next, stop, start, terminal); |
590 | + STOP_SANITY_ASSERT(__LINE__); |
591 | +diff --git a/lib/pengine/common.c b/lib/pengine/common.c |
592 | +index da39c99..fcd7cf0 100644 |
593 | +--- a/lib/pengine/common.c |
594 | ++++ b/lib/pengine/common.c |
595 | +@@ -198,6 +198,9 @@ fail2text(enum action_fail_response fail) |
596 | + case action_fail_ignore: |
597 | + result = "ignore"; |
598 | + break; |
599 | ++ case action_fail_demote: |
600 | ++ result = "demote"; |
601 | ++ break; |
602 | + case action_fail_block: |
603 | + result = "block"; |
604 | + break; |
605 | +diff --git a/lib/pengine/unpack.c b/lib/pengine/unpack.c |
606 | +index 8f1ac1b..e690c4e 100644 |
607 | +--- a/lib/pengine/unpack.c |
608 | ++++ b/lib/pengine/unpack.c |
609 | +@@ -100,6 +100,7 @@ pe_fence_node(pe_working_set_t * data_set, node_t * node, const char *reason) |
610 | + */ |
611 | + node->details->remote_requires_reset = TRUE; |
612 | + set_bit(rsc->flags, pe_rsc_failed); |
613 | ++ set_bit(rsc->flags, pe_rsc_stop); |
614 | + } |
615 | + } |
616 | + |
617 | +@@ -109,6 +110,7 @@ pe_fence_node(pe_working_set_t * data_set, node_t * node, const char *reason) |
618 | + "and guest resource no longer exists", |
619 | + node->details->uname, reason); |
620 | + set_bit(node->details->remote_rsc->flags, pe_rsc_failed); |
621 | ++ set_bit(node->details->remote_rsc->flags, pe_rsc_stop); |
622 | + |
623 | + } else if (pe__is_remote_node(node)) { |
624 | + resource_t *rsc = node->details->remote_rsc; |
625 | +@@ -1898,6 +1900,7 @@ process_rsc_state(resource_t * rsc, node_t * node, |
626 | + */ |
627 | + if (pe__is_guest_node(node)) { |
628 | + set_bit(rsc->flags, pe_rsc_failed); |
629 | ++ set_bit(rsc->flags, pe_rsc_stop); |
630 | + should_fence = TRUE; |
631 | + |
632 | + } else if (is_set(data_set->flags, pe_flag_stonith_enabled)) { |
633 | +@@ -1940,6 +1943,11 @@ process_rsc_state(resource_t * rsc, node_t * node, |
634 | + /* nothing to do */ |
635 | + break; |
636 | + |
637 | ++ case action_fail_demote: |
638 | ++ set_bit(rsc->flags, pe_rsc_failed); |
639 | ++ demote_action(rsc, node, FALSE); |
640 | ++ break; |
641 | ++ |
642 | + case action_fail_fence: |
643 | + /* treat it as if it is still running |
644 | + * but also mark the node as unclean |
645 | +@@ -1976,12 +1984,14 @@ process_rsc_state(resource_t * rsc, node_t * node, |
646 | + case action_fail_recover: |
647 | + if (rsc->role != RSC_ROLE_STOPPED && rsc->role != RSC_ROLE_UNKNOWN) { |
648 | + set_bit(rsc->flags, pe_rsc_failed); |
649 | ++ set_bit(rsc->flags, pe_rsc_stop); |
650 | + stop_action(rsc, node, FALSE); |
651 | + } |
652 | + break; |
653 | + |
654 | + case action_fail_restart_container: |
655 | + set_bit(rsc->flags, pe_rsc_failed); |
656 | ++ set_bit(rsc->flags, pe_rsc_stop); |
657 | + |
658 | + if (rsc->container && pe_rsc_is_bundled(rsc)) { |
659 | + /* A bundle's remote connection can run on a different node than |
660 | +@@ -2000,6 +2010,7 @@ process_rsc_state(resource_t * rsc, node_t * node, |
661 | + |
662 | + case action_fail_reset_remote: |
663 | + set_bit(rsc->flags, pe_rsc_failed); |
664 | ++ set_bit(rsc->flags, pe_rsc_stop); |
665 | + if (is_set(data_set->flags, pe_flag_stonith_enabled)) { |
666 | + tmpnode = NULL; |
667 | + if (rsc->is_remote_node) { |
668 | +@@ -2054,8 +2065,17 @@ process_rsc_state(resource_t * rsc, node_t * node, |
669 | + } |
670 | + |
671 | + native_add_running(rsc, node, data_set); |
672 | +- if (on_fail != action_fail_ignore) { |
673 | +- set_bit(rsc->flags, pe_rsc_failed); |
674 | ++ switch (on_fail) { |
675 | ++ case action_fail_ignore: |
676 | ++ break; |
677 | ++ case action_fail_demote: |
678 | ++ case action_fail_block: |
679 | ++ set_bit(rsc->flags, pe_rsc_failed); |
680 | ++ break; |
681 | ++ default: |
682 | ++ set_bit(rsc->flags, pe_rsc_failed); |
683 | ++ set_bit(rsc->flags, pe_rsc_stop); |
684 | ++ break; |
685 | + } |
686 | + |
687 | + } else if (rsc->clone_name && strchr(rsc->clone_name, ':') != NULL) { |
688 | +@@ -2549,6 +2569,7 @@ unpack_migrate_to_success(pe_resource_t *rsc, pe_node_t *node, xmlNode *xml_op, |
689 | + } else { |
690 | + /* Consider it failed here - forces a restart, prevents migration */ |
691 | + set_bit(rsc->flags, pe_rsc_failed); |
692 | ++ set_bit(rsc->flags, pe_rsc_stop); |
693 | + clear_bit(rsc->flags, pe_rsc_allow_migrate); |
694 | + } |
695 | + } |
696 | +@@ -2739,9 +2760,21 @@ static int |
697 | + cmp_on_fail(enum action_fail_response first, enum action_fail_response second) |
698 | + { |
699 | + switch (first) { |
700 | ++ case action_fail_demote: |
701 | ++ switch (second) { |
702 | ++ case action_fail_ignore: |
703 | ++ return 1; |
704 | ++ case action_fail_demote: |
705 | ++ return 0; |
706 | ++ default: |
707 | ++ return -1; |
708 | ++ } |
709 | ++ break; |
710 | ++ |
711 | + case action_fail_reset_remote: |
712 | + switch (second) { |
713 | + case action_fail_ignore: |
714 | ++ case action_fail_demote: |
715 | + case action_fail_recover: |
716 | + return 1; |
717 | + case action_fail_reset_remote: |
718 | +@@ -2754,6 +2787,7 @@ cmp_on_fail(enum action_fail_response first, enum action_fail_response second) |
719 | + case action_fail_restart_container: |
720 | + switch (second) { |
721 | + case action_fail_ignore: |
722 | ++ case action_fail_demote: |
723 | + case action_fail_recover: |
724 | + case action_fail_reset_remote: |
725 | + return 1; |
726 | +@@ -2768,9 +2802,13 @@ cmp_on_fail(enum action_fail_response first, enum action_fail_response second) |
727 | + break; |
728 | + } |
729 | + switch (second) { |
730 | ++ case action_fail_demote: |
731 | ++ return (first == action_fail_ignore)? -1 : 1; |
732 | ++ |
733 | + case action_fail_reset_remote: |
734 | + switch (first) { |
735 | + case action_fail_ignore: |
736 | ++ case action_fail_demote: |
737 | + case action_fail_recover: |
738 | + return -1; |
739 | + default: |
740 | +@@ -2781,6 +2819,7 @@ cmp_on_fail(enum action_fail_response first, enum action_fail_response second) |
741 | + case action_fail_restart_container: |
742 | + switch (first) { |
743 | + case action_fail_ignore: |
744 | ++ case action_fail_demote: |
745 | + case action_fail_recover: |
746 | + case action_fail_reset_remote: |
747 | + return -1; |
748 | +@@ -3381,7 +3420,11 @@ update_resource_state(resource_t * rsc, node_t * node, xmlNode * xml_op, const c |
749 | + clear_past_failure = TRUE; |
750 | + |
751 | + } else if (safe_str_eq(task, CRMD_ACTION_DEMOTE)) { |
752 | +- /* Demote from Master does not clear an error */ |
753 | ++ |
754 | ++ if (*on_fail == action_fail_demote) { |
755 | ++ // Demote clears an error only if on-fail=demote |
756 | ++ clear_past_failure = TRUE; |
757 | ++ } |
758 | + rsc->role = RSC_ROLE_SLAVE; |
759 | + |
760 | + } else if (safe_str_eq(task, CRMD_ACTION_MIGRATED)) { |
761 | +@@ -3409,6 +3452,7 @@ update_resource_state(resource_t * rsc, node_t * node, xmlNode * xml_op, const c |
762 | + |
763 | + case action_fail_block: |
764 | + case action_fail_ignore: |
765 | ++ case action_fail_demote: |
766 | + case action_fail_recover: |
767 | + case action_fail_restart_container: |
768 | + *on_fail = action_fail_ignore; |
769 | +@@ -3669,6 +3713,7 @@ unpack_rsc_op(pe_resource_t *rsc, pe_node_t *node, xmlNode *xml_op, |
770 | + * that, ensure the remote connection is considered failed. |
771 | + */ |
772 | + set_bit(node->details->remote_rsc->flags, pe_rsc_failed); |
773 | ++ set_bit(node->details->remote_rsc->flags, pe_rsc_stop); |
774 | + } |
775 | + |
776 | + // fall through |
777 | +diff --git a/lib/pengine/utils.c b/lib/pengine/utils.c |
778 | +index ad5a09b..e57d858 100644 |
779 | +--- a/lib/pengine/utils.c |
780 | ++++ b/lib/pengine/utils.c |
781 | +@@ -670,6 +670,7 @@ static bool |
782 | + valid_stop_on_fail(const char *value) |
783 | + { |
784 | + return safe_str_neq(value, "standby") |
785 | ++ && safe_str_neq(value, "demote") |
786 | + && safe_str_neq(value, "stop"); |
787 | + } |
788 | + |
789 | +@@ -677,6 +678,11 @@ static const char * |
790 | + unpack_operation_on_fail(action_t * action) |
791 | + { |
792 | + |
793 | ++ const char *name = NULL; |
794 | ++ const char *role = NULL; |
795 | ++ const char *on_fail = NULL; |
796 | ++ const char *interval_spec = NULL; |
797 | ++ const char *enabled = NULL; |
798 | + const char *value = g_hash_table_lookup(action->meta, XML_OP_ATTR_ON_FAIL); |
799 | + |
800 | + if (safe_str_eq(action->task, CRMD_ACTION_STOP) |
801 | +@@ -685,14 +691,10 @@ unpack_operation_on_fail(action_t * action) |
802 | + "action to default value because '%s' is not " |
803 | + "allowed for stop", action->rsc->id, value); |
804 | + return NULL; |
805 | ++ |
806 | + } else if (safe_str_eq(action->task, CRMD_ACTION_DEMOTE) && !value) { |
807 | + /* demote on_fail defaults to master monitor value if present */ |
808 | + xmlNode *operation = NULL; |
809 | +- const char *name = NULL; |
810 | +- const char *role = NULL; |
811 | +- const char *on_fail = NULL; |
812 | +- const char *interval_spec = NULL; |
813 | +- const char *enabled = NULL; |
814 | + |
815 | + CRM_CHECK(action->rsc != NULL, return NULL); |
816 | + |
817 | +@@ -715,10 +717,28 @@ unpack_operation_on_fail(action_t * action) |
818 | + continue; |
819 | + } else if (crm_parse_interval_spec(interval_spec) == 0) { |
820 | + continue; |
821 | ++ } else if (safe_str_eq(on_fail, "demote")) { |
822 | ++ continue; |
823 | + } |
824 | + |
825 | + value = on_fail; |
826 | + } |
827 | ++ } else if (safe_str_eq(value, "demote")) { |
828 | ++ name = crm_element_value(action->op_entry, "name"); |
829 | ++ role = crm_element_value(action->op_entry, "role"); |
830 | ++ on_fail = crm_element_value(action->op_entry, XML_OP_ATTR_ON_FAIL); |
831 | ++ interval_spec = crm_element_value(action->op_entry, |
832 | ++ XML_LRM_ATTR_INTERVAL); |
833 | ++ |
834 | ++ if (safe_str_neq(name, CRMD_ACTION_PROMOTE) |
835 | ++ && (safe_str_neq(name, CRMD_ACTION_STATUS) |
836 | ++ || safe_str_neq(role, "Master") |
837 | ++ || (crm_parse_interval_spec(interval_spec) == 0))) { |
838 | ++ crm_config_err("Resetting '" XML_OP_ATTR_ON_FAIL "' for %s %s " |
839 | ++ "action to default value because 'demote' is not " |
840 | ++ "allowed for it", action->rsc->id, name); |
841 | ++ return NULL; |
842 | ++ } |
843 | + } |
844 | + |
845 | + return value; |
846 | +@@ -1097,6 +1117,10 @@ unpack_operation(action_t * action, xmlNode * xml_obj, resource_t * container, |
847 | + value = NULL; |
848 | + } |
849 | + |
850 | ++ } else if (safe_str_eq(value, "demote")) { |
851 | ++ action->on_fail = action_fail_demote; |
852 | ++ value = "demote instance"; |
853 | ++ |
854 | + } else { |
855 | + pe_err("Resource %s: Unknown failure type (%s)", action->rsc->id, value); |
856 | + value = NULL; |
857 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-08-7eec572-Build-libcrmcommon-bump-CRM-feature-set.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-08-7eec572-Build-libcrmcommon-bump-CRM-feature-set.patch |
858 | new file mode 100644 |
859 | index 0000000..95033ed |
860 | --- /dev/null |
861 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-08-7eec572-Build-libcrmcommon-bump-CRM-feature-set.patch |
862 | @@ -0,0 +1,51 @@ |
863 | +From: Ken Gaillot <kgaillot@redhat.com> |
864 | +Date: Fri, 5 Jun 2020 10:02:05 -0500 |
865 | +Subject: Build: libcrmcommon: bump CRM feature set |
866 | + |
867 | +... for op_expression/rsc_expression rules, on-fail=demote, and |
868 | +no-quorum-policy=demote |
869 | + |
870 | + [Backport] |
871 | + |
872 | + The features op_expression/rsc_expression are not included in this |
873 | + Ubuntu pacemaker release. The features being backported are only |
874 | + on-fail/no-quorum-policy=demote. For this reason, instead of using the |
875 | + upstream feature set version (3.4.0), I'm using Ubuntu own feature set |
876 | + version 3.2.1 (using a minor-minor version). |
877 | + |
878 | + Note: I have done the same thing in Pacemaker for Ubuntu Groovy, but, |
879 | + because that pacemaker had feature set version 3.3.0, I used version |
880 | + 3.3.1. |
881 | + |
882 | + There is no problem in having 3.2.1 supporting the feature, together |
883 | + with 3.3.1, as the minor-minor version serves for that purpose: to |
884 | + backport features in pacemaker distribution versions. When a cluster |
885 | + is upgraded from Focal (3.2.1) to Groovy (3.3.1), the feature will |
886 | + exist there as well. |
887 | + |
888 | +Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
889 | + |
890 | +Author: Ken Gaillot <kgaillot@redhat.com> |
891 | +Origin: backport, https://github.com/ClusterLabs/pacemaker/commit/7eec572 |
892 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
893 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
894 | +Last-Update: 2020-10-05 |
895 | +--- |
896 | + include/crm/crm.h | 3 ++- |
897 | + 1 file changed, 2 insertions(+), 1 deletion(-) |
898 | + |
899 | +diff --git a/include/crm/crm.h b/include/crm/crm.h |
900 | +index cbf72d3..35928bb 100644 |
901 | +--- a/include/crm/crm.h |
902 | ++++ b/include/crm/crm.h |
903 | +@@ -50,8 +50,9 @@ extern "C" { |
904 | + * XML v2 patchsets are created by default |
905 | + * >=3.0.13: Fail counts include operation name and interval |
906 | + * >=3.2.0: DC supports PCMK_LRM_OP_INVALID and PCMK_LRM_OP_NOT_CONNECTED |
907 | ++ * >=3.2.1: UBUNTU: on-fail=demote + no-quorum-policy=demote (3.4.0 backport) |
908 | + */ |
909 | +-# define CRM_FEATURE_SET "3.2.0" |
910 | ++# define CRM_FEATURE_SET "3.2.1" |
911 | + |
912 | + # define EOS '\0' |
913 | + # define DIMOF(a) ((int) (sizeof(a)/sizeof(a[0])) ) |
914 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-09-204961e-Doc-Pacemaker-Explained-document-new-on-fail.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-09-204961e-Doc-Pacemaker-Explained-document-new-on-fail.patch |
915 | new file mode 100644 |
916 | index 0000000..d993858 |
917 | --- /dev/null |
918 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-09-204961e-Doc-Pacemaker-Explained-document-new-on-fail.patch |
919 | @@ -0,0 +1,67 @@ |
920 | +From: Ken Gaillot <kgaillot@redhat.com> |
921 | +Date: Tue, 26 May 2020 18:04:32 -0500 |
922 | +Subject: Doc: Pacemaker Explained: document new on-fail="demote" option |
923 | + |
924 | +Author: Ken Gaillot <kgaillot@redhat.com> |
925 | +Origin: upstream, https://github.com/ClusterLabs/pacemaker/commit/204961e |
926 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
927 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
928 | +Last-Update: 2020-10-05 |
929 | +--- |
930 | + doc/Pacemaker_Explained/en-US/Ch-Resources.txt | 36 ++++++++++++++++++++++++++ |
931 | + 1 file changed, 36 insertions(+) |
932 | + |
933 | +diff --git a/doc/Pacemaker_Explained/en-US/Ch-Resources.txt b/doc/Pacemaker_Explained/en-US/Ch-Resources.txt |
934 | +index d8e7115..9df9243 100644 |
935 | +--- a/doc/Pacemaker_Explained/en-US/Ch-Resources.txt |
936 | ++++ b/doc/Pacemaker_Explained/en-US/Ch-Resources.txt |
937 | +@@ -676,6 +676,10 @@ a|The action to take if this action ever fails. Allowed values: |
938 | + * +ignore:+ Pretend the resource did not fail. |
939 | + * +block:+ Don't perform any further operations on the resource. |
940 | + * +stop:+ Stop the resource and do not start it elsewhere. |
941 | ++* +demote:+ Demote the resource, without a full restart. This is valid only for |
942 | ++ +promote+ actions, and for +monitor+ actions with both a nonzero +interval+ |
943 | ++ and +role+ set to +Master+; for any other action, a configuration error will |
944 | ++ be logged, and the default behavior will be used. |
945 | + * +restart:+ Stop the resource and start it again (possibly on a different node). |
946 | + * +fence:+ STONITH the node on which the resource failed. |
947 | + * +standby:+ Move _all_ resources away from the node on which the resource failed. |
948 | +@@ -714,6 +718,38 @@ indexterm:[Action,Property,on-fail] |
949 | + |
950 | + |========================================================= |
951 | + |
952 | ++[NOTE] |
953 | ++==== |
954 | ++When +on-fail+ is set to +demote+, recovery from failure by a successful demote |
955 | ++causes the cluster to recalculate whether and where a new instance should be |
956 | ++promoted. The node with the failure is eligible, so if master scores have not |
957 | ++changed, it will be promoted again. |
958 | ++ |
959 | ++There is no direct equivalent of +migration-threshold+ for the master role, but |
960 | ++the same effect can be achieved with a location constraint using a |
961 | ++<<ch-rules,rule>> with a node attribute expression for the resource's fail |
962 | ++count. |
963 | ++ |
964 | ++For example, to immediately ban the master role from a node with any failed |
965 | ++promote or master monitor: |
966 | ++[source,XML] |
967 | ++---- |
968 | ++<rsc_location id="loc1" rsc="my_primitive"> |
969 | ++ <rule id="rule1" score="-INFINITY" role="Master" boolean-op="or"> |
970 | ++ <expression id="expr1" attribute="fail-count-my_primitive#promote_0" |
971 | ++ operation="gte" value="1"/> |
972 | ++ <expression id="expr2" attribute="fail-count-my_primitive#monitor_10000" |
973 | ++ operation="gte" value="1"/> |
974 | ++ </rule> |
975 | ++</rsc_location> |
976 | ++---- |
977 | ++ |
978 | ++This example assumes that there is a promotable clone of the +my_primitive+ |
979 | ++resource (note that the primitive name, not the clone name, is used in the |
980 | ++rule), and that there is a recurring 10-second-interval monitor configured for |
981 | ++the master role (fail count attributes specify the interval in milliseconds). |
982 | ++==== |
983 | ++ |
984 | + [[s-resource-monitoring]] |
985 | + === Monitoring Resources for Failure === |
986 | + |
987 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-10-015b5c0-Doc-Pacemaker-Explained-document-no-quorum.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-10-015b5c0-Doc-Pacemaker-Explained-document-no-quorum.patch |
988 | new file mode 100644 |
989 | index 0000000..2573479 |
990 | --- /dev/null |
991 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-10-015b5c0-Doc-Pacemaker-Explained-document-no-quorum.patch |
992 | @@ -0,0 +1,26 @@ |
993 | +From: Ken Gaillot <kgaillot@redhat.com> |
994 | +Date: Thu, 28 May 2020 12:13:20 -0500 |
995 | +Subject: Doc: Pacemaker Explained: document no-quorum-policy=demote |
996 | + |
997 | +Author: Ken Gaillot <kgaillot@redhat.com> |
998 | +Origin: upstream, https://github.com/ClusterLabs/pacemaker/commit/015b5c0 |
999 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
1000 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
1001 | +Last-Update: 2020-10-05 |
1002 | +--- |
1003 | + doc/Pacemaker_Explained/en-US/Ch-Options.txt | 2 ++ |
1004 | + 1 file changed, 2 insertions(+) |
1005 | + |
1006 | +diff --git a/doc/Pacemaker_Explained/en-US/Ch-Options.txt b/doc/Pacemaker_Explained/en-US/Ch-Options.txt |
1007 | +index f864987..d344ecd 100644 |
1008 | +--- a/doc/Pacemaker_Explained/en-US/Ch-Options.txt |
1009 | ++++ b/doc/Pacemaker_Explained/en-US/Ch-Options.txt |
1010 | +@@ -181,6 +181,8 @@ What to do when the cluster does not have quorum. Allowed values: |
1011 | + * +ignore:+ continue all resource management |
1012 | + * +freeze:+ continue resource management, but don't recover resources from nodes not in the affected partition |
1013 | + * +stop:+ stop all resources in the affected cluster partition |
1014 | ++* +demote:+ demote promotable resources and stop all other resources in the |
1015 | ++ affected cluster partition |
1016 | + * +suicide:+ fence all nodes in the affected cluster partition |
1017 | + |
1018 | + | batch-limit | 0 | |
1019 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-11-0b68344-Refactor-scheduler-functionize-checking-quorum.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-11-0b68344-Refactor-scheduler-functionize-checking-quorum.patch |
1020 | new file mode 100644 |
1021 | index 0000000..ac2e23d |
1022 | --- /dev/null |
1023 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-11-0b68344-Refactor-scheduler-functionize-checking-quorum.patch |
1024 | @@ -0,0 +1,61 @@ |
1025 | +From: Ken Gaillot <kgaillot@redhat.com> |
1026 | +Date: Tue, 2 Jun 2020 15:05:56 -0500 |
1027 | +Subject: Refactor: scheduler: functionize checking quorum policy in effect |
1028 | + |
1029 | +... for readability and ease of future changes |
1030 | + |
1031 | +Author: Ken Gaillot <kgaillot@redhat.com> |
1032 | +Origin: upstream, https://github.com/ClusterLabs/pacemaker/commit/0b68344 |
1033 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
1034 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
1035 | +Last-Update: 2020-10-05 |
1036 | +--- |
1037 | + lib/pengine/utils.c | 18 ++++++++++++++---- |
1038 | + 1 file changed, 14 insertions(+), 4 deletions(-) |
1039 | + |
1040 | +diff --git a/lib/pengine/utils.c b/lib/pengine/utils.c |
1041 | +index e57d858..b842481 100644 |
1042 | +--- a/lib/pengine/utils.c |
1043 | ++++ b/lib/pengine/utils.c |
1044 | +@@ -451,6 +451,17 @@ sort_rsc_priority(gconstpointer a, gconstpointer b) |
1045 | + return 0; |
1046 | + } |
1047 | + |
1048 | ++static enum pe_quorum_policy |
1049 | ++effective_quorum_policy(pe_resource_t *rsc, pe_working_set_t *data_set) |
1050 | ++{ |
1051 | ++ enum pe_quorum_policy policy = data_set->no_quorum_policy; |
1052 | ++ |
1053 | ++ if (is_set(data_set->flags, pe_flag_have_quorum)) { |
1054 | ++ policy = no_quorum_ignore; |
1055 | ++ } |
1056 | ++ return policy; |
1057 | ++} |
1058 | ++ |
1059 | + action_t * |
1060 | + custom_action(resource_t * rsc, char *key, const char *task, |
1061 | + node_t * on_node, gboolean optional, gboolean save_action, |
1062 | +@@ -554,6 +565,7 @@ custom_action(resource_t * rsc, char *key, const char *task, |
1063 | + |
1064 | + if (rsc != NULL) { |
1065 | + enum action_tasks a_task = text2task(action->task); |
1066 | ++ enum pe_quorum_policy quorum_policy = effective_quorum_policy(rsc, data_set); |
1067 | + int warn_level = LOG_TRACE; |
1068 | + |
1069 | + if (save_action) { |
1070 | +@@ -625,13 +637,11 @@ custom_action(resource_t * rsc, char *key, const char *task, |
1071 | + crm_trace("Action %s requires only stonith", action->uuid); |
1072 | + action->runnable = TRUE; |
1073 | + #endif |
1074 | +- } else if (is_set(data_set->flags, pe_flag_have_quorum) == FALSE |
1075 | +- && data_set->no_quorum_policy == no_quorum_stop) { |
1076 | ++ } else if (quorum_policy == no_quorum_stop) { |
1077 | + pe_action_set_flag_reason(__FUNCTION__, __LINE__, action, NULL, "no quorum", pe_action_runnable, TRUE); |
1078 | + crm_debug("%s\t%s (cancelled : quorum)", action->node->details->uname, action->uuid); |
1079 | + |
1080 | +- } else if (is_set(data_set->flags, pe_flag_have_quorum) == FALSE |
1081 | +- && data_set->no_quorum_policy == no_quorum_freeze) { |
1082 | ++ } else if (quorum_policy == no_quorum_freeze) { |
1083 | + pe_rsc_trace(rsc, "Check resource is already active: %s %s %s %s", rsc->id, action->uuid, role2text(rsc->next_role), role2text(rsc->role)); |
1084 | + if (rsc->fns->active(rsc, TRUE) == FALSE || rsc->next_role > rsc->role) { |
1085 | + pe_action_set_flag_reason(__FUNCTION__, __LINE__, action, NULL, "quorum freeze", pe_action_runnable, TRUE); |
1086 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-12-b1ae359-Feature-scheduler-support-demote-choice-for.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-12-b1ae359-Feature-scheduler-support-demote-choice-for.patch |
1087 | new file mode 100644 |
1088 | index 0000000..c7fc280 |
1089 | --- /dev/null |
1090 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-12-b1ae359-Feature-scheduler-support-demote-choice-for.patch |
1091 | @@ -0,0 +1,163 @@ |
1092 | +From: Ken Gaillot <kgaillot@redhat.com> |
1093 | +Date: Tue, 2 Jun 2020 15:06:32 -0500 |
1094 | +Subject: Feature: scheduler: support "demote" choice for no-quorum-policy |
1095 | + option |
1096 | + |
1097 | +If quorum is lost, promotable resources in the master role will be demoted but |
1098 | +left running, and all other resources will be stopped. |
1099 | + |
1100 | + [Backport] |
1101 | + |
1102 | + Existing pacemaker version did not have create common/options.c yet, |
1103 | + and did not have a detailed pengine/pe_output.c txt output function. |
1104 | + I have changed html and xml output functions (as the original patch) |
1105 | + and kept the same changes in everything else. |
1106 | + |
1107 | +Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
1108 | + |
1109 | +Author: Ken Gaillot <kgaillot@redhat.com> |
1110 | +Origin: backport, https://github.com/ClusterLabs/pacemaker/commit/b1ae359 |
1111 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
1112 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
1113 | +Last-Update: 2020-10-05 |
1114 | +--- |
1115 | + daemons/controld/controld_control.c | 2 +- |
1116 | + include/crm/pengine/pe_types.h | 3 ++- |
1117 | + lib/common/utils.c | 3 +++ |
1118 | + lib/pengine/common.c | 2 +- |
1119 | + lib/pengine/unpack.c | 7 +++++++ |
1120 | + lib/pengine/utils.c | 14 ++++++++++++++ |
1121 | + tools/crm_mon_output.c | 9 +++++++++ |
1122 | + 7 files changed, 37 insertions(+), 3 deletions(-) |
1123 | + |
1124 | +diff --git a/daemons/controld/controld_control.c b/daemons/controld/controld_control.c |
1125 | +index 6c7f97c..132b059 100644 |
1126 | +--- a/daemons/controld/controld_control.c |
1127 | ++++ b/daemons/controld/controld_control.c |
1128 | +@@ -587,7 +587,7 @@ static pe_cluster_option crmd_opts[] = { |
1129 | + { "stonith-max-attempts",NULL,"integer",NULL,"10",&check_positive_number, |
1130 | + "How many times stonith can fail before it will no longer be attempted on a target" |
1131 | + }, |
1132 | +- { "no-quorum-policy", NULL, "enum", "stop, freeze, ignore, suicide", "stop", &check_quorum, NULL, NULL }, |
1133 | ++ { "no-quorum-policy", NULL, "enum", "stop, freeze, ignore, demote, suicide", "stop", &check_quorum, NULL, NULL }, |
1134 | + }; |
1135 | + /* *INDENT-ON* */ |
1136 | + |
1137 | +diff --git a/include/crm/pengine/pe_types.h b/include/crm/pengine/pe_types.h |
1138 | +index 6e5cbcc..baa9160 100644 |
1139 | +--- a/include/crm/pengine/pe_types.h |
1140 | ++++ b/include/crm/pengine/pe_types.h |
1141 | +@@ -61,7 +61,8 @@ enum pe_quorum_policy { |
1142 | + no_quorum_freeze, |
1143 | + no_quorum_stop, |
1144 | + no_quorum_ignore, |
1145 | +- no_quorum_suicide |
1146 | ++ no_quorum_suicide, |
1147 | ++ no_quorum_demote |
1148 | + }; |
1149 | + |
1150 | + enum node_type { |
1151 | +diff --git a/lib/common/utils.c b/lib/common/utils.c |
1152 | +index cb0bc1f..b114b44 100644 |
1153 | +--- a/lib/common/utils.c |
1154 | ++++ b/lib/common/utils.c |
1155 | +@@ -140,6 +140,9 @@ check_quorum(const char *value) |
1156 | + } else if (safe_str_eq(value, "ignore")) { |
1157 | + return TRUE; |
1158 | + |
1159 | ++ } else if (safe_str_eq(value, "demote")) { |
1160 | ++ return TRUE; |
1161 | ++ |
1162 | + } else if (safe_str_eq(value, "suicide")) { |
1163 | + return TRUE; |
1164 | + } |
1165 | +diff --git a/lib/pengine/common.c b/lib/pengine/common.c |
1166 | +index fcd7cf0..d134d79 100644 |
1167 | +--- a/lib/pengine/common.c |
1168 | ++++ b/lib/pengine/common.c |
1169 | +@@ -75,7 +75,7 @@ check_placement_strategy(const char *value) |
1170 | + /* *INDENT-OFF* */ |
1171 | + static pe_cluster_option pe_opts[] = { |
1172 | + /* name, old-name, validate, default, description */ |
1173 | +- { "no-quorum-policy", NULL, "enum", "stop, freeze, ignore, suicide", "stop", &check_quorum, |
1174 | ++ { "no-quorum-policy", NULL, "enum", "stop, freeze, ignore, demote, suicide", "stop", &check_quorum, |
1175 | + "What to do when the cluster does not have quorum", NULL }, |
1176 | + { "symmetric-cluster", NULL, "boolean", NULL, "true", &check_boolean, |
1177 | + "All resources can run anywhere by default", NULL }, |
1178 | +diff --git a/lib/pengine/unpack.c b/lib/pengine/unpack.c |
1179 | +index e690c4e..3306662 100644 |
1180 | +--- a/lib/pengine/unpack.c |
1181 | ++++ b/lib/pengine/unpack.c |
1182 | +@@ -243,6 +243,9 @@ unpack_config(xmlNode * config, pe_working_set_t * data_set) |
1183 | + } else if (safe_str_eq(value, "freeze")) { |
1184 | + data_set->no_quorum_policy = no_quorum_freeze; |
1185 | + |
1186 | ++ } else if (safe_str_eq(value, "demote")) { |
1187 | ++ data_set->no_quorum_policy = no_quorum_demote; |
1188 | ++ |
1189 | + } else if (safe_str_eq(value, "suicide")) { |
1190 | + if (is_set(data_set->flags, pe_flag_stonith_enabled)) { |
1191 | + int do_panic = 0; |
1192 | +@@ -271,6 +274,10 @@ unpack_config(xmlNode * config, pe_working_set_t * data_set) |
1193 | + case no_quorum_stop: |
1194 | + crm_debug("On loss of quorum: Stop ALL resources"); |
1195 | + break; |
1196 | ++ case no_quorum_demote: |
1197 | ++ crm_debug("On loss of quorum: " |
1198 | ++ "Demote promotable resources and stop other resources"); |
1199 | ++ break; |
1200 | + case no_quorum_suicide: |
1201 | + crm_notice("On loss of quorum: Fence all remaining nodes"); |
1202 | + break; |
1203 | +diff --git a/lib/pengine/utils.c b/lib/pengine/utils.c |
1204 | +index b842481..fd06e9e 100644 |
1205 | +--- a/lib/pengine/utils.c |
1206 | ++++ b/lib/pengine/utils.c |
1207 | +@@ -458,6 +458,20 @@ effective_quorum_policy(pe_resource_t *rsc, pe_working_set_t *data_set) |
1208 | + |
1209 | + if (is_set(data_set->flags, pe_flag_have_quorum)) { |
1210 | + policy = no_quorum_ignore; |
1211 | ++ |
1212 | ++ } else if (data_set->no_quorum_policy == no_quorum_demote) { |
1213 | ++ switch (rsc->role) { |
1214 | ++ case RSC_ROLE_MASTER: |
1215 | ++ case RSC_ROLE_SLAVE: |
1216 | ++ if (rsc->next_role > RSC_ROLE_SLAVE) { |
1217 | ++ rsc->next_role = RSC_ROLE_SLAVE; |
1218 | ++ } |
1219 | ++ policy = no_quorum_ignore; |
1220 | ++ break; |
1221 | ++ default: |
1222 | ++ policy = no_quorum_stop; |
1223 | ++ break; |
1224 | ++ } |
1225 | + } |
1226 | + return policy; |
1227 | + } |
1228 | +diff --git a/tools/crm_mon_output.c b/tools/crm_mon_output.c |
1229 | +index c27aa83..bb6b8b8 100644 |
1230 | +--- a/tools/crm_mon_output.c |
1231 | ++++ b/tools/crm_mon_output.c |
1232 | +@@ -472,6 +472,11 @@ cluster_options_html(pcmk__output_t *out, va_list args) { |
1233 | + out->list_item(out, NULL, "No Quorum policy: Stop ALL resources"); |
1234 | + break; |
1235 | + |
1236 | ++ case no_quorum_demote: |
1237 | ++ out->list_item(out, NULL, "No Quorum policy: Demote promotable " |
1238 | ++ "resources and stop all other resources"); |
1239 | ++ break; |
1240 | ++ |
1241 | + case no_quorum_ignore: |
1242 | + out->list_item(out, NULL, "No Quorum policy: Ignore"); |
1243 | + break; |
1244 | +@@ -526,6 +531,10 @@ cluster_options_xml(pcmk__output_t *out, va_list args) { |
1245 | + xmlSetProp(node, (pcmkXmlStr) "no-quorum-policy", (pcmkXmlStr) "stop"); |
1246 | + break; |
1247 | + |
1248 | ++ case no_quorum_demote: |
1249 | ++ xmlSetProp(node, (pcmkXmlStr) "no-quorum-policy", (pcmkXmlStr) "demote"); |
1250 | ++ break; |
1251 | ++ |
1252 | + case no_quorum_ignore: |
1253 | + xmlSetProp(node, (pcmkXmlStr) "no-quorum-policy", (pcmkXmlStr) "ignore"); |
1254 | + break; |
1255 | diff --git a/debian/patches/ubuntu-2.0.3-demote/lp1896223-13-d4b9117-Doc-Pacemaker-Explained-correct-on-fail-default.patch b/debian/patches/ubuntu-2.0.3-demote/lp1896223-13-d4b9117-Doc-Pacemaker-Explained-correct-on-fail-default.patch |
1256 | new file mode 100644 |
1257 | index 0000000..f70ac52 |
1258 | --- /dev/null |
1259 | +++ b/debian/patches/ubuntu-2.0.3-demote/lp1896223-13-d4b9117-Doc-Pacemaker-Explained-correct-on-fail-default.patch |
1260 | @@ -0,0 +1,33 @@ |
1261 | +From: Ken Gaillot <kgaillot@redhat.com> |
1262 | +Date: Tue, 26 May 2020 18:10:33 -0500 |
1263 | +Subject: Doc: Pacemaker Explained: correct on-fail default |
1264 | + |
1265 | +Author: Ken Gaillot <kgaillot@redhat.com> |
1266 | +Origin: upstream, https://github.com/ClusterLabs/pacemaker/commit/d4b9117 |
1267 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1896223 |
1268 | +Reviewed-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> |
1269 | +Last-Update: 2020-10-05 |
1270 | +--- |
1271 | + doc/Pacemaker_Explained/en-US/Ch-Resources.txt | 9 +++++++-- |
1272 | + 1 file changed, 7 insertions(+), 2 deletions(-) |
1273 | + |
1274 | +diff --git a/doc/Pacemaker_Explained/en-US/Ch-Resources.txt b/doc/Pacemaker_Explained/en-US/Ch-Resources.txt |
1275 | +index 9df9243..88892db 100644 |
1276 | +--- a/doc/Pacemaker_Explained/en-US/Ch-Resources.txt |
1277 | ++++ b/doc/Pacemaker_Explained/en-US/Ch-Resources.txt |
1278 | +@@ -669,8 +669,13 @@ XML attributes take precedence over +nvpair+ elements if both are specified. |
1279 | + indexterm:[Action,Property,timeout] |
1280 | + |
1281 | + |on-fail |
1282 | +-|restart '(except for +stop+ operations, which default to' fence 'when |
1283 | +- STONITH is enabled and' block 'otherwise)' |
1284 | ++a|Varies by action: |
1285 | ++ |
1286 | ++* +stop+: +fence+ if +stonith-enabled+ is true or +block+ otherwise |
1287 | ++* +demote+: +on-fail+ of the +monitor+ action with +role+ set to +Master+, if |
1288 | ++ present, enabled, and configured to a value other than +demote+, or +restart+ |
1289 | ++ otherwise |
1290 | ++* all other actions: +restart+ |
1291 | + a|The action to take if this action ever fails. Allowed values: |
1292 | + |
1293 | + * +ignore:+ Pretend the resource did not fail. |
The content of the MP has changed a lot (some patch headers, a whitespace change).
Seems like a re-export from git with other git options after a rebase, but mostly nothing important - yet a lot to re-digest :-)
---
There are also other changes like crm_config_err -> pcmk__config_err, action_t -> pe_action_t, and so on
I re-reviewed these compared to the changes that formerly went in.
The patches still apply and didn't semantically change (mostly varibable names and such context updates).
Well, the code is complex and "applying" isn't everything.
Could be all good, or we have just lost all that we formerly had identified on backport.
Because the changes closely match the former backport notes.
Let me ask one question as an example - and depending on the outcome I'd ask to regenerate ALL patches with that in mind - or I'm happy as is and we can go on. 02-ef246ff- Fix-scheduler- disallow- on-fail- stop-for- stop.patch
File: lp1896223-
Rafael: crm_config_err
Lucas: pcmk__config_err
But Lucas File still says at the top:
7 [Backport]
8
9 This pacemaker version did not use pcmk__config_err() function for
10 configuration warnings. It used crm_config_err() still.
So either update all the patch headers to have backport-notes that match the content.
OR I've found changes that got unintentionally changed - then please fix these.
I have checked the example above for pcmk__config_err, but it isn't touched in the patches that went in since the last merge by Rafel. So one of you must be wrong I guess?
---
One more thing that remains as a todo:
"One bit, on SRUs I usually proof-read the SRU details on the bugs description before we go to the SRU team to be denied. There is nothing there yet, please make sure to add it before uploading."
That I asked on the old MP and I'd still ask for it - yet your SRU experience is good and you won't make a bad bug description (so I don't need to review it) - just keep in mind that you need to be complete before uploading.
---
TL;DR:
- Please explain if the patch differences were intentional (or fix them up if not)
- please add SRU description before upload
- please complete the tests before upload