Merge lp:~chad.smith/landscape-client/ha-manager-skeleton into lp:~landscape/landscape-client/trunk
- ha-manager-skeleton
- Merge into trunk
Status: | Merged |
---|---|
Approved by: | Chad Smith |
Approved revision: | 636 |
Merged at revision: | 628 |
Proposed branch: | lp:~chad.smith/landscape-client/ha-manager-skeleton |
Merge into: | lp:~landscape/landscape-client/trunk |
Diff against target: |
597 lines (+547/-3) 5 files modified
landscape/manager/config.py (+1/-1) landscape/manager/haservice.py (+205/-0) landscape/manager/tests/test_config.py (+2/-1) landscape/manager/tests/test_haservice.py (+331/-0) landscape/message_schemas.py (+8/-1) |
To merge this branch: | bzr merge lp:~chad.smith/landscape-client/ha-manager-skeleton |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Jerry Seutter (community) | Approve | ||
Christopher Armstrong (community) | Approve | ||
Review via email: mp+148593@code.launchpad.net |
Commit message
Initial HA service manager plugin for landscape-client to better enable Openstack live upgrades. This manager expects generic ha enablement and health scripts (add_to_cluster, remove_from_cluster and health_checks.d) delivered by a charm at /var/lib/
This plugin only activates upon receipt of an ha-service-change message from landscape-server. It will only take action with haproxy configured charms delivering the above-mentioned scripts. Without scripts or a health_checks.d dir, this plugin will log success and continue on with any package maintenance or updates.
Description of the change
Initial HA service manager pluging for landscape- client. Please take a good look over the deferreds and callbacks I'm using to make sure I'm not overusing callbacks.
This is round 1, a skeleton that depends on charm-delivered scripts which allow for local service health checks and ha_cluster online or standby. There will likely me iterations on this as the server team (HA charm writers) add functionality to the openstack HA charms.
The manager will allow Landscape server to send change-ha-service messages to request service-state: "online" or "standby".
When change-ha-service requests the "standby" state:
The manager will run the charm's remove_from_cluster script and validate the error code to return an operation-result message with SUCCEEDED or FAILED message status. If the charm doesn't deliver a remove_from_cluster script a SUCCEEDED result is returned.
When change-ha-service requests the "online" state:
The manager will first run and validate any charm health check scripts delivered at /var/lib/
To test:
in my local copy of landscape/trunk I hacked the RebootComputer message to instead send a static change-ha-service message so I could test from a local client. I'll attach the trunk patch I was using which might better enable integration testing.
Chad Smith (chad.smith) wrote : | # |
- 629. By Chad Smith
-
reintroduce CharmScriptError and RunPartsError to return as twisted deferred fails.
- 630. By Chad Smith
-
fixed config unit test adding HAService to ALL_PLUGINS test
Christopher Armstrong (radix) wrote : | # |
[1] + def run_parts(
You have an inner run_parts method that doesn't need to exist; you can just inline the code.
[2] + def _respond_
You should use landscape.
[3] _format_exception and the following code at the end of handle_
+ except Exception, e:
+ self._respond_
Instead, do
except:
self.
Failure(), when constructed with no arguments, automatically grabs the "current" exception and traceback.
[4] I recommend separating _respond_failure into two different functions, one for handling failure instances and another for handling string messages.
[5] In the places where you invoke getProcessValue, I think you'll still need to provide an environment in case the script relies on basic things like PATH, etc. It should be reasonable to just pass through os.environ like you do for the getProcessOutpu
[6]
+ def validate_
+ if code != 0:
+ return fail(CharmScrip
+ else:
+ return succeed("%s succeeded." % script)
This could be rewritten a bit nicer as
if code != 0:
raise CharmScriptErro
else:
return "%s succeeded." % script
[7] Same for parse_output.
- 631. By Chad Smith
-
per review comments:
- drop unnneeded run_parts method in favor of inline
- use log_failure instead of logging.error
- drop format_exception and use Failure() instead
- on success, instead of return succeed("some string") just return "some string"
- raise CharmScriptError instead of return fail(CharmScriptError) - 632. By Chad Smith
-
- add _respond_
failure_ string to handle failure strings using logging.error.
- _respond_failure handles any raised exceptions using log_failure
Chad Smith (chad.smith) wrote : | # |
thanks Chris for the input. I worked in those changes you suggested.
Christopher Armstrong (radix) wrote : | # |
[8]
+ failure_string = "%s" % (failure.value)
You should probably just use failure.
Looks good!
- 633. By Chad Smith
-
use failure.
getErrorMessage () instead of str(failure.value)
Jerry Seutter (jseutter) wrote : | # |
+1 looks good
Regarding "opid" - it looks like operator process id to me. I have been told to stop over-abbreviating variable names in the past. It would probably be best to use operation_id instead.
- 634. By Chad Smith
-
charm directory needs to tack on the 'charm' subdir for housing all charm deliverables. An example: /var/lib/
juju/units/ keystone- 2/charm/ - 635. By Chad Smith
-
opid -> operation_id
- 636. By Chad Smith
-
lint fixes
Preview Diff
1 | === modified file 'landscape/manager/config.py' |
2 | --- landscape/manager/config.py 2013-01-24 20:15:36 +0000 |
3 | +++ landscape/manager/config.py 2013-02-22 00:26:21 +0000 |
4 | @@ -6,7 +6,7 @@ |
5 | |
6 | ALL_PLUGINS = ["ProcessKiller", "PackageManager", "UserManager", |
7 | "ShutdownManager", "Eucalyptus", "AptSources", "HardwareInfo", |
8 | - "CephUsage", "KeystoneToken"] |
9 | + "CephUsage", "KeystoneToken", "HAService"] |
10 | |
11 | |
12 | class ManagerConfiguration(Configuration): |
13 | |
14 | === added file 'landscape/manager/haservice.py' |
15 | --- landscape/manager/haservice.py 1970-01-01 00:00:00 +0000 |
16 | +++ landscape/manager/haservice.py 2013-02-22 00:26:21 +0000 |
17 | @@ -0,0 +1,205 @@ |
18 | +import logging |
19 | +import os |
20 | + |
21 | +from twisted.python.failure import Failure |
22 | +from twisted.internet.utils import getProcessValue, getProcessOutputAndValue |
23 | +from twisted.internet.defer import succeed |
24 | + |
25 | +from landscape.lib.log import log_failure |
26 | +from landscape.manager.plugin import ManagerPlugin, SUCCEEDED, FAILED |
27 | + |
28 | + |
29 | +class CharmScriptError(Exception): |
30 | + """ |
31 | + Raised when a charm-provided script fails with a non-zero exit code. |
32 | + |
33 | + @ivar script: the name of the failed script |
34 | + @ivar code: the exit code of the failed script |
35 | + """ |
36 | + |
37 | + def __init__(self, script, code): |
38 | + self.script = script |
39 | + self.code = code |
40 | + Exception.__init__(self, self._get_message()) |
41 | + |
42 | + def _get_message(self): |
43 | + return ("Failed charm script: %s exited with return code %d." % |
44 | + (self.script, self.code)) |
45 | + |
46 | + |
47 | +class RunPartsError(Exception): |
48 | + """ |
49 | + Raised when a charm-provided health script run-parts directory contains |
50 | + a health script that fails with a non-zero exit code. |
51 | + |
52 | + @ivar stderr: the stderr from the failed run-parts command |
53 | + """ |
54 | + |
55 | + def __init__(self, stderr): |
56 | + self.message = ("%s" % stderr.split(":")[1].strip()) |
57 | + Exception.__init__(self, self._get_message()) |
58 | + |
59 | + def _get_message(self): |
60 | + return "Failed charm script: %s." % self.message |
61 | + |
62 | + |
63 | +class HAService(ManagerPlugin): |
64 | + """ |
65 | + Plugin to manage this computer's active participation in a |
66 | + high-availability cluster. It depends on charms delivering both health |
67 | + scripts and cluster_add cluster_remove scripts to function. |
68 | + """ |
69 | + |
70 | + JUJU_UNITS_BASE = "/var/lib/juju/units" |
71 | + CLUSTER_ONLINE = "add_to_cluster" |
72 | + CLUSTER_STANDBY = "remove_from_cluster" |
73 | + HEALTH_SCRIPTS_DIR = "health_checks.d" |
74 | + STATE_STANDBY = u"standby" |
75 | + STATE_ONLINE = u"online" |
76 | + |
77 | + def register(self, registry): |
78 | + super(HAService, self).register(registry) |
79 | + registry.register_message("change-ha-service", |
80 | + self.handle_change_ha_service) |
81 | + |
82 | + def _respond(self, status, data, operation_id): |
83 | + message = {"type": "operation-result", |
84 | + "status": status, |
85 | + "operation-id": operation_id} |
86 | + if data: |
87 | + if not isinstance(data, unicode): |
88 | + # Let's decode result-text, replacing non-printable |
89 | + # characters |
90 | + message["result-text"] = data.decode("utf-8", "replace") |
91 | + else: |
92 | + message["result-text"] = data.decode("utf-8", "replace") |
93 | + return self.registry.broker.send_message(message, True) |
94 | + |
95 | + def _respond_success(self, data, message, operation_id): |
96 | + logging.info(message) |
97 | + return self._respond(SUCCEEDED, data, operation_id) |
98 | + |
99 | + def _respond_failure(self, failure, operation_id): |
100 | + """Handle exception failures.""" |
101 | + log_failure(failure) |
102 | + return self._respond(FAILED, failure.getErrorMessage(), operation_id) |
103 | + |
104 | + def _respond_failure_string(self, failure_string, operation_id): |
105 | + """Only handle string failures.""" |
106 | + logging.error(failure_string) |
107 | + return self._respond(FAILED, failure_string, operation_id) |
108 | + |
109 | + def _run_health_checks(self, unit_name): |
110 | + """ |
111 | + Exercise any discovered health check scripts, will return a deferred |
112 | + success or fail. |
113 | + """ |
114 | + health_dir = "%s/%s/charm/%s" % ( |
115 | + self.JUJU_UNITS_BASE, unit_name, self.HEALTH_SCRIPTS_DIR) |
116 | + if not os.path.exists(health_dir) or len(os.listdir(health_dir)) == 0: |
117 | + # No scripts, no problem |
118 | + message = ( |
119 | + "Skipping juju charm health checks. No scripts at %s." % |
120 | + health_dir) |
121 | + logging.info(message) |
122 | + return succeed(message) |
123 | + |
124 | + def parse_output((stdout_data, stderr_data, status)): |
125 | + if status != 0: |
126 | + raise RunPartsError(stderr_data) |
127 | + else: |
128 | + return "All health checks succeeded." |
129 | + |
130 | + result = getProcessOutputAndValue( |
131 | + "run-parts", [health_dir], env=os.environ) |
132 | + return result.addCallback(parse_output) |
133 | + |
134 | + def _change_cluster_participation(self, _, unit_name, service_state): |
135 | + """ |
136 | + Enables or disables a unit's participation in a cluster based on |
137 | + running charm-delivered CLUSTER_ONLINE and CLUSTER_STANDBY scripts |
138 | + if they exist. If the charm doesn't deliver scripts, return succeed(). |
139 | + """ |
140 | + |
141 | + unit_dir = "%s/%s/charm/" % (self.JUJU_UNITS_BASE, unit_name) |
142 | + if service_state == u"online": |
143 | + script = unit_dir + self.CLUSTER_ONLINE |
144 | + else: |
145 | + script = unit_dir + self.CLUSTER_STANDBY |
146 | + |
147 | + if not os.path.exists(script): |
148 | + logging.info("Ignoring juju charm cluster state change to '%s'. " |
149 | + "Charm script does not exist at %s." % |
150 | + (service_state, script)) |
151 | + return succeed( |
152 | + "This computer is always a participant in its high-availabilty" |
153 | + " cluster. No juju charm cluster settings changed.") |
154 | + |
155 | + def run_script(script): |
156 | + result = getProcessValue(script, env=os.environ) |
157 | + |
158 | + def validate_exit_code(code, script): |
159 | + if code != 0: |
160 | + raise CharmScriptError(script, code) |
161 | + else: |
162 | + return "%s succeeded." % script |
163 | + return result.addCallback(validate_exit_code, script) |
164 | + |
165 | + return run_script(script) |
166 | + |
167 | + def _perform_state_change(self, unit_name, service_state, operation_id): |
168 | + """ |
169 | + Handle specific state change requests through calls to available |
170 | + charm scripts like C{CLUSTER_ONLINE}, C{CLUSTER_STANDBY} and any |
171 | + health check scripts. Assume success in any case where no scripts |
172 | + exist for a given task. |
173 | + """ |
174 | + d = succeed(None) |
175 | + if service_state == self.STATE_ONLINE: |
176 | + # Validate health of local service before we bring it online |
177 | + # in the HAcluster |
178 | + d = self._run_health_checks(unit_name) |
179 | + d.addCallback( |
180 | + self._change_cluster_participation, unit_name, service_state) |
181 | + return d |
182 | + |
183 | + def handle_change_ha_service(self, message): |
184 | + """Parse incoming change-ha-service messages""" |
185 | + operation_id = message["operation-id"] |
186 | + try: |
187 | + error_message = u"" |
188 | + |
189 | + service_name = message["service-name"] # keystone |
190 | + unit_name = message["unit-name"] # keystone-0 |
191 | + service_state = message["service-state"] # "online" | "standby" |
192 | + change_message = ( |
193 | + "%s high-availability service set to %s" % |
194 | + (service_name, service_state)) |
195 | + |
196 | + if service_state not in [self.STATE_STANDBY, self.STATE_ONLINE]: |
197 | + error_message = ( |
198 | + u"Invalid cluster participation state requested %s." % |
199 | + service_state) |
200 | + |
201 | + unit_dir = "%s/%s/charm" % (self.JUJU_UNITS_BASE, unit_name) |
202 | + if not os.path.exists(self.JUJU_UNITS_BASE): |
203 | + error_message = ( |
204 | + u"This computer is not deployed with juju. " |
205 | + u"Changing high-availability service not supported.") |
206 | + elif not os.path.exists(unit_dir): |
207 | + error_message = ( |
208 | + u"This computer is not juju unit %s. Unable to " |
209 | + u"modify high-availability services." % unit_name) |
210 | + |
211 | + if error_message: |
212 | + return self._respond_failure_string( |
213 | + error_message, operation_id) |
214 | + |
215 | + d = self._perform_state_change( |
216 | + unit_name, service_state, operation_id) |
217 | + d.addCallback(self._respond_success, change_message, operation_id) |
218 | + d.addErrback(self._respond_failure, operation_id) |
219 | + return d |
220 | + except: |
221 | + self._respond_failure(Failure(), operation_id) |
222 | + return d |
223 | |
224 | === modified file 'landscape/manager/tests/test_config.py' |
225 | --- landscape/manager/tests/test_config.py 2013-01-24 18:22:32 +0000 |
226 | +++ landscape/manager/tests/test_config.py 2013-02-22 00:26:21 +0000 |
227 | @@ -13,7 +13,8 @@ |
228 | """By default all plugins are enabled.""" |
229 | self.assertEqual(["ProcessKiller", "PackageManager", "UserManager", |
230 | "ShutdownManager", "Eucalyptus", "AptSources", |
231 | - "HardwareInfo", "CephUsage", "KeystoneToken"], |
232 | + "HardwareInfo", "CephUsage", "KeystoneToken", |
233 | + "HAService"], |
234 | ALL_PLUGINS) |
235 | self.assertEqual(ALL_PLUGINS, self.config.plugin_factories) |
236 | |
237 | |
238 | === added file 'landscape/manager/tests/test_haservice.py' |
239 | --- landscape/manager/tests/test_haservice.py 1970-01-01 00:00:00 +0000 |
240 | +++ landscape/manager/tests/test_haservice.py 2013-02-22 00:26:21 +0000 |
241 | @@ -0,0 +1,331 @@ |
242 | +import os |
243 | + |
244 | +from twisted.internet.defer import Deferred |
245 | + |
246 | + |
247 | +from landscape.manager.haservice import HAService |
248 | +from landscape.manager.plugin import SUCCEEDED, FAILED |
249 | +from landscape.tests.helpers import LandscapeTest, ManagerHelper |
250 | +from landscape.tests.mocker import ANY |
251 | + |
252 | + |
253 | +class HAServiceTests(LandscapeTest): |
254 | + helpers = [ManagerHelper] |
255 | + |
256 | + def setUp(self): |
257 | + super(HAServiceTests, self).setUp() |
258 | + self.ha_service = HAService() |
259 | + self.ha_service.JUJU_UNITS_BASE = self.makeDir() |
260 | + self.unit_name = "my-service-9" |
261 | + |
262 | + self.health_check_d = os.path.join( |
263 | + self.ha_service.JUJU_UNITS_BASE, self.unit_name, "charm", |
264 | + self.ha_service.HEALTH_SCRIPTS_DIR) |
265 | + # create entire dir path |
266 | + os.makedirs(self.health_check_d) |
267 | + |
268 | + self.manager.add(self.ha_service) |
269 | + |
270 | + unit_dir = "%s/%s/charm" % ( |
271 | + self.ha_service.JUJU_UNITS_BASE, self.unit_name) |
272 | + cluster_online = file( |
273 | + "%s/add_to_cluster" % unit_dir, "w") |
274 | + cluster_online.write("#!/bin/bash\nexit 0") |
275 | + cluster_online.close() |
276 | + cluster_standby = file( |
277 | + "%s/remove_from_cluster" % unit_dir, "w") |
278 | + cluster_standby.write("#!/bin/bash\nexit 0") |
279 | + cluster_standby.close() |
280 | + |
281 | + os.chmod( |
282 | + "%s/add_to_cluster" % unit_dir, 0755) |
283 | + os.chmod( |
284 | + "%s/remove_from_cluster" % unit_dir, 0755) |
285 | + |
286 | + service = self.broker_service |
287 | + service.message_store.set_accepted_types(["operation-result"]) |
288 | + |
289 | + def test_invalid_server_service_state_request(self): |
290 | + """ |
291 | + When the landscape server requests a C{service-state} other than |
292 | + 'online' or 'standby' the client responds with the appropriate error. |
293 | + """ |
294 | + logging_mock = self.mocker.replace("logging.error") |
295 | + logging_mock("Invalid cluster participation state requested BOGUS.") |
296 | + self.mocker.replay() |
297 | + |
298 | + self.manager.dispatch_message( |
299 | + {"type": "change-ha-service", "service-name": "my-service", |
300 | + "unit-name": self.unit_name, "service-state": "BOGUS", |
301 | + "operation-id": 1}) |
302 | + |
303 | + service = self.broker_service |
304 | + self.assertMessages( |
305 | + service.message_store.get_pending_messages(), |
306 | + [{"type": "operation-result", "result-text": |
307 | + u"Invalid cluster participation state requested BOGUS.", |
308 | + "status": FAILED, "operation-id": 1}]) |
309 | + |
310 | + def test_not_a_juju_computer(self): |
311 | + """ |
312 | + When not a juju charmed computer, L{HAService} reponds with an error |
313 | + due to missing JUJU_UNITS_BASE dir. |
314 | + """ |
315 | + self.ha_service.JUJU_UNITS_BASE = "/I/don't/exist" |
316 | + |
317 | + logging_mock = self.mocker.replace("logging.error") |
318 | + logging_mock("This computer is not deployed with juju. " |
319 | + "Changing high-availability service not supported.") |
320 | + self.mocker.replay() |
321 | + |
322 | + self.manager.dispatch_message( |
323 | + {"type": "change-ha-service", "service-name": "my-service", |
324 | + "unit-name": self.unit_name, |
325 | + "service-state": self.ha_service.STATE_STANDBY, |
326 | + "operation-id": 1}) |
327 | + |
328 | + service = self.broker_service |
329 | + self.assertMessages( |
330 | + service.message_store.get_pending_messages(), |
331 | + [{"type": "operation-result", "result-text": |
332 | + u"This computer is not deployed with juju. Changing " |
333 | + u"high-availability service not supported.", |
334 | + "status": FAILED, "operation-id": 1}]) |
335 | + |
336 | + def test_incorrect_juju_unit(self): |
337 | + """ |
338 | + When not the specific juju charmed computer, L{HAService} reponds |
339 | + with an error due to missing the JUJU_UNITS_BASE/$JUJU_UNIT dir. |
340 | + """ |
341 | + logging_mock = self.mocker.replace("logging.error") |
342 | + logging_mock("This computer is not juju unit some-other-service-0. " |
343 | + "Unable to modify high-availability services.") |
344 | + self.mocker.replay() |
345 | + |
346 | + self.manager.dispatch_message( |
347 | + {"type": "change-ha-service", "service-name": "some-other-service", |
348 | + "unit-name": "some-other-service-0", "service-state": "standby", |
349 | + "operation-id": 1}) |
350 | + |
351 | + service = self.broker_service |
352 | + self.assertMessages( |
353 | + service.message_store.get_pending_messages(), |
354 | + [{"type": "operation-result", "result-text": |
355 | + u"This computer is not juju unit some-other-service-0. " |
356 | + u"Unable to modify high-availability services.", |
357 | + "status": FAILED, "operation-id": 1}]) |
358 | + |
359 | + def test_wb_no_health_check_directory(self): |
360 | + """ |
361 | + When unable to find a valid C{HEALTH_CHECK_DIR}, L{HAService} will |
362 | + succeed but log an informational message. |
363 | + """ |
364 | + self.ha_service.HEALTH_SCRIPTS_DIR = "I/don't/exist" |
365 | + |
366 | + def should_not_be_called(result): |
367 | + self.fail( |
368 | + "_run_health_checks failed on absent health check directory.") |
369 | + |
370 | + def check_success_result(result): |
371 | + self.assertEqual( |
372 | + result, |
373 | + "Skipping juju charm health checks. No scripts at " |
374 | + "%s/%s/charm/I/don't/exist." % |
375 | + (self.ha_service.JUJU_UNITS_BASE, self.unit_name)) |
376 | + |
377 | + result = self.ha_service._run_health_checks(self.unit_name) |
378 | + result.addCallbacks(check_success_result, should_not_be_called) |
379 | + |
380 | + def test_wb_no_health_check_scripts(self): |
381 | + """ |
382 | + When C{HEALTH_CHECK_DIR} exists but, no scripts exist, L{HAService} |
383 | + will log an informational message, but succeed. |
384 | + """ |
385 | + # In setup we created a health check directory but placed no health |
386 | + # scripts in it. |
387 | + def should_not_be_called(result): |
388 | + self.fail( |
389 | + "_run_health_checks failed on empty health check directory.") |
390 | + |
391 | + def check_success_result(result): |
392 | + self.assertEqual( |
393 | + result, |
394 | + "Skipping juju charm health checks. No scripts at " |
395 | + "%s/%s/charm/%s." % |
396 | + (self.ha_service.JUJU_UNITS_BASE, self.unit_name, |
397 | + self.ha_service.HEALTH_SCRIPTS_DIR)) |
398 | + |
399 | + result = self.ha_service._run_health_checks(self.unit_name) |
400 | + result.addCallbacks(check_success_result, should_not_be_called) |
401 | + |
402 | + def test_wb_failed_health_script(self): |
403 | + """ |
404 | + L{HAService} runs all health check scripts found in the |
405 | + C{HEALTH_CHECK_DIR}. If any script fails, L{HAService} will return a |
406 | + deferred L{fail}. |
407 | + """ |
408 | + |
409 | + def expected_failure(result): |
410 | + self.assertEqual( |
411 | + str(result.value), |
412 | + "Failed charm script: %s/%s/charm/%s/my-health-script-2 " |
413 | + "exited with return code 1." % |
414 | + (self.ha_service.JUJU_UNITS_BASE, self.unit_name, |
415 | + self.ha_service.HEALTH_SCRIPTS_DIR)) |
416 | + |
417 | + def check_success_result(result): |
418 | + self.fail( |
419 | + "_run_health_checks succeded despite a failed health script.") |
420 | + |
421 | + for number in [1, 2, 3]: |
422 | + script_path = ( |
423 | + "%s/my-health-script-%d" % (self.health_check_d, number)) |
424 | + health_script = file(script_path, "w") |
425 | + if number == 2: |
426 | + health_script.write("#!/bin/bash\nexit 1") |
427 | + else: |
428 | + health_script.write("#!/bin/bash\nexit 0") |
429 | + health_script.close() |
430 | + os.chmod(script_path, 0755) |
431 | + |
432 | + result = self.ha_service._run_health_checks(self.unit_name) |
433 | + result.addCallbacks(check_success_result, expected_failure) |
434 | + return result |
435 | + |
436 | + def test_missing_cluster_standby_or_cluster_online_scripts(self): |
437 | + """ |
438 | + When no cluster status change scripts are delivered by the charm, |
439 | + L{HAService} will still return a L{succeeded}. |
440 | + C{HEALTH_CHECK_DIR}. If any script fails, L{HAService} will return a |
441 | + deferred L{fail}. |
442 | + """ |
443 | + |
444 | + def should_not_be_called(result): |
445 | + self.fail( |
446 | + "_change_cluster_participation failed on absent charm script.") |
447 | + |
448 | + def check_success_result(result): |
449 | + self.assertEqual( |
450 | + result, |
451 | + "This computer is always a participant in its high-availabilty" |
452 | + " cluster. No juju charm cluster settings changed.") |
453 | + |
454 | + self.ha_service.CLUSTER_ONLINE = "I/don't/exist" |
455 | + self.ha_service.CLUSTER_STANDBY = "I/don't/exist" |
456 | + |
457 | + result = self.ha_service._change_cluster_participation( |
458 | + None, self.unit_name, self.ha_service.STATE_ONLINE) |
459 | + result.addCallbacks(check_success_result, should_not_be_called) |
460 | + |
461 | + # Now test the cluster standby script |
462 | + result = self.ha_service._change_cluster_participation( |
463 | + None, self.unit_name, self.ha_service.STATE_STANDBY) |
464 | + result.addCallbacks(check_success_result, should_not_be_called) |
465 | + return result |
466 | + |
467 | + def test_failed_cluster_standby_or_cluster_online_scripts(self): |
468 | + def expected_failure(result, script_path): |
469 | + self.assertEqual( |
470 | + str(result.value), |
471 | + "Failed charm script: %s exited with return code 2." % |
472 | + (script_path)) |
473 | + |
474 | + def check_success_result(result): |
475 | + self.fail( |
476 | + "_change_cluster_participation ignored charm script failure.") |
477 | + |
478 | + # Rewrite both cluster scripts as failures |
479 | + unit_dir = "%s/%s/charm" % ( |
480 | + self.ha_service.JUJU_UNITS_BASE, self.unit_name) |
481 | + for script_name in [ |
482 | + self.ha_service.CLUSTER_ONLINE, self.ha_service.CLUSTER_STANDBY]: |
483 | + |
484 | + cluster_online = file("%s/%s" % (unit_dir, script_name), "w") |
485 | + cluster_online.write("#!/bin/bash\nexit 2") |
486 | + cluster_online.close() |
487 | + |
488 | + result = self.ha_service._change_cluster_participation( |
489 | + None, self.unit_name, self.ha_service.STATE_ONLINE) |
490 | + result.addCallback(check_success_result) |
491 | + script_path = ("%s/%s" % (unit_dir, self.ha_service.CLUSTER_ONLINE)) |
492 | + result.addErrback(expected_failure, script_path) |
493 | + |
494 | + # Now test the cluster standby script |
495 | + result = self.ha_service._change_cluster_participation( |
496 | + None, self.unit_name, self.ha_service.STATE_STANDBY) |
497 | + result.addCallback(check_success_result) |
498 | + script_path = ("%s/%s" % (unit_dir, self.ha_service.CLUSTER_STANDBY)) |
499 | + result.addErrback(expected_failure, script_path) |
500 | + return result |
501 | + |
502 | + def test_run_success_cluster_standby(self): |
503 | + """ |
504 | + When receives a C{change-ha-service message} with C{STATE_STANDBY} |
505 | + requested the manager runs the C{CLUSTER_STANDBY} script and returns |
506 | + a successful operation-result to the server. |
507 | + """ |
508 | + message = ({"type": "change-ha-service", "service-name": "my-service", |
509 | + "unit-name": self.unit_name, |
510 | + "service-state": self.ha_service.STATE_STANDBY, |
511 | + "operation-id": 1}) |
512 | + deferred = Deferred() |
513 | + |
514 | + def validate_messages(value): |
515 | + cluster_script = "%s/%s/charm/%s" % ( |
516 | + self.ha_service.JUJU_UNITS_BASE, self.unit_name, |
517 | + self.ha_service.CLUSTER_STANDBY) |
518 | + service = self.broker_service |
519 | + self.assertMessages( |
520 | + service.message_store.get_pending_messages(), |
521 | + [{"type": "operation-result", |
522 | + "result-text": u"%s succeeded." % cluster_script, |
523 | + "status": SUCCEEDED, "operation-id": 1}]) |
524 | + |
525 | + def handle_has_run(handle_result_deferred): |
526 | + handle_result_deferred.chainDeferred(deferred) |
527 | + return deferred.addCallback(validate_messages) |
528 | + |
529 | + ha_service_mock = self.mocker.patch(self.ha_service) |
530 | + ha_service_mock.handle_change_ha_service(ANY) |
531 | + self.mocker.passthrough(handle_has_run) |
532 | + self.mocker.replay() |
533 | + self.manager.add(self.ha_service) |
534 | + self.manager.dispatch_message(message) |
535 | + |
536 | + return deferred |
537 | + |
538 | + def test_run_success_cluster_online(self): |
539 | + """ |
540 | + When receives a C{change-ha-service message} with C{STATE_ONLINE} |
541 | + requested the manager runs the C{CLUSTER_ONLINE} script and returns |
542 | + a successful operation-result to the server. |
543 | + """ |
544 | + message = ({"type": "change-ha-service", "service-name": "my-service", |
545 | + "unit-name": self.unit_name, |
546 | + "service-state": self.ha_service.STATE_ONLINE, |
547 | + "operation-id": 1}) |
548 | + deferred = Deferred() |
549 | + |
550 | + def validate_messages(value): |
551 | + cluster_script = "%s/%s/charm/%s" % ( |
552 | + self.ha_service.JUJU_UNITS_BASE, self.unit_name, |
553 | + self.ha_service.CLUSTER_ONLINE) |
554 | + service = self.broker_service |
555 | + self.assertMessages( |
556 | + service.message_store.get_pending_messages(), |
557 | + [{"type": "operation-result", |
558 | + "result-text": u"%s succeeded." % cluster_script, |
559 | + "status": SUCCEEDED, "operation-id": 1}]) |
560 | + |
561 | + def handle_has_run(handle_result_deferred): |
562 | + handle_result_deferred.chainDeferred(deferred) |
563 | + return deferred.addCallback(validate_messages) |
564 | + |
565 | + ha_service_mock = self.mocker.patch(self.ha_service) |
566 | + ha_service_mock.handle_change_ha_service(ANY) |
567 | + self.mocker.passthrough(handle_has_run) |
568 | + self.mocker.replay() |
569 | + self.manager.add(self.ha_service) |
570 | + self.manager.dispatch_message(message) |
571 | + |
572 | + return deferred |
573 | |
574 | === modified file 'landscape/message_schemas.py' |
575 | --- landscape/message_schemas.py 2013-02-21 13:35:54 +0000 |
576 | +++ landscape/message_schemas.py 2013-02-22 00:26:21 +0000 |
577 | @@ -125,6 +125,12 @@ |
578 | "data": Any(String(), Constant(None)) |
579 | }) |
580 | |
581 | +CHANGE_HA_SERVICE = Message( |
582 | + "change-ha-service", |
583 | + {"service-name": String(), # keystone |
584 | + "unit-name": String(), # keystone-9 |
585 | + "state": String()}) # online or standby |
586 | + |
587 | MEMORY_INFO = Message("memory-info", { |
588 | "memory-info": List(Tuple(Float(), Int(), Int())), |
589 | }) |
590 | @@ -445,5 +451,6 @@ |
591 | CUSTOM_GRAPH, REBOOT_REQUIRED, APT_PREFERENCES, EUCALYPTUS_INFO, |
592 | EUCALYPTUS_INFO_ERROR, NETWORK_DEVICE, NETWORK_ACTIVITY, |
593 | REBOOT_REQUIRED_INFO, UPDATE_MANAGER_INFO, CPU_USAGE, |
594 | - CEPH_USAGE, SWIFT_DEVICE_INFO, KEYSTONE_TOKEN]: |
595 | + CEPH_USAGE, SWIFT_DEVICE_INFO, KEYSTONE_TOKEN, |
596 | + CHANGE_HA_SERVICE]: |
597 | message_schemas[schema.type] = schema |
hmmm no attach functionality to a merge proposal. Let's try a pastebin (which will probably need updating)
https:/ /pastebin. canonical. com/84692/