Merge lp:~stub/charms/trusty/cassandra/spike into lp:charms/trusty/cassandra
- Trusty Tahr (14.04)
- spike
- Merge into trunk
Status: | Merged | ||||
---|---|---|---|---|---|
Merged at revision: | 362 | ||||
Proposed branch: | lp:~stub/charms/trusty/cassandra/spike | ||||
Merge into: | lp:charms/trusty/cassandra | ||||
Diff against target: |
7203 lines (+2676/-3234) 41 files modified
Makefile (+10/-6) README.md (+4/-11) charm-helpers.yaml (+2/-3) config.yaml (+1/-18) hooks/actions.py (+380/-191) hooks/charmhelpers/contrib/benchmark/__init__.py (+124/-0) hooks/charmhelpers/contrib/charmsupport/nrpe.py (+3/-1) hooks/charmhelpers/contrib/peerstorage/__init__.py (+0/-150) hooks/charmhelpers/contrib/unison/__init__.py (+0/-298) hooks/charmhelpers/coordinator.py (+607/-0) hooks/charmhelpers/core/hookenv.py (+230/-14) hooks/charmhelpers/core/host.py (+1/-1) hooks/charmhelpers/core/services/base.py (+41/-17) hooks/charmhelpers/core/strutils.py (+2/-2) hooks/charmhelpers/fetch/giturl.py (+7/-5) hooks/cluster-relation-changed (+20/-0) hooks/cluster-relation-departed (+20/-0) hooks/config-changed (+20/-0) hooks/coordinator.py (+35/-0) hooks/data-relation-changed (+20/-0) hooks/data-relation-departed (+20/-0) hooks/database-admin-relation-changed (+20/-0) hooks/database-relation-changed (+20/-0) hooks/definitions.py (+51/-62) hooks/helpers.py (+194/-517) hooks/hooks.py (+22/-7) hooks/install (+20/-0) hooks/leader-elected (+20/-0) hooks/leader-settings-changed (+20/-0) hooks/nrpe-external-master-relation-changed (+20/-0) hooks/relations.py (+19/-3) hooks/rollingrestart.py (+0/-257) hooks/stop (+20/-0) hooks/upgrade-charm (+20/-0) templates/cassandra_maintenance_cron.tmpl (+1/-1) tests/test_actions.py (+524/-515) tests/test_definitions.py (+37/-21) tests/test_helpers.py (+93/-589) tests/test_integration.py (+27/-8) tests/test_rollingrestart.py (+0/-536) tests/tests.yaml (+1/-1) |
||||
To merge this branch: | bzr merge lp:~stub/charms/trusty/cassandra/spike | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Tim Van Steenburgh (community) | Approve | ||
charmers | Pending | ||
Review via email:
|
Commit message
Description of the change
Rework the Cassandra charm to rely on leadership, and maintain the unit and service status while we are at it.
By using leadership, we increase reliability and are able to remove a large amount of code that was needed to cope with flaky coordination.

Stuart Bishop (stub) wrote : | # |

Stuart Bishop (stub) wrote : | # |
I've also taken the opportunity to update the default versions. With 2.1.7, the 2.1 series is finally the recommended version and DataStax has released 4.7 (now based on Cassandra 2.1).

Tim Van Steenburgh (tvansteenburgh) wrote : | # |
Hey stub, there are some failures in the tests. You may already be aware, but I wanted to point it out b/c the test runner is, for reasons unknown to me at this point, exiting 0 in spite of the failures.
Check out the console logs here: http://

Stuart Bishop (stub) wrote : | # |
On 26 June 2015 at 22:37, Tim Van Steenburgh <email address hidden> wrote:
> Check out the console logs here: http://
These seem to be for an earlier revision of the charm. In particular,
the Test21Deployment tests should be failing because that TestCase no
longer exists, yet they are being run. This broke with my work on the
23rd, so I've fixed the Makefile and tests.yaml and pushed.
--
Stuart Bishop <email address hidden>

Stuart Bishop (stub) wrote : | # |
I have fixed the error codes bug, so failing tests will be reported as failed in Jenkins.

Stuart Bishop (stub) wrote : | # |
Another run at http://
http://

Stuart Bishop (stub) wrote : | # |
http://

Stuart Bishop (stub) wrote : | # |
I've also opened Bug #1470024 on the sentry.
- 362. By Tim Van Steenburgh
-
[stub] Rework the Cassandra charm to rely on leadership, and maintain
the unit and service status while we are at it.By using leadership, we increase reliability and are able to remove a
large amount of code that was needed to cope with flaky coordination.

Tim Van Steenburgh (tvansteenburgh) wrote : | # |
LGTM, merging on the merit of tests passing on AWS. Other failures are clearly tooling-related, not problems in the charm. Thanks for filing the bug against amulet.
Preview Diff
1 | === modified file 'Makefile' |
2 | --- Makefile 2015-04-28 16:20:05 +0000 |
3 | +++ Makefile 2015-06-30 07:36:26 +0000 |
4 | @@ -34,6 +34,10 @@ |
5 | PIP=.venv3/bin/pip3.4 -q |
6 | NOSETESTS=.venv3/bin/nosetests-3.4 -sv |
7 | |
8 | +# Set pipefail so we can get sane error codes while tagging test output |
9 | +# with ts(1) |
10 | +SHELL=bash -o pipefail |
11 | + |
12 | deps: packages venv3 |
13 | |
14 | lint: deps |
15 | @@ -41,16 +45,17 @@ |
16 | free --human |
17 | charm proof $(CHARM_DIR) |
18 | flake8 \ |
19 | - --ignore=E402 \ |
20 | + --ignore=E402,E265 \ |
21 | --exclude=charmhelpers,.venv2,.venv3 hooks tests testing |
22 | + @echo OK: Lint free `date` |
23 | |
24 | unittest: lint |
25 | $(NOSETESTS) \ |
26 | tests.test_actions --cover-package=actions \ |
27 | tests.test_helpers --cover-package=helpers \ |
28 | - tests.test_rollingrestart --cover-package=rollingrestart \ |
29 | tests.test_definitions --cover-package=definitions \ |
30 | --with-coverage --cover-branches |
31 | + @echo OK: Unit tests pass `date` |
32 | |
33 | test: unittest |
34 | AMULET_TIMEOUT=3600 \ |
35 | @@ -62,11 +67,11 @@ |
36 | AMULET_TIMEOUT=5400 \ |
37 | $(NOSETESTS) tests.test_integration:Test1UnitDeployment 2>&1 | ts |
38 | |
39 | -21test: unittest Test21Deployment |
40 | -Test21Deployment: deps |
41 | +20test: unittest Test21Deployment |
42 | +Test20Deployment: deps |
43 | date |
44 | AMULET_TIMEOUT=5400 \ |
45 | - $(NOSETESTS) tests.test_integration:Test21Deployment 2>&1 | ts |
46 | + $(NOSETESTS) tests.test_integration:Test20Deployment 2>&1 | ts |
47 | |
48 | 3test: unittest Test3UnitDeployment |
49 | Test3UnitDeployment: deps |
50 | @@ -95,7 +100,6 @@ |
51 | $(NOSETESTS) \ |
52 | tests.test_actions --cover-package=actions \ |
53 | tests.test_helpers --cover-package=helpers \ |
54 | - tests.test_rollingrestart --cover-package=rollingrestart \ |
55 | tests.test_definitions --cover-package=definitions \ |
56 | --with-coverage --cover-branches \ |
57 | --cover-html --cover-html-dir=coverage \ |
58 | |
59 | === modified file 'README.md' |
60 | --- README.md 2015-03-07 09:27:02 +0000 |
61 | +++ README.md 2015-06-30 07:36:26 +0000 |
62 | @@ -14,9 +14,9 @@ |
63 | # Editions |
64 | |
65 | This charm supports Apache Cassandra 2.0, Apache Cassandra 2.1, and |
66 | -Datastax Enterprise 4.6. The default is Apache Cassandra 2.0. |
67 | +Datastax Enterprise 4.7. The default is Apache Cassandra 2.1. |
68 | |
69 | -To use Apache Cassandra 2.1, specify the Apache Cassandra 2.1 archive source |
70 | +To use Apache Cassandra 2.0, specify the Apache Cassandra 2.0 archive source |
71 | in the `install_sources` config setting when deploying. |
72 | |
73 | To use Datastax Enterprise, set the `edition` config setting to `dse` |
74 | @@ -83,23 +83,16 @@ |
75 | `install_keys` configuration item. Place the DataStax packages in a |
76 | local archive to avoid downloading from datastax.com. |
77 | |
78 | -The Cassandra Python driver and some dependencies are installed using |
79 | -`pip(1)`. Set the `http_proxy` config item to direct its downloads via |
80 | -a proxy server, such as squid or devpi. |
81 | - |
82 | |
83 | ## Oracle Java SE |
84 | |
85 | -Cassandra recommends using Oracle Java SE 7. Unfortunately, this |
86 | +Cassandra recommends using Oracle Java SE 8. Unfortunately, this |
87 | software is accessible only after accepting Oracle's click-through |
88 | license making deployments using it much more cumbersome. You will need |
89 | -to download the Oracle Java SE 7 Server Runtime for Linux, and place the |
90 | +to download the Oracle Java SE 8 Server Runtime for Linux, and place the |
91 | tarball at a URL accessible to your deployed units. The config item |
92 | `private_jre_url` needs to be set to this URL. |
93 | |
94 | -The recommended Java version is expected to change, as Java SE 7 public |
95 | -releases cease as of April 2015. |
96 | - |
97 | |
98 | # Usage |
99 | |
100 | |
101 | === modified file 'charm-helpers.yaml' |
102 | --- charm-helpers.yaml 2015-04-03 15:42:21 +0000 |
103 | +++ charm-helpers.yaml 2015-06-30 07:36:26 +0000 |
104 | @@ -18,11 +18,10 @@ |
105 | #branch: lp:charm-helpers |
106 | branch: lp:~stub/charm-helpers/integration |
107 | include: |
108 | - - __init__ |
109 | + - coordinator |
110 | - core |
111 | - fetch |
112 | - contrib.charmsupport |
113 | - contrib.templating.jinja |
114 | - - contrib.peerstorage |
115 | - contrib.network.ufw |
116 | - - contrib.unison |
117 | + - contrib.benchmark |
118 | |
119 | === modified file 'config.yaml' |
120 | --- config.yaml 2015-03-07 09:27:02 +0000 |
121 | +++ config.yaml 2015-06-30 07:36:26 +0000 |
122 | @@ -35,7 +35,7 @@ |
123 | If you are using Datastax Enterprise, you will need to |
124 | override one defaults with your own username and password. |
125 | default: | |
126 | - - deb http://www.apache.org/dist/cassandra/debian 20x main |
127 | + - deb http://www.apache.org/dist/cassandra/debian 21x main |
128 | # - deb http://debian.datastax.com/community stable main |
129 | # DSE requires you to register and add your username/password here. |
130 | # - deb http://un:pw@debian.datastax.com/enterprise stable main |
131 | @@ -207,15 +207,6 @@ |
132 | default: 256 |
133 | description: Number of tokens per node. |
134 | |
135 | - # Not implemented. |
136 | - # force_seed_nodes: |
137 | - # type: string |
138 | - # default: "" |
139 | - # description: > |
140 | - # A comma separated list of seed node ip addresses. This is |
141 | - # useful if the cluster being created in this juju environment |
142 | - # is part of a larger cluser and the seed nodes are remote. |
143 | - |
144 | # Topology of the service in the cluster. |
145 | datacenter: |
146 | type: string |
147 | @@ -312,11 +303,3 @@ |
148 | performance problems and even exaust the server heap. Adjust |
149 | the thresholds here if you understand the dangers and want |
150 | to scan more tombstones anyway. |
151 | - |
152 | - post_bootstrap_delay: |
153 | - type: int |
154 | - default: 120 |
155 | - description: > |
156 | - Testing option - do not change. When adding multiple new |
157 | - nodes to a cluster, you are supposed to wait two minutes |
158 | - between nodes. I set this to zero to speed up test runs. |
159 | |
160 | === modified file 'hooks/actions.py' |
161 | --- hooks/actions.py 2015-04-12 16:20:50 +0000 |
162 | +++ hooks/actions.py 2015-06-30 07:36:26 +0000 |
163 | @@ -23,20 +23,22 @@ |
164 | import shlex |
165 | import subprocess |
166 | from textwrap import dedent |
167 | +import time |
168 | import urllib.request |
169 | |
170 | from charmhelpers import fetch |
171 | from charmhelpers.contrib.charmsupport import nrpe |
172 | from charmhelpers.contrib.network import ufw |
173 | from charmhelpers.contrib.templating import jinja |
174 | -from charmhelpers.contrib import unison |
175 | from charmhelpers.core import hookenv, host |
176 | from charmhelpers.core.fstab import Fstab |
177 | -from charmhelpers.core.hookenv import ERROR, WARNING |
178 | - |
179 | +from charmhelpers.core.hookenv import DEBUG, ERROR, WARNING |
180 | + |
181 | +import cassandra |
182 | + |
183 | +from coordinator import coordinator |
184 | import helpers |
185 | import relations |
186 | -import rollingrestart |
187 | |
188 | |
189 | # These config keys cannot be changed after service deployment. |
190 | @@ -54,7 +56,6 @@ |
191 | 'native_transport_port', |
192 | 'partitioner', |
193 | 'num_tokens', |
194 | - 'force_seed_nodes', |
195 | 'max_heap_size', |
196 | 'heap_newsize', |
197 | 'authorizer', |
198 | @@ -83,8 +84,7 @@ |
199 | 'nagios_heapchk_warn_pct', |
200 | 'nagios_heapchk_crit_pct', |
201 | 'nagios_disk_warn_pct', |
202 | - 'nagios_disk_crit_pct', |
203 | - 'post_bootstrap_delay']) |
204 | + 'nagios_disk_crit_pct']) |
205 | |
206 | |
207 | def action(func): |
208 | @@ -103,6 +103,17 @@ |
209 | return wrapper |
210 | |
211 | |
212 | +def leader_only(func): |
213 | + '''Decorated function is only run on the leader.''' |
214 | + @wraps(func) |
215 | + def wrapper(*args, **kw): |
216 | + if hookenv.is_leader(): |
217 | + return func(*args, **kw) |
218 | + else: |
219 | + return None |
220 | + return wrapper |
221 | + |
222 | + |
223 | @action |
224 | def set_proxy(): |
225 | config = hookenv.config() |
226 | @@ -113,13 +124,14 @@ |
227 | |
228 | @action |
229 | def revert_unchangeable_config(): |
230 | - if hookenv.hook_name() == 'install': |
231 | - # config.previous() only becomes meaningful after the install |
232 | - # hook has run. During the first run on the unit hook, it |
233 | - # reports everything has having None as the previous value. |
234 | + config = hookenv.config() |
235 | + |
236 | + # config.previous() only becomes meaningful after the install |
237 | + # hook has run. During the first run on the unit hook, it |
238 | + # reports everything has having None as the previous value. |
239 | + if config._prev_dict is None: |
240 | return |
241 | |
242 | - config = hookenv.config() |
243 | for key in UNCHANGEABLE_KEYS: |
244 | if config.changed(key): |
245 | previous = config.previous(key) |
246 | @@ -231,48 +243,56 @@ |
247 | helpers.ensure_package_status(helpers.get_cassandra_packages()) |
248 | |
249 | |
250 | -@action |
251 | -def install_oracle_jre(): |
252 | - if helpers.get_jre() != 'oracle': |
253 | - return |
254 | - |
255 | +def _fetch_oracle_jre(): |
256 | config = hookenv.config() |
257 | url = config.get('private_jre_url', None) |
258 | - if not url: |
259 | - hookenv.log('private_jre_url not set. Unable to continue.', ERROR) |
260 | - raise SystemExit(1) |
261 | - |
262 | - if config.get('retrieved_jre', None) != url: |
263 | - filename = os.path.join('lib', url.split('/')[-1]) |
264 | + if url and config.get('retrieved_jre', None) != url: |
265 | + filename = os.path.join(hookenv.charm_dir(), |
266 | + 'lib', url.split('/')[-1]) |
267 | if not filename.endswith('-linux-x64.tar.gz'): |
268 | - hookenv.log('Invalid JRE URL {}'.format(url), ERROR) |
269 | - raise SystemExit(1) |
270 | + helpers.status_set('blocked', |
271 | + 'Invalid private_jre_url {}'.format(url)) |
272 | + raise SystemExit(0) |
273 | + helpers.status_set(hookenv.status_get(), |
274 | + 'Downloading Oracle JRE') |
275 | + hookenv.log('Oracle JRE URL is {}'.format(url)) |
276 | urllib.request.urlretrieve(url, filename) |
277 | config['retrieved_jre'] = url |
278 | |
279 | - pattern = 'lib/server-jre-?u*-linux-x64.tar.gz' |
280 | + pattern = os.path.join(hookenv.charm_dir(), |
281 | + 'lib', 'server-jre-?u*-linux-x64.tar.gz') |
282 | tarballs = glob.glob(pattern) |
283 | - if not tarballs: |
284 | - hookenv.log('Oracle JRE tarball not found ({})'.format(pattern), |
285 | - ERROR) |
286 | - # We could fallback to OpenJDK, but the user took the trouble |
287 | - # to specify the Oracle JRE and it is recommended for Cassandra |
288 | - # so lets hard fail instead. |
289 | - raise SystemExit(1) |
290 | + if not (url or tarballs): |
291 | + helpers.status_set('blocked', |
292 | + 'private_jre_url not set and no local tarballs.') |
293 | + raise SystemExit(0) |
294 | + |
295 | + elif not tarballs: |
296 | + helpers.status_set('blocked', |
297 | + 'Oracle JRE tarball not found ({})'.format(pattern)) |
298 | + raise SystemExit(0) |
299 | |
300 | # Latest tarball by filename/version num. Lets hope they don't hit |
301 | # 99 (currently at 76). |
302 | tarball = sorted(tarballs)[-1] |
303 | - |
304 | + return tarball |
305 | + |
306 | + |
307 | +def _install_oracle_jre_tarball(tarball): |
308 | # Same directory as webupd8 to avoid surprising people, but it could |
309 | # be anything. |
310 | - dest = '/usr/lib/jvm/java-7-oracle' |
311 | + if 'jre-7u' in str(tarball): |
312 | + dest = '/usr/lib/jvm/java-7-oracle' |
313 | + else: |
314 | + dest = '/usr/lib/jvm/java-8-oracle' |
315 | |
316 | if not os.path.isdir(dest): |
317 | host.mkdir(dest) |
318 | |
319 | jre_exists = os.path.exists(os.path.join(dest, 'bin', 'java')) |
320 | |
321 | + config = hookenv.config() |
322 | + |
323 | # Unpack the latest tarball if necessary. |
324 | if config.get('oracle_jre_tarball', '') == tarball and jre_exists: |
325 | hookenv.log('Already installed {}'.format(tarball)) |
326 | @@ -293,6 +313,15 @@ |
327 | |
328 | |
329 | @action |
330 | +def install_oracle_jre(): |
331 | + if helpers.get_jre() != 'oracle': |
332 | + return |
333 | + |
334 | + tarball = _fetch_oracle_jre() |
335 | + _install_oracle_jre_tarball(tarball) |
336 | + |
337 | + |
338 | +@action |
339 | def emit_java_version(): |
340 | # Log the version for posterity. Could be useful since Oracle JRE |
341 | # security updates are not automated. |
342 | @@ -355,82 +384,153 @@ |
343 | host.write_file(rackdc_path, rackdc_properties.encode('UTF-8')) |
344 | |
345 | |
346 | +def needs_reset_auth_keyspace_replication(): |
347 | + '''Guard for reset_auth_keyspace_replication.''' |
348 | + num_nodes = helpers.num_nodes() |
349 | + n = min(num_nodes, 3) |
350 | + datacenter = hookenv.config()['datacenter'] |
351 | + with helpers.connect() as session: |
352 | + strategy_opts = helpers.get_auth_keyspace_replication(session) |
353 | + rf = int(strategy_opts.get(datacenter, -1)) |
354 | + hookenv.log('system_auth rf={!r}'.format(strategy_opts)) |
355 | + # If the node count has increased, we will change the rf. |
356 | + # If the node count is decreasing, we do nothing as the service |
357 | + # may be being destroyed. |
358 | + if rf < n: |
359 | + return True |
360 | + return False |
361 | + |
362 | + |
363 | +@leader_only |
364 | @action |
365 | +@coordinator.require('repair', needs_reset_auth_keyspace_replication) |
366 | def reset_auth_keyspace_replication(): |
367 | - # This action only lowers the system_auth keyspace replication |
368 | - # values when a node has been decommissioned. The replication settings |
369 | - # are also updated during rolling restart, which takes care of when new |
370 | - # nodes are added. |
371 | - helpers.reset_auth_keyspace_replication() |
372 | + # Cassandra requires you to manually set the replication factor of |
373 | + # the system_auth keyspace, to ensure availability and redundancy. |
374 | + # We replication factor in this service's DC can be no higher than |
375 | + # the number of bootstrapped nodes. We also cap this at 3 to ensure |
376 | + # we don't have too many seeds. |
377 | + num_nodes = helpers.num_nodes() |
378 | + n = min(num_nodes, 3) |
379 | + datacenter = hookenv.config()['datacenter'] |
380 | + with helpers.connect() as session: |
381 | + strategy_opts = helpers.get_auth_keyspace_replication(session) |
382 | + rf = int(strategy_opts.get(datacenter, -1)) |
383 | + hookenv.log('system_auth rf={!r}'.format(strategy_opts)) |
384 | + if rf != n: |
385 | + strategy_opts['class'] = 'NetworkTopologyStrategy' |
386 | + strategy_opts[datacenter] = n |
387 | + if 'replication_factor' in strategy_opts: |
388 | + del strategy_opts['replication_factor'] |
389 | + helpers.set_auth_keyspace_replication(session, strategy_opts) |
390 | + helpers.repair_auth_keyspace() |
391 | + helpers.set_active() |
392 | |
393 | |
394 | @action |
395 | def store_unit_private_ip(): |
396 | + '''Store the unit's private ip address, so we can tell if it changes.''' |
397 | hookenv.config()['unit_private_ip'] = hookenv.unit_private_ip() |
398 | |
399 | |
400 | -@action |
401 | -def set_unit_zero_bootstrapped(): |
402 | - '''Unit #0 is used as the first node in the cluster. |
403 | - |
404 | - Unit #0 is implicitly flagged as bootstrap, and thus befores the |
405 | - first node in the cluster and providing a seed for other nodes to |
406 | - bootstrap off. We can change this when we have juju leadership, |
407 | - making the leader the first node in the cluster. Until then, don't |
408 | - attempt to create a multiunit service if you have removed Unit #0. |
409 | - ''' |
410 | - relname = rollingrestart.get_peer_relation_name() |
411 | - if helpers.unit_number() == 0 and hookenv.hook_name().startswith(relname): |
412 | - helpers.set_bootstrapped(True) |
413 | - |
414 | - |
415 | -@action |
416 | -def maybe_schedule_restart(): |
417 | - '''Prepare for and schedule a rolling restart if necessary.''' |
418 | +def needs_restart(): |
419 | + '''Return True if Cassandra is not running or needs to be restarted.''' |
420 | + if helpers.is_decommissioned(): |
421 | + # Decommissioned nodes are never restarted. They remain up |
422 | + # telling everyone they are decommissioned. |
423 | + helpers.status_set('blocked', 'Decommissioned node') |
424 | + return False |
425 | + |
426 | if not helpers.is_cassandra_running(): |
427 | - # Short circuit if Cassandra is not running to avoid log spam. |
428 | - rollingrestart.request_restart() |
429 | - return |
430 | - |
431 | - if helpers.is_decommissioned(): |
432 | - hookenv.log("Decommissioned") |
433 | - return |
434 | - |
435 | - # If any of these config items changed, a restart is required. |
436 | + if helpers.is_bootstrapped(): |
437 | + helpers.status_set('waiting', 'Waiting for permission to start') |
438 | + else: |
439 | + helpers.status_set('waiting', |
440 | + 'Waiting for permission to bootstrap') |
441 | + return True |
442 | + |
443 | config = hookenv.config() |
444 | - restart = False |
445 | - for key in RESTART_REQUIRED_KEYS: |
446 | - if config.changed(key): |
447 | - hookenv.log('{} changed. Restart required.'.format(key)) |
448 | - restart = True |
449 | + |
450 | + # If our IP address has changed, we need to restart. |
451 | + if config.changed('unit_private_ip'): |
452 | + hookenv.log('waiting', |
453 | + 'IP address changed. Waiting for restart permission.') |
454 | + return True |
455 | |
456 | # If the directory paths have changed, we need to migrate data |
457 | - # during a restart. Directory config items have already been picked |
458 | - # up in the previous check. |
459 | + # during a restart. |
460 | storage = relations.StorageRelation() |
461 | if storage.needs_remount(): |
462 | - hookenv.log('Mountpoint changed. Restart and migration required.') |
463 | - restart = True |
464 | + helpers.status_set(hookenv.status_get(), |
465 | + 'New mounts. Waiting for restart permission') |
466 | + return True |
467 | |
468 | - # If our IP address has changed, we need to restart. |
469 | - if config.changed('unit_private_ip'): |
470 | - hookenv.log('Unit IP address changed. Restart required.') |
471 | - restart = True |
472 | + # If any of these config items changed, a restart is required. |
473 | + for key in RESTART_REQUIRED_KEYS: |
474 | + if config.changed(key): |
475 | + hookenv.log('{} changed. Restart required.'.format(key)) |
476 | + for key in RESTART_REQUIRED_KEYS: |
477 | + if config.changed(key): |
478 | + helpers.status_set(hookenv.status_get(), |
479 | + 'Config changes. ' |
480 | + 'Waiting for restart permission.') |
481 | + return True |
482 | |
483 | # If we have new seeds, we should restart. |
484 | - new_seeds = helpers.seed_ips() |
485 | - config['configured_seeds'] = sorted(new_seeds) |
486 | - if config.changed('configured_seeds'): |
487 | + new_seeds = helpers.get_seed_ips() |
488 | + if config.get('configured_seeds') != sorted(new_seeds): |
489 | old_seeds = set(config.previous('configured_seeds') or []) |
490 | changed = old_seeds.symmetric_difference(new_seeds) |
491 | # We don't care about the local node in the changes. |
492 | changed.discard(hookenv.unit_private_ip()) |
493 | if changed: |
494 | - hookenv.log('New seeds {!r}. Restart required.'.format(new_seeds)) |
495 | - restart = True |
496 | - |
497 | - if restart: |
498 | - rollingrestart.request_restart() |
499 | + helpers.status_set(hookenv.status_get(), |
500 | + 'Updated seeds {!r}. ' |
501 | + 'Waiting for restart permission.' |
502 | + ''.format(new_seeds)) |
503 | + return True |
504 | + |
505 | + hookenv.log('Restart not required') |
506 | + return False |
507 | + |
508 | + |
509 | +@action |
510 | +@coordinator.require('restart', needs_restart) |
511 | +def maybe_restart(): |
512 | + '''Restart sequence. |
513 | + |
514 | + If a restart is needed, shutdown Cassandra, perform all pending operations |
515 | + that cannot be be done while Cassandra is live, and restart it. |
516 | + ''' |
517 | + helpers.status_set('maintenance', 'Stopping Cassandra') |
518 | + helpers.stop_cassandra() |
519 | + helpers.remount_cassandra() |
520 | + helpers.ensure_database_directories() |
521 | + if helpers.peer_relid() and not helpers.is_bootstrapped(): |
522 | + helpers.status_set('maintenance', 'Bootstrapping') |
523 | + else: |
524 | + helpers.status_set('maintenance', 'Starting Cassandra') |
525 | + helpers.start_cassandra() |
526 | + |
527 | + |
528 | +@action |
529 | +def post_bootstrap(): |
530 | + '''Maintain state on if the node has bootstrapped into the cluster. |
531 | + |
532 | + Per documented procedure for adding new units to a cluster, wait 2 |
533 | + minutes if the unit has just bootstrapped to ensure other units |
534 | + do not attempt bootstrap too soon. |
535 | + ''' |
536 | + if not helpers.is_bootstrapped(): |
537 | + if coordinator.relid is not None: |
538 | + helpers.status_set('maintenance', 'Post-bootstrap 2 minute delay') |
539 | + hookenv.log('Post-bootstrap 2 minute delay') |
540 | + time.sleep(120) # Must wait 2 minutes between bootstrapping nodes. |
541 | + |
542 | + # Unconditionally call this to publish the bootstrapped flag to |
543 | + # the peer relation, as the first unit was bootstrapped before |
544 | + # the peer relation existed. |
545 | + helpers.set_bootstrapped() |
546 | |
547 | |
548 | @action |
549 | @@ -443,66 +543,90 @@ |
550 | helpers.start_cassandra() |
551 | |
552 | |
553 | +@leader_only |
554 | @action |
555 | -def ensure_unit_superuser(): |
556 | - helpers.ensure_unit_superuser() |
557 | +def create_unit_superusers(): |
558 | + # The leader creates and updates accounts for nodes, using the |
559 | + # encrypted password they provide in relations.PeerRelation. We |
560 | + # don't end up with unencrypted passwords leaving the unit, and we |
561 | + # don't need to restart Cassandra in no-auth mode which is slow and |
562 | + # I worry may cause issues interrupting the bootstrap. |
563 | + if not coordinator.relid: |
564 | + return # No peer relation, no requests yet. |
565 | + |
566 | + created_units = helpers.get_unit_superusers() |
567 | + uncreated_units = [u for u in hookenv.related_units(coordinator.relid) |
568 | + if u not in created_units] |
569 | + for peer in uncreated_units: |
570 | + rel = hookenv.relation_get(unit=peer, rid=coordinator.relid) |
571 | + username = rel.get('username') |
572 | + pwhash = rel.get('pwhash') |
573 | + if not username: |
574 | + continue |
575 | + hookenv.log('Creating {} account for {}'.format(username, peer)) |
576 | + with helpers.connect() as session: |
577 | + helpers.ensure_user(session, username, pwhash, superuser=True) |
578 | + created_units.add(peer) |
579 | + helpers.set_unit_superusers(created_units) |
580 | |
581 | |
582 | @action |
583 | def reset_all_io_schedulers(): |
584 | - helpers.reset_all_io_schedulers() |
585 | + dirs = helpers.get_all_database_directories() |
586 | + dirs = (dirs['data_file_directories'] + [dirs['commitlog_directory']] + |
587 | + [dirs['saved_caches_directory']]) |
588 | + config = hookenv.config() |
589 | + for d in dirs: |
590 | + if os.path.isdir(d): # Directory may not exist yet. |
591 | + helpers.set_io_scheduler(config['io_scheduler'], d) |
592 | + |
593 | + |
594 | +def _client_credentials(relid): |
595 | + '''Return the client credentials used by relation relid.''' |
596 | + relinfo = hookenv.relation_get(unit=hookenv.local_unit(), rid=relid) |
597 | + username = relinfo.get('username') |
598 | + password = relinfo.get('password') |
599 | + if username is None or password is None: |
600 | + for unit in hookenv.related_units(coordinator.relid): |
601 | + try: |
602 | + relinfo = hookenv.relation_get(unit=unit, rid=relid) |
603 | + username = relinfo.get('username') |
604 | + password = relinfo.get('password') |
605 | + if username is not None and password is not None: |
606 | + return username, password |
607 | + except subprocess.CalledProcessError: |
608 | + pass # Assume the remote unit has not joined yet. |
609 | + return None, None |
610 | + else: |
611 | + return username, password |
612 | |
613 | |
614 | def _publish_database_relation(relid, superuser): |
615 | - # Due to Bug #1409763, this functionality is as action rather than a |
616 | - # provided_data item. |
617 | - # |
618 | # The Casandra service needs to provide a common set of credentials |
619 | - # to a client unit. Juju does not yet provide a leader so we |
620 | - # need another mechanism for determine which unit will create the |
621 | - # client's account, with the remaining units copying the lead unit's |
622 | - # credentials. For the purposes of this charm, the first unit in |
623 | - # order is considered the leader and creates the user with a random |
624 | - # password. It then tickles the peer relation to ensure the other |
625 | - # units get a hook fired and the opportunity to copy and publish |
626 | - # these credentials. If the lowest numbered unit is removed before |
627 | - # all of the other peers have copied its credentials, then the next |
628 | - # lowest will have either already copied the credentials (and the |
629 | - # remaining peers will use them), or the process starts again and |
630 | - # it will generate new credentials. |
631 | - node_list = list(rollingrestart.get_peers()) + [hookenv.local_unit()] |
632 | - sorted_nodes = sorted(node_list, key=lambda unit: int(unit.split('/')[-1])) |
633 | - first_node = sorted_nodes[0] |
634 | - |
635 | - config = hookenv.config() |
636 | - |
637 | - try: |
638 | - relinfo = hookenv.relation_get(unit=first_node, rid=relid) |
639 | - except subprocess.CalledProcessError: |
640 | - if first_node == hookenv.local_unit(): |
641 | - raise |
642 | - # relation-get may fail if the specified unit has not yet joined |
643 | - # the peer relation, or has just departed. Try again later. |
644 | - return |
645 | - |
646 | - username = relinfo.get('username') |
647 | - password = relinfo.get('password') |
648 | - if hookenv.local_unit() == first_node: |
649 | - # Lowest numbered unit, at least for now. |
650 | - if 'username' not in relinfo: |
651 | - # Credentials unset. Generate them. |
652 | - username = 'juju_{}'.format( |
653 | - relid.replace(':', '_').replace('-', '_')) |
654 | + # to a client unit. The leader creates these, if none of the other |
655 | + # units are found to have published them already (a previously elected |
656 | + # leader may have done this). The leader then tickles the other units, |
657 | + # firing a hook and giving them the opportunity to copy and publish |
658 | + # these credentials. |
659 | + username, password = _client_credentials(relid) |
660 | + if username is None: |
661 | + if hookenv.is_leader(): |
662 | + # Credentials not set. The leader must generate them. We use |
663 | + # the service name so that database permissions remain valid |
664 | + # even after the relation is dropped and recreated, or the |
665 | + # juju environment rebuild and the database restored from |
666 | + # backups. |
667 | + username = 'juju_{}'.format(helpers.get_service_name(relid)) |
668 | + if superuser: |
669 | + username += '_admin' |
670 | password = host.pwgen() |
671 | - # Wake the other peers, if any. |
672 | - hookenv.relation_set(rollingrestart.get_peer_relation_id(), |
673 | - ping=rollingrestart.utcnow_str()) |
674 | - # Create the account if necessary, and reset the password. |
675 | - # We need to reset the password as another unit may have |
676 | - # rudely changed it thinking they were the lowest numbered |
677 | - # unit. Fix this behavior once juju provides real |
678 | - # leadership. |
679 | - helpers.ensure_user(username, password, superuser) |
680 | + pwhash = helpers.encrypt_password(password) |
681 | + with helpers.connect() as session: |
682 | + helpers.ensure_user(session, username, pwhash, superuser) |
683 | + # Wake the peers, if any. |
684 | + helpers.leader_ping() |
685 | + else: |
686 | + return # No credentials yet. Nothing to do. |
687 | |
688 | # Publish the information the client needs on the relation where |
689 | # they can find it. |
690 | @@ -511,6 +635,7 @@ |
691 | # - cluster_name, so clients can differentiate multiple clusters |
692 | # - datacenter + rack, so clients know what names they can use |
693 | # when altering keyspace replication settings. |
694 | + config = hookenv.config() |
695 | hookenv.relation_set(relid, |
696 | username=username, password=password, |
697 | host=hookenv.unit_public_ip(), |
698 | @@ -546,53 +671,13 @@ |
699 | |
700 | |
701 | @action |
702 | -def emit_describe_cluster(): |
703 | - '''Spam 'nodetool describecluster' into the logs.''' |
704 | +def emit_cluster_info(): |
705 | helpers.emit_describe_cluster() |
706 | - |
707 | - |
708 | -@action |
709 | -def emit_auth_keyspace_status(): |
710 | - '''Spam 'nodetool status system_auth' into the logs.''' |
711 | helpers.emit_auth_keyspace_status() |
712 | - |
713 | - |
714 | -@action |
715 | -def emit_netstats(): |
716 | - '''Spam 'nodetool netstats' into the logs.''' |
717 | helpers.emit_netstats() |
718 | |
719 | |
720 | @action |
721 | -def shutdown_before_joining_peers(): |
722 | - '''Shutdown the node before opening firewall ports for our peers. |
723 | - |
724 | - When the peer relation is first joined, the node has already been |
725 | - setup and is running as a standalone cluster. This is problematic |
726 | - when the peer relation has been formed, as if it is left running |
727 | - peers may conect to it and initiate replication before this node |
728 | - has been properly reset and bootstrapped. To avoid this, we |
729 | - shutdown the node before opening firewall ports to any peers. The |
730 | - firewall can then be opened, and peers will not find a node here |
731 | - until it starts its bootstrapping. |
732 | - ''' |
733 | - relname = rollingrestart.get_peer_relation_name() |
734 | - if hookenv.hook_name() == '{}-relation-joined'.format(relname): |
735 | - if not helpers.is_bootstrapped(): |
736 | - helpers.stop_cassandra(immediate=True) |
737 | - |
738 | - |
739 | -@action |
740 | -def grant_ssh_access(): |
741 | - '''Grant SSH access to run nodetool on remote nodes. |
742 | - |
743 | - This is easier than setting up remote JMX access, and more secure. |
744 | - ''' |
745 | - unison.ssh_authorized_peers(rollingrestart.get_peer_relation_name(), |
746 | - 'juju_ssh', ensure_local_user=True) |
747 | - |
748 | - |
749 | -@action |
750 | def configure_firewall(): |
751 | '''Configure firewall rules using ufw. |
752 | |
753 | @@ -632,13 +717,9 @@ |
754 | |
755 | # Rules for peers |
756 | for relinfo in hookenv.relations_of_type('cluster'): |
757 | - for port in peer_ports: |
758 | - desired_rules.add((relinfo['private-address'], 'any', port)) |
759 | - |
760 | - # External seeds also need access. |
761 | - for seed_ip in helpers.seed_ips(): |
762 | - for port in peer_ports: |
763 | - desired_rules.add((seed_ip, 'any', port)) |
764 | + if relinfo['private-address']: |
765 | + for port in peer_ports: |
766 | + desired_rules.add((relinfo['private-address'], 'any', port)) |
767 | |
768 | previous_rules = set(tuple(rule) for rule in config.get('ufw_rules', [])) |
769 | |
770 | @@ -683,8 +764,7 @@ |
771 | description="Check Cassandra Heap", |
772 | check_cmd="check_cassandra_heap.sh {} {} {}" |
773 | "".format(hookenv.unit_private_ip(), cassandra_heap_warn, |
774 | - cassandra_heap_crit) |
775 | - ) |
776 | + cassandra_heap_crit)) |
777 | |
778 | cassandra_disk_warn = conf.get('nagios_disk_warn_pct') |
779 | cassandra_disk_crit = conf.get('nagios_disk_crit_pct') |
780 | @@ -699,7 +779,116 @@ |
781 | description="Check Cassandra Disk {}".format(disk), |
782 | check_cmd="check_disk -u GB -w {}% -c {}% -K 5% -p {}" |
783 | "".format(cassandra_disk_warn, cassandra_disk_crit, |
784 | - disk) |
785 | - ) |
786 | - |
787 | + disk)) |
788 | nrpe_compat.write() |
789 | + |
790 | + |
791 | +@leader_only |
792 | +@action |
793 | +def maintain_seeds(): |
794 | + '''The leader needs to maintain the list of seed nodes''' |
795 | + seed_ips = helpers.get_seed_ips() |
796 | + hookenv.log('Current seeds == {!r}'.format(seed_ips), DEBUG) |
797 | + |
798 | + bootstrapped_ips = helpers.get_bootstrapped_ips() |
799 | + hookenv.log('Bootstrapped == {!r}'.format(bootstrapped_ips), DEBUG) |
800 | + |
801 | + # Remove any seeds that are no longer bootstrapped, such as dropped |
802 | + # units. |
803 | + seed_ips.intersection_update(bootstrapped_ips) |
804 | + |
805 | + # Add more bootstrapped nodes, if necessary, to get to our maximum |
806 | + # of 3 seeds. |
807 | + potential_seed_ips = list(reversed(sorted(bootstrapped_ips))) |
808 | + while len(seed_ips) < 3 and potential_seed_ips: |
809 | + seed_ips.add(potential_seed_ips.pop()) |
810 | + |
811 | + # If there are no seeds or bootstrapped nodes, start with the leader. Us. |
812 | + if len(seed_ips) == 0: |
813 | + seed_ips.add(hookenv.unit_private_ip()) |
814 | + |
815 | + hookenv.log('Updated seeds == {!r}'.format(seed_ips), DEBUG) |
816 | + |
817 | + hookenv.leader_set(seeds=','.join(sorted(seed_ips))) |
818 | + |
819 | + |
820 | +@leader_only |
821 | +@action |
822 | +def reset_default_password(): |
823 | + if hookenv.leader_get('default_admin_password_changed'): |
824 | + hookenv.log('Default admin password already changed') |
825 | + return |
826 | + |
827 | + # Cassandra ships with well known credentials, rather than |
828 | + # providing a tool to reset credentials. This is a huge security |
829 | + # hole we must close. |
830 | + try: |
831 | + # We need a big timeout here, as the cassandra user actually |
832 | + # springs into existence some time after Cassandra has started |
833 | + # up and is accepting connections. |
834 | + with helpers.connect('cassandra', 'cassandra', |
835 | + timeout=120, auth_timeout=120) as session: |
836 | + # But before we close this security hole, we need to use these |
837 | + # credentials to create a different admin account for the |
838 | + # leader, allowing it to create accounts for other nodes as they |
839 | + # join. The alternative is restarting Cassandra without |
840 | + # authentication, which this charm will likely need to do in the |
841 | + # future when we allow Cassandra services to be related together. |
842 | + helpers.status_set('maintenance', |
843 | + 'Creating initial superuser account') |
844 | + username, password = helpers.superuser_credentials() |
845 | + pwhash = helpers.encrypt_password(password) |
846 | + helpers.ensure_user(session, username, pwhash, superuser=True) |
847 | + helpers.set_unit_superusers([hookenv.local_unit()]) |
848 | + |
849 | + helpers.status_set('maintenance', |
850 | + 'Changing default admin password') |
851 | + helpers.query(session, 'ALTER USER cassandra WITH PASSWORD %s', |
852 | + cassandra.ConsistencyLevel.ALL, (host.pwgen(),)) |
853 | + except cassandra.AuthenticationFailed: |
854 | + hookenv.log('Default superuser account already reset') |
855 | + try: |
856 | + with helpers.connect(): |
857 | + hookenv.log("Leader's superuser account already created") |
858 | + except cassandra.AuthenticationFailed: |
859 | + # We have no known superuser credentials. Create the account |
860 | + # the hard, slow way. This will be the normal method |
861 | + # of creating the service's initial account when we allow |
862 | + # services to be related together. |
863 | + helpers.create_unit_superuser_hard() |
864 | + |
865 | + hookenv.leader_set(default_admin_password_changed=True) |
866 | + |
867 | + |
868 | +@action |
869 | +def set_active(): |
870 | + # If we got this far, the unit is active. Update the status if it is |
871 | + # not already active. We don't do this unconditionally, as the charm |
872 | + # may be active but doing stuff, like active but waiting for restart |
873 | + # permission. |
874 | + if hookenv.status_get() != 'active': |
875 | + helpers.set_active() |
876 | + else: |
877 | + hookenv.log('Unit status already active', DEBUG) |
878 | + |
879 | + |
880 | +@action |
881 | +def request_unit_superuser(): |
882 | + relid = helpers.peer_relid() |
883 | + if relid is None: |
884 | + hookenv.log('Request deferred until peer relation exists') |
885 | + return |
886 | + |
887 | + relinfo = hookenv.relation_get(unit=hookenv.local_unit(), |
888 | + rid=relid) |
889 | + if relinfo and relinfo.get('username'): |
890 | + # We must avoid blindly setting the pwhash on the relation, |
891 | + # as we will likely get a different value everytime we |
892 | + # encrypt the password due to the random salt. |
893 | + hookenv.log('Superuser account request previously made') |
894 | + else: |
895 | + # Publish the requested superuser and hash to our peers. |
896 | + username, password = helpers.superuser_credentials() |
897 | + pwhash = helpers.encrypt_password(password) |
898 | + hookenv.relation_set(relid, username=username, pwhash=pwhash) |
899 | + hookenv.log('Requested superuser account creation') |
900 | |
901 | === added directory 'hooks/charmhelpers/contrib/benchmark' |
902 | === added file 'hooks/charmhelpers/contrib/benchmark/__init__.py' |
903 | --- hooks/charmhelpers/contrib/benchmark/__init__.py 1970-01-01 00:00:00 +0000 |
904 | +++ hooks/charmhelpers/contrib/benchmark/__init__.py 2015-06-30 07:36:26 +0000 |
905 | @@ -0,0 +1,124 @@ |
906 | +# Copyright 2014-2015 Canonical Limited. |
907 | +# |
908 | +# This file is part of charm-helpers. |
909 | +# |
910 | +# charm-helpers is free software: you can redistribute it and/or modify |
911 | +# it under the terms of the GNU Lesser General Public License version 3 as |
912 | +# published by the Free Software Foundation. |
913 | +# |
914 | +# charm-helpers is distributed in the hope that it will be useful, |
915 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
916 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
917 | +# GNU Lesser General Public License for more details. |
918 | +# |
919 | +# You should have received a copy of the GNU Lesser General Public License |
920 | +# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>. |
921 | + |
922 | +import subprocess |
923 | +import time |
924 | +import os |
925 | +from distutils.spawn import find_executable |
926 | + |
927 | +from charmhelpers.core.hookenv import ( |
928 | + in_relation_hook, |
929 | + relation_ids, |
930 | + relation_set, |
931 | + relation_get, |
932 | +) |
933 | + |
934 | + |
935 | +def action_set(key, val): |
936 | + if find_executable('action-set'): |
937 | + action_cmd = ['action-set'] |
938 | + |
939 | + if isinstance(val, dict): |
940 | + for k, v in iter(val.items()): |
941 | + action_set('%s.%s' % (key, k), v) |
942 | + return True |
943 | + |
944 | + action_cmd.append('%s=%s' % (key, val)) |
945 | + subprocess.check_call(action_cmd) |
946 | + return True |
947 | + return False |
948 | + |
949 | + |
950 | +class Benchmark(): |
951 | + """ |
952 | + Helper class for the `benchmark` interface. |
953 | + |
954 | + :param list actions: Define the actions that are also benchmarks |
955 | + |
956 | + From inside the benchmark-relation-changed hook, you would |
957 | + Benchmark(['memory', 'cpu', 'disk', 'smoke', 'custom']) |
958 | + |
959 | + Examples: |
960 | + |
961 | + siege = Benchmark(['siege']) |
962 | + siege.start() |
963 | + [... run siege ...] |
964 | + # The higher the score, the better the benchmark |
965 | + siege.set_composite_score(16.70, 'trans/sec', 'desc') |
966 | + siege.finish() |
967 | + |
968 | + |
969 | + """ |
970 | + |
971 | + required_keys = [ |
972 | + 'hostname', |
973 | + 'port', |
974 | + 'graphite_port', |
975 | + 'graphite_endpoint', |
976 | + 'api_port' |
977 | + ] |
978 | + |
979 | + def __init__(self, benchmarks=None): |
980 | + if in_relation_hook(): |
981 | + if benchmarks is not None: |
982 | + for rid in sorted(relation_ids('benchmark')): |
983 | + relation_set(relation_id=rid, relation_settings={ |
984 | + 'benchmarks': ",".join(benchmarks) |
985 | + }) |
986 | + |
987 | + # Check the relation data |
988 | + config = {} |
989 | + for key in self.required_keys: |
990 | + val = relation_get(key) |
991 | + if val is not None: |
992 | + config[key] = val |
993 | + else: |
994 | + # We don't have all of the required keys |
995 | + config = {} |
996 | + break |
997 | + |
998 | + if len(config): |
999 | + with open('/etc/benchmark.conf', 'w') as f: |
1000 | + for key, val in iter(config.items()): |
1001 | + f.write("%s=%s\n" % (key, val)) |
1002 | + |
1003 | + @staticmethod |
1004 | + def start(): |
1005 | + action_set('meta.start', time.strftime('%Y-%m-%dT%H:%M:%SZ')) |
1006 | + |
1007 | + """ |
1008 | + If the collectd charm is also installed, tell it to send a snapshot |
1009 | + of the current profile data. |
1010 | + """ |
1011 | + COLLECT_PROFILE_DATA = '/usr/local/bin/collect-profile-data' |
1012 | + if os.path.exists(COLLECT_PROFILE_DATA): |
1013 | + subprocess.check_output([COLLECT_PROFILE_DATA]) |
1014 | + |
1015 | + @staticmethod |
1016 | + def finish(): |
1017 | + action_set('meta.stop', time.strftime('%Y-%m-%dT%H:%M:%SZ')) |
1018 | + |
1019 | + @staticmethod |
1020 | + def set_composite_score(value, units, direction='asc'): |
1021 | + """ |
1022 | + Set the composite score for a benchmark run. This is a single number |
1023 | + representative of the benchmark results. This could be the most |
1024 | + important metric, or an amalgamation of metric scores. |
1025 | + """ |
1026 | + return action_set( |
1027 | + "meta.composite", |
1028 | + {'value': value, 'units': units, 'direction': direction} |
1029 | + ) |
1030 | |
1031 | === modified file 'hooks/charmhelpers/contrib/charmsupport/nrpe.py' |
1032 | --- hooks/charmhelpers/contrib/charmsupport/nrpe.py 2015-04-03 15:42:21 +0000 |
1033 | +++ hooks/charmhelpers/contrib/charmsupport/nrpe.py 2015-06-30 07:36:26 +0000 |
1034 | @@ -247,7 +247,9 @@ |
1035 | |
1036 | service('restart', 'nagios-nrpe-server') |
1037 | |
1038 | - for rid in relation_ids("local-monitors"): |
1039 | + monitor_ids = relation_ids("local-monitors") + \ |
1040 | + relation_ids("nrpe-external-master") |
1041 | + for rid in monitor_ids: |
1042 | relation_set(relation_id=rid, monitors=yaml.dump(monitors)) |
1043 | |
1044 | |
1045 | |
1046 | === removed directory 'hooks/charmhelpers/contrib/peerstorage' |
1047 | === removed file 'hooks/charmhelpers/contrib/peerstorage/__init__.py' |
1048 | --- hooks/charmhelpers/contrib/peerstorage/__init__.py 2015-01-26 13:07:31 +0000 |
1049 | +++ hooks/charmhelpers/contrib/peerstorage/__init__.py 1970-01-01 00:00:00 +0000 |
1050 | @@ -1,150 +0,0 @@ |
1051 | -# Copyright 2014-2015 Canonical Limited. |
1052 | -# |
1053 | -# This file is part of charm-helpers. |
1054 | -# |
1055 | -# charm-helpers is free software: you can redistribute it and/or modify |
1056 | -# it under the terms of the GNU Lesser General Public License version 3 as |
1057 | -# published by the Free Software Foundation. |
1058 | -# |
1059 | -# charm-helpers is distributed in the hope that it will be useful, |
1060 | -# but WITHOUT ANY WARRANTY; without even the implied warranty of |
1061 | -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
1062 | -# GNU Lesser General Public License for more details. |
1063 | -# |
1064 | -# You should have received a copy of the GNU Lesser General Public License |
1065 | -# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>. |
1066 | - |
1067 | -import six |
1068 | -from charmhelpers.core.hookenv import relation_id as current_relation_id |
1069 | -from charmhelpers.core.hookenv import ( |
1070 | - is_relation_made, |
1071 | - relation_ids, |
1072 | - relation_get, |
1073 | - local_unit, |
1074 | - relation_set, |
1075 | -) |
1076 | - |
1077 | - |
1078 | -""" |
1079 | -This helper provides functions to support use of a peer relation |
1080 | -for basic key/value storage, with the added benefit that all storage |
1081 | -can be replicated across peer units. |
1082 | - |
1083 | -Requirement to use: |
1084 | - |
1085 | -To use this, the "peer_echo()" method has to be called form the peer |
1086 | -relation's relation-changed hook: |
1087 | - |
1088 | -@hooks.hook("cluster-relation-changed") # Adapt the to your peer relation name |
1089 | -def cluster_relation_changed(): |
1090 | - peer_echo() |
1091 | - |
1092 | -Once this is done, you can use peer storage from anywhere: |
1093 | - |
1094 | -@hooks.hook("some-hook") |
1095 | -def some_hook(): |
1096 | - # You can store and retrieve key/values this way: |
1097 | - if is_relation_made("cluster"): # from charmhelpers.core.hookenv |
1098 | - # There are peers available so we can work with peer storage |
1099 | - peer_store("mykey", "myvalue") |
1100 | - value = peer_retrieve("mykey") |
1101 | - print value |
1102 | - else: |
1103 | - print "No peers joind the relation, cannot share key/values :(" |
1104 | -""" |
1105 | - |
1106 | - |
1107 | -def peer_retrieve(key, relation_name='cluster'): |
1108 | - """Retrieve a named key from peer relation `relation_name`.""" |
1109 | - cluster_rels = relation_ids(relation_name) |
1110 | - if len(cluster_rels) > 0: |
1111 | - cluster_rid = cluster_rels[0] |
1112 | - return relation_get(attribute=key, rid=cluster_rid, |
1113 | - unit=local_unit()) |
1114 | - else: |
1115 | - raise ValueError('Unable to detect' |
1116 | - 'peer relation {}'.format(relation_name)) |
1117 | - |
1118 | - |
1119 | -def peer_retrieve_by_prefix(prefix, relation_name='cluster', delimiter='_', |
1120 | - inc_list=None, exc_list=None): |
1121 | - """ Retrieve k/v pairs given a prefix and filter using {inc,exc}_list """ |
1122 | - inc_list = inc_list if inc_list else [] |
1123 | - exc_list = exc_list if exc_list else [] |
1124 | - peerdb_settings = peer_retrieve('-', relation_name=relation_name) |
1125 | - matched = {} |
1126 | - if peerdb_settings is None: |
1127 | - return matched |
1128 | - for k, v in peerdb_settings.items(): |
1129 | - full_prefix = prefix + delimiter |
1130 | - if k.startswith(full_prefix): |
1131 | - new_key = k.replace(full_prefix, '') |
1132 | - if new_key in exc_list: |
1133 | - continue |
1134 | - if new_key in inc_list or len(inc_list) == 0: |
1135 | - matched[new_key] = v |
1136 | - return matched |
1137 | - |
1138 | - |
1139 | -def peer_store(key, value, relation_name='cluster'): |
1140 | - """Store the key/value pair on the named peer relation `relation_name`.""" |
1141 | - cluster_rels = relation_ids(relation_name) |
1142 | - if len(cluster_rels) > 0: |
1143 | - cluster_rid = cluster_rels[0] |
1144 | - relation_set(relation_id=cluster_rid, |
1145 | - relation_settings={key: value}) |
1146 | - else: |
1147 | - raise ValueError('Unable to detect ' |
1148 | - 'peer relation {}'.format(relation_name)) |
1149 | - |
1150 | - |
1151 | -def peer_echo(includes=None): |
1152 | - """Echo filtered attributes back onto the same relation for storage. |
1153 | - |
1154 | - This is a requirement to use the peerstorage module - it needs to be called |
1155 | - from the peer relation's changed hook. |
1156 | - """ |
1157 | - rdata = relation_get() |
1158 | - echo_data = {} |
1159 | - if includes is None: |
1160 | - echo_data = rdata.copy() |
1161 | - for ex in ['private-address', 'public-address']: |
1162 | - if ex in echo_data: |
1163 | - echo_data.pop(ex) |
1164 | - else: |
1165 | - for attribute, value in six.iteritems(rdata): |
1166 | - for include in includes: |
1167 | - if include in attribute: |
1168 | - echo_data[attribute] = value |
1169 | - if len(echo_data) > 0: |
1170 | - relation_set(relation_settings=echo_data) |
1171 | - |
1172 | - |
1173 | -def peer_store_and_set(relation_id=None, peer_relation_name='cluster', |
1174 | - peer_store_fatal=False, relation_settings=None, |
1175 | - delimiter='_', **kwargs): |
1176 | - """Store passed-in arguments both in argument relation and in peer storage. |
1177 | - |
1178 | - It functions like doing relation_set() and peer_store() at the same time, |
1179 | - with the same data. |
1180 | - |
1181 | - @param relation_id: the id of the relation to store the data on. Defaults |
1182 | - to the current relation. |
1183 | - @param peer_store_fatal: Set to True, the function will raise an exception |
1184 | - should the peer sotrage not be avialable.""" |
1185 | - |
1186 | - relation_settings = relation_settings if relation_settings else {} |
1187 | - relation_set(relation_id=relation_id, |
1188 | - relation_settings=relation_settings, |
1189 | - **kwargs) |
1190 | - if is_relation_made(peer_relation_name): |
1191 | - for key, value in six.iteritems(dict(list(kwargs.items()) + |
1192 | - list(relation_settings.items()))): |
1193 | - key_prefix = relation_id or current_relation_id() |
1194 | - peer_store(key_prefix + delimiter + key, |
1195 | - value, |
1196 | - relation_name=peer_relation_name) |
1197 | - else: |
1198 | - if peer_store_fatal: |
1199 | - raise ValueError('Unable to detect ' |
1200 | - 'peer relation {}'.format(peer_relation_name)) |
1201 | |
1202 | === removed directory 'hooks/charmhelpers/contrib/unison' |
1203 | === removed file 'hooks/charmhelpers/contrib/unison/__init__.py' |
1204 | --- hooks/charmhelpers/contrib/unison/__init__.py 2015-04-03 15:42:21 +0000 |
1205 | +++ hooks/charmhelpers/contrib/unison/__init__.py 1970-01-01 00:00:00 +0000 |
1206 | @@ -1,298 +0,0 @@ |
1207 | -# Copyright 2014-2015 Canonical Limited. |
1208 | -# |
1209 | -# This file is part of charm-helpers. |
1210 | -# |
1211 | -# charm-helpers is free software: you can redistribute it and/or modify |
1212 | -# it under the terms of the GNU Lesser General Public License version 3 as |
1213 | -# published by the Free Software Foundation. |
1214 | -# |
1215 | -# charm-helpers is distributed in the hope that it will be useful, |
1216 | -# but WITHOUT ANY WARRANTY; without even the implied warranty of |
1217 | -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
1218 | -# GNU Lesser General Public License for more details. |
1219 | -# |
1220 | -# You should have received a copy of the GNU Lesser General Public License |
1221 | -# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>. |
1222 | - |
1223 | -# Easy file synchronization among peer units using ssh + unison. |
1224 | -# |
1225 | -# From *both* peer relation -joined and -changed, add a call to |
1226 | -# ssh_authorized_peers() describing the peer relation and the desired |
1227 | -# user + group. After all peer relations have settled, all hosts should |
1228 | -# be able to connect to on another via key auth'd ssh as the specified user. |
1229 | -# |
1230 | -# Other hooks are then free to synchronize files and directories using |
1231 | -# sync_to_peers(). |
1232 | -# |
1233 | -# For a peer relation named 'cluster', for example: |
1234 | -# |
1235 | -# cluster-relation-joined: |
1236 | -# ... |
1237 | -# ssh_authorized_peers(peer_interface='cluster', |
1238 | -# user='juju_ssh', group='juju_ssh', |
1239 | -# ensure_user=True) |
1240 | -# ... |
1241 | -# |
1242 | -# cluster-relation-changed: |
1243 | -# ... |
1244 | -# ssh_authorized_peers(peer_interface='cluster', |
1245 | -# user='juju_ssh', group='juju_ssh', |
1246 | -# ensure_user=True) |
1247 | -# ... |
1248 | -# |
1249 | -# Hooks are now free to sync files as easily as: |
1250 | -# |
1251 | -# files = ['/etc/fstab', '/etc/apt.conf.d/'] |
1252 | -# sync_to_peers(peer_interface='cluster', |
1253 | -# user='juju_ssh, paths=[files]) |
1254 | -# |
1255 | -# It is assumed the charm itself has setup permissions on each unit |
1256 | -# such that 'juju_ssh' has read + write permissions. Also assumed |
1257 | -# that the calling charm takes care of leader delegation. |
1258 | -# |
1259 | -# Additionally files can be synchronized only to an specific unit: |
1260 | -# sync_to_peer(slave_address, user='juju_ssh', |
1261 | -# paths=[files], verbose=False) |
1262 | - |
1263 | -import os |
1264 | -import pwd |
1265 | - |
1266 | -from copy import copy |
1267 | -from subprocess import check_call, check_output |
1268 | - |
1269 | -from charmhelpers.core.host import ( |
1270 | - adduser, |
1271 | - add_user_to_group, |
1272 | - pwgen, |
1273 | -) |
1274 | - |
1275 | -from charmhelpers.core.hookenv import ( |
1276 | - log, |
1277 | - hook_name, |
1278 | - relation_ids, |
1279 | - related_units, |
1280 | - relation_set, |
1281 | - relation_get, |
1282 | - unit_private_ip, |
1283 | - INFO, |
1284 | - ERROR, |
1285 | -) |
1286 | - |
1287 | -BASE_CMD = ['unison', '-auto', '-batch=true', '-confirmbigdel=false', |
1288 | - '-fastcheck=true', '-group=false', '-owner=false', |
1289 | - '-prefer=newer', '-times=true'] |
1290 | - |
1291 | - |
1292 | -def get_homedir(user): |
1293 | - try: |
1294 | - user = pwd.getpwnam(user) |
1295 | - return user.pw_dir |
1296 | - except KeyError: |
1297 | - log('Could not get homedir for user %s: user exists?' % (user), ERROR) |
1298 | - raise Exception |
1299 | - |
1300 | - |
1301 | -def create_private_key(user, priv_key_path): |
1302 | - if not os.path.isfile(priv_key_path): |
1303 | - log('Generating new SSH key for user %s.' % user) |
1304 | - cmd = ['ssh-keygen', '-q', '-N', '', '-t', 'rsa', '-b', '2048', |
1305 | - '-f', priv_key_path] |
1306 | - check_call(cmd) |
1307 | - else: |
1308 | - log('SSH key already exists at %s.' % priv_key_path) |
1309 | - check_call(['chown', user, priv_key_path]) |
1310 | - check_call(['chmod', '0600', priv_key_path]) |
1311 | - |
1312 | - |
1313 | -def create_public_key(user, priv_key_path, pub_key_path): |
1314 | - if not os.path.isfile(pub_key_path): |
1315 | - log('Generating missing ssh public key @ %s.' % pub_key_path) |
1316 | - cmd = ['ssh-keygen', '-y', '-f', priv_key_path] |
1317 | - p = check_output(cmd).strip() |
1318 | - with open(pub_key_path, 'wb') as out: |
1319 | - out.write(p) |
1320 | - check_call(['chown', user, pub_key_path]) |
1321 | - |
1322 | - |
1323 | -def get_keypair(user): |
1324 | - home_dir = get_homedir(user) |
1325 | - ssh_dir = os.path.join(home_dir, '.ssh') |
1326 | - priv_key = os.path.join(ssh_dir, 'id_rsa') |
1327 | - pub_key = '%s.pub' % priv_key |
1328 | - |
1329 | - if not os.path.isdir(ssh_dir): |
1330 | - os.mkdir(ssh_dir) |
1331 | - check_call(['chown', '-R', user, ssh_dir]) |
1332 | - |
1333 | - create_private_key(user, priv_key) |
1334 | - create_public_key(user, priv_key, pub_key) |
1335 | - |
1336 | - with open(priv_key, 'r') as p: |
1337 | - _priv = p.read().strip() |
1338 | - |
1339 | - with open(pub_key, 'r') as p: |
1340 | - _pub = p.read().strip() |
1341 | - |
1342 | - return (_priv, _pub) |
1343 | - |
1344 | - |
1345 | -def write_authorized_keys(user, keys): |
1346 | - home_dir = get_homedir(user) |
1347 | - ssh_dir = os.path.join(home_dir, '.ssh') |
1348 | - auth_keys = os.path.join(ssh_dir, 'authorized_keys') |
1349 | - log('Syncing authorized_keys @ %s.' % auth_keys) |
1350 | - with open(auth_keys, 'w') as out: |
1351 | - for k in keys: |
1352 | - out.write('%s\n' % k) |
1353 | - |
1354 | - |
1355 | -def write_known_hosts(user, hosts): |
1356 | - home_dir = get_homedir(user) |
1357 | - ssh_dir = os.path.join(home_dir, '.ssh') |
1358 | - known_hosts = os.path.join(ssh_dir, 'known_hosts') |
1359 | - khosts = [] |
1360 | - for host in hosts: |
1361 | - cmd = ['ssh-keyscan', '-H', '-t', 'rsa', host] |
1362 | - remote_key = check_output(cmd, universal_newlines=True).strip() |
1363 | - khosts.append(remote_key) |
1364 | - log('Syncing known_hosts @ %s.' % known_hosts) |
1365 | - with open(known_hosts, 'w') as out: |
1366 | - for host in khosts: |
1367 | - out.write('%s\n' % host) |
1368 | - |
1369 | - |
1370 | -def ensure_user(user, group=None): |
1371 | - adduser(user, pwgen()) |
1372 | - if group: |
1373 | - add_user_to_group(user, group) |
1374 | - |
1375 | - |
1376 | -def ssh_authorized_peers(peer_interface, user, group=None, |
1377 | - ensure_local_user=False): |
1378 | - """ |
1379 | - Main setup function, should be called from both peer -changed and -joined |
1380 | - hooks with the same parameters. |
1381 | - """ |
1382 | - if ensure_local_user: |
1383 | - ensure_user(user, group) |
1384 | - priv_key, pub_key = get_keypair(user) |
1385 | - hook = hook_name() |
1386 | - if hook == '%s-relation-joined' % peer_interface: |
1387 | - relation_set(ssh_pub_key=pub_key) |
1388 | - elif hook == '%s-relation-changed' % peer_interface: |
1389 | - hosts = [] |
1390 | - keys = [] |
1391 | - |
1392 | - for r_id in relation_ids(peer_interface): |
1393 | - for unit in related_units(r_id): |
1394 | - ssh_pub_key = relation_get('ssh_pub_key', |
1395 | - rid=r_id, |
1396 | - unit=unit) |
1397 | - priv_addr = relation_get('private-address', |
1398 | - rid=r_id, |
1399 | - unit=unit) |
1400 | - if ssh_pub_key: |
1401 | - keys.append(ssh_pub_key) |
1402 | - hosts.append(priv_addr) |
1403 | - else: |
1404 | - log('ssh_authorized_peers(): ssh_pub_key ' |
1405 | - 'missing for unit %s, skipping.' % unit) |
1406 | - write_authorized_keys(user, keys) |
1407 | - write_known_hosts(user, hosts) |
1408 | - authed_hosts = ':'.join(hosts) |
1409 | - relation_set(ssh_authorized_hosts=authed_hosts) |
1410 | - |
1411 | - |
1412 | -def _run_as_user(user, gid=None): |
1413 | - try: |
1414 | - user = pwd.getpwnam(user) |
1415 | - except KeyError: |
1416 | - log('Invalid user: %s' % user) |
1417 | - raise Exception |
1418 | - uid = user.pw_uid |
1419 | - gid = gid or user.pw_gid |
1420 | - os.environ['HOME'] = user.pw_dir |
1421 | - |
1422 | - def _inner(): |
1423 | - os.setgid(gid) |
1424 | - os.setuid(uid) |
1425 | - return _inner |
1426 | - |
1427 | - |
1428 | -def run_as_user(user, cmd, gid=None): |
1429 | - return check_output(cmd, preexec_fn=_run_as_user(user, gid), cwd='/') |
1430 | - |
1431 | - |
1432 | -def collect_authed_hosts(peer_interface): |
1433 | - '''Iterate through the units on peer interface to find all that |
1434 | - have the calling host in its authorized hosts list''' |
1435 | - hosts = [] |
1436 | - for r_id in (relation_ids(peer_interface) or []): |
1437 | - for unit in related_units(r_id): |
1438 | - private_addr = relation_get('private-address', |
1439 | - rid=r_id, unit=unit) |
1440 | - authed_hosts = relation_get('ssh_authorized_hosts', |
1441 | - rid=r_id, unit=unit) |
1442 | - |
1443 | - if not authed_hosts: |
1444 | - log('Peer %s has not authorized *any* hosts yet, skipping.' % |
1445 | - (unit), level=INFO) |
1446 | - continue |
1447 | - |
1448 | - if unit_private_ip() in authed_hosts.split(':'): |
1449 | - hosts.append(private_addr) |
1450 | - else: |
1451 | - log('Peer %s has not authorized *this* host yet, skipping.' % |
1452 | - (unit), level=INFO) |
1453 | - return hosts |
1454 | - |
1455 | - |
1456 | -def sync_path_to_host(path, host, user, verbose=False, cmd=None, gid=None, |
1457 | - fatal=False): |
1458 | - """Sync path to an specific peer host |
1459 | - |
1460 | - Propagates exception if operation fails and fatal=True. |
1461 | - """ |
1462 | - cmd = cmd or copy(BASE_CMD) |
1463 | - if not verbose: |
1464 | - cmd.append('-silent') |
1465 | - |
1466 | - # removing trailing slash from directory paths, unison |
1467 | - # doesn't like these. |
1468 | - if path.endswith('/'): |
1469 | - path = path[:(len(path) - 1)] |
1470 | - |
1471 | - cmd = cmd + [path, 'ssh://%s@%s/%s' % (user, host, path)] |
1472 | - |
1473 | - try: |
1474 | - log('Syncing local path %s to %s@%s:%s' % (path, user, host, path)) |
1475 | - run_as_user(user, cmd, gid) |
1476 | - except: |
1477 | - log('Error syncing remote files') |
1478 | - if fatal: |
1479 | - raise |
1480 | - |
1481 | - |
1482 | -def sync_to_peer(host, user, paths=None, verbose=False, cmd=None, gid=None, |
1483 | - fatal=False): |
1484 | - """Sync paths to an specific peer host |
1485 | - |
1486 | - Propagates exception if any operation fails and fatal=True. |
1487 | - """ |
1488 | - if paths: |
1489 | - for p in paths: |
1490 | - sync_path_to_host(p, host, user, verbose, cmd, gid, fatal) |
1491 | - |
1492 | - |
1493 | -def sync_to_peers(peer_interface, user, paths=None, verbose=False, cmd=None, |
1494 | - gid=None, fatal=False): |
1495 | - """Sync all hosts to an specific path |
1496 | - |
1497 | - The type of group is integer, it allows user has permissions to |
1498 | - operate a directory have a different group id with the user id. |
1499 | - |
1500 | - Propagates exception if any operation fails and fatal=True. |
1501 | - """ |
1502 | - if paths: |
1503 | - for host in collect_authed_hosts(peer_interface): |
1504 | - sync_to_peer(host, user, paths, verbose, cmd, gid, fatal) |
1505 | |
1506 | === added file 'hooks/charmhelpers/coordinator.py' |
1507 | --- hooks/charmhelpers/coordinator.py 1970-01-01 00:00:00 +0000 |
1508 | +++ hooks/charmhelpers/coordinator.py 2015-06-30 07:36:26 +0000 |
1509 | @@ -0,0 +1,607 @@ |
1510 | +# Copyright 2014-2015 Canonical Limited. |
1511 | +# |
1512 | +# This file is part of charm-helpers. |
1513 | +# |
1514 | +# charm-helpers is free software: you can redistribute it and/or modify |
1515 | +# it under the terms of the GNU Lesser General Public License version 3 as |
1516 | +# published by the Free Software Foundation. |
1517 | +# |
1518 | +# charm-helpers is distributed in the hope that it will be useful, |
1519 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
1520 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
1521 | +# GNU Lesser General Public License for more details. |
1522 | +# |
1523 | +# You should have received a copy of the GNU Lesser General Public License |
1524 | +# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>. |
1525 | +''' |
1526 | +The coordinator module allows you to use Juju's leadership feature to |
1527 | +coordinate operations between units of a service. |
1528 | + |
1529 | +Behavior is defined in subclasses of coordinator.BaseCoordinator. |
1530 | +One implementation is provided (coordinator.Serial), which allows an |
1531 | +operation to be run on a single unit at a time, on a first come, first |
1532 | +served basis. You can trivially define more complex behavior by |
1533 | +subclassing BaseCoordinator or Serial. |
1534 | + |
1535 | +:author: Stuart Bishop <stuart.bishop@canonical.com> |
1536 | + |
1537 | + |
1538 | +Services Framework Usage |
1539 | +======================== |
1540 | + |
1541 | +Ensure a peer relation is defined in metadata.yaml. Instantiate a |
1542 | +BaseCoordinator subclass before invoking ServiceManager.manage(). |
1543 | +Ensure that ServiceManager.manage() is wired up to the leader-elected, |
1544 | +leader-settings-changed, peer relation-changed and peer |
1545 | +relation-departed hooks in addition to any other hooks you need, or your |
1546 | +service will deadlock. |
1547 | + |
1548 | +Ensure calls to acquire() are guarded, so that locks are only requested |
1549 | +when they are really needed (and thus hooks only triggered when necessary). |
1550 | +Failing to do this and calling acquire() unconditionally will put your unit |
1551 | +into a hook loop. Calls to granted() do not need to be guarded. |
1552 | + |
1553 | +For example:: |
1554 | + |
1555 | + from charmhelpers.core import hookenv, services |
1556 | + from charmhelpers import coordinator |
1557 | + |
1558 | + def maybe_restart(servicename): |
1559 | + serial = coordinator.Serial() |
1560 | + if needs_restart(): |
1561 | + serial.acquire('restart') |
1562 | + if serial.granted('restart'): |
1563 | + hookenv.service_restart(servicename) |
1564 | + |
1565 | + services = [dict(service='servicename', |
1566 | + data_ready=[maybe_restart])] |
1567 | + |
1568 | + if __name__ == '__main__': |
1569 | + _ = coordinator.Serial() # Must instantiate before manager.manage() |
1570 | + manager = services.ServiceManager(services) |
1571 | + manager.manage() |
1572 | + |
1573 | + |
1574 | +You can implement a similar pattern using a decorator. If the lock has |
1575 | +not been granted, an attempt to acquire() it will be made if the guard |
1576 | +function returns True. If the lock has been granted, the decorated function |
1577 | +is run as normal:: |
1578 | + |
1579 | + from charmhelpers.core import hookenv, services |
1580 | + from charmhelpers import coordinator |
1581 | + |
1582 | + serial = coordinator.Serial() # Global, instatiated on module import. |
1583 | + |
1584 | + def needs_restart(): |
1585 | + [ ... Introspect state. Return True if restart is needed ... ] |
1586 | + |
1587 | + @serial.require('restart', needs_restart) |
1588 | + def maybe_restart(servicename): |
1589 | + hookenv.service_restart(servicename) |
1590 | + |
1591 | + services = [dict(service='servicename', |
1592 | + data_ready=[maybe_restart])] |
1593 | + |
1594 | + if __name__ == '__main__': |
1595 | + manager = services.ServiceManager(services) |
1596 | + manager.manage() |
1597 | + |
1598 | + |
1599 | +Traditional Usage |
1600 | +================= |
1601 | + |
1602 | +Ensure a peer relationis defined in metadata.yaml. |
1603 | + |
1604 | +If you are using charmhelpers.core.hookenv.Hooks, ensure that a |
1605 | +BaseCoordinator subclass is instantiated before calling Hooks.execute. |
1606 | + |
1607 | +If you are not using charmhelpers.core.hookenv.Hooks, ensure |
1608 | +that a BaseCoordinator subclass is instantiated and its handle() |
1609 | +method called at the start of all your hooks. |
1610 | + |
1611 | +For example:: |
1612 | + |
1613 | + import sys |
1614 | + from charmhelpers.core import hookenv |
1615 | + from charmhelpers import coordinator |
1616 | + |
1617 | + hooks = hookenv.Hooks() |
1618 | + |
1619 | + def maybe_restart(): |
1620 | + serial = coordinator.Serial() |
1621 | + if serial.granted('restart'): |
1622 | + hookenv.service_restart('myservice') |
1623 | + |
1624 | + @hooks.hook |
1625 | + def config_changed(): |
1626 | + update_config() |
1627 | + serial = coordinator.Serial() |
1628 | + if needs_restart(): |
1629 | + serial.acquire('restart'): |
1630 | + maybe_restart() |
1631 | + |
1632 | + # Cluster hooks must be wired up. |
1633 | + @hooks.hook('cluster-relation-changed', 'cluster-relation-departed') |
1634 | + def cluster_relation_changed(): |
1635 | + maybe_restart() |
1636 | + |
1637 | + # Leader hooks must be wired up. |
1638 | + @hooks.hook('leader-elected', 'leader-settings-changed') |
1639 | + def leader_settings_changed(): |
1640 | + maybe_restart() |
1641 | + |
1642 | + [ ... repeat for *all* other hooks you are using ... ] |
1643 | + |
1644 | + if __name__ == '__main__': |
1645 | + _ = coordinator.Serial() # Must instantiate before execute() |
1646 | + hooks.execute(sys.argv) |
1647 | + |
1648 | + |
1649 | +You can also use the require decorator. If the lock has not been granted, |
1650 | +an attempt to acquire() it will be made if the guard function returns True. |
1651 | +If the lock has been granted, the decorated function is run as normal:: |
1652 | + |
1653 | + from charmhelpers.core import hookenv |
1654 | + |
1655 | + hooks = hookenv.Hooks() |
1656 | + serial = coordinator.Serial() # Must instantiate before execute() |
1657 | + |
1658 | + @require('restart', needs_restart) |
1659 | + def maybe_restart(): |
1660 | + hookenv.service_restart('myservice') |
1661 | + |
1662 | + @hooks.hook('install', 'config-changed', 'upgrade-charm', |
1663 | + # Peer and leader hooks must be wired up. |
1664 | + 'cluster-relation-changed', 'cluster-relation-departed', |
1665 | + 'leader-elected', 'leader-settings-changed') |
1666 | + def default_hook(): |
1667 | + [...] |
1668 | + maybe_restart() |
1669 | + |
1670 | + if __name__ == '__main__': |
1671 | + hooks.execute() |
1672 | + |
1673 | + |
1674 | +Details |
1675 | +======= |
1676 | + |
1677 | +A simple API is provided similar to traditional locking APIs. A lock |
1678 | +may be requested using the acquire() method, and the granted() method |
1679 | +may be used do to check if a lock previously requested by acquire() has |
1680 | +been granted. It doesn't matter how many times acquire() is called in a |
1681 | +hook. |
1682 | + |
1683 | +Locks are released at the end of the hook they are acquired in. This may |
1684 | +be the current hook if the unit is leader and the lock is free. It is |
1685 | +more likely a future hook (probably leader-settings-changed, possibly |
1686 | +the peer relation-changed or departed hook, potentially any hook). |
1687 | + |
1688 | +Whenever a charm needs to perform a coordinated action it will acquire() |
1689 | +the lock and perform the action immediately if acquisition is |
1690 | +successful. It will also need to perform the same action in every other |
1691 | +hook if the lock has been granted. |
1692 | + |
1693 | + |
1694 | +Grubby Details |
1695 | +-------------- |
1696 | + |
1697 | +Why do you need to be able to perform the same action in every hook? |
1698 | +If the unit is the leader, then it may be able to grant its own lock |
1699 | +and perform the action immediately in the source hook. If the unit is |
1700 | +the leader and cannot immediately grant the lock, then its only |
1701 | +guaranteed chance of acquiring the lock is in the peer relation-joined, |
1702 | +relation-changed or peer relation-departed hooks when another unit has |
1703 | +released it (the only channel to communicate to the leader is the peer |
1704 | +relation). If the unit is not the leader, then it is unlikely the lock |
1705 | +is granted in the source hook (a previous hook must have also made the |
1706 | +request for this to happen). A non-leader is notified about the lock via |
1707 | +leader settings. These changes may be visible in any hook, even before |
1708 | +the leader-settings-changed hook has been invoked. Or the requesting |
1709 | +unit may be promoted to leader after making a request, in which case the |
1710 | +lock may be granted in leader-elected or in a future peer |
1711 | +relation-changed or relation-departed hook. |
1712 | + |
1713 | +This could be simpler if leader-settings-changed was invoked on the |
1714 | +leader. We could then never grant locks except in |
1715 | +leader-settings-changed hooks giving one place for the operation to be |
1716 | +performed. Unfortunately this is not the case with Juju 1.23 leadership. |
1717 | + |
1718 | +But of course, this doesn't really matter to most people as most people |
1719 | +seem to prefer the Services Framework or similar reset-the-world |
1720 | +approaches, rather than the twisty maze of attempting to deduce what |
1721 | +should be done based on what hook happens to be running (which always |
1722 | +seems to evolve into reset-the-world anyway when the charm grows beyond |
1723 | +the trivial). |
1724 | + |
1725 | +I chose not to implement a callback model, where a callback was passed |
1726 | +to acquire to be executed when the lock is granted, because the callback |
1727 | +may become invalid between making the request and the lock being granted |
1728 | +due to an upgrade-charm being run in the interim. And it would create |
1729 | +restrictions, such no lambdas, callback defined at the top level of a |
1730 | +module, etc. Still, we could implement it on top of what is here, eg. |
1731 | +by adding a defer decorator that stores a pickle of itself to disk and |
1732 | +have BaseCoordinator unpickle and execute them when the locks are granted. |
1733 | +''' |
1734 | +from datetime import datetime |
1735 | +from functools import wraps |
1736 | +import json |
1737 | +import os.path |
1738 | + |
1739 | +from six import with_metaclass |
1740 | + |
1741 | +from charmhelpers.core import hookenv |
1742 | + |
1743 | + |
1744 | +# We make BaseCoordinator and subclasses singletons, so that if we |
1745 | +# need to spill to local storage then only a single instance does so, |
1746 | +# rather than having multiple instances stomp over each other. |
1747 | +class Singleton(type): |
1748 | + _instances = {} |
1749 | + |
1750 | + def __call__(cls, *args, **kwargs): |
1751 | + if cls not in cls._instances: |
1752 | + cls._instances[cls] = super(Singleton, cls).__call__(*args, |
1753 | + **kwargs) |
1754 | + return cls._instances[cls] |
1755 | + |
1756 | + |
1757 | +class BaseCoordinator(with_metaclass(Singleton, object)): |
1758 | + relid = None # Peer relation-id, set by __init__ |
1759 | + relname = None |
1760 | + |
1761 | + grants = None # self.grants[unit][lock] == timestamp |
1762 | + requests = None # self.requests[unit][lock] == timestamp |
1763 | + |
1764 | + def __init__(self, relation_key='coordinator', peer_relation_name=None): |
1765 | + '''Instatiate a Coordinator. |
1766 | + |
1767 | + Data is stored on the peer relation and in leadership storage |
1768 | + under the provided relation_key. |
1769 | + |
1770 | + The peer relation is identified by peer_relation_name, and defaults |
1771 | + to the first one found in metadata.yaml. |
1772 | + ''' |
1773 | + # Most initialization is deferred, since invoking hook tools from |
1774 | + # the constructor makes testing hard. |
1775 | + self.key = relation_key |
1776 | + self.relname = peer_relation_name |
1777 | + hookenv.atstart(self.initialize) |
1778 | + |
1779 | + # Ensure that handle() is called, without placing that burden on |
1780 | + # the charm author. They still need to do this manually if they |
1781 | + # are not using a hook framework. |
1782 | + hookenv.atstart(self.handle) |
1783 | + |
1784 | + def initialize(self): |
1785 | + if self.requests is not None: |
1786 | + return # Already initialized. |
1787 | + |
1788 | + assert hookenv.has_juju_version('1.23'), 'Needs Juju 1.23+' |
1789 | + |
1790 | + if self.relname is None: |
1791 | + self.relname = _implicit_peer_relation_name() |
1792 | + |
1793 | + relids = hookenv.relation_ids(self.relname) |
1794 | + if relids: |
1795 | + self.relid = sorted(relids)[0] |
1796 | + |
1797 | + # Load our state, from leadership, the peer relationship, and maybe |
1798 | + # local state as a fallback. Populates self.requests and self.grants. |
1799 | + self._load_state() |
1800 | + self._emit_state() |
1801 | + |
1802 | + # Save our state if the hook completes successfully. |
1803 | + hookenv.atexit(self._save_state) |
1804 | + |
1805 | + # Schedule release of granted locks for the end of the hook. |
1806 | + # This needs to be the last of our atexit callbacks to ensure |
1807 | + # it will be run first when the hook is complete, because there |
1808 | + # is no point mutating our state after it has been saved. |
1809 | + hookenv.atexit(self._release_granted) |
1810 | + |
1811 | + def acquire(self, lock): |
1812 | + '''Acquire the named lock, non-blocking. |
1813 | + |
1814 | + The lock may be granted immediately, or in a future hook. |
1815 | + |
1816 | + Returns True if the lock has been granted. The lock will be |
1817 | + automatically released at the end of the hook in which it is |
1818 | + granted. |
1819 | + |
1820 | + Do not mindlessly call this method, as it triggers a cascade of |
1821 | + hooks. For example, if you call acquire() every time in your |
1822 | + peer relation-changed hook you will end up with an infinite loop |
1823 | + of hooks. It should almost always be guarded by some condition. |
1824 | + ''' |
1825 | + unit = hookenv.local_unit() |
1826 | + ts = self.requests[unit].get(lock) |
1827 | + if not ts: |
1828 | + # If there is no outstanding request on the peer relation, |
1829 | + # create one. |
1830 | + self.requests.setdefault(lock, {}) |
1831 | + self.requests[unit][lock] = _timestamp() |
1832 | + self.msg('Requested {}'.format(lock)) |
1833 | + |
1834 | + # If the leader has granted the lock, yay. |
1835 | + if self.granted(lock): |
1836 | + self.msg('Acquired {}'.format(lock)) |
1837 | + return True |
1838 | + |
1839 | + # If the unit making the request also happens to be the |
1840 | + # leader, it must handle the request now. Even though the |
1841 | + # request has been stored on the peer relation, the peer |
1842 | + # relation-changed hook will not be triggered. |
1843 | + if hookenv.is_leader(): |
1844 | + return self.grant(lock, unit) |
1845 | + |
1846 | + return False # Can't acquire lock, yet. Maybe next hook. |
1847 | + |
1848 | + def granted(self, lock): |
1849 | + '''Return True if a previously requested lock has been granted''' |
1850 | + unit = hookenv.local_unit() |
1851 | + ts = self.requests[unit].get(lock) |
1852 | + if ts and self.grants.get(unit, {}).get(lock) == ts: |
1853 | + return True |
1854 | + return False |
1855 | + |
1856 | + def requested(self, lock): |
1857 | + '''Return True if we are in the queue for the lock''' |
1858 | + return lock in self.requests[hookenv.local_unit()] |
1859 | + |
1860 | + def request_timestamp(self, lock): |
1861 | + '''Return the timestamp of our outstanding request for lock, or None. |
1862 | + |
1863 | + Returns a datetime.datetime() UTC timestamp, with no tzinfo attribute. |
1864 | + ''' |
1865 | + ts = self.requests[hookenv.local_unit()].get(lock, None) |
1866 | + if ts is not None: |
1867 | + return datetime.strptime(ts, _timestamp_format) |
1868 | + |
1869 | + def handle(self): |
1870 | + if not hookenv.is_leader(): |
1871 | + return # Only the leader can grant requests. |
1872 | + |
1873 | + self.msg('Leader handling coordinator requests') |
1874 | + |
1875 | + # Clear our grants that have been released. |
1876 | + for unit in self.grants.keys(): |
1877 | + for lock, grant_ts in list(self.grants[unit].items()): |
1878 | + req_ts = self.requests.get(unit, {}).get(lock) |
1879 | + if req_ts != grant_ts: |
1880 | + # The request timestamp does not match the granted |
1881 | + # timestamp. Several hooks on 'unit' may have run |
1882 | + # before the leader got a chance to make a decision, |
1883 | + # and 'unit' may have released its lock and attempted |
1884 | + # to reacquire it. This will change the timestamp, |
1885 | + # and we correctly revoke the old grant putting it |
1886 | + # to the end of the queue. |
1887 | + ts = datetime.strptime(self.grants[unit][lock], |
1888 | + _timestamp_format) |
1889 | + del self.grants[unit][lock] |
1890 | + self.released(unit, lock, ts) |
1891 | + |
1892 | + # Grant locks |
1893 | + for unit in self.requests.keys(): |
1894 | + for lock in self.requests[unit]: |
1895 | + self.grant(lock, unit) |
1896 | + |
1897 | + def grant(self, lock, unit): |
1898 | + '''Maybe grant the lock to a unit. |
1899 | + |
1900 | + The decision to grant the lock or not is made for $lock |
1901 | + by a corresponding method grant_$lock, which you may define |
1902 | + in a subclass. If no such method is defined, the default_grant |
1903 | + method is used. See Serial.default_grant() for details. |
1904 | + ''' |
1905 | + if not hookenv.is_leader(): |
1906 | + return False # Not the leader, so we cannot grant. |
1907 | + |
1908 | + # Set of units already granted the lock. |
1909 | + granted = set() |
1910 | + for u in self.grants: |
1911 | + if lock in self.grants[u]: |
1912 | + granted.add(u) |
1913 | + if unit in granted: |
1914 | + return True # Already granted. |
1915 | + |
1916 | + # Ordered list of units waiting for the lock. |
1917 | + reqs = set() |
1918 | + for u in self.requests: |
1919 | + if u in granted: |
1920 | + continue # In the granted set. Not wanted in the req list. |
1921 | + for l, ts in self.requests[u].items(): |
1922 | + if l == lock: |
1923 | + reqs.add((ts, u)) |
1924 | + queue = [t[1] for t in sorted(reqs)] |
1925 | + if unit not in queue: |
1926 | + return False # Unit has not requested the lock. |
1927 | + |
1928 | + # Locate custom logic, or fallback to the default. |
1929 | + grant_func = getattr(self, 'grant_{}'.format(lock), self.default_grant) |
1930 | + |
1931 | + if grant_func(lock, unit, granted, queue): |
1932 | + # Grant the lock. |
1933 | + self.msg('Leader grants {} to {}'.format(lock, unit)) |
1934 | + self.grants.setdefault(unit, {})[lock] = self.requests[unit][lock] |
1935 | + return True |
1936 | + |
1937 | + return False |
1938 | + |
1939 | + def released(self, unit, lock, timestamp): |
1940 | + '''Called on the leader when it has released a lock. |
1941 | + |
1942 | + By default, does nothing but log messages. Override if you |
1943 | + need to perform additional housekeeping when a lock is released, |
1944 | + for example recording timestamps. |
1945 | + ''' |
1946 | + interval = _utcnow() - timestamp |
1947 | + self.msg('Leader released {} from {}, held {}'.format(lock, unit, |
1948 | + interval)) |
1949 | + |
1950 | + def require(self, lock, guard_func, *guard_args, **guard_kw): |
1951 | + """Decorate a function to be run only when a lock is acquired. |
1952 | + |
1953 | + The lock is requested if the guard function returns True. |
1954 | + |
1955 | + The decorated function is called if the lock has been granted. |
1956 | + """ |
1957 | + def decorator(f): |
1958 | + @wraps(f) |
1959 | + def wrapper(*args, **kw): |
1960 | + if self.granted(lock): |
1961 | + self.msg('Granted {}'.format(lock)) |
1962 | + return f(*args, **kw) |
1963 | + if guard_func(*guard_args, **guard_kw) and self.acquire(lock): |
1964 | + return f(*args, **kw) |
1965 | + return None |
1966 | + return wrapper |
1967 | + return decorator |
1968 | + |
1969 | + def msg(self, msg): |
1970 | + '''Emit a message. Override to customize log spam.''' |
1971 | + hookenv.log('coordinator.{} {}'.format(self._name(), msg), |
1972 | + level=hookenv.INFO) |
1973 | + |
1974 | + def _name(self): |
1975 | + return self.__class__.__name__ |
1976 | + |
1977 | + def _load_state(self): |
1978 | + self.msg('Loading state'.format(self._name())) |
1979 | + |
1980 | + # All responses must be stored in the leadership settings. |
1981 | + # The leader cannot use local state, as a different unit may |
1982 | + # be leader next time. Which is fine, as the leadership |
1983 | + # settings are always available. |
1984 | + self.grants = json.loads(hookenv.leader_get(self.key) or '{}') |
1985 | + |
1986 | + local_unit = hookenv.local_unit() |
1987 | + |
1988 | + # All requests must be stored on the peer relation. This is |
1989 | + # the only channel units have to communicate with the leader. |
1990 | + # Even the leader needs to store its requests here, as a |
1991 | + # different unit may be leader by the time the request can be |
1992 | + # granted. |
1993 | + if self.relid is None: |
1994 | + # The peer relation is not available. Maybe we are early in |
1995 | + # the units's lifecycle. Maybe this unit is standalone. |
1996 | + # Fallback to using local state. |
1997 | + self.msg('No peer relation. Loading local state') |
1998 | + self.requests = {local_unit: self._load_local_state()} |
1999 | + else: |
2000 | + self.requests = self._load_peer_state() |
2001 | + if local_unit not in self.requests: |
2002 | + # The peer relation has just been joined. Update any state |
2003 | + # loaded from our peers with our local state. |
2004 | + self.msg('New peer relation. Merging local state') |
2005 | + self.requests[local_unit] = self._load_local_state() |
2006 | + |
2007 | + def _emit_state(self): |
2008 | + # Emit this units lock status. |
2009 | + for lock in sorted(self.requests[hookenv.local_unit()].keys()): |
2010 | + if self.granted(lock): |
2011 | + self.msg('Granted {}'.format(lock)) |
2012 | + else: |
2013 | + self.msg('Waiting on {}'.format(lock)) |
2014 | + |
2015 | + def _save_state(self): |
2016 | + self.msg('Publishing state'.format(self._name())) |
2017 | + if hookenv.is_leader(): |
2018 | + # sort_keys to ensure stability. |
2019 | + raw = json.dumps(self.grants, sort_keys=True) |
2020 | + hookenv.leader_set({self.key: raw}) |
2021 | + |
2022 | + local_unit = hookenv.local_unit() |
2023 | + |
2024 | + if self.relid is None: |
2025 | + # No peer relation yet. Fallback to local state. |
2026 | + self.msg('No peer relation. Saving local state') |
2027 | + self._save_local_state(self.requests[local_unit]) |
2028 | + else: |
2029 | + # sort_keys to ensure stability. |
2030 | + raw = json.dumps(self.requests[local_unit], sort_keys=True) |
2031 | + hookenv.relation_set(self.relid, relation_settings={self.key: raw}) |
2032 | + |
2033 | + def _load_peer_state(self): |
2034 | + requests = {} |
2035 | + units = set(hookenv.related_units(self.relid)) |
2036 | + units.add(hookenv.local_unit()) |
2037 | + for unit in units: |
2038 | + raw = hookenv.relation_get(self.key, unit, self.relid) |
2039 | + if raw: |
2040 | + requests[unit] = json.loads(raw) |
2041 | + return requests |
2042 | + |
2043 | + def _local_state_filename(self): |
2044 | + # Include the class name. We allow multiple BaseCoordinator |
2045 | + # subclasses to be instantiated, and they are singletons, so |
2046 | + # this avoids conflicts (unless someone creates and uses two |
2047 | + # BaseCoordinator subclasses with the same class name, so don't |
2048 | + # do that). |
2049 | + return '.charmhelpers.coordinator.{}'.format(self._name()) |
2050 | + |
2051 | + def _load_local_state(self): |
2052 | + fn = self._local_state_filename() |
2053 | + if os.path.exists(fn): |
2054 | + with open(fn, 'r') as f: |
2055 | + return json.load(f) |
2056 | + return {} |
2057 | + |
2058 | + def _save_local_state(self, state): |
2059 | + fn = self._local_state_filename() |
2060 | + with open(fn, 'w') as f: |
2061 | + json.dump(state, f) |
2062 | + |
2063 | + def _release_granted(self): |
2064 | + # At the end of every hook, release all locks granted to |
2065 | + # this unit. If a hook neglects to make use of what it |
2066 | + # requested, it will just have to make the request again. |
2067 | + # Implicit release is the only way this will work, as |
2068 | + # if the unit is standalone there may be no future triggers |
2069 | + # called to do a manual release. |
2070 | + unit = hookenv.local_unit() |
2071 | + for lock in list(self.requests[unit].keys()): |
2072 | + if self.granted(lock): |
2073 | + self.msg('Released local {} lock'.format(lock)) |
2074 | + del self.requests[unit][lock] |
2075 | + |
2076 | + |
2077 | +class Serial(BaseCoordinator): |
2078 | + def default_grant(self, lock, unit, granted, queue): |
2079 | + '''Default logic to grant a lock to a unit. Unless overridden, |
2080 | + only one unit may hold the lock and it will be granted to the |
2081 | + earliest queued request. |
2082 | + |
2083 | + To define custom logic for $lock, create a subclass and |
2084 | + define a grant_$lock method. |
2085 | + |
2086 | + `unit` is the unit name making the request. |
2087 | + |
2088 | + `granted` is the set of units already granted the lock. It will |
2089 | + never include `unit`. It may be empty. |
2090 | + |
2091 | + `queue` is the list of units waiting for the lock, ordered by time |
2092 | + of request. It will always include `unit`, but `unit` is not |
2093 | + necessarily first. |
2094 | + |
2095 | + Returns True if the lock should be granted to `unit`. |
2096 | + ''' |
2097 | + return unit == queue[0] and not granted |
2098 | + |
2099 | + |
2100 | +def _implicit_peer_relation_name(): |
2101 | + md = hookenv.metadata() |
2102 | + assert 'peers' in md, 'No peer relations in metadata.yaml' |
2103 | + return sorted(md['peers'].keys())[0] |
2104 | + |
2105 | + |
2106 | +# A human readable, sortable UTC timestamp format. |
2107 | +_timestamp_format = '%Y-%m-%d %H:%M:%S.%fZ' |
2108 | + |
2109 | + |
2110 | +def _utcnow(): # pragma: no cover |
2111 | + # This wrapper exists as mocking datetime methods is problematic. |
2112 | + return datetime.utcnow() |
2113 | + |
2114 | + |
2115 | +def _timestamp(): |
2116 | + return _utcnow().strftime(_timestamp_format) |
2117 | |
2118 | === modified file 'hooks/charmhelpers/core/hookenv.py' |
2119 | --- hooks/charmhelpers/core/hookenv.py 2015-04-03 15:42:21 +0000 |
2120 | +++ hooks/charmhelpers/core/hookenv.py 2015-06-30 07:36:26 +0000 |
2121 | @@ -20,12 +20,17 @@ |
2122 | # Authors: |
2123 | # Charm Helpers Developers <juju@lists.ubuntu.com> |
2124 | |
2125 | +from __future__ import print_function |
2126 | +from distutils.version import LooseVersion |
2127 | from functools import wraps |
2128 | +import glob |
2129 | import os |
2130 | import json |
2131 | import yaml |
2132 | import subprocess |
2133 | import sys |
2134 | +import errno |
2135 | +import tempfile |
2136 | from subprocess import CalledProcessError |
2137 | |
2138 | import six |
2139 | @@ -90,7 +95,18 @@ |
2140 | if not isinstance(message, six.string_types): |
2141 | message = repr(message) |
2142 | command += [message] |
2143 | - subprocess.call(command) |
2144 | + # Missing juju-log should not cause failures in unit tests |
2145 | + # Send log output to stderr |
2146 | + try: |
2147 | + subprocess.call(command) |
2148 | + except OSError as e: |
2149 | + if e.errno == errno.ENOENT: |
2150 | + if level: |
2151 | + message = "{}: {}".format(level, message) |
2152 | + message = "juju-log: {}".format(message) |
2153 | + print(message, file=sys.stderr) |
2154 | + else: |
2155 | + raise |
2156 | |
2157 | |
2158 | class Serializable(UserDict): |
2159 | @@ -228,6 +244,7 @@ |
2160 | self.path = os.path.join(charm_dir(), Config.CONFIG_FILE_NAME) |
2161 | if os.path.exists(self.path): |
2162 | self.load_previous() |
2163 | + atexit(self._implicit_save) |
2164 | |
2165 | def __getitem__(self, key): |
2166 | """For regular dict lookups, check the current juju config first, |
2167 | @@ -307,6 +324,10 @@ |
2168 | with open(self.path, 'w') as f: |
2169 | json.dump(self, f) |
2170 | |
2171 | + def _implicit_save(self): |
2172 | + if self.implicit_save: |
2173 | + self.save() |
2174 | + |
2175 | |
2176 | @cached |
2177 | def config(scope=None): |
2178 | @@ -349,18 +370,49 @@ |
2179 | """Set relation information for the current unit""" |
2180 | relation_settings = relation_settings if relation_settings else {} |
2181 | relation_cmd_line = ['relation-set'] |
2182 | + accepts_file = "--file" in subprocess.check_output( |
2183 | + relation_cmd_line + ["--help"], universal_newlines=True) |
2184 | if relation_id is not None: |
2185 | relation_cmd_line.extend(('-r', relation_id)) |
2186 | - for k, v in (list(relation_settings.items()) + list(kwargs.items())): |
2187 | - if v is None: |
2188 | - relation_cmd_line.append('{}='.format(k)) |
2189 | - else: |
2190 | - relation_cmd_line.append('{}={}'.format(k, v)) |
2191 | - subprocess.check_call(relation_cmd_line) |
2192 | + settings = relation_settings.copy() |
2193 | + settings.update(kwargs) |
2194 | + for key, value in settings.items(): |
2195 | + # Force value to be a string: it always should, but some call |
2196 | + # sites pass in things like dicts or numbers. |
2197 | + if value is not None: |
2198 | + settings[key] = "{}".format(value) |
2199 | + if accepts_file: |
2200 | + # --file was introduced in Juju 1.23.2. Use it by default if |
2201 | + # available, since otherwise we'll break if the relation data is |
2202 | + # too big. Ideally we should tell relation-set to read the data from |
2203 | + # stdin, but that feature is broken in 1.23.2: Bug #1454678. |
2204 | + with tempfile.NamedTemporaryFile(delete=False) as settings_file: |
2205 | + settings_file.write(yaml.safe_dump(settings).encode("utf-8")) |
2206 | + subprocess.check_call( |
2207 | + relation_cmd_line + ["--file", settings_file.name]) |
2208 | + os.remove(settings_file.name) |
2209 | + else: |
2210 | + for key, value in settings.items(): |
2211 | + if value is None: |
2212 | + relation_cmd_line.append('{}='.format(key)) |
2213 | + else: |
2214 | + relation_cmd_line.append('{}={}'.format(key, value)) |
2215 | + subprocess.check_call(relation_cmd_line) |
2216 | # Flush cache of any relation-gets for local unit |
2217 | flush(local_unit()) |
2218 | |
2219 | |
2220 | +def relation_clear(r_id=None): |
2221 | + ''' Clears any relation data already set on relation r_id ''' |
2222 | + settings = relation_get(rid=r_id, |
2223 | + unit=local_unit()) |
2224 | + for setting in settings: |
2225 | + if setting not in ['public-address', 'private-address']: |
2226 | + settings[setting] = None |
2227 | + relation_set(relation_id=r_id, |
2228 | + **settings) |
2229 | + |
2230 | + |
2231 | @cached |
2232 | def relation_ids(reltype=None): |
2233 | """A list of relation_ids""" |
2234 | @@ -542,10 +594,14 @@ |
2235 | hooks.execute(sys.argv) |
2236 | """ |
2237 | |
2238 | - def __init__(self, config_save=True): |
2239 | + def __init__(self, config_save=None): |
2240 | super(Hooks, self).__init__() |
2241 | self._hooks = {} |
2242 | - self._config_save = config_save |
2243 | + |
2244 | + # For unknown reasons, we allow the Hooks constructor to override |
2245 | + # config().implicit_save. |
2246 | + if config_save is not None: |
2247 | + config().implicit_save = config_save |
2248 | |
2249 | def register(self, name, function): |
2250 | """Register a hook""" |
2251 | @@ -553,13 +609,16 @@ |
2252 | |
2253 | def execute(self, args): |
2254 | """Execute a registered hook based on args[0]""" |
2255 | + _run_atstart() |
2256 | hook_name = os.path.basename(args[0]) |
2257 | if hook_name in self._hooks: |
2258 | - self._hooks[hook_name]() |
2259 | - if self._config_save: |
2260 | - cfg = config() |
2261 | - if cfg.implicit_save: |
2262 | - cfg.save() |
2263 | + try: |
2264 | + self._hooks[hook_name]() |
2265 | + except SystemExit as x: |
2266 | + if x.code is None or x.code == 0: |
2267 | + _run_atexit() |
2268 | + raise |
2269 | + _run_atexit() |
2270 | else: |
2271 | raise UnregisteredHookError(hook_name) |
2272 | |
2273 | @@ -606,3 +665,160 @@ |
2274 | |
2275 | The results set by action_set are preserved.""" |
2276 | subprocess.check_call(['action-fail', message]) |
2277 | + |
2278 | + |
2279 | +def status_set(workload_state, message): |
2280 | + """Set the workload state with a message |
2281 | + |
2282 | + Use status-set to set the workload state with a message which is visible |
2283 | + to the user via juju status. If the status-set command is not found then |
2284 | + assume this is juju < 1.23 and juju-log the message unstead. |
2285 | + |
2286 | + workload_state -- valid juju workload state. |
2287 | + message -- status update message |
2288 | + """ |
2289 | + valid_states = ['maintenance', 'blocked', 'waiting', 'active'] |
2290 | + if workload_state not in valid_states: |
2291 | + raise ValueError( |
2292 | + '{!r} is not a valid workload state'.format(workload_state) |
2293 | + ) |
2294 | + cmd = ['status-set', workload_state, message] |
2295 | + try: |
2296 | + ret = subprocess.call(cmd) |
2297 | + if ret == 0: |
2298 | + return |
2299 | + except OSError as e: |
2300 | + if e.errno != errno.ENOENT: |
2301 | + raise |
2302 | + log_message = 'status-set failed: {} {}'.format(workload_state, |
2303 | + message) |
2304 | + log(log_message, level='INFO') |
2305 | + |
2306 | + |
2307 | +def status_get(): |
2308 | + """Retrieve the previously set juju workload state |
2309 | + |
2310 | + If the status-set command is not found then assume this is juju < 1.23 and |
2311 | + return 'unknown' |
2312 | + """ |
2313 | + cmd = ['status-get'] |
2314 | + try: |
2315 | + raw_status = subprocess.check_output(cmd, universal_newlines=True) |
2316 | + status = raw_status.rstrip() |
2317 | + return status |
2318 | + except OSError as e: |
2319 | + if e.errno == errno.ENOENT: |
2320 | + return 'unknown' |
2321 | + else: |
2322 | + raise |
2323 | + |
2324 | + |
2325 | +def translate_exc(from_exc, to_exc): |
2326 | + def inner_translate_exc1(f): |
2327 | + def inner_translate_exc2(*args, **kwargs): |
2328 | + try: |
2329 | + return f(*args, **kwargs) |
2330 | + except from_exc: |
2331 | + raise to_exc |
2332 | + |
2333 | + return inner_translate_exc2 |
2334 | + |
2335 | + return inner_translate_exc1 |
2336 | + |
2337 | + |
2338 | +@translate_exc(from_exc=OSError, to_exc=NotImplementedError) |
2339 | +def is_leader(): |
2340 | + """Does the current unit hold the juju leadership |
2341 | + |
2342 | + Uses juju to determine whether the current unit is the leader of its peers |
2343 | + """ |
2344 | + cmd = ['is-leader', '--format=json'] |
2345 | + return json.loads(subprocess.check_output(cmd).decode('UTF-8')) |
2346 | + |
2347 | + |
2348 | +@translate_exc(from_exc=OSError, to_exc=NotImplementedError) |
2349 | +def leader_get(attribute=None): |
2350 | + """Juju leader get value(s)""" |
2351 | + cmd = ['leader-get', '--format=json'] + [attribute or '-'] |
2352 | + return json.loads(subprocess.check_output(cmd).decode('UTF-8')) |
2353 | + |
2354 | + |
2355 | +@translate_exc(from_exc=OSError, to_exc=NotImplementedError) |
2356 | +def leader_set(settings=None, **kwargs): |
2357 | + """Juju leader set value(s)""" |
2358 | + # Don't log secrets. |
2359 | + # log("Juju leader-set '%s'" % (settings), level=DEBUG) |
2360 | + cmd = ['leader-set'] |
2361 | + settings = settings or {} |
2362 | + settings.update(kwargs) |
2363 | + for k, v in settings.items(): |
2364 | + if v is None: |
2365 | + cmd.append('{}='.format(k)) |
2366 | + else: |
2367 | + cmd.append('{}={}'.format(k, v)) |
2368 | + subprocess.check_call(cmd) |
2369 | + |
2370 | + |
2371 | +@cached |
2372 | +def juju_version(): |
2373 | + """Full version string (eg. '1.23.3.1-trusty-amd64')""" |
2374 | + # Per https://bugs.launchpad.net/juju-core/+bug/1455368/comments/1 |
2375 | + jujud = glob.glob('/var/lib/juju/tools/machine-*/jujud')[0] |
2376 | + return subprocess.check_output([jujud, 'version'], |
2377 | + universal_newlines=True).strip() |
2378 | + |
2379 | + |
2380 | +@cached |
2381 | +def has_juju_version(minimum_version): |
2382 | + """Return True if the Juju version is at least the provided version""" |
2383 | + return LooseVersion(juju_version()) >= LooseVersion(minimum_version) |
2384 | + |
2385 | + |
2386 | +_atexit = [] |
2387 | +_atstart = [] |
2388 | + |
2389 | + |
2390 | +def atstart(callback, *args, **kwargs): |
2391 | + '''Schedule a callback to run before the main hook. |
2392 | + |
2393 | + Callbacks are run in the order they were added. |
2394 | + |
2395 | + This is useful for modules and classes to perform initialization |
2396 | + and inject behavior. In particular: |
2397 | + - Run common code before all of your hooks, such as logging |
2398 | + the hook name or interesting relation data. |
2399 | + - Defer object or module initialization that requires a hook |
2400 | + context until we know there actually is a hook context, |
2401 | + making testing easier. |
2402 | + - Rather than requiring charm authors to include boilerplate to |
2403 | + invoke your helper's behavior, have it run automatically if |
2404 | + your object is instantiated or module imported. |
2405 | + |
2406 | + This is not at all useful after your hook framework as been launched. |
2407 | + ''' |
2408 | + global _atstart |
2409 | + _atstart.append((callback, args, kwargs)) |
2410 | + |
2411 | + |
2412 | +def atexit(callback, *args, **kwargs): |
2413 | + '''Schedule a callback to run on successful hook completion. |
2414 | + |
2415 | + Callbacks are run in the reverse order that they were added.''' |
2416 | + _atexit.append((callback, args, kwargs)) |
2417 | + |
2418 | + |
2419 | +def _run_atstart(): |
2420 | + '''Hook frameworks must invoke this before running the main hook body.''' |
2421 | + global _atstart |
2422 | + for callback, args, kwargs in _atstart: |
2423 | + callback(*args, **kwargs) |
2424 | + del _atstart[:] |
2425 | + |
2426 | + |
2427 | +def _run_atexit(): |
2428 | + '''Hook frameworks must invoke this after the main hook body has |
2429 | + successfully completed. Do not invoke it if the hook fails.''' |
2430 | + global _atexit |
2431 | + for callback, args, kwargs in reversed(_atexit): |
2432 | + callback(*args, **kwargs) |
2433 | + del _atexit[:] |
2434 | |
2435 | === modified file 'hooks/charmhelpers/core/host.py' |
2436 | --- hooks/charmhelpers/core/host.py 2015-04-03 15:42:21 +0000 |
2437 | +++ hooks/charmhelpers/core/host.py 2015-06-30 07:36:26 +0000 |
2438 | @@ -413,7 +413,7 @@ |
2439 | the pkgcache argument is None. Be sure to add charmhelpers.fetch if |
2440 | you call this function, or pass an apt_pkg.Cache() instance. |
2441 | ''' |
2442 | - from apt import apt_pkg |
2443 | + import apt_pkg |
2444 | if not pkgcache: |
2445 | from charmhelpers.fetch import apt_cache |
2446 | pkgcache = apt_cache() |
2447 | |
2448 | === modified file 'hooks/charmhelpers/core/services/base.py' |
2449 | --- hooks/charmhelpers/core/services/base.py 2015-01-26 13:07:31 +0000 |
2450 | +++ hooks/charmhelpers/core/services/base.py 2015-06-30 07:36:26 +0000 |
2451 | @@ -15,8 +15,8 @@ |
2452 | # along with charm-helpers. If not, see <http://www.gnu.org/licenses/>. |
2453 | |
2454 | import os |
2455 | -import re |
2456 | import json |
2457 | +from inspect import getargspec |
2458 | from collections import Iterable, OrderedDict |
2459 | |
2460 | from charmhelpers.core import host |
2461 | @@ -128,15 +128,18 @@ |
2462 | """ |
2463 | Handle the current hook by doing The Right Thing with the registered services. |
2464 | """ |
2465 | - hook_name = hookenv.hook_name() |
2466 | - if hook_name == 'stop': |
2467 | - self.stop_services() |
2468 | - else: |
2469 | - self.provide_data() |
2470 | - self.reconfigure_services() |
2471 | - cfg = hookenv.config() |
2472 | - if cfg.implicit_save: |
2473 | - cfg.save() |
2474 | + hookenv._run_atstart() |
2475 | + try: |
2476 | + hook_name = hookenv.hook_name() |
2477 | + if hook_name == 'stop': |
2478 | + self.stop_services() |
2479 | + else: |
2480 | + self.reconfigure_services() |
2481 | + self.provide_data() |
2482 | + except SystemExit as x: |
2483 | + if x.code is None or x.code == 0: |
2484 | + hookenv._run_atexit() |
2485 | + hookenv._run_atexit() |
2486 | |
2487 | def provide_data(self): |
2488 | """ |
2489 | @@ -145,15 +148,36 @@ |
2490 | A provider must have a `name` attribute, which indicates which relation |
2491 | to set data on, and a `provide_data()` method, which returns a dict of |
2492 | data to set. |
2493 | + |
2494 | + The `provide_data()` method can optionally accept two parameters: |
2495 | + |
2496 | + * ``remote_service`` The name of the remote service that the data will |
2497 | + be provided to. The `provide_data()` method will be called once |
2498 | + for each connected service (not unit). This allows the method to |
2499 | + tailor its data to the given service. |
2500 | + * ``service_ready`` Whether or not the service definition had all of |
2501 | + its requirements met, and thus the ``data_ready`` callbacks run. |
2502 | + |
2503 | + Note that the ``provided_data`` methods are now called **after** the |
2504 | + ``data_ready`` callbacks are run. This gives the ``data_ready`` callbacks |
2505 | + a chance to generate any data necessary for the providing to the remote |
2506 | + services. |
2507 | """ |
2508 | - hook_name = hookenv.hook_name() |
2509 | - for service in self.services.values(): |
2510 | + for service_name, service in self.services.items(): |
2511 | + service_ready = self.is_ready(service_name) |
2512 | for provider in service.get('provided_data', []): |
2513 | - if re.match(r'{}-relation-(joined|changed)'.format(provider.name), hook_name): |
2514 | - data = provider.provide_data() |
2515 | - _ready = provider._is_ready(data) if hasattr(provider, '_is_ready') else data |
2516 | - if _ready: |
2517 | - hookenv.relation_set(None, data) |
2518 | + for relid in hookenv.relation_ids(provider.name): |
2519 | + units = hookenv.related_units(relid) |
2520 | + if not units: |
2521 | + continue |
2522 | + remote_service = units[0].split('/')[0] |
2523 | + argspec = getargspec(provider.provide_data) |
2524 | + if len(argspec.args) > 1: |
2525 | + data = provider.provide_data(remote_service, service_ready) |
2526 | + else: |
2527 | + data = provider.provide_data() |
2528 | + if data: |
2529 | + hookenv.relation_set(relid, data) |
2530 | |
2531 | def reconfigure_services(self, *service_names): |
2532 | """ |
2533 | |
2534 | === modified file 'hooks/charmhelpers/core/strutils.py' |
2535 | --- hooks/charmhelpers/core/strutils.py 2015-02-27 05:36:25 +0000 |
2536 | +++ hooks/charmhelpers/core/strutils.py 2015-06-30 07:36:26 +0000 |
2537 | @@ -33,9 +33,9 @@ |
2538 | |
2539 | value = value.strip().lower() |
2540 | |
2541 | - if value in ['y', 'yes', 'true', 't']: |
2542 | + if value in ['y', 'yes', 'true', 't', 'on']: |
2543 | return True |
2544 | - elif value in ['n', 'no', 'false', 'f']: |
2545 | + elif value in ['n', 'no', 'false', 'f', 'off']: |
2546 | return False |
2547 | |
2548 | msg = "Unable to interpret string value '%s' as boolean" % (value) |
2549 | |
2550 | === modified file 'hooks/charmhelpers/fetch/giturl.py' |
2551 | --- hooks/charmhelpers/fetch/giturl.py 2015-02-18 14:24:32 +0000 |
2552 | +++ hooks/charmhelpers/fetch/giturl.py 2015-06-30 07:36:26 +0000 |
2553 | @@ -45,14 +45,16 @@ |
2554 | else: |
2555 | return True |
2556 | |
2557 | - def clone(self, source, dest, branch): |
2558 | + def clone(self, source, dest, branch, depth=None): |
2559 | if not self.can_handle(source): |
2560 | raise UnhandledSource("Cannot handle {}".format(source)) |
2561 | |
2562 | - repo = Repo.clone_from(source, dest) |
2563 | - repo.git.checkout(branch) |
2564 | + if depth: |
2565 | + Repo.clone_from(source, dest, branch=branch, depth=depth) |
2566 | + else: |
2567 | + Repo.clone_from(source, dest, branch=branch) |
2568 | |
2569 | - def install(self, source, branch="master", dest=None): |
2570 | + def install(self, source, branch="master", dest=None, depth=None): |
2571 | url_parts = self.parse_url(source) |
2572 | branch_name = url_parts.path.strip("/").split("/")[-1] |
2573 | if dest: |
2574 | @@ -63,7 +65,7 @@ |
2575 | if not os.path.exists(dest_dir): |
2576 | mkdir(dest_dir, perms=0o755) |
2577 | try: |
2578 | - self.clone(source, dest_dir, branch) |
2579 | + self.clone(source, dest_dir, branch, depth) |
2580 | except GitCommandError as e: |
2581 | raise UnhandledSource(e.message) |
2582 | except OSError as e: |
2583 | |
2584 | === modified symlink 'hooks/cluster-relation-changed' (properties changed: -x to +x) |
2585 | === target was u'hooks.py' |
2586 | --- hooks/cluster-relation-changed 1970-01-01 00:00:00 +0000 |
2587 | +++ hooks/cluster-relation-changed 2015-06-30 07:36:26 +0000 |
2588 | @@ -0,0 +1,20 @@ |
2589 | +#!/usr/bin/python3 |
2590 | +# Copyright 2015 Canonical Ltd. |
2591 | +# |
2592 | +# This file is part of the Cassandra Charm for Juju. |
2593 | +# |
2594 | +# This program is free software: you can redistribute it and/or modify |
2595 | +# it under the terms of the GNU General Public License version 3, as |
2596 | +# published by the Free Software Foundation. |
2597 | +# |
2598 | +# This program is distributed in the hope that it will be useful, but |
2599 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
2600 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
2601 | +# PURPOSE. See the GNU General Public License for more details. |
2602 | +# |
2603 | +# You should have received a copy of the GNU General Public License |
2604 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
2605 | +import hooks |
2606 | +if __name__ == '__main__': |
2607 | + hooks.bootstrap() |
2608 | + hooks.default_hook() |
2609 | |
2610 | === modified symlink 'hooks/cluster-relation-departed' (properties changed: -x to +x) |
2611 | === target was u'hooks.py' |
2612 | --- hooks/cluster-relation-departed 1970-01-01 00:00:00 +0000 |
2613 | +++ hooks/cluster-relation-departed 2015-06-30 07:36:26 +0000 |
2614 | @@ -0,0 +1,20 @@ |
2615 | +#!/usr/bin/python3 |
2616 | +# Copyright 2015 Canonical Ltd. |
2617 | +# |
2618 | +# This file is part of the Cassandra Charm for Juju. |
2619 | +# |
2620 | +# This program is free software: you can redistribute it and/or modify |
2621 | +# it under the terms of the GNU General Public License version 3, as |
2622 | +# published by the Free Software Foundation. |
2623 | +# |
2624 | +# This program is distributed in the hope that it will be useful, but |
2625 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
2626 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
2627 | +# PURPOSE. See the GNU General Public License for more details. |
2628 | +# |
2629 | +# You should have received a copy of the GNU General Public License |
2630 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
2631 | +import hooks |
2632 | +if __name__ == '__main__': |
2633 | + hooks.bootstrap() |
2634 | + hooks.default_hook() |
2635 | |
2636 | === removed symlink 'hooks/cluster-relation-joined' |
2637 | === target was u'hooks.py' |
2638 | === modified symlink 'hooks/config-changed' (properties changed: -x to +x) |
2639 | === target was u'hooks.py' |
2640 | --- hooks/config-changed 1970-01-01 00:00:00 +0000 |
2641 | +++ hooks/config-changed 2015-06-30 07:36:26 +0000 |
2642 | @@ -0,0 +1,20 @@ |
2643 | +#!/usr/bin/python3 |
2644 | +# Copyright 2015 Canonical Ltd. |
2645 | +# |
2646 | +# This file is part of the Cassandra Charm for Juju. |
2647 | +# |
2648 | +# This program is free software: you can redistribute it and/or modify |
2649 | +# it under the terms of the GNU General Public License version 3, as |
2650 | +# published by the Free Software Foundation. |
2651 | +# |
2652 | +# This program is distributed in the hope that it will be useful, but |
2653 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
2654 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
2655 | +# PURPOSE. See the GNU General Public License for more details. |
2656 | +# |
2657 | +# You should have received a copy of the GNU General Public License |
2658 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
2659 | +import hooks |
2660 | +if __name__ == '__main__': |
2661 | + hooks.bootstrap() |
2662 | + hooks.default_hook() |
2663 | |
2664 | === added file 'hooks/coordinator.py' |
2665 | --- hooks/coordinator.py 1970-01-01 00:00:00 +0000 |
2666 | +++ hooks/coordinator.py 2015-06-30 07:36:26 +0000 |
2667 | @@ -0,0 +1,35 @@ |
2668 | +# Copyright 2015 Canonical Ltd. |
2669 | +# |
2670 | +# This file is part of the Cassandra Charm for Juju. |
2671 | +# |
2672 | +# This program is free software: you can redistribute it and/or modify |
2673 | +# it under the terms of the GNU General Public License version 3, as |
2674 | +# published by the Free Software Foundation. |
2675 | +# |
2676 | +# This program is distributed in the hope that it will be useful, but |
2677 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
2678 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
2679 | +# PURPOSE. See the GNU General Public License for more details. |
2680 | +# |
2681 | +# You should have received a copy of the GNU General Public License |
2682 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
2683 | +from charmhelpers.coordinator import BaseCoordinator |
2684 | + |
2685 | + |
2686 | +class CassandraCoordinator(BaseCoordinator): |
2687 | + def default_grant(self, lock, unit, granted, queue): |
2688 | + '''Grant locks to only one unit at a time, regardless of its name. |
2689 | + |
2690 | + This lets us keep separate locks like repair and restart, |
2691 | + while ensuring the operations do not occur on different nodes |
2692 | + at the same time. |
2693 | + ''' |
2694 | + # Return True if this unit has already been granted a lock. |
2695 | + if self.grants.get(unit): |
2696 | + return True |
2697 | + |
2698 | + # Otherwise, return True if the unit is first in the queue. |
2699 | + return queue[0] == unit and not granted |
2700 | + |
2701 | + |
2702 | +coordinator = CassandraCoordinator() |
2703 | |
2704 | === modified symlink 'hooks/data-relation-changed' (properties changed: -x to +x) |
2705 | === target was u'hooks.py' |
2706 | --- hooks/data-relation-changed 1970-01-01 00:00:00 +0000 |
2707 | +++ hooks/data-relation-changed 2015-06-30 07:36:26 +0000 |
2708 | @@ -0,0 +1,20 @@ |
2709 | +#!/usr/bin/python3 |
2710 | +# Copyright 2015 Canonical Ltd. |
2711 | +# |
2712 | +# This file is part of the Cassandra Charm for Juju. |
2713 | +# |
2714 | +# This program is free software: you can redistribute it and/or modify |
2715 | +# it under the terms of the GNU General Public License version 3, as |
2716 | +# published by the Free Software Foundation. |
2717 | +# |
2718 | +# This program is distributed in the hope that it will be useful, but |
2719 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
2720 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
2721 | +# PURPOSE. See the GNU General Public License for more details. |
2722 | +# |
2723 | +# You should have received a copy of the GNU General Public License |
2724 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
2725 | +import hooks |
2726 | +if __name__ == '__main__': |
2727 | + hooks.bootstrap() |
2728 | + hooks.default_hook() |
2729 | |
2730 | === modified symlink 'hooks/data-relation-departed' (properties changed: -x to +x) |
2731 | === target was u'hooks.py' |
2732 | --- hooks/data-relation-departed 1970-01-01 00:00:00 +0000 |
2733 | +++ hooks/data-relation-departed 2015-06-30 07:36:26 +0000 |
2734 | @@ -0,0 +1,20 @@ |
2735 | +#!/usr/bin/python3 |
2736 | +# Copyright 2015 Canonical Ltd. |
2737 | +# |
2738 | +# This file is part of the Cassandra Charm for Juju. |
2739 | +# |
2740 | +# This program is free software: you can redistribute it and/or modify |
2741 | +# it under the terms of the GNU General Public License version 3, as |
2742 | +# published by the Free Software Foundation. |
2743 | +# |
2744 | +# This program is distributed in the hope that it will be useful, but |
2745 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
2746 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
2747 | +# PURPOSE. See the GNU General Public License for more details. |
2748 | +# |
2749 | +# You should have received a copy of the GNU General Public License |
2750 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
2751 | +import hooks |
2752 | +if __name__ == '__main__': |
2753 | + hooks.bootstrap() |
2754 | + hooks.default_hook() |
2755 | |
2756 | === modified symlink 'hooks/database-admin-relation-changed' (properties changed: -x to +x) |
2757 | === target was u'hooks.py' |
2758 | --- hooks/database-admin-relation-changed 1970-01-01 00:00:00 +0000 |
2759 | +++ hooks/database-admin-relation-changed 2015-06-30 07:36:26 +0000 |
2760 | @@ -0,0 +1,20 @@ |
2761 | +#!/usr/bin/python3 |
2762 | +# Copyright 2015 Canonical Ltd. |
2763 | +# |
2764 | +# This file is part of the Cassandra Charm for Juju. |
2765 | +# |
2766 | +# This program is free software: you can redistribute it and/or modify |
2767 | +# it under the terms of the GNU General Public License version 3, as |
2768 | +# published by the Free Software Foundation. |
2769 | +# |
2770 | +# This program is distributed in the hope that it will be useful, but |
2771 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
2772 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
2773 | +# PURPOSE. See the GNU General Public License for more details. |
2774 | +# |
2775 | +# You should have received a copy of the GNU General Public License |
2776 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
2777 | +import hooks |
2778 | +if __name__ == '__main__': |
2779 | + hooks.bootstrap() |
2780 | + hooks.default_hook() |
2781 | |
2782 | === modified symlink 'hooks/database-relation-changed' (properties changed: -x to +x) |
2783 | === target was u'hooks.py' |
2784 | --- hooks/database-relation-changed 1970-01-01 00:00:00 +0000 |
2785 | +++ hooks/database-relation-changed 2015-06-30 07:36:26 +0000 |
2786 | @@ -0,0 +1,20 @@ |
2787 | +#!/usr/bin/python3 |
2788 | +# Copyright 2015 Canonical Ltd. |
2789 | +# |
2790 | +# This file is part of the Cassandra Charm for Juju. |
2791 | +# |
2792 | +# This program is free software: you can redistribute it and/or modify |
2793 | +# it under the terms of the GNU General Public License version 3, as |
2794 | +# published by the Free Software Foundation. |
2795 | +# |
2796 | +# This program is distributed in the hope that it will be useful, but |
2797 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
2798 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
2799 | +# PURPOSE. See the GNU General Public License for more details. |
2800 | +# |
2801 | +# You should have received a copy of the GNU General Public License |
2802 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
2803 | +import hooks |
2804 | +if __name__ == '__main__': |
2805 | + hooks.bootstrap() |
2806 | + hooks.default_hook() |
2807 | |
2808 | === modified file 'hooks/definitions.py' |
2809 | --- hooks/definitions.py 2015-04-13 16:39:20 +0000 |
2810 | +++ hooks/definitions.py 2015-06-30 07:36:26 +0000 |
2811 | @@ -15,13 +15,11 @@ |
2812 | # along with this program. If not, see <http://www.gnu.org/licenses/>. |
2813 | |
2814 | from charmhelpers.core import hookenv |
2815 | -from charmhelpers.core.hookenv import ERROR |
2816 | from charmhelpers.core import services |
2817 | |
2818 | import actions |
2819 | import helpers |
2820 | import relations |
2821 | -import rollingrestart |
2822 | |
2823 | |
2824 | def get_service_definitions(): |
2825 | @@ -35,32 +33,13 @@ |
2826 | config = hookenv.config() |
2827 | |
2828 | return [ |
2829 | - # Actions done before or while the Cassandra service is running. |
2830 | - dict(service=helpers.get_cassandra_service(), |
2831 | - |
2832 | - # Open access to client and replication ports. Client |
2833 | - # protocols require password authentication. Access to |
2834 | - # the unauthenticated replication ports is protected via |
2835 | - # ufw firewall rules. We do not open the JMX port, although |
2836 | - # we could since it is similarly protected by ufw. |
2837 | - ports=[config['rpc_port'], # Thrift clients |
2838 | - config['native_transport_port'], # Native clients. |
2839 | - config['storage_port'], # Plaintext replication |
2840 | - config['ssl_storage_port']], # Encrypted replication. |
2841 | - |
2842 | - required_data=[relations.StorageRelation()], |
2843 | - provided_data=[relations.StorageRelation()], |
2844 | + # Prepare for the Cassandra service. |
2845 | + dict(service='install', |
2846 | data_ready=[actions.set_proxy, |
2847 | actions.preinstall, |
2848 | actions.emit_meminfo, |
2849 | actions.revert_unchangeable_config, |
2850 | actions.store_unit_private_ip, |
2851 | - actions.set_unit_zero_bootstrapped, |
2852 | - actions.shutdown_before_joining_peers, |
2853 | - # Must open ports before attempting bind to the |
2854 | - # public ip address. |
2855 | - actions.configure_firewall, |
2856 | - actions.grant_ssh_access, |
2857 | actions.add_implicit_package_signing_keys, |
2858 | actions.configure_sources, |
2859 | actions.swapoff, |
2860 | @@ -68,65 +47,75 @@ |
2861 | actions.install_oracle_jre, |
2862 | actions.install_cassandra_packages, |
2863 | actions.emit_java_version, |
2864 | - actions.ensure_cassandra_package_status, |
2865 | + actions.ensure_cassandra_package_status], |
2866 | + start=[], stop=[]), |
2867 | + |
2868 | + # Get Cassandra running. |
2869 | + dict(service=helpers.get_cassandra_service(), |
2870 | + |
2871 | + # Open access to client and replication ports. Client |
2872 | + # protocols require password authentication. Access to |
2873 | + # the unauthenticated replication ports is protected via |
2874 | + # ufw firewall rules. We do not open the JMX port, although |
2875 | + # we could since it is similarly protected by ufw. |
2876 | + ports=[config['rpc_port'], # Thrift clients |
2877 | + config['native_transport_port'], # Native clients. |
2878 | + config['storage_port'], # Plaintext replication |
2879 | + config['ssl_storage_port']], # Encrypted replication. |
2880 | + |
2881 | + required_data=[relations.StorageRelation(), |
2882 | + relations.PeerRelation()], |
2883 | + provided_data=[relations.StorageRelation()], |
2884 | + data_ready=[actions.configure_firewall, |
2885 | + actions.maintain_seeds, |
2886 | actions.configure_cassandra_yaml, |
2887 | actions.configure_cassandra_env, |
2888 | actions.configure_cassandra_rackdc, |
2889 | actions.reset_all_io_schedulers, |
2890 | - actions.nrpe_external_master_relation, |
2891 | - actions.maybe_schedule_restart], |
2892 | + actions.maybe_restart, |
2893 | + actions.request_unit_superuser, |
2894 | + actions.reset_default_password], |
2895 | start=[services.open_ports], |
2896 | stop=[actions.stop_cassandra, services.close_ports]), |
2897 | |
2898 | - # Rolling restart. This service will call the restart hook when |
2899 | - # it is this units turn to restart. This is also where we do |
2900 | - # actions done while Cassandra is not running, and where we do |
2901 | - # actions that should only be done by one node at a time. |
2902 | - rollingrestart.make_service([helpers.stop_cassandra, |
2903 | - helpers.remount_cassandra, |
2904 | - helpers.ensure_database_directories, |
2905 | - helpers.pre_bootstrap, |
2906 | - helpers.start_cassandra, |
2907 | - helpers.post_bootstrap, |
2908 | - helpers.wait_for_agreed_schema, |
2909 | - helpers.wait_for_normality, |
2910 | - helpers.emit_describe_cluster, |
2911 | - helpers.reset_default_password, |
2912 | - helpers.ensure_unit_superuser, |
2913 | - helpers.reset_auth_keyspace_replication]), |
2914 | - |
2915 | # Actions that must be done while Cassandra is running. |
2916 | dict(service='post', |
2917 | required_data=[RequiresLiveNode()], |
2918 | - data_ready=[actions.publish_database_relations, |
2919 | + data_ready=[actions.post_bootstrap, |
2920 | + actions.create_unit_superusers, |
2921 | + actions.reset_auth_keyspace_replication, |
2922 | + actions.publish_database_relations, |
2923 | actions.publish_database_admin_relations, |
2924 | actions.install_maintenance_crontab, |
2925 | - actions.emit_describe_cluster, |
2926 | - actions.emit_auth_keyspace_status, |
2927 | - actions.emit_netstats], |
2928 | + actions.nrpe_external_master_relation, |
2929 | + actions.emit_cluster_info, |
2930 | + actions.set_active], |
2931 | start=[], stop=[])] |
2932 | |
2933 | |
2934 | class RequiresLiveNode: |
2935 | def __bool__(self): |
2936 | - return self.is_live() |
2937 | + is_live = self.is_live() |
2938 | + hookenv.log('Requirement RequiresLiveNode: {}'.format(is_live), |
2939 | + hookenv.DEBUG) |
2940 | + return is_live |
2941 | |
2942 | def is_live(self): |
2943 | + if helpers.is_decommissioned(): |
2944 | + hookenv.log('Node is decommissioned') |
2945 | + return False |
2946 | + |
2947 | if helpers.is_cassandra_running(): |
2948 | - if helpers.is_decommissioned(): |
2949 | - # Node is decommissioned and will refuse to talk. |
2950 | - hookenv.log('Node is decommissioned') |
2951 | - return False |
2952 | - try: |
2953 | - with helpers.connect(): |
2954 | - hookenv.log("Node live and authentication working") |
2955 | - return True |
2956 | - except Exception as x: |
2957 | - hookenv.log( |
2958 | - 'Unable to connect as superuser: {}'.format(str(x)), |
2959 | - ERROR) |
2960 | - return False |
2961 | - return False |
2962 | + hookenv.log('Cassandra is running') |
2963 | + if hookenv.local_unit() in helpers.get_unit_superusers(): |
2964 | + hookenv.log('Credentials created') |
2965 | + return True |
2966 | + else: |
2967 | + hookenv.log('Credentials have not been created') |
2968 | + return False |
2969 | + else: |
2970 | + hookenv.log('Cassandra is not running') |
2971 | + return False |
2972 | |
2973 | |
2974 | def get_service_manager(): |
2975 | |
2976 | === modified file 'hooks/helpers.py' |
2977 | --- hooks/helpers.py 2015-04-17 10:06:34 +0000 |
2978 | +++ hooks/helpers.py 2015-06-30 07:36:26 +0000 |
2979 | @@ -13,14 +13,12 @@ |
2980 | # |
2981 | # You should have received a copy of the GNU General Public License |
2982 | # along with this program. If not, see <http://www.gnu.org/licenses/>. |
2983 | - |
2984 | import configparser |
2985 | from contextlib import contextmanager |
2986 | from datetime import timedelta |
2987 | import errno |
2988 | from functools import wraps |
2989 | import io |
2990 | -from itertools import chain |
2991 | import json |
2992 | import os.path |
2993 | import re |
2994 | @@ -37,13 +35,11 @@ |
2995 | import cassandra.query |
2996 | import yaml |
2997 | |
2998 | -from charmhelpers.contrib import unison |
2999 | from charmhelpers.core import hookenv, host |
3000 | from charmhelpers.core.hookenv import DEBUG, ERROR, WARNING |
3001 | from charmhelpers import fetch |
3002 | |
3003 | -import relations |
3004 | -import rollingrestart |
3005 | +from coordinator import coordinator |
3006 | |
3007 | |
3008 | RESTART_TIMEOUT = 600 |
3009 | @@ -101,10 +97,9 @@ |
3010 | if hookenv.config('extra_packages'): |
3011 | packages.extend(hookenv.config('extra_packages').split()) |
3012 | packages = fetch.filter_installed_packages(packages) |
3013 | - # if 'ntp' in packages: |
3014 | - # fetch.apt_install(['ntp'], fatal=True) # With autostart |
3015 | - # packages.remove('ntp') |
3016 | if packages: |
3017 | + # The DSE packages are huge, so this might take some time. |
3018 | + status_set('maintenance', 'Installing packages') |
3019 | with autostart_disabled(['cassandra']): |
3020 | fetch.apt_install(packages, fatal=True) |
3021 | |
3022 | @@ -128,26 +123,13 @@ |
3023 | dpkg.communicate(input=''.join(selections).encode('US-ASCII')) |
3024 | |
3025 | |
3026 | -def seed_ips(): |
3027 | +def get_seed_ips(): |
3028 | '''Return the set of seed ip addresses. |
3029 | |
3030 | - If this is the only unit in the service, then the local unit's IP |
3031 | - address is returned as the only seed. |
3032 | - |
3033 | - If there is more than one unit in the service, then the IP addresses |
3034 | - of three lowest numbered peers (if bootstrapped) are returned. |
3035 | + We use ip addresses rather than unit names, as we may need to use |
3036 | + external seed ips at some point. |
3037 | ''' |
3038 | - peers = rollingrestart.get_peers() |
3039 | - if not peers: |
3040 | - return set([hookenv.unit_private_ip()]) |
3041 | - |
3042 | - # The three lowest numbered units are seeds & may include this unit. |
3043 | - seeds = sorted(list(peers) + [hookenv.local_unit()], |
3044 | - key=lambda x: int(x.split('/')[-1]))[:3] |
3045 | - |
3046 | - relid = rollingrestart.get_peer_relation_id() |
3047 | - return set(hookenv.relation_get('private-address', seed, relid) |
3048 | - for seed in seeds) |
3049 | + return set((hookenv.leader_get('seeds') or '').split(',')) |
3050 | |
3051 | |
3052 | def actual_seed_ips(): |
3053 | @@ -163,6 +145,7 @@ |
3054 | Entries in the config file may be absolute, relative to |
3055 | /var/lib/cassandra, or relative to the mountpoint. |
3056 | ''' |
3057 | + import relations |
3058 | storage = relations.StorageRelation() |
3059 | if storage.mountpoint: |
3060 | root = os.path.join(storage.mountpoint, 'cassandra') |
3061 | @@ -177,9 +160,6 @@ |
3062 | |
3063 | Returns the absolute path. |
3064 | ''' |
3065 | - # Guard against changing perms on a running db. Although probably |
3066 | - # harmless, it causes shutil.chown() to fail. |
3067 | - assert not is_cassandra_running() |
3068 | absdir = get_database_directory(config_path) |
3069 | |
3070 | # Work around Bug #1427150 by ensuring components of the path are |
3071 | @@ -322,9 +302,7 @@ |
3072 | |
3073 | def get_cassandra_version(): |
3074 | if get_cassandra_edition() == 'dse': |
3075 | - # When we support multiple versions, we will need to map |
3076 | - # DataStax versions to Cassandra versions. |
3077 | - return '2.0' if get_package_version('dse-full') else None |
3078 | + return '2.1' if get_package_version('dse-full') else None |
3079 | return get_package_version('cassandra') |
3080 | |
3081 | |
3082 | @@ -352,8 +330,6 @@ |
3083 | edition = get_cassandra_edition() |
3084 | if edition == 'dse': |
3085 | pid_file = "/var/run/dse/dse.pid" |
3086 | - # elif apt_pkg.version_compare(get_cassandra_version(), "2.0") < 0: |
3087 | - # pid_file = "/var/run/cassandra.pid" |
3088 | else: |
3089 | pid_file = "/var/run/cassandra/cassandra.pid" |
3090 | return pid_file |
3091 | @@ -377,20 +353,20 @@ |
3092 | # agreement. |
3093 | pass |
3094 | else: |
3095 | + # NB. OpenJDK 8 not available in trusty. |
3096 | packages.add('openjdk-7-jre-headless') |
3097 | |
3098 | return packages |
3099 | |
3100 | |
3101 | @logged |
3102 | -def stop_cassandra(immediate=False): |
3103 | +def stop_cassandra(): |
3104 | if is_cassandra_running(): |
3105 | - if not immediate: |
3106 | - # If there are cluster operations in progress, wait until |
3107 | - # they are complete before restarting. This might take days. |
3108 | - wait_for_normality() |
3109 | + hookenv.log('Shutting down Cassandra') |
3110 | host.service_stop(get_cassandra_service()) |
3111 | - assert not is_cassandra_running() |
3112 | + if is_cassandra_running(): |
3113 | + hookenv.status_set('blocked', 'Cassandra failed to shut down') |
3114 | + raise SystemExit(0) |
3115 | |
3116 | |
3117 | @logged |
3118 | @@ -398,9 +374,19 @@ |
3119 | if is_cassandra_running(): |
3120 | return |
3121 | |
3122 | - actual_seeds = actual_seed_ips() |
3123 | + actual_seeds = sorted(actual_seed_ips()) |
3124 | assert actual_seeds, 'Attempting to start cassandra with empty seed list' |
3125 | - hookenv.log('Starting Cassandra with seeds {!r}'.format(actual_seeds)) |
3126 | + hookenv.config()['configured_seeds'] = actual_seeds |
3127 | + |
3128 | + if is_bootstrapped(): |
3129 | + status_set('maintenance', |
3130 | + 'Starting Cassandra with seeds {!r}' |
3131 | + .format(','.join(actual_seeds))) |
3132 | + else: |
3133 | + status_set('maintenance', |
3134 | + 'Bootstrapping with seeds {}' |
3135 | + .format(','.join(actual_seeds))) |
3136 | + |
3137 | host.service_start(get_cassandra_service()) |
3138 | |
3139 | # Wait for Cassandra to actually start, or abort. |
3140 | @@ -409,35 +395,8 @@ |
3141 | if is_cassandra_running(): |
3142 | return |
3143 | time.sleep(1) |
3144 | - hookenv.log('Cassandra failed to start.', ERROR) |
3145 | - raise SystemExit(1) |
3146 | - |
3147 | - |
3148 | -def is_responding(ip, port=None, timeout=5): |
3149 | - if port is None: |
3150 | - port = hookenv.config()['storage_port'] |
3151 | - try: |
3152 | - subprocess.check_output(['nc', '-zw', str(timeout), ip, str(port)], |
3153 | - stderr=subprocess.STDOUT) |
3154 | - hookenv.log('{} listening on port {}'.format(ip, port), DEBUG) |
3155 | - return True |
3156 | - except subprocess.TimeoutExpired: |
3157 | - hookenv.log('{} not listening on port {}'.format(ip, port), DEBUG) |
3158 | - return False |
3159 | - |
3160 | - |
3161 | -@logged |
3162 | -def are_all_nodes_responding(): |
3163 | - all_contactable = True |
3164 | - for ip in node_ips(): |
3165 | - if ip == hookenv.unit_private_ip(): |
3166 | - pass |
3167 | - elif is_responding(ip, timeout=5): |
3168 | - hookenv.log('{} is responding'.format(ip)) |
3169 | - else: |
3170 | - all_contactable = False |
3171 | - hookenv.log('{} is not responding'.format(ip)) |
3172 | - return all_contactable |
3173 | + status_set('blocked', 'Cassandra failed to start') |
3174 | + raise SystemExit(0) |
3175 | |
3176 | |
3177 | @logged |
3178 | @@ -451,8 +410,10 @@ |
3179 | def remount_cassandra(): |
3180 | '''If a new mountpoint is ready, migrate data across to it.''' |
3181 | assert not is_cassandra_running() # Guard against data loss. |
3182 | + import relations |
3183 | storage = relations.StorageRelation() |
3184 | if storage.needs_remount(): |
3185 | + status_set('maintenance', 'Migrating data to new mountpoint') |
3186 | hookenv.config()['bootstrapped_into_cluster'] = False |
3187 | if storage.mountpoint is None: |
3188 | hookenv.log('External storage AND DATA gone. ' |
3189 | @@ -468,6 +429,9 @@ |
3190 | @logged |
3191 | def ensure_database_directories(): |
3192 | '''Ensure that directories Cassandra expects to store its data in exist.''' |
3193 | + # Guard against changing perms on a running db. Although probably |
3194 | + # harmless, it causes shutil.chown() to fail. |
3195 | + assert not is_cassandra_running() |
3196 | db_dirs = get_all_database_directories() |
3197 | unpacked_db_dirs = (db_dirs['data_file_directories'] + |
3198 | [db_dirs['commitlog_directory']] + |
3199 | @@ -476,28 +440,7 @@ |
3200 | ensure_database_directory(db_dir) |
3201 | |
3202 | |
3203 | -@logged |
3204 | -def reset_default_password(): |
3205 | - config = hookenv.config() |
3206 | - if config.get('default_admin_password_changed', False): |
3207 | - hookenv.log('Default admin password changed in an earlier hook') |
3208 | - return |
3209 | - |
3210 | - # If we can connect using the default superuser password |
3211 | - # 'cassandra', change it to something random. We do this in a loop |
3212 | - # to ensure the hook that calls this is idempotent. |
3213 | - try: |
3214 | - with connect('cassandra', 'cassandra') as session: |
3215 | - hookenv.log('Changing default admin password') |
3216 | - query(session, 'ALTER USER cassandra WITH PASSWORD %s', |
3217 | - ConsistencyLevel.QUORUM, (host.pwgen(),)) |
3218 | - except cassandra.AuthenticationFailed: |
3219 | - hookenv.log('Default admin password already changed') |
3220 | - |
3221 | - config['default_admin_password_changed'] = True |
3222 | - |
3223 | - |
3224 | -CONNECT_TIMEOUT = 300 |
3225 | +CONNECT_TIMEOUT = 10 |
3226 | |
3227 | |
3228 | @contextmanager |
3229 | @@ -568,72 +511,45 @@ |
3230 | raise |
3231 | |
3232 | |
3233 | -def ensure_user(username, password, superuser=False): |
3234 | +def encrypt_password(password): |
3235 | + return bcrypt.hashpw(password, bcrypt.gensalt()) |
3236 | + |
3237 | + |
3238 | +@logged |
3239 | +def ensure_user(session, username, encrypted_password, superuser=False): |
3240 | '''Create the DB user if it doesn't already exist & reset the password.''' |
3241 | if superuser: |
3242 | hookenv.log('Creating SUPERUSER {}'.format(username)) |
3243 | - sup = 'SUPERUSER' |
3244 | else: |
3245 | hookenv.log('Creating user {}'.format(username)) |
3246 | - sup = 'NOSUPERUSER' |
3247 | - with connect() as session: |
3248 | - query(session, |
3249 | - 'CREATE USER IF NOT EXISTS %s ' |
3250 | - 'WITH PASSWORD %s {}'.format(sup), |
3251 | - ConsistencyLevel.QUORUM, (username, password,)) |
3252 | - query(session, 'ALTER USER %s WITH PASSWORD %s {}'.format(sup), |
3253 | - ConsistencyLevel.QUORUM, (username, password,)) |
3254 | - |
3255 | - |
3256 | -@logged |
3257 | -def ensure_unit_superuser(): |
3258 | - '''If the unit's superuser account is not working, recreate it.''' |
3259 | - try: |
3260 | - with connect(auth_timeout=10): |
3261 | - hookenv.log('Unit superuser account already setup', DEBUG) |
3262 | - return |
3263 | - except cassandra.AuthenticationFailed: |
3264 | - pass |
3265 | - |
3266 | - create_unit_superuser() # Doesn't exist or can't access, so create it. |
3267 | - |
3268 | - with connect(): |
3269 | - hookenv.log('Unit superuser password reset successful') |
3270 | - |
3271 | - |
3272 | -@logged |
3273 | -def create_unit_superuser(): |
3274 | + query(session, |
3275 | + 'INSERT INTO system_auth.users (name, super) VALUES (%s, %s)', |
3276 | + ConsistencyLevel.ALL, (username, superuser)) |
3277 | + query(session, |
3278 | + 'INSERT INTO system_auth.credentials (username, salted_hash) ' |
3279 | + 'VALUES (%s, %s)', |
3280 | + ConsistencyLevel.ALL, (username, encrypted_password)) |
3281 | + |
3282 | + |
3283 | +@logged |
3284 | +def create_unit_superuser_hard(): |
3285 | '''Create or recreate the unit's superuser account. |
3286 | |
3287 | - As there may be no known superuser credentials to use, we restart |
3288 | - the node using the AllowAllAuthenticator and insert our user |
3289 | - directly into the system_auth keyspace. |
3290 | + This method is used when there are no known superuser credentials |
3291 | + to use. We restart the node using the AllowAllAuthenticator and |
3292 | + insert our credentials directly into the system_auth keyspace. |
3293 | ''' |
3294 | username, password = superuser_credentials() |
3295 | + pwhash = encrypt_password(password) |
3296 | hookenv.log('Creating unit superuser {}'.format(username)) |
3297 | |
3298 | # Restart cassandra without authentication & listening on localhost. |
3299 | - wait_for_normality() |
3300 | reconfigure_and_restart_cassandra( |
3301 | dict(authenticator='AllowAllAuthenticator', rpc_address='localhost')) |
3302 | - wait_for_normality() |
3303 | - emit_cluster_info() |
3304 | for _ in backoff('superuser creation'): |
3305 | try: |
3306 | with connect() as session: |
3307 | - pwhash = bcrypt.hashpw(password, |
3308 | - bcrypt.gensalt()) # Cassandra 2.1 |
3309 | - statement = dedent('''\ |
3310 | - INSERT INTO system_auth.users (name, super) |
3311 | - VALUES (%s, TRUE) |
3312 | - ''') |
3313 | - query(session, statement, ConsistencyLevel.QUORUM, (username,)) |
3314 | - statement = dedent('''\ |
3315 | - INSERT INTO system_auth.credentials (username, salted_hash) |
3316 | - VALUES (%s, %s) |
3317 | - ''') |
3318 | - query(session, statement, |
3319 | - ConsistencyLevel.QUORUM, (username, pwhash)) |
3320 | + ensure_user(session, username, pwhash, superuser=True) |
3321 | break |
3322 | except Exception as x: |
3323 | print(str(x)) |
3324 | @@ -641,7 +557,6 @@ |
3325 | # Restart Cassandra with regular config. |
3326 | nodetool('flush') # Ensure our backdoor updates are flushed. |
3327 | reconfigure_and_restart_cassandra() |
3328 | - wait_for_normality() |
3329 | |
3330 | |
3331 | def get_cqlshrc_path(): |
3332 | @@ -703,12 +618,8 @@ |
3333 | sys.stdout.flush() |
3334 | |
3335 | |
3336 | -def nodetool(*cmd, ip=None, timeout=120): |
3337 | +def nodetool(*cmd, timeout=120): |
3338 | cmd = ['nodetool'] + [str(i) for i in cmd] |
3339 | - if ip is not None: |
3340 | - assert ip in unison.collect_authed_hosts( |
3341 | - rollingrestart.get_peer_relation_name()) |
3342 | - cmd = ['sudo', '-Hsu', 'juju_ssh', 'ssh', ip] + cmd |
3343 | i = 0 |
3344 | until = time.time() + timeout |
3345 | for _ in backoff('nodetool to work'): |
3346 | @@ -734,55 +645,16 @@ |
3347 | except subprocess.CalledProcessError as x: |
3348 | if i > 1: |
3349 | emit(x.output.expandtabs()) # Expand tabs for juju debug-log. |
3350 | + if not is_cassandra_running(): |
3351 | + status_set('blocked', |
3352 | + 'Cassandra has unexpectedly shutdown') |
3353 | + raise SystemExit(0) |
3354 | if time.time() >= until: |
3355 | raise |
3356 | |
3357 | |
3358 | -def node_ips(): |
3359 | - '''IP addresses of all nodes in the Cassandra cluster.''' |
3360 | - if is_bootstrapped() or num_peers() == 0: |
3361 | - # If bootstrapped, or standalone, we trust our local info. |
3362 | - raw = nodetool('status', 'system_auth') |
3363 | - else: |
3364 | - # If not bootstrapped, we query a seed. |
3365 | - authed_ips = unison.collect_authed_hosts( |
3366 | - rollingrestart.get_peer_relation_name()) |
3367 | - seeds = [ip for ip in seed_ips() if ip in authed_ips] |
3368 | - if not seeds: |
3369 | - return set() # Not bootstrapped, and nobody else is either. |
3370 | - raw = nodetool('status', 'system_auth', ip=seeds[0]) |
3371 | - |
3372 | - ips = set() |
3373 | - for line in raw.splitlines(): |
3374 | - match = re.search(r'^(\w)([NLJM])\s+([\d\.]+)\s', line) |
3375 | - if match is not None and (match.group(1) == 'U' |
3376 | - or match.group(2) != 'L'): |
3377 | - ips.add(match.group(3)) |
3378 | - return ips |
3379 | - |
3380 | - |
3381 | -def up_node_ips(): |
3382 | - '''IP addresses of nodes that are up.''' |
3383 | - raw = nodetool('status', 'system_auth') |
3384 | - ips = set() |
3385 | - for line in raw.splitlines(): |
3386 | - match = re.search(r'^(\w)([NLJM])\s+([\d\.]+)\s', line) |
3387 | - if match is not None and match.group(1) == 'U': # Up |
3388 | - ips.add(match.group(3)) |
3389 | - return ips |
3390 | - |
3391 | - |
3392 | def num_nodes(): |
3393 | - '''Number of nodes in the Cassandra cluster. |
3394 | - |
3395 | - This is not necessarily the same as the number of peers, as nodes |
3396 | - may be decommissioned. |
3397 | - ''' |
3398 | - return len(node_ips()) |
3399 | - |
3400 | - |
3401 | -def num_peers(): |
3402 | - return len(rollingrestart.get_peers()) |
3403 | + return len(get_bootstrapped_ips()) |
3404 | |
3405 | |
3406 | def read_cassandra_yaml(): |
3407 | @@ -819,7 +691,8 @@ |
3408 | 'storage_port', 'ssl_storage_port'] |
3409 | cassandra_yaml.update((k, config[k]) for k in simple_config_keys) |
3410 | |
3411 | - seeds = ','.join(seeds or seed_ips()) # Don't include whitespace! |
3412 | + seeds = ','.join(seeds or get_seed_ips()) # Don't include whitespace! |
3413 | + hookenv.log('Configuring seeds as {!r}'.format(seeds), DEBUG) |
3414 | assert seeds, 'Attempting to configure cassandra with empty seed list' |
3415 | cassandra_yaml['seed_provider'][0]['parameters'][0]['seeds'] = seeds |
3416 | |
3417 | @@ -892,36 +765,6 @@ |
3418 | return False |
3419 | |
3420 | |
3421 | -@logged |
3422 | -def reset_all_io_schedulers(): |
3423 | - dirs = get_all_database_directories() |
3424 | - dirs = (dirs['data_file_directories'] + [dirs['commitlog_directory']] + |
3425 | - [dirs['saved_caches_directory']]) |
3426 | - config = hookenv.config() |
3427 | - for d in dirs: |
3428 | - if os.path.isdir(d): # Directory may not exist yet. |
3429 | - set_io_scheduler(config['io_scheduler'], d) |
3430 | - |
3431 | - |
3432 | -@logged |
3433 | -def reset_auth_keyspace_replication(): |
3434 | - # Cassandra requires you to manually set the replication factor of |
3435 | - # the system_auth keyspace, to ensure availability and redundancy. |
3436 | - n = max(min(num_nodes(), 3), 1) # Cap at 1-3 replicas. |
3437 | - datacenter = hookenv.config()['datacenter'] |
3438 | - with connect() as session: |
3439 | - strategy_opts = get_auth_keyspace_replication(session) |
3440 | - rf = int(strategy_opts.get(datacenter, -1)) |
3441 | - hookenv.log('system_auth rf={!r}'.format(strategy_opts)) |
3442 | - if rf != n: |
3443 | - strategy_opts['class'] = 'NetworkTopologyStrategy' |
3444 | - strategy_opts[datacenter] = n |
3445 | - if 'replication_factor' in strategy_opts: |
3446 | - del strategy_opts['replication_factor'] |
3447 | - set_auth_keyspace_replication(session, strategy_opts) |
3448 | - repair_auth_keyspace() |
3449 | - |
3450 | - |
3451 | def get_auth_keyspace_replication(session): |
3452 | statement = dedent('''\ |
3453 | SELECT strategy_options FROM system.schema_keyspaces |
3454 | @@ -931,321 +774,96 @@ |
3455 | return json.loads(r[0][0]) |
3456 | |
3457 | |
3458 | +@logged |
3459 | def set_auth_keyspace_replication(session, settings): |
3460 | - wait_for_normality() |
3461 | - hookenv.log('Updating system_auth rf={!r}'.format(settings)) |
3462 | + # Live operation, so keep status the same. |
3463 | + status_set(hookenv.status_get(), |
3464 | + 'Updating system_auth rf to {!r}'.format(settings)) |
3465 | statement = 'ALTER KEYSPACE system_auth WITH REPLICATION = %s' |
3466 | - query(session, statement, ConsistencyLevel.QUORUM, (settings,)) |
3467 | + query(session, statement, ConsistencyLevel.ALL, (settings,)) |
3468 | |
3469 | |
3470 | @logged |
3471 | def repair_auth_keyspace(): |
3472 | - # First, wait for schema agreement. Attempting to repair a keyspace |
3473 | - # with inconsistent replication settings will fail. |
3474 | - wait_for_agreed_schema() |
3475 | # Repair takes a long time, and may need to be retried due to 'snapshot |
3476 | # creation' errors, but should certainly complete within an hour since |
3477 | # the keyspace is tiny. |
3478 | + status_set(hookenv.status_get(), |
3479 | + 'Repairing system_auth keyspace') |
3480 | nodetool('repair', 'system_auth', timeout=3600) |
3481 | |
3482 | |
3483 | -def non_system_keyspaces(): |
3484 | - # If there are only system keyspaces defined, there is no data we |
3485 | - # may want to preserve and we can safely proceed with the bootstrap. |
3486 | - # This should always be the case, but it is worth checking for weird |
3487 | - # situations such as reusing an existing external mount without |
3488 | - # clearing its data. |
3489 | - dfds = get_all_database_directories()['data_file_directories'] |
3490 | - keyspaces = set(chain(*[os.listdir(dfd) for dfd in dfds])) |
3491 | - hookenv.log('keyspaces={!r}'.format(keyspaces), DEBUG) |
3492 | - return keyspaces - set(['system', 'system_auth', 'system_traces', |
3493 | - 'dse_system']) |
3494 | - |
3495 | - |
3496 | -def nuke_local_database(): |
3497 | - '''Destroy the local database, entirely, so this node may bootstrap. |
3498 | - |
3499 | - This needs to be lower level than just removing selected keyspaces |
3500 | - such as 'system', as commitlogs and other crumbs can also cause the |
3501 | - bootstrap process to fail disasterously. |
3502 | - ''' |
3503 | - # This function is dangerous enough to warrent a guard. |
3504 | - assert not is_bootstrapped() |
3505 | - assert unit_number() != 0 |
3506 | - |
3507 | - keyspaces = non_system_keyspaces() |
3508 | - if keyspaces: |
3509 | - hookenv.log('Non-system keyspaces {!r} detected. ' |
3510 | - 'Unable to bootstrap.'.format(keyspaces), ERROR) |
3511 | - raise SystemExit(1) |
3512 | - |
3513 | - dirs = get_all_database_directories() |
3514 | - nuke_directory_contents(dirs['saved_caches_directory']) |
3515 | - nuke_directory_contents(dirs['commitlog_directory']) |
3516 | - for dfd in dirs['data_file_directories']: |
3517 | - nuke_directory_contents(dfd) |
3518 | - |
3519 | - |
3520 | -def nuke_directory_contents(d): |
3521 | - '''Remove the contents of directory d, leaving d in place. |
3522 | - |
3523 | - We don't remove the top level directory as it may be a mount |
3524 | - or symlink we cannot recreate. |
3525 | - ''' |
3526 | - for name in os.listdir(d): |
3527 | - path = os.path.join(d, name) |
3528 | - if os.path.isdir(path): |
3529 | - shutil.rmtree(path) |
3530 | - else: |
3531 | - os.remove(path) |
3532 | - |
3533 | - |
3534 | -def unit_number(unit=None): |
3535 | - if unit is None: |
3536 | - unit = hookenv.local_unit() |
3537 | - return int(unit.split('/')[-1]) |
3538 | - |
3539 | - |
3540 | def is_bootstrapped(unit=None): |
3541 | '''Return True if the node has already bootstrapped into the cluster.''' |
3542 | - if unit is None: |
3543 | - unit = hookenv.local_unit() |
3544 | - if unit.endswith('/0'): |
3545 | - return True |
3546 | - relid = rollingrestart.get_peer_relation_id() |
3547 | - if not relid: |
3548 | + if unit is None or unit == hookenv.local_unit(): |
3549 | + return hookenv.config().get('bootstrapped', False) |
3550 | + elif coordinator.relid: |
3551 | + return bool(hookenv.relation_get(rid=coordinator.relid, |
3552 | + unit=unit).get('bootstrapped')) |
3553 | + else: |
3554 | return False |
3555 | - return hookenv.relation_get('bootstrapped', unit, relid) == '1' |
3556 | - |
3557 | - |
3558 | -def set_bootstrapped(flag): |
3559 | - relid = rollingrestart.get_peer_relation_id() |
3560 | - if bool(flag) is not is_bootstrapped(): |
3561 | - hookenv.log('Setting bootstrapped to {}'.format(flag)) |
3562 | - |
3563 | - if flag: |
3564 | - hookenv.relation_set(relid, bootstrapped="1") |
3565 | - else: |
3566 | - hookenv.relation_set(relid, bootstrapped=None) |
3567 | - |
3568 | - |
3569 | -def bootstrapped_peers(): |
3570 | - '''Return the ordered list of bootstrapped peers''' |
3571 | - peer_relid = rollingrestart.get_peer_relation_id() |
3572 | - return [peer for peer in rollingrestart.get_peers() |
3573 | - if hookenv.relation_get('bootstrapped', peer, peer_relid) == '1'] |
3574 | - |
3575 | - |
3576 | -def unbootstrapped_peers(): |
3577 | - '''Return the ordered list of unbootstrapped peers''' |
3578 | - peer_relid = rollingrestart.get_peer_relation_id() |
3579 | - return [peer for peer in rollingrestart.get_peers() |
3580 | - if hookenv.relation_get('bootstrapped', peer, peer_relid) != '1'] |
3581 | - |
3582 | - |
3583 | -@logged |
3584 | -def pre_bootstrap(): |
3585 | - """If we are about to bootstrap a node, prepare. |
3586 | - |
3587 | - Only the first node in the cluster is not bootstrapped. All other |
3588 | - nodes added to the cluster need to bootstrap. To bootstrap, the local |
3589 | - node needs to be shutdown, the database needs to be completely reset, |
3590 | - and the node restarted with a valid seed_node. |
3591 | - |
3592 | - Until juju gains the necessary features, the best we can do is assume |
3593 | - that Unit 0 should be the initial seed node. If the service has been |
3594 | - running for some time and never had a second unit added, it is almost |
3595 | - certain that it is Unit 0 that contains data we want to keep. This |
3596 | - assumption is false if you have removed the only unit in the service |
3597 | - at some point, so don't do that. |
3598 | - """ |
3599 | - if is_bootstrapped(): |
3600 | - hookenv.log("Already bootstrapped") |
3601 | - return |
3602 | - |
3603 | - if num_peers() == 0: |
3604 | - hookenv.log("No peers, no cluster, no bootstrapping") |
3605 | - return |
3606 | - |
3607 | - if not seed_ips(): |
3608 | - hookenv.log("No seeds available. Deferring bootstrap.") |
3609 | - raise rollingrestart.DeferRestart() |
3610 | - |
3611 | - # Don't attempt to bootstrap until all lower numbered units have |
3612 | - # bootstrapped. We need to do this as the rollingrestart algorithm |
3613 | - # fails during initial cluster setup, where two or more newly joined |
3614 | - # units may restart (and bootstrap) at the same time and exceed |
3615 | - # Cassandra's limits on the number of nodes that may bootstrap |
3616 | - # simultaneously. In addition, this ensures we wait until there is |
3617 | - # at least one seed bootstrapped, as the three first units will be |
3618 | - # seeds. |
3619 | - for peer in unbootstrapped_peers(): |
3620 | - if unit_number(peer) < unit_number(): |
3621 | - hookenv.log("{} is not bootstrapped. Deferring.".format(peer)) |
3622 | - raise rollingrestart.DeferRestart() |
3623 | - |
3624 | - # Don't attempt to bootstrap until all bootstrapped peers have |
3625 | - # authorized us. |
3626 | - peer_relname = rollingrestart.get_peer_relation_name() |
3627 | - peer_relid = rollingrestart.get_peer_relation_id() |
3628 | - authed_ips = unison.collect_authed_hosts(peer_relname) |
3629 | - for peer in bootstrapped_peers(): |
3630 | - peer_ip = hookenv.relation_get('private-address', peer, peer_relid) |
3631 | - if peer_ip not in authed_ips: |
3632 | - hookenv.log("{} has not authorized us. Deferring.".format(peer)) |
3633 | - raise rollingrestart.DeferRestart() |
3634 | - |
3635 | - # Bootstrap fail if we haven't yet opened our ports to all |
3636 | - # the bootstrapped nodes. This is the case if we have not yet |
3637 | - # joined the peer relationship with the node's unit. |
3638 | - missing = node_ips() - peer_ips() - set([hookenv.unit_private_ip()]) |
3639 | - if missing: |
3640 | - hookenv.log("Not yet in a peer relationship with {!r}. " |
3641 | - "Deferring bootstrap".format(missing)) |
3642 | - raise rollingrestart.DeferRestart() |
3643 | - |
3644 | - # Bootstrap will fail if all the nodes that contain data that needs |
3645 | - # to move to the new node are down. If you are storing data in |
3646 | - # rf==1, bootstrap can fail if a single node is not contactable. As |
3647 | - # far as this charm is concerned, we should not attempt bootstrap |
3648 | - # until all the nodes in the cluster and contactable. This also |
3649 | - # ensures that peers have run their relation-changed hook and |
3650 | - # opened their firewall ports to the new unit. |
3651 | - if not are_all_nodes_responding(): |
3652 | - raise rollingrestart.DeferRestart() |
3653 | - |
3654 | - hookenv.log('Joining cluster and need to bootstrap.') |
3655 | - |
3656 | - nuke_local_database() |
3657 | - |
3658 | - |
3659 | -@logged |
3660 | -def post_bootstrap(): |
3661 | - '''Maintain state on if the node has bootstrapped into the cluster. |
3662 | - |
3663 | - Per documented procedure for adding new units to a cluster, wait 2 |
3664 | - minutes if the unit has just bootstrapped to ensure no other. |
3665 | + |
3666 | + |
3667 | +def set_bootstrapped(): |
3668 | + # We need to store this flag in two locations. The peer relation, |
3669 | + # so peers can see it, and local state, for when we haven't joined |
3670 | + # the peer relation yet. actions.publish_bootstrapped_flag() |
3671 | + # calls this method again when necessary to ensure that state is |
3672 | + # propagated # if/when the peer relation is joined. |
3673 | + config = hookenv.config() |
3674 | + config['bootstrapped'] = True |
3675 | + if coordinator.relid is not None: |
3676 | + hookenv.relation_set(coordinator.relid, bootstrapped="1") |
3677 | + if config.changed('bootstrapped'): |
3678 | + hookenv.log('Bootstrapped') |
3679 | + else: |
3680 | + hookenv.log('Already bootstrapped') |
3681 | + |
3682 | + |
3683 | +def get_bootstrapped(): |
3684 | + units = [hookenv.local_unit()] |
3685 | + if coordinator.relid is not None: |
3686 | + units.extend(hookenv.related_units(coordinator.relid)) |
3687 | + return set([unit for unit in units if is_bootstrapped(unit)]) |
3688 | + |
3689 | + |
3690 | +def get_bootstrapped_ips(): |
3691 | + return set([unit_to_ip(unit) for unit in get_bootstrapped()]) |
3692 | + |
3693 | + |
3694 | +def unit_to_ip(unit): |
3695 | + if unit is None or unit == hookenv.local_unit(): |
3696 | + return hookenv.unit_private_ip() |
3697 | + elif coordinator.relid: |
3698 | + return hookenv.relation_get(rid=coordinator.relid, |
3699 | + unit=unit).get('private-address') |
3700 | + else: |
3701 | + return None |
3702 | + |
3703 | + |
3704 | +def get_node_status(): |
3705 | + '''Return the Cassandra node status. |
3706 | + |
3707 | + May be NORMAL, JOINING, DECOMMISSIONED etc., or None if we can't tell. |
3708 | ''' |
3709 | - if num_peers() == 0: |
3710 | - # There is no cluster (just us), so we are not bootstrapped into |
3711 | - # the cluster. |
3712 | - return |
3713 | - |
3714 | - if not is_bootstrapped(): |
3715 | - for _ in backoff('bootstrap to finish'): |
3716 | - if not is_cassandra_running(): |
3717 | - hookenv.log('Bootstrap failed. Retrying') |
3718 | - start_cassandra() |
3719 | - break |
3720 | - if is_all_normal(): |
3721 | - hookenv.log('Cluster settled after bootstrap.') |
3722 | - config = hookenv.config() |
3723 | - hookenv.log('Waiting {}s.'.format( |
3724 | - config['post_bootstrap_delay'])) |
3725 | - time.sleep(config['post_bootstrap_delay']) |
3726 | - if is_cassandra_running(): |
3727 | - set_bootstrapped(True) |
3728 | - break |
3729 | - |
3730 | - |
3731 | -def is_schema_agreed(): |
3732 | - '''Return True if all the nodes that are up agree on a schema.''' |
3733 | - up_ips = set(up_node_ips()) |
3734 | - # Always include ourself since we may be joining just now. |
3735 | - up_ips.add(hookenv.unit_private_ip()) |
3736 | - raw = nodetool('describecluster') |
3737 | - # The output of nodetool describe cluster is almost yaml, |
3738 | - # so we use that tool once we fix the tabs. |
3739 | - description = yaml.load(raw.expandtabs()) |
3740 | - versions = description['Cluster Information']['Schema versions'] or {} |
3741 | - |
3742 | - for schema, schema_ips in versions.items(): |
3743 | - schema_ips = set(schema_ips) |
3744 | - if up_ips.issubset(schema_ips): |
3745 | - hookenv.log('{!r} agree on schema'.format(up_ips), DEBUG) |
3746 | - return True |
3747 | - hookenv.log('{!r} do not agree on schema'.format(up_ips), DEBUG) |
3748 | + if not is_cassandra_running(): |
3749 | + return None |
3750 | + raw = nodetool('netstats') |
3751 | + m = re.search(r'(?m)^Mode:\s+(\w+)$', raw) |
3752 | + if m is None: |
3753 | + return None |
3754 | + return m.group(1).upper() |
3755 | + |
3756 | + |
3757 | +def is_decommissioned(): |
3758 | + status = get_node_status() |
3759 | + if status in ('DECOMMISSIONED', 'LEAVING'): |
3760 | + hookenv.log('This node is {}'.format(status), WARNING) |
3761 | + return True |
3762 | return False |
3763 | |
3764 | |
3765 | @logged |
3766 | -def wait_for_agreed_schema(): |
3767 | - for _ in backoff('schema agreement'): |
3768 | - if is_schema_agreed(): |
3769 | - return |
3770 | - |
3771 | - |
3772 | -def peer_ips(): |
3773 | - ips = set() |
3774 | - relid = rollingrestart.get_peer_relation_id() |
3775 | - if relid is not None: |
3776 | - for unit in hookenv.related_units(relid): |
3777 | - ip = hookenv.relation_get('private-address', unit, relid) |
3778 | - ips.add(ip) |
3779 | - return ips |
3780 | - |
3781 | - |
3782 | -def is_all_normal(timeout=120): |
3783 | - '''All nodes in the cluster report status Normal. |
3784 | - |
3785 | - Returns false if a node is joining, leaving or moving in the ring. |
3786 | - ''' |
3787 | - is_all_normal = True |
3788 | - try: |
3789 | - raw = nodetool('status', 'system_auth', timeout=timeout) |
3790 | - except subprocess.TimeoutExpired: |
3791 | - return False |
3792 | - node_status_re = re.compile('^(.)([NLJM])\s+([\d\.]+)\s') |
3793 | - for line in raw.splitlines(): |
3794 | - match = node_status_re.search(line) |
3795 | - if match is not None: |
3796 | - updown, mode, address = match.groups() |
3797 | - if updown == 'D': |
3798 | - # 'Down' is unfortunately just informative. During service |
3799 | - # teardown, nodes will disappear without decommissioning |
3800 | - # leaving these entries. |
3801 | - if address == hookenv.unit_private_ip(): |
3802 | - is_all_normal = False |
3803 | - hookenv.log('Node {} (this node) is down'.format(address)) |
3804 | - else: |
3805 | - hookenv.log('Node {} is down'.format(address)) |
3806 | - elif updown == '?': # CASSANDRA-8791 |
3807 | - is_all_normal = False |
3808 | - hookenv.log('Node {} is transitioning'.format(address)) |
3809 | - |
3810 | - if mode == 'L': |
3811 | - hookenv.log('Node {} is leaving the cluster'.format(address)) |
3812 | - is_all_normal = False |
3813 | - elif mode == 'J': |
3814 | - hookenv.log('Node {} is joining the cluster'.format(address)) |
3815 | - is_all_normal = False |
3816 | - elif mode == 'M': |
3817 | - hookenv.log('Node {} is moving ring position'.format(address)) |
3818 | - is_all_normal = False |
3819 | - return is_all_normal |
3820 | - |
3821 | - |
3822 | -@logged |
3823 | -def wait_for_normality(): |
3824 | - for _ in backoff('cluster operators to complete'): |
3825 | - if is_all_normal(): |
3826 | - return |
3827 | - |
3828 | - |
3829 | -def is_decommissioned(): |
3830 | - if not is_cassandra_running(): |
3831 | - return False # Decommissioned nodes are not shut down. |
3832 | - |
3833 | - for _ in backoff('stable node mode'): |
3834 | - raw = nodetool('netstats') |
3835 | - if 'Mode: DECOMMISSIONED' in raw: |
3836 | - hookenv.log('This node is DECOMMISSIONED', WARNING) |
3837 | - return True |
3838 | - elif 'Mode: NORMAL' in raw: |
3839 | - return False |
3840 | - |
3841 | - |
3842 | -@logged |
3843 | def emit_describe_cluster(): |
3844 | '''Run nodetool describecluster for the logs.''' |
3845 | nodetool('describecluster') # Implicit emit |
3846 | @@ -1309,3 +927,62 @@ |
3847 | # FOR CHARMHELPERS. This should be a constant in nrpe.py |
3848 | def local_plugins_dir(): |
3849 | return '/usr/local/lib/nagios/plugins' |
3850 | + |
3851 | + |
3852 | +def leader_ping(): |
3853 | + '''Make a change in the leader settings, waking the non-leaders.''' |
3854 | + assert hookenv.is_leader() |
3855 | + last = int(hookenv.leader_get('ping') or 0) |
3856 | + hookenv.leader_set(ping=str(last + 1)) |
3857 | + |
3858 | + |
3859 | +def get_unit_superusers(): |
3860 | + '''Return the set of units that have had their superuser accounts created. |
3861 | + ''' |
3862 | + raw = hookenv.leader_get('superusers') |
3863 | + return set(json.loads(raw or '[]')) |
3864 | + |
3865 | + |
3866 | +def set_unit_superusers(superusers): |
3867 | + hookenv.leader_set(superusers=json.dumps(sorted(superusers))) |
3868 | + |
3869 | + |
3870 | +def status_set(state, message): |
3871 | + '''Set the unit status and log a message.''' |
3872 | + hookenv.status_set(state, message) |
3873 | + hookenv.log('{} unit state: {}'.format(state, message)) |
3874 | + |
3875 | + |
3876 | +def service_status_set(state, message): |
3877 | + '''Set the service status and log a message.''' |
3878 | + subprocess.check_call(['status-set', '--service', state, message]) |
3879 | + hookenv.log('{} service state: {}'.format(state, message)) |
3880 | + |
3881 | + |
3882 | +def get_service_name(relid): |
3883 | + '''Return the service name for the other end of relid.''' |
3884 | + units = hookenv.related_units(relid) |
3885 | + if units: |
3886 | + return units[0].split('/', 1)[0] |
3887 | + else: |
3888 | + return None |
3889 | + |
3890 | + |
3891 | +def peer_relid(): |
3892 | + return coordinator.relid |
3893 | + |
3894 | + |
3895 | +@logged |
3896 | +def set_active(): |
3897 | + '''Set happy state''' |
3898 | + if hookenv.unit_private_ip() in get_seed_ips(): |
3899 | + msg = 'Live seed' |
3900 | + else: |
3901 | + msg = 'Live node' |
3902 | + status_set('active', msg) |
3903 | + |
3904 | + if hookenv.is_leader(): |
3905 | + n = num_nodes() |
3906 | + if n == 1: |
3907 | + n = 'Single' |
3908 | + service_status_set('active', '{} node cluster'.format(n)) |
3909 | |
3910 | === modified file 'hooks/hooks.py' (properties changed: +x to -x) |
3911 | --- hooks/hooks.py 2015-04-28 16:20:05 +0000 |
3912 | +++ hooks/hooks.py 2015-06-30 07:36:26 +0000 |
3913 | @@ -1,5 +1,19 @@ |
3914 | #!/usr/bin/python3 |
3915 | - |
3916 | +# Copyright 2015 Canonical Ltd. |
3917 | +# |
3918 | +# This file is part of the Cassandra Charm for Juju. |
3919 | +# |
3920 | +# This program is free software: you can redistribute it and/or modify |
3921 | +# it under the terms of the GNU General Public License version 3, as |
3922 | +# published by the Free Software Foundation. |
3923 | +# |
3924 | +# This program is distributed in the hope that it will be useful, but |
3925 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
3926 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
3927 | +# PURPOSE. See the GNU General Public License for more details. |
3928 | +# |
3929 | +# You should have received a copy of the GNU General Public License |
3930 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
3931 | from charmhelpers import fetch |
3932 | from charmhelpers.core import hookenv |
3933 | |
3934 | @@ -18,20 +32,21 @@ |
3935 | |
3936 | |
3937 | def default_hook(): |
3938 | + if not hookenv.has_juju_version('1.24'): |
3939 | + hookenv.status_set('blocked', 'Requires Juju 1.24 or higher') |
3940 | + # Error state, since we don't have 1.24 to give a nice blocked state. |
3941 | + raise SystemExit(1) |
3942 | + |
3943 | # These need to be imported after bootstrap() or required Python |
3944 | # packages may not have been installed. |
3945 | import definitions |
3946 | - from loglog import loglog |
3947 | |
3948 | # Only useful for debugging, or perhaps have this enabled with a config |
3949 | # option? |
3950 | - loglog('/var/log/cassandra/system.log', prefix='C*: ') |
3951 | + ## from loglog import loglog |
3952 | + ## loglog('/var/log/cassandra/system.log', prefix='C*: ') |
3953 | |
3954 | hookenv.log('*** {} Hook Start'.format(hookenv.hook_name())) |
3955 | sm = definitions.get_service_manager() |
3956 | sm.manage() |
3957 | hookenv.log('*** {} Hook Done'.format(hookenv.hook_name())) |
3958 | - |
3959 | -if __name__ == '__main__': |
3960 | - bootstrap() |
3961 | - default_hook() |
3962 | |
3963 | === modified symlink 'hooks/install' (properties changed: -x to +x) |
3964 | === target was u'hooks.py' |
3965 | --- hooks/install 1970-01-01 00:00:00 +0000 |
3966 | +++ hooks/install 2015-06-30 07:36:26 +0000 |
3967 | @@ -0,0 +1,20 @@ |
3968 | +#!/usr/bin/python3 |
3969 | +# Copyright 2015 Canonical Ltd. |
3970 | +# |
3971 | +# This file is part of the Cassandra Charm for Juju. |
3972 | +# |
3973 | +# This program is free software: you can redistribute it and/or modify |
3974 | +# it under the terms of the GNU General Public License version 3, as |
3975 | +# published by the Free Software Foundation. |
3976 | +# |
3977 | +# This program is distributed in the hope that it will be useful, but |
3978 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
3979 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
3980 | +# PURPOSE. See the GNU General Public License for more details. |
3981 | +# |
3982 | +# You should have received a copy of the GNU General Public License |
3983 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
3984 | +import hooks |
3985 | +if __name__ == '__main__': |
3986 | + hooks.bootstrap() |
3987 | + hooks.default_hook() |
3988 | |
3989 | === added file 'hooks/leader-elected' |
3990 | --- hooks/leader-elected 1970-01-01 00:00:00 +0000 |
3991 | +++ hooks/leader-elected 2015-06-30 07:36:26 +0000 |
3992 | @@ -0,0 +1,20 @@ |
3993 | +#!/usr/bin/python3 |
3994 | +# Copyright 2015 Canonical Ltd. |
3995 | +# |
3996 | +# This file is part of the Cassandra Charm for Juju. |
3997 | +# |
3998 | +# This program is free software: you can redistribute it and/or modify |
3999 | +# it under the terms of the GNU General Public License version 3, as |
4000 | +# published by the Free Software Foundation. |
4001 | +# |
4002 | +# This program is distributed in the hope that it will be useful, but |
4003 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
4004 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
4005 | +# PURPOSE. See the GNU General Public License for more details. |
4006 | +# |
4007 | +# You should have received a copy of the GNU General Public License |
4008 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
4009 | +import hooks |
4010 | +if __name__ == '__main__': |
4011 | + hooks.bootstrap() |
4012 | + hooks.default_hook() |
4013 | |
4014 | === added file 'hooks/leader-settings-changed' |
4015 | --- hooks/leader-settings-changed 1970-01-01 00:00:00 +0000 |
4016 | +++ hooks/leader-settings-changed 2015-06-30 07:36:26 +0000 |
4017 | @@ -0,0 +1,20 @@ |
4018 | +#!/usr/bin/python3 |
4019 | +# Copyright 2015 Canonical Ltd. |
4020 | +# |
4021 | +# This file is part of the Cassandra Charm for Juju. |
4022 | +# |
4023 | +# This program is free software: you can redistribute it and/or modify |
4024 | +# it under the terms of the GNU General Public License version 3, as |
4025 | +# published by the Free Software Foundation. |
4026 | +# |
4027 | +# This program is distributed in the hope that it will be useful, but |
4028 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
4029 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
4030 | +# PURPOSE. See the GNU General Public License for more details. |
4031 | +# |
4032 | +# You should have received a copy of the GNU General Public License |
4033 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
4034 | +import hooks |
4035 | +if __name__ == '__main__': |
4036 | + hooks.bootstrap() |
4037 | + hooks.default_hook() |
4038 | |
4039 | === modified symlink 'hooks/nrpe-external-master-relation-changed' (properties changed: -x to +x) |
4040 | === target was u'hooks.py' |
4041 | --- hooks/nrpe-external-master-relation-changed 1970-01-01 00:00:00 +0000 |
4042 | +++ hooks/nrpe-external-master-relation-changed 2015-06-30 07:36:26 +0000 |
4043 | @@ -0,0 +1,20 @@ |
4044 | +#!/usr/bin/python3 |
4045 | +# Copyright 2015 Canonical Ltd. |
4046 | +# |
4047 | +# This file is part of the Cassandra Charm for Juju. |
4048 | +# |
4049 | +# This program is free software: you can redistribute it and/or modify |
4050 | +# it under the terms of the GNU General Public License version 3, as |
4051 | +# published by the Free Software Foundation. |
4052 | +# |
4053 | +# This program is distributed in the hope that it will be useful, but |
4054 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
4055 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
4056 | +# PURPOSE. See the GNU General Public License for more details. |
4057 | +# |
4058 | +# You should have received a copy of the GNU General Public License |
4059 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
4060 | +import hooks |
4061 | +if __name__ == '__main__': |
4062 | + hooks.bootstrap() |
4063 | + hooks.default_hook() |
4064 | |
4065 | === modified file 'hooks/relations.py' |
4066 | --- hooks/relations.py 2015-02-02 15:52:50 +0000 |
4067 | +++ hooks/relations.py 2015-06-30 07:36:26 +0000 |
4068 | @@ -22,8 +22,22 @@ |
4069 | from charmhelpers.core.hookenv import log, WARNING |
4070 | from charmhelpers.core.services.helpers import RelationContext |
4071 | |
4072 | - |
4073 | -# FOR CHARMHELPERS |
4074 | +from coordinator import coordinator |
4075 | + |
4076 | + |
4077 | +class PeerRelation(RelationContext): |
4078 | + interface = 'cassandra-cluster' |
4079 | + name = 'cluster' |
4080 | + |
4081 | + def is_ready(self): |
4082 | + # All units except the leader need to wait until the peer |
4083 | + # relation is available. |
4084 | + if coordinator.relid is not None or hookenv.is_leader(): |
4085 | + return True |
4086 | + return False |
4087 | + |
4088 | + |
4089 | +# FOR CHARMHELPERS (if we can integrate Juju 1.24 storage too) |
4090 | class StorageRelation(RelationContext): |
4091 | '''Wait for the block storage mount to become available. |
4092 | |
4093 | @@ -80,7 +94,9 @@ |
4094 | return False |
4095 | return True |
4096 | |
4097 | - def provide_data(self): |
4098 | + def provide_data(self, remote_service, service_ready): |
4099 | + hookenv.log('Requesting mountpoint {} from {}' |
4100 | + .format(self._requested_mountpoint, remote_service)) |
4101 | return dict(mountpoint=self._requested_mountpoint) |
4102 | |
4103 | def needs_remount(self): |
4104 | |
4105 | === removed file 'hooks/rollingrestart.py' |
4106 | --- hooks/rollingrestart.py 2015-02-14 04:58:55 +0000 |
4107 | +++ hooks/rollingrestart.py 1970-01-01 00:00:00 +0000 |
4108 | @@ -1,257 +0,0 @@ |
4109 | -# Copyright 2015 Canonical Ltd. |
4110 | -# |
4111 | -# This file is part of the Cassandra Charm for Juju. |
4112 | -# |
4113 | -# This program is free software: you can redistribute it and/or modify |
4114 | -# it under the terms of the GNU General Public License version 3, as |
4115 | -# published by the Free Software Foundation. |
4116 | -# |
4117 | -# This program is distributed in the hope that it will be useful, but |
4118 | -# WITHOUT ANY WARRANTY; without even the implied warranties of |
4119 | -# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
4120 | -# PURPOSE. See the GNU General Public License for more details. |
4121 | -# |
4122 | -# You should have received a copy of the GNU General Public License |
4123 | -# along with this program. If not, see <http://www.gnu.org/licenses/>. |
4124 | - |
4125 | -from datetime import datetime |
4126 | -import os.path |
4127 | - |
4128 | -from charmhelpers.contrib import peerstorage |
4129 | -from charmhelpers.core import hookenv, host |
4130 | -from charmhelpers.core.hookenv import DEBUG |
4131 | - |
4132 | - |
4133 | -def request_restart(): |
4134 | - '''Request a rolling restart.''' |
4135 | - # We store the local restart request as a flag on the filesystem, |
4136 | - # for rolling_restart() to deal with later. This allows |
4137 | - # rolling_restart to know that this is a new entry to be queued for |
4138 | - # a future hook to handle, or an existing entry in the queue that |
4139 | - # may be restarted this hook. |
4140 | - flag = os.path.join(hookenv.charm_dir(), '.needs-restart') |
4141 | - if os.path.exists(flag): |
4142 | - return |
4143 | - host.write_file(flag, utcnow_str().encode('US-ASCII')) |
4144 | - |
4145 | - |
4146 | -def is_waiting_for_restart(): |
4147 | - '''Is there an outstanding rolling restart request.''' |
4148 | - flag = os.path.join(hookenv.charm_dir(), '.needs-restart') |
4149 | - return os.path.exists(flag) |
4150 | - |
4151 | - |
4152 | -def cancel_restart(): |
4153 | - '''Cancel the rolling restart request, if any, and dequeue.''' |
4154 | - _enqueue(False) |
4155 | - flag = os.path.join(hookenv.charm_dir(), '.needs-restart') |
4156 | - if os.path.exists(flag): |
4157 | - os.remove(flag) |
4158 | - |
4159 | - |
4160 | -class DeferRestart(Exception): |
4161 | - '''Defer the restart sequence until a future hook. |
4162 | - |
4163 | - If a restart callback raises this exception, no further callbacks |
4164 | - are made and the restart request is moved to the end of the queue. |
4165 | - The restart will be triggered again in a future hook. |
4166 | - ''' |
4167 | - |
4168 | - |
4169 | -def get_restart_queue(): |
4170 | - '''The sorted list of units waiting for a rolling restart. |
4171 | - |
4172 | - The local unit will not be included until after rolling_restart |
4173 | - has been called. |
4174 | - ''' |
4175 | - relid = get_peer_relation_id() |
4176 | - if relid is None: |
4177 | - return [] |
4178 | - |
4179 | - all_units = set(get_peers()) |
4180 | - all_units.add(hookenv.local_unit()) |
4181 | - |
4182 | - queue = [] |
4183 | - |
4184 | - # Iterate over all units, retrieving restartrequest flags. |
4185 | - # We can't use the peerstorage helpers here, as that will only |
4186 | - # retrieve data that has been echoed and may be out of date if |
4187 | - # the necessary relation-changed hooks have not yet been invoked. |
4188 | - for unit in all_units: |
4189 | - relinfo = hookenv.relation_get(unit=unit, rid=relid) |
4190 | - request = relinfo.get('rollingrestart_{}'.format(unit)) |
4191 | - if request is not None: |
4192 | - queue.append((request, unit)) |
4193 | - queue.sort() |
4194 | - return [unit for _, unit in queue] |
4195 | - |
4196 | - |
4197 | -def _enqueue(flag): |
4198 | - '''Add or remove an entry from peerstorage queue.''' |
4199 | - relid = get_peer_relation_id() |
4200 | - if not relid: |
4201 | - # No peer relation, no problems. If there is no peer relation, |
4202 | - # there are no peers to coordinate with. |
4203 | - return |
4204 | - |
4205 | - unit = hookenv.local_unit() |
4206 | - queue = get_restart_queue() |
4207 | - if flag and unit in queue: |
4208 | - return # Already queued. |
4209 | - |
4210 | - key = _peerstorage_key() |
4211 | - value = utcnow_str() if flag else None |
4212 | - hookenv.log('Restart queue {} = {}'.format(key, value)) |
4213 | - hookenv.relation_set(relid, {key: value}) |
4214 | - |
4215 | - |
4216 | -def rolling_restart(restart_hooks, prerequisites=()): |
4217 | - '''To ensure availability, only restart one unit at a time. |
4218 | - |
4219 | - Returns: |
4220 | - True if the queued restart has occurred and restart_hook called. |
4221 | - True if no restart request has been queued. |
4222 | - False if we are still waiting on a queued restart. |
4223 | - |
4224 | - restart_hooks is a list of callables that takes no parameters. |
4225 | - |
4226 | - For best results, call rolling_restart() unconditionally at the |
4227 | - beginning and end of every hook. |
4228 | - |
4229 | - At a minimum, rolling_restart() must be called every peer |
4230 | - relation-changed hook, and after any non-peer hook calls |
4231 | - request_restart(), or the system will deadlock. |
4232 | - ''' |
4233 | - _peer_echo() # Always call this, or peerstorage will fail. |
4234 | - |
4235 | - if not is_waiting_for_restart(): |
4236 | - cancel_restart() # Clear any errant queue entries. |
4237 | - return True |
4238 | - |
4239 | - for prerequisite in prerequisites: |
4240 | - if not prerequisite(): |
4241 | - hookenv.log('Prerequisite {} not met. Restart deferred.') |
4242 | - _enqueue(False) |
4243 | - _enqueue(True) # Rejoin at the end of the queue. |
4244 | - return False |
4245 | - |
4246 | - def _restart(): |
4247 | - for restart_hook in restart_hooks: |
4248 | - restart_hook() |
4249 | - hookenv.log('Restart complete.') |
4250 | - cancel_restart() |
4251 | - hookenv.log('Restart queue entries purged.') |
4252 | - return True |
4253 | - |
4254 | - try: |
4255 | - # If there are no peers, restart the service now since there is |
4256 | - # nobody to coordinate with. |
4257 | - if len(get_peers()) == 0: |
4258 | - hookenv.log('Restart request with no peers. Restarting.') |
4259 | - return _restart() |
4260 | - |
4261 | - local_unit = hookenv.local_unit() |
4262 | - queue = get_restart_queue() |
4263 | - |
4264 | - # If we are not in the restart queue, join it and restart later. |
4265 | - # If there are peers, we cannot restart in the same hook we made |
4266 | - # the request or we will race with other units trying to restart. |
4267 | - if local_unit not in queue: |
4268 | - hookenv.log('Joining rolling restart queue') |
4269 | - _enqueue(True) |
4270 | - return False |
4271 | - |
4272 | - # We joined the restart queue in a previous hook, but we still have |
4273 | - # to wait until a peer relation-changed hook before we can do a |
4274 | - # restart. If we did the restart in any old hook, we might not see |
4275 | - # higher priority requests from peers. |
4276 | - peer_relname = get_peer_relation_name() |
4277 | - changed_hook = '{}-relation-changed'.format(peer_relname) |
4278 | - if local_unit == queue[0] and hookenv.hook_name() == changed_hook: |
4279 | - hookenv.log('First in rolling restart queue. Restarting.') |
4280 | - return _restart() |
4281 | - |
4282 | - queue_str = ', '.join(queue) |
4283 | - hookenv.log('Waiting in rolling restart queue ({})'.format(queue_str)) |
4284 | - return False |
4285 | - except DeferRestart: |
4286 | - _enqueue(False) |
4287 | - _enqueue(True) # Rejoin at the end of the queue. |
4288 | - return False |
4289 | - |
4290 | - |
4291 | -def _peerstorage_key(): |
4292 | - return 'rollingrestart_{}'.format(hookenv.local_unit()) |
4293 | - |
4294 | - |
4295 | -def _peer_echo(): |
4296 | - peer_relname = get_peer_relation_name() |
4297 | - changed_hook = '{}-relation-changed'.format(peer_relname) |
4298 | - |
4299 | - # If we are in the peer relation-changed hook, we must invoke |
4300 | - # peerstorage.peer_echo() or peerstorage will fail. |
4301 | - if hookenv.hook_name() == changed_hook: |
4302 | - # We only want to echo the restart entry for the remote unit, as |
4303 | - # any other information it has for other peers may be stale. |
4304 | - includes = ['rollingrestart_{}'.format(hookenv.remote_unit())] |
4305 | - hookenv.log('peerstorage.peer_echo for rolling restart.') |
4306 | - hookenv.log('peer_echo(includes={!r})'.format(includes), DEBUG) |
4307 | - # This method only works from the peer relation-changed hook. |
4308 | - peerstorage.peer_echo(includes) |
4309 | - |
4310 | - # If a peer is departing, clean out any rubbish it left in the |
4311 | - # peerstorage. |
4312 | - departed_hook = '{}-relation-departed'.format(peer_relname) |
4313 | - if hookenv.hook_name() == departed_hook: |
4314 | - key = 'rollingrestart_{}'.format(hookenv.remote_unit()) |
4315 | - peerstorage.peer_store(key, None, peer_relname) |
4316 | - |
4317 | - |
4318 | -def get_peer_relation_name(): |
4319 | - # Make this the charmhelpers.contrib.peerstorage default? |
4320 | - # Although it is normal to have a single peer relation, it is |
4321 | - # possible to have many. We use the first in alphabetical order. |
4322 | - md = hookenv.metadata() |
4323 | - assert 'peers' in md, 'No peer relations in metadata.yaml' |
4324 | - return sorted(md['peers'].keys())[0] |
4325 | - |
4326 | - |
4327 | -def get_peer_relation_id(): |
4328 | - for relid in hookenv.relation_ids(get_peer_relation_name()): |
4329 | - return relid |
4330 | - return None |
4331 | - |
4332 | - |
4333 | -def get_peers(): |
4334 | - """Return the sorted list of peer units. |
4335 | - |
4336 | - Peers only. Does not include the local unit. |
4337 | - """ |
4338 | - for relid in hookenv.relation_ids(get_peer_relation_name()): |
4339 | - return sorted(hookenv.related_units(relid), |
4340 | - key=lambda x: int(x.split('/')[-1])) |
4341 | - return [] |
4342 | - |
4343 | - |
4344 | -def utcnow(): |
4345 | - return datetime.utcnow() |
4346 | - |
4347 | - |
4348 | -def utcnow_str(): |
4349 | - return utcnow().strftime('%Y-%m-%d %H:%M:%S.%fZ') |
4350 | - |
4351 | - |
4352 | -def make_service(restart_hooks, prerequisites=()): |
4353 | - """Create a service dictionary for rolling restart. |
4354 | - |
4355 | - This needs to be a separate service, rather than a data_ready or |
4356 | - provided_data item on an existing service, because the rolling_restart() |
4357 | - function must be called for all hooks to avoid deadlocking the system. |
4358 | - It would only be safe to list rolling_restart() as a data_ready item in |
4359 | - a service with no requirements. |
4360 | - """ |
4361 | - def data_ready(servicename): |
4362 | - rolling_restart(restart_hooks, prerequisites) |
4363 | - |
4364 | - return dict(service='rollingrestart', data_ready=data_ready, |
4365 | - stop=[], start=[]) |
4366 | |
4367 | === modified symlink 'hooks/stop' (properties changed: -x to +x) |
4368 | === target was u'hooks.py' |
4369 | --- hooks/stop 1970-01-01 00:00:00 +0000 |
4370 | +++ hooks/stop 2015-06-30 07:36:26 +0000 |
4371 | @@ -0,0 +1,20 @@ |
4372 | +#!/usr/bin/python3 |
4373 | +# Copyright 2015 Canonical Ltd. |
4374 | +# |
4375 | +# This file is part of the Cassandra Charm for Juju. |
4376 | +# |
4377 | +# This program is free software: you can redistribute it and/or modify |
4378 | +# it under the terms of the GNU General Public License version 3, as |
4379 | +# published by the Free Software Foundation. |
4380 | +# |
4381 | +# This program is distributed in the hope that it will be useful, but |
4382 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
4383 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
4384 | +# PURPOSE. See the GNU General Public License for more details. |
4385 | +# |
4386 | +# You should have received a copy of the GNU General Public License |
4387 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
4388 | +import hooks |
4389 | +if __name__ == '__main__': |
4390 | + hooks.bootstrap() |
4391 | + hooks.default_hook() |
4392 | |
4393 | === modified symlink 'hooks/upgrade-charm' (properties changed: -x to +x) |
4394 | === target was u'hooks.py' |
4395 | --- hooks/upgrade-charm 1970-01-01 00:00:00 +0000 |
4396 | +++ hooks/upgrade-charm 2015-06-30 07:36:26 +0000 |
4397 | @@ -0,0 +1,20 @@ |
4398 | +#!/usr/bin/python3 |
4399 | +# Copyright 2015 Canonical Ltd. |
4400 | +# |
4401 | +# This file is part of the Cassandra Charm for Juju. |
4402 | +# |
4403 | +# This program is free software: you can redistribute it and/or modify |
4404 | +# it under the terms of the GNU General Public License version 3, as |
4405 | +# published by the Free Software Foundation. |
4406 | +# |
4407 | +# This program is distributed in the hope that it will be useful, but |
4408 | +# WITHOUT ANY WARRANTY; without even the implied warranties of |
4409 | +# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR |
4410 | +# PURPOSE. See the GNU General Public License for more details. |
4411 | +# |
4412 | +# You should have received a copy of the GNU General Public License |
4413 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
4414 | +import hooks |
4415 | +if __name__ == '__main__': |
4416 | + hooks.bootstrap() |
4417 | + hooks.default_hook() |
4418 | |
4419 | === modified file 'templates/cassandra_maintenance_cron.tmpl' |
4420 | --- templates/cassandra_maintenance_cron.tmpl 2015-01-23 19:04:15 +0000 |
4421 | +++ templates/cassandra_maintenance_cron.tmpl 2015-06-30 07:36:26 +0000 |
4422 | @@ -1,4 +1,4 @@ |
4423 | # Cassandra maintenance |
4424 | # Staggered weekly repairs |
4425 | # m h dom mon dow user command |
4426 | -{{minute}} {{hour}} * * {{dow}} cassandra run-one-until-success nodetool repair -pr && run-one-until-success nodetool cleanup |
4427 | +{{minute}} {{hour}} * * {{dow}} cassandra run-one-until-success nodetool repair -pr |
4428 | |
4429 | === modified file 'tests/test_actions.py' |
4430 | --- tests/test_actions.py 2015-04-10 14:42:53 +0000 |
4431 | +++ tests/test_actions.py 2015-06-30 07:36:26 +0000 |
4432 | @@ -17,7 +17,6 @@ |
4433 | # along with this program. If not, see <http://www.gnu.org/licenses/>. |
4434 | |
4435 | import errno |
4436 | -import functools |
4437 | from itertools import repeat |
4438 | import os.path |
4439 | import re |
4440 | @@ -29,17 +28,16 @@ |
4441 | from unittest.mock import ANY, call, patch, sentinel |
4442 | import yaml |
4443 | |
4444 | +import cassandra |
4445 | from charmhelpers.core import hookenv |
4446 | |
4447 | from tests.base import TestCaseBase |
4448 | import actions |
4449 | +from coordinator import coordinator |
4450 | import helpers |
4451 | |
4452 | |
4453 | -patch = functools.partial(patch, autospec=True) # autospec=True as default. |
4454 | - |
4455 | - |
4456 | -class TestsActions(TestCaseBase): |
4457 | +class TestActions(TestCaseBase): |
4458 | def test_action_wrapper(self): |
4459 | @actions.action |
4460 | def somefunc(*args, **kw): |
4461 | @@ -61,40 +59,38 @@ |
4462 | '** Action catch-fire/somefunc (foo)') |
4463 | |
4464 | def test_revert_unchangeable_config(self): |
4465 | - hookenv.hook_name.return_value = 'config-changed' |
4466 | config = hookenv.config() |
4467 | |
4468 | self.assertIn('datacenter', actions.UNCHANGEABLE_KEYS) |
4469 | |
4470 | + # In the first hook, revert does nothing as there is nothing to |
4471 | + # revert too. |
4472 | config['datacenter'] = 'mission_control' |
4473 | + self.assertTrue(config.changed('datacenter')) |
4474 | + actions.revert_unchangeable_config('') |
4475 | + self.assertEqual(config['datacenter'], 'mission_control') |
4476 | + |
4477 | config.save() |
4478 | config.load_previous() |
4479 | config['datacenter'] = 'orbital_1' |
4480 | |
4481 | - self.assertTrue(config.changed('datacenter')) |
4482 | - |
4483 | actions.revert_unchangeable_config('') |
4484 | self.assertEqual(config['datacenter'], 'mission_control') # Reverted |
4485 | |
4486 | - hookenv.log.assert_any_call(ANY, hookenv.ERROR) |
4487 | - |
4488 | - def test_revert_unchangeable_config_install(self): |
4489 | - hookenv.hook_name.return_value = 'install' |
4490 | - config = hookenv.config() |
4491 | - |
4492 | - self.assertIn('datacenter', actions.UNCHANGEABLE_KEYS) |
4493 | - |
4494 | - config['datacenter'] = 'mission_control' |
4495 | - config.save() |
4496 | - config.load_previous() |
4497 | - config['datacenter'] = 'orbital_1' |
4498 | - |
4499 | - self.assertTrue(config.changed('datacenter')) |
4500 | - |
4501 | - # In the install hook, revert_unchangeable_config() does |
4502 | - # nothing. |
4503 | - actions.revert_unchangeable_config('') |
4504 | - self.assertEqual(config['datacenter'], 'orbital_1') |
4505 | + hookenv.log.assert_any_call(ANY, hookenv.ERROR) # Logged the problem. |
4506 | + |
4507 | + @patch('charmhelpers.core.hookenv.is_leader') |
4508 | + def test_leader_only(self, is_leader): |
4509 | + |
4510 | + @actions.leader_only |
4511 | + def f(*args, **kw): |
4512 | + return args, kw |
4513 | + |
4514 | + is_leader.return_value = False |
4515 | + self.assertIsNone(f(1, foo='bar')) |
4516 | + |
4517 | + is_leader.return_value = True |
4518 | + self.assertEqual(f(1, foo='bar'), ((1,), dict(foo='bar'))) |
4519 | |
4520 | def test_set_proxy(self): |
4521 | # NB. Environment is already mocked. |
4522 | @@ -315,6 +311,123 @@ |
4523 | # handles this. |
4524 | self.assertFalse(check_call.called) |
4525 | |
4526 | + @patch('actions._install_oracle_jre_tarball') |
4527 | + @patch('actions._fetch_oracle_jre') |
4528 | + def test_install_oracle_jre(self, fetch, install_tarball): |
4529 | + fetch.return_value = sentinel.tarball |
4530 | + |
4531 | + actions.install_oracle_jre('') |
4532 | + self.assertFalse(fetch.called) |
4533 | + self.assertFalse(install_tarball.called) |
4534 | + |
4535 | + hookenv.config()['jre'] = 'oracle' |
4536 | + actions.install_oracle_jre('') |
4537 | + fetch.assert_called_once_with() |
4538 | + install_tarball.assert_called_once_with(sentinel.tarball) |
4539 | + |
4540 | + @patch('helpers.status_set') |
4541 | + @patch('urllib.request') |
4542 | + def test_fetch_oracle_jre(self, req, status_set): |
4543 | + config = hookenv.config() |
4544 | + url = 'https://foo.example.com/server-jre-7u42-linux-x64.tar.gz' |
4545 | + expected_tarball = os.path.join(hookenv.charm_dir(), 'lib', |
4546 | + 'server-jre-7u42-linux-x64.tar.gz') |
4547 | + config['private_jre_url'] = url |
4548 | + |
4549 | + # Create a dummy tarball, since the mock urlretrieve won't. |
4550 | + os.makedirs(os.path.dirname(expected_tarball)) |
4551 | + with open(expected_tarball, 'w'): |
4552 | + pass # Empty file |
4553 | + |
4554 | + self.assertEqual(actions._fetch_oracle_jre(), expected_tarball) |
4555 | + req.urlretrieve.assert_called_once_with(url, expected_tarball) |
4556 | + |
4557 | + def test_fetch_oracle_jre_local(self): |
4558 | + # Create an existing tarball. If it is found, it will be used |
4559 | + # without needing to specify a remote url or actually download |
4560 | + # anything. |
4561 | + expected_tarball = os.path.join(hookenv.charm_dir(), 'lib', |
4562 | + 'server-jre-7u42-linux-x64.tar.gz') |
4563 | + os.makedirs(os.path.dirname(expected_tarball)) |
4564 | + with open(expected_tarball, 'w'): |
4565 | + pass # Empty file |
4566 | + |
4567 | + self.assertEqual(actions._fetch_oracle_jre(), expected_tarball) |
4568 | + |
4569 | + @patch('helpers.status_set') |
4570 | + def test_fetch_oracle_jre_notfound(self, status_set): |
4571 | + with self.assertRaises(SystemExit) as x: |
4572 | + actions._fetch_oracle_jre() |
4573 | + self.assertEqual(x.code, 0) |
4574 | + status_set.assert_called_once_with('blocked', ANY) |
4575 | + |
4576 | + @patch('subprocess.check_call') |
4577 | + @patch('charmhelpers.core.host.mkdir') |
4578 | + @patch('os.path.isdir') |
4579 | + def test_install_oracle_jre_tarball(self, isdir, mkdir, check_call): |
4580 | + isdir.return_value = False |
4581 | + |
4582 | + dest = '/usr/lib/jvm/java-8-oracle' |
4583 | + |
4584 | + actions._install_oracle_jre_tarball(sentinel.tarball) |
4585 | + mkdir.assert_called_once_with(dest) |
4586 | + check_call.assert_has_calls([ |
4587 | + call(['tar', '-xz', '-C', dest, |
4588 | + '--strip-components=1', '-f', sentinel.tarball]), |
4589 | + call(['update-alternatives', '--install', |
4590 | + '/usr/bin/java', 'java', |
4591 | + os.path.join(dest, 'bin', 'java'), '1']), |
4592 | + call(['update-alternatives', '--set', 'java', |
4593 | + os.path.join(dest, 'bin', 'java')]), |
4594 | + call(['update-alternatives', '--install', |
4595 | + '/usr/bin/javac', 'javac', |
4596 | + os.path.join(dest, 'bin', 'javac'), '1']), |
4597 | + call(['update-alternatives', '--set', 'javac', |
4598 | + os.path.join(dest, 'bin', 'javac')])]) |
4599 | + |
4600 | + @patch('os.path.exists') |
4601 | + @patch('subprocess.check_call') |
4602 | + @patch('charmhelpers.core.host.mkdir') |
4603 | + @patch('os.path.isdir') |
4604 | + def test_install_oracle_jre_tarball_already(self, isdir, |
4605 | + mkdir, check_call, exists): |
4606 | + isdir.return_value = True |
4607 | + exists.return_value = True # jre already installed |
4608 | + |
4609 | + # Store the version previously installed. |
4610 | + hookenv.config()['oracle_jre_tarball'] = sentinel.tarball |
4611 | + |
4612 | + dest = '/usr/lib/jvm/java-8-oracle' |
4613 | + |
4614 | + actions._install_oracle_jre_tarball(sentinel.tarball) |
4615 | + |
4616 | + self.assertFalse(mkdir.called) # The jvm dir already existed. |
4617 | + |
4618 | + exists.assert_called_once_with('/usr/lib/jvm/java-8-oracle/bin/java') |
4619 | + |
4620 | + # update-alternatives done, but tarball not extracted. |
4621 | + check_call.assert_has_calls([ |
4622 | + call(['update-alternatives', '--install', |
4623 | + '/usr/bin/java', 'java', |
4624 | + os.path.join(dest, 'bin', 'java'), '1']), |
4625 | + call(['update-alternatives', '--set', 'java', |
4626 | + os.path.join(dest, 'bin', 'java')]), |
4627 | + call(['update-alternatives', '--install', |
4628 | + '/usr/bin/javac', 'javac', |
4629 | + os.path.join(dest, 'bin', 'javac'), '1']), |
4630 | + call(['update-alternatives', '--set', 'javac', |
4631 | + os.path.join(dest, 'bin', 'javac')])]) |
4632 | + |
4633 | + @patch('subprocess.check_output') |
4634 | + def test_emit_java_version(self, check_output): |
4635 | + check_output.return_value = 'Line 1\nLine 2' |
4636 | + actions.emit_java_version('') |
4637 | + check_output.assert_called_once_with(['java', '-version'], |
4638 | + universal_newlines=True) |
4639 | + hookenv.log.assert_has_calls([call(ANY), |
4640 | + call('JRE: Line 1'), |
4641 | + call('JRE: Line 2')]) |
4642 | + |
4643 | @patch('helpers.configure_cassandra_yaml') |
4644 | def test_configure_cassandra_yaml(self, configure_cassandra_yaml): |
4645 | # actions.configure_cassandra_yaml is just a wrapper around the |
4646 | @@ -399,10 +512,62 @@ |
4647 | self.assertEqual(f.read().strip(), |
4648 | 'dc=test_dc\nrack=test_rack') |
4649 | |
4650 | - @patch('helpers.reset_auth_keyspace_replication') |
4651 | - def test_reset_auth_keyspace_replication(self, helpers_reset): |
4652 | + @patch('helpers.connect') |
4653 | + @patch('helpers.get_auth_keyspace_replication') |
4654 | + @patch('helpers.num_nodes') |
4655 | + def test_needs_reset_auth_keyspace_replication(self, num_nodes, |
4656 | + get_auth_ks_rep, |
4657 | + connect): |
4658 | + num_nodes.return_value = 4 |
4659 | + connect().__enter__.return_value = sentinel.session |
4660 | + connect().__exit__.return_value = False |
4661 | + get_auth_ks_rep.return_value = {'another': '8'} |
4662 | + self.assertTrue(actions.needs_reset_auth_keyspace_replication()) |
4663 | + |
4664 | + @patch('helpers.connect') |
4665 | + @patch('helpers.get_auth_keyspace_replication') |
4666 | + @patch('helpers.num_nodes') |
4667 | + def test_needs_reset_auth_keyspace_replication_false(self, num_nodes, |
4668 | + get_auth_ks_rep, |
4669 | + connect): |
4670 | + config = hookenv.config() |
4671 | + config['datacenter'] = 'mydc' |
4672 | + connect().__enter__.return_value = sentinel.session |
4673 | + connect().__exit__.return_value = False |
4674 | + |
4675 | + num_nodes.return_value = 4 |
4676 | + get_auth_ks_rep.return_value = {'another': '8', |
4677 | + 'mydc': '3'} |
4678 | + self.assertFalse(actions.needs_reset_auth_keyspace_replication()) |
4679 | + |
4680 | + @patch('helpers.set_active') |
4681 | + @patch('helpers.repair_auth_keyspace') |
4682 | + @patch('helpers.connect') |
4683 | + @patch('helpers.set_auth_keyspace_replication') |
4684 | + @patch('helpers.get_auth_keyspace_replication') |
4685 | + @patch('helpers.num_nodes') |
4686 | + @patch('charmhelpers.core.hookenv.is_leader') |
4687 | + def test_reset_auth_keyspace_replication(self, is_leader, num_nodes, |
4688 | + get_auth_ks_rep, |
4689 | + set_auth_ks_rep, |
4690 | + connect, repair, set_active): |
4691 | + is_leader.return_value = True |
4692 | + num_nodes.return_value = 4 |
4693 | + coordinator.grants = {} |
4694 | + coordinator.requests = {hookenv.local_unit(): {}} |
4695 | + coordinator.grant('repair', hookenv.local_unit()) |
4696 | + config = hookenv.config() |
4697 | + config['datacenter'] = 'mydc' |
4698 | + connect().__enter__.return_value = sentinel.session |
4699 | + connect().__exit__.return_value = False |
4700 | + get_auth_ks_rep.return_value = {'another': '8'} |
4701 | + self.assertTrue(actions.needs_reset_auth_keyspace_replication()) |
4702 | actions.reset_auth_keyspace_replication('') |
4703 | - helpers_reset.assert_called_once_with() |
4704 | + set_auth_ks_rep.assert_called_once_with( |
4705 | + sentinel.session, |
4706 | + {'class': 'NetworkTopologyStrategy', 'another': '8', 'mydc': 3}) |
4707 | + repair.assert_called_once_with() |
4708 | + set_active.assert_called_once_with() |
4709 | |
4710 | def test_store_unit_private_ip(self): |
4711 | hookenv.unit_private_ip.side_effect = None |
4712 | @@ -410,181 +575,92 @@ |
4713 | actions.store_unit_private_ip('') |
4714 | self.assertEqual(hookenv.config()['unit_private_ip'], sentinel.ip) |
4715 | |
4716 | - @patch('helpers.set_bootstrapped') |
4717 | - @patch('helpers.unit_number') |
4718 | - def test_set_unit_zero_bootstrapped(self, unit_number, set_bootstrapped): |
4719 | - unit_number.return_value = 0 |
4720 | - hookenv.hook_name.return_value = 'cluster-whatever' |
4721 | - actions.set_unit_zero_bootstrapped('') |
4722 | - set_bootstrapped.assert_called_once_with(True) |
4723 | - set_bootstrapped.reset_mock() |
4724 | - |
4725 | - # Noop if unit number is not 0. |
4726 | - unit_number.return_value = 2 |
4727 | - actions.set_unit_zero_bootstrapped('') |
4728 | - self.assertFalse(set_bootstrapped.called) |
4729 | - |
4730 | - # Noop if called from a non peer relation hook. Unfortunatly, |
4731 | - # attempting to set attributes on the peer relation fails before |
4732 | - # the peer relation has been joined. |
4733 | - unit_number.return_value = 0 |
4734 | - hookenv.hook_name.return_value = 'install' |
4735 | - actions.set_unit_zero_bootstrapped('') |
4736 | - self.assertFalse(set_bootstrapped.called) |
4737 | - |
4738 | + @patch('charmhelpers.core.host.service_start') |
4739 | + @patch('helpers.status_set') |
4740 | + @patch('helpers.actual_seed_ips') |
4741 | + @patch('helpers.get_seed_ips') |
4742 | + @patch('relations.StorageRelation.needs_remount') |
4743 | + @patch('helpers.is_bootstrapped') |
4744 | @patch('helpers.is_cassandra_running') |
4745 | @patch('helpers.is_decommissioned') |
4746 | - @patch('rollingrestart.request_restart') |
4747 | - def test_maybe_schedule_restart_decommissioned(self, request_restart, |
4748 | - is_running, is_decom): |
4749 | + def test_needs_restart(self, is_decom, is_running, is_bootstrapped, |
4750 | + needs_remount, seed_ips, actual_seeds, |
4751 | + status_set, service_start): |
4752 | + is_decom.return_value = False |
4753 | + is_running.return_value = True |
4754 | + needs_remount.return_value = False |
4755 | + seed_ips.return_value = set(['1.2.3.4']) |
4756 | + actual_seeds.return_value = set(['1.2.3.4']) |
4757 | + |
4758 | + config = hookenv.config() |
4759 | + config['configured_seeds'] = list(sorted(seed_ips())) |
4760 | + config.save() |
4761 | + config.load_previous() # Ensure everything flagged as unchanged. |
4762 | + |
4763 | + self.assertFalse(actions.needs_restart()) |
4764 | + |
4765 | + # Decommissioned nodes are not restarted. |
4766 | is_decom.return_value = True |
4767 | - is_running.return_value = True |
4768 | - actions.maybe_schedule_restart('') |
4769 | - self.assertFalse(request_restart.called) |
4770 | - |
4771 | - @patch('helpers.is_decommissioned') |
4772 | - @patch('helpers.seed_ips') |
4773 | - @patch('helpers.is_cassandra_running') |
4774 | - @patch('relations.StorageRelation') |
4775 | - @patch('rollingrestart.request_restart') |
4776 | - def test_maybe_schedule_restart_need_remount(self, request_restart, |
4777 | - storage_relation, is_running, |
4778 | - seed_ips, is_decom): |
4779 | - config = hookenv.config() |
4780 | - |
4781 | - is_decom.return_value = False |
4782 | - is_running.return_value = True |
4783 | - |
4784 | - # Storage says we need to restart. |
4785 | - storage_relation().needs_remount.return_value = True |
4786 | - |
4787 | - # No new seeds |
4788 | - seed_ips.return_value = set(['a']) |
4789 | - config['configured_seeds'] = sorted(seed_ips()) |
4790 | - config.save() |
4791 | - config.load_previous() |
4792 | - |
4793 | - # IP address is unchanged. |
4794 | - config['unit_private_ip'] = hookenv.unit_private_ip() |
4795 | - |
4796 | - # Config items are unchanged. |
4797 | - config.save() |
4798 | - config.load_previous() |
4799 | - |
4800 | - actions.maybe_schedule_restart('') |
4801 | - request_restart.assert_called_once_with() |
4802 | - hookenv.log.assert_any_call('Mountpoint changed. ' |
4803 | - 'Restart and migration required.') |
4804 | - |
4805 | - @patch('helpers.is_decommissioned') |
4806 | - @patch('helpers.seed_ips') |
4807 | - @patch('helpers.is_cassandra_running') |
4808 | - @patch('relations.StorageRelation') |
4809 | - @patch('rollingrestart.request_restart') |
4810 | - def test_maybe_schedule_restart_unchanged(self, request_restart, |
4811 | - storage_relation, is_running, |
4812 | - seed_ips, is_decom): |
4813 | - config = hookenv.config() |
4814 | - config['configured_seeds'] = ['a'] |
4815 | - config.save() |
4816 | - config.load_previous() |
4817 | - |
4818 | - is_decom.return_value = False |
4819 | - is_running.return_value = True |
4820 | - |
4821 | - # Storage says we do not need to restart. |
4822 | - storage_relation().needs_remount.return_value = False |
4823 | - |
4824 | - # No new seeds |
4825 | - seed_ips.return_value = set(['a']) |
4826 | - config['configured_seeds'] = sorted(seed_ips()) |
4827 | - config.save() |
4828 | - config.load_previous() |
4829 | - |
4830 | - # IP address is unchanged. |
4831 | - config['unit_private_ip'] = hookenv.unit_private_ip() |
4832 | - |
4833 | - # Config items are unchanged, except for ones that do not |
4834 | - # matter. |
4835 | - config.save() |
4836 | - config.load_previous() |
4837 | - config['package_status'] = 'new' |
4838 | - self.assertTrue(config.changed('package_status')) |
4839 | - |
4840 | - actions.maybe_schedule_restart('') |
4841 | - self.assertFalse(request_restart.called) |
4842 | - |
4843 | - @patch('helpers.is_decommissioned') |
4844 | - @patch('helpers.is_cassandra_running') |
4845 | - @patch('helpers.seed_ips') |
4846 | - @patch('relations.StorageRelation') |
4847 | - @patch('rollingrestart.request_restart') |
4848 | - def test_maybe_schedule_restart_config_changed(self, request_restart, |
4849 | - storage_relation, |
4850 | - seed_ips, is_running, |
4851 | - is_decom): |
4852 | - config = hookenv.config() |
4853 | - |
4854 | - is_decom.return_value = False |
4855 | - is_running.return_value = True |
4856 | - |
4857 | - # Storage says we do not need to restart. |
4858 | - storage_relation().needs_remount.return_value = False |
4859 | - |
4860 | - # IP address is unchanged. |
4861 | - config['unit_private_ip'] = hookenv.unit_private_ip() |
4862 | - |
4863 | - # Config items are changed. |
4864 | - config.save() |
4865 | - config.load_previous() |
4866 | - config['package_status'] = 'new' |
4867 | - self.assertTrue(config.changed('package_status')) # Doesn't matter. |
4868 | - config['max_heap_size'] = 'lots and lots' |
4869 | - self.assertTrue(config.changed('max_heap_size')) # Requires restart. |
4870 | - |
4871 | - actions.maybe_schedule_restart('') |
4872 | - request_restart.assert_called_once_with() |
4873 | - hookenv.log.assert_any_call('max_heap_size changed. Restart required.') |
4874 | - |
4875 | - @patch('helpers.is_decommissioned') |
4876 | - @patch('helpers.seed_ips') |
4877 | - @patch('helpers.is_cassandra_running') |
4878 | - @patch('relations.StorageRelation') |
4879 | - @patch('rollingrestart.request_restart') |
4880 | - def test_maybe_schedule_restart_ip_changed(self, request_restart, |
4881 | - storage_relation, is_running, |
4882 | - seed_ips, is_decom): |
4883 | - is_decom.return_value = False |
4884 | - is_running.return_value = True |
4885 | - |
4886 | - # Storage says we do not need to restart. |
4887 | - storage_relation().needs_remount.return_value = False |
4888 | - |
4889 | - # No new seeds |
4890 | - seed_ips.return_value = set(['a']) |
4891 | - config = hookenv.config() |
4892 | - config['configured_seeds'] = sorted(seed_ips()) |
4893 | - config.save() |
4894 | - config.load_previous() |
4895 | - |
4896 | - # Config items are unchanged. |
4897 | - config = hookenv.config() |
4898 | - config.save() |
4899 | - config.load_previous() |
4900 | - |
4901 | - actions.store_unit_private_ip('') # IP address change |
4902 | - |
4903 | - actions.maybe_schedule_restart('') |
4904 | - request_restart.assert_called_once_with() |
4905 | - hookenv.log.assert_any_call('Unit IP address changed. ' |
4906 | - 'Restart required.') |
4907 | - |
4908 | - @patch('helpers.is_cassandra_running') |
4909 | - @patch('rollingrestart.request_restart') |
4910 | - def test_maybe_schedule_restart_down(self, request_restart, is_running): |
4911 | + self.assertFalse(actions.needs_restart()) |
4912 | + is_decom.return_value = False |
4913 | + self.assertFalse(actions.needs_restart()) |
4914 | + |
4915 | + # Nodes not running need to be restarted. |
4916 | is_running.return_value = False |
4917 | - actions.maybe_schedule_restart('') |
4918 | - request_restart.assert_called_once_with() |
4919 | + self.assertTrue(actions.needs_restart()) |
4920 | + is_running.return_value = True |
4921 | + self.assertFalse(actions.needs_restart()) |
4922 | + |
4923 | + # If we have a new mountpoint, we need to restart in order to |
4924 | + # migrate data. |
4925 | + needs_remount.return_value = True |
4926 | + self.assertTrue(actions.needs_restart()) |
4927 | + needs_remount.return_value = False |
4928 | + self.assertFalse(actions.needs_restart()) |
4929 | + |
4930 | + # Certain changed config items trigger a restart. |
4931 | + config['max_heap_size'] = '512M' |
4932 | + self.assertTrue(actions.needs_restart()) |
4933 | + config.save() |
4934 | + config.load_previous() |
4935 | + self.assertFalse(actions.needs_restart()) |
4936 | + |
4937 | + # A new IP address requires a restart. |
4938 | + config['unit_private_ip'] = 'new' |
4939 | + self.assertTrue(actions.needs_restart()) |
4940 | + config.save() |
4941 | + config.load_previous() |
4942 | + self.assertFalse(actions.needs_restart()) |
4943 | + |
4944 | + # If the seeds have changed, we need to restart. |
4945 | + seed_ips.return_value = set(['9.8.7.6']) |
4946 | + actual_seeds.return_value = set(['9.8.7.6']) |
4947 | + self.assertTrue(actions.needs_restart()) |
4948 | + is_running.side_effect = iter([False, True]) |
4949 | + helpers.start_cassandra() |
4950 | + is_running.side_effect = None |
4951 | + is_running.return_value = True |
4952 | + self.assertFalse(actions.needs_restart()) |
4953 | + |
4954 | + @patch('charmhelpers.core.hookenv.is_leader') |
4955 | + @patch('helpers.is_bootstrapped') |
4956 | + @patch('helpers.ensure_database_directories') |
4957 | + @patch('helpers.remount_cassandra') |
4958 | + @patch('helpers.start_cassandra') |
4959 | + @patch('helpers.stop_cassandra') |
4960 | + @patch('helpers.status_set') |
4961 | + def test_maybe_restart(self, status_set, stop_cassandra, start_cassandra, |
4962 | + remount, ensure_directories, is_bootstrapped, |
4963 | + is_leader): |
4964 | + coordinator.grants = {} |
4965 | + coordinator.requests = {hookenv.local_unit(): {}} |
4966 | + coordinator.relid = 'cluster:1' |
4967 | + coordinator.grant('restart', hookenv.local_unit()) |
4968 | + actions.maybe_restart('') |
4969 | + stop_cassandra.assert_called_once_with() |
4970 | + remount.assert_called_once_with() |
4971 | + ensure_directories.assert_called_once_with() |
4972 | + start_cassandra.assert_called_once_with() |
4973 | |
4974 | @patch('helpers.stop_cassandra') |
4975 | def test_stop_cassandra(self, helpers_stop_cassandra): |
4976 | @@ -596,15 +672,29 @@ |
4977 | actions.start_cassandra('ignored') |
4978 | helpers_start_cassandra.assert_called_once_with() |
4979 | |
4980 | - @patch('helpers.ensure_unit_superuser') |
4981 | - def test_ensure_unit_superuser(self, helpers_ensure_unit_superuser): |
4982 | - actions.ensure_unit_superuser('ignored') |
4983 | - helpers_ensure_unit_superuser.assert_called_once_with() |
4984 | + @patch('os.path.isdir') |
4985 | + @patch('helpers.get_all_database_directories') |
4986 | + @patch('helpers.set_io_scheduler') |
4987 | + def test_reset_all_io_schedulers(self, set_io_scheduler, dbdirs, isdir): |
4988 | + hookenv.config()['io_scheduler'] = sentinel.io_scheduler |
4989 | + dbdirs.return_value = dict( |
4990 | + data_file_directories=[sentinel.d1, sentinel.d2], |
4991 | + commitlog_directory=sentinel.cl, |
4992 | + saved_caches_directory=sentinel.sc) |
4993 | + isdir.return_value = True |
4994 | + actions.reset_all_io_schedulers('') |
4995 | + set_io_scheduler.assert_has_calls([ |
4996 | + call(sentinel.io_scheduler, sentinel.d1), |
4997 | + call(sentinel.io_scheduler, sentinel.d2), |
4998 | + call(sentinel.io_scheduler, sentinel.cl), |
4999 | + call(sentinel.io_scheduler, sentinel.sc)], |
5000 | + any_order=True) |
The charmhelpers coordinator module has not landed and can be reviewed at https:/ /code.launchpad .net/~stub/ charm-helpers/ coordinator/ +merge/ 261267